- Subtopic
-
NIH Data Management and Sharing Policy
Need assistance with NIH Data Management and Sharing?
The 2023 NIH Data Management and Sharing (DMS) Policy applies to all research, funded or conducted in whole or in part by the NIH, that results in the generation of scientific data. The policy details expectations and requirements regarding data management and sharing of scientific data generated out of NIH funded projects as well as budgeting for the costs to do so. The policy is effective January 25th 2023, and applies to all awards with receipt dates on or after that date.
NIH Sharing Website: Contains detail on the policy and related information. Access the NIH Data management and Sharing Policy Webpage.
Where to get help understanding the new NIH DMS Policy
- Access FAQs regarding the NIH DMS Policy
- Sample plans from the NIH for 14 different types of data (including secondary data analyses)
- Duke personnel should reach out to researchdata@duke.edu concerning any questions on the NIH DMS policy they may have.
- For inquiries specific to writing a DMS Plan, reach out to datamanagement@duke.edu.
Main Policy Points
- Data Management and Sharing Plan (DMS Plan): As part of the project proposal researchers are required to submit a DMS Plan detailing the data management during the project, AND sharing plans for after the project. This plan will be treated as a term and condition of the award and is expected to be adhered to throughout the lifecycle of the award.
- Sharing Scientific Data: Scientific data generated out of the project should be shared as soon as possible. No later than the time of an associated publication of work conducted in the project, or the project end of performance period, whichever comes first.
- Budgeting for the managing and sharing of data: Costs associated with data management and sharing during the project period can now be charged as direct costs. A budget for implementation of the DMS Plan is now required as part of a proposal. Update – NIH has released a Notice of Policy Change on July 31, 2023 announcing a change regarding budgeting for DMS costs. See section below for details.
What funding types does the policy apply to?
The DMS Policy applies to all research, funded or conducted in whole or in part by NIH, that results in the generation of scientific data regardless of funding amount or funding mechanism.
The NIH defines scientific data as follows:
Scientific Data is defined as data commonly accepted in the scientific community as of sufficient quality to validate and replicate research findings, regardless of whether the data are used to support scholarly publications.
- Scientific data includes any data needed to validate and replicate research findings.
- Scientific data does not include laboratory notebooks, preliminary analyses, completed case report forms, drafts of scientific papers, plans for future research, peer reviews, communications with colleagues, or physical objects such as laboratory specimens.
A full list of NIH activity codes subject to the DMS policy (pdf)
The DMS Policy does not apply to research and other activities that do not generate scientific data, including: Ts, Fs, KM1, C06, R13, Gs, S06
Data Management and Sharing Plans
DMS Plans ideally should be 2 pages, but longer plans are allowed, and attached to the proposal application as a pdf file.
Plan Elements
Plans should address the following six elements.
1. Data Type
- General summary of types and estimated amounts of scientific data to be generated or used.
- Describe which scientific data will be preserved and shared, with reasoning being provided of why certain data may not be shared based on legal, ethical and technical factors being documented in the plan.
- Brief listing of metadata and documentation that will be made accessible to facilitate interpretation of data.
2. Related Tools, Software and/or Code
- State whether tools, software or code are needed to access or manipulate the data, and if so provide the names and versions if applicable.
- If applicable, specify how needed tools can be accessed, and, whether such tools are likely to remain available for as long as the scientific data remain available.
3. Standards
- State what common data standards (if any) will be applied to the scientific data and associated metadata. Describe how these data standards will be applied (may be related to data formats, data dictionaries, definitions, identifiers, etc.) to the scientific data generated by the research proposed in this project. If applicable, indicate that no consensus standards exist. Here is a link to a helpful resource for documentation and metadata.
4. Data Preservation, Access and Timelines
- Describe when scientific data and metadata/documentation associated with the project will be made available. It is highly recommended that datasets be put into an established data repository, which are organizations that collect, manage, and archive datasets for data analysis, sharing and reuse. In some cases, a repository may be specified by the funding Institute or Center. In cases where it is not specified, the NIH has guidance on selecting suitable repositories. Small datasets (up to 2 GB in size) may be included as supplementary material to accompany articles submitted to PubMed Central rather than using a separate repository. Duke also provides the Duke Research Data Repository as an option for data that may be openly shared.
- Describe how the data will be findable and identifiable (i.e., with a persistent identifier such as a DOI number)
- Describe when the data will be made available for sharing and for how long. Please note that data must be shared no later than time of an associated publication or end of the performance period, whichever comes first. Researchers are encouraged to consider requirements and policies (e.g., data repository policies, award record retention requirements, journal policies) as guidance for the minimum time frame scientific data should be made available.
Note that depositing data in to a repository qualifies as data retention for the purposes of the Duke University Retention Policy of 2023. Though please note that the Duke policy states that data must be retained for a minimum of 6 years, so if data is taken out of a repository prior to 6 years after the publication the dataset was associated with was published, the data must be stored in another location accessible to Duke.
5. Access, Distribution, or Reuse Considerations
- Describe any privacy, security, consent or proprietary issues that might affect access and reuse (informed consent, privacy, confidentiality or other regulatory or contractual protections).
- Whether access to data derived from human participants will be controlled (e.g. made available by a data repository only after approval).
- If generating scientific data derived from humans; describe how the privacy, rights, and confidentiality of human research participants will be protected (e.g., through de-identification, Certificates of Confidentiality, and other protective measures). See also the NIH Guidance Protecting Privacy When Sharing Human Research Participant Data.
6. Oversight of Data Management and Sharing
- Describe how compliance with this Plan will be monitored and managed, frequency of oversight, and by whom at Duke (e.g., titles/office name, roles).
As part of this section, Duke University requires you to include the following:
- List the names and titles/roles of everyone on the research team who will be responsible for monitoring compliance with the data management plan.
- State how often compliance with the data management and sharing plan will be verified by team members (e.g. every X months, on the first of each month, etc.)
- The following standard language must also be included in this section: “The Office of Scientific Integrity at Duke University has a data management and sharing plan review and compliance procedure. Consistent with the procedure, the Office will monitor compliance with the DMS plan, including the deposition of award relevant scientific data into the selected data repository, through attestation of the principal investigator(s) listed on the award at milestone reporting periods. The University will also leverage its internal audit function for periodic review of study team adherence with DMS elements.”
Note on the level of Detail in plans:
The NIH does not expect researchers to necessarily have all details at the application stage and the advice is to give the best educated guesses for any information that is not yet available.
If there are any clarifying details that the program officer wants in a plan they will give feedback and revisions can be made at Just in time submission.
Plans will be expected to be updated at reporting periods like RPPR to reflect any new changes or plans for the project.
Plan Review
Program staff at the proposed NIH Institute or Center will assess DMS Plans, not peer reviewers. During peer review, reviewers will not be asked to comment on the DMS Plan nor will they factor the DMS Plan into the Overall Impact score. Peer reviewers will only use the information found in the budget justification to determine whether the requested Data Management and Sharing Costs are reasonable.
Templates, Examples and DMPTool
DMS Plan Template
The NIH has provided a template of Data Management and Sharing Plan Format Page for writing a DMS Plan. Please note that the template has text in italics to help explain the answers for each element. This italicized text should be deleted prior to submitting the DMS Plan.
- Additional DMS plan templates from Duke Research Development offices are available within the toolkits on the following pages:
DMS Plan Examples
The NIH has shared sample data management plans for 14 different types of data from four different Institutes that are consistent with the expectations of the NIH. Please note that there may also be additional elements required by the specific institute shown in these examples, but they are labeled as such. These examples are relevant for these types of data even if you are submitting to a different Institute.
The National institute of Mental Health (NIMH)
- Clinical and/or MRI data from human research participants
- Genomic data from human research participants
- Genomic data from a non-human source
- Secondary Data Analysis
The National Institute of Child Health and Human Development (NICHD)
- Human clinical and genomics data
- Gene expression analysis data from non-human model organism (zebrafish)
- Human survey data
- Human clinical trial data
The National Human Genome Research Institute (NHGRI)
The National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK)
- Clinical data from human research participants
- Basic research from a non-human source example
- Secondary data analysis
Additional examples from the NIH and other NIH ICOs will be shared as they become available
DMPTool
The DMPTool (Data Management Plan Tool) is an online tool that can provide templates and helpful information to assist in writing DMS Plans. This is a highly recommended starting place for writing a DMS Plan as it provides detailed guidance on how to write plans as well as sample language for each element. To draft the plan itself, go to the DMPTool website, Log in using a Duke Shibboleth and create a new plan using the NIH 2023 template.
Please note that DMS Plans are recommended to be at maximum two pages.
Budgeting for Data Management and Sharing
Allowable
The new NIH DMS Policy allows for costs related to data management and sharing incurred during the performance period to be charged to the grant. These include:
- Personnel costs regarding the curation and management of data and related documentation: this includes formatting data according to accepted community standards; de-identifying data; attaching metadata to foster discoverability, interpretation, and reuse; and formatting data for transmission and storage at a selected repository for long-term preservation and access.
- Fees for depositing and preserving data in repositories after the award is over. Please note that the costs of preserving and sharing the data in a repository must be paid in full during the performance period.
- Fees for local data management considerations during the active phase of the award, such as specialized infrastructure (ex. PACE)
- *For research involving genomic data/subject to the NIH GDS policy only: Fees or costs relating to genomic data management and sharing
Unallowable
Budget requests for Data Management and Sharing must not include:
- Costs that are typically covered under Facilities and Administrative costs (aka, indirect costs)
- Costs associated with the routine conduct of research.
- Costs associated with collecting or otherwise gaining access to research data as these are considered costs of doing research.
- Costs that are double charged or inconsistently charged as both direct and indirect costs
Budgeting
Update: The NIH has released a Notice of Policy Change on July 31, 2023 announcing a change regarding budgeting for DMS costs. An excerpt from this notice:
"Effective for applications submitted for due dates on or after October 5, 2023, NIH will no longer require the use of the single DMS cost line item. NIH recognizes that DMS costs may be requested in many cost categories. Therefore, in line with our standard budget instructions, DMS costs must be requested in the appropriate cost category, e.g., personnel, equipment, supplies, and other expenses, following the instructions for the R&R Budget Form or PHS 398 Modular Budget Form, as applicable. While the single cost line item is no longer required, NIH will require applicants to specify estimated DMS cost details within the “Budget Justification” attachment of the R&R Budget Form or “Additional Narrative Justification” attachment of the PHS 398 Modular Budget Form, pursuant to the instructions.”
Use page 3 of the Duke NIH Data Management and Sharing Plan Checklist to aid in developing your budget and budget justification.
External tools and resources to help understand and create a budget can be found below:
- The NIHM Data Archive Data Submission Cost Estimation tool and
- The National Academies publications around Forecasting Costs for Preserving, Archiving, and Promoting Access to Biomedical Data
- The NIH DMS website page on Budgeting for Data Management and Sharing
Please note that for Duke related data services it is best practice to reach out to those services during the proposal development stage in order to get cost estimates for budgeting.
R&R Budget:
- Costs to support the activities described in the Data Management and Sharing Plan must be requested in the appropriate cost category(ies), e.g., personnel, equipment, supplies, and other expenses.
- Investigators must also include a justification of the activities proposed in the DMS Plan that will incur costs. This justification must be labeled as "Data Management and Sharing Justification" within the budget justification attachment, followed by the estimated dollar amount.
- Duke guidance is that all salary/effort for Senior/Key personnel in the project must be included in the personnel section and not separately listed in ODC with other DMS costs to avoid effort reporting issues. For anyone that is split between research and DMS effort, leave the entire effort in salary and describe the effort split in the budget justification.
- Any other budget line item that impacts indirect calculations (e.g. equipment) should be budgeted in their normal line so that indirects are calculated appropriately.
Budget justification:
- Supporting details must be outlined in the budget justification attachment in a section clearly labeled "Data Management and Sharing Justification". The recommended length of the justification should be no more than half a page.
- A justification is required even if there are $0 for DMS costs.
- For modular budgets, use the "Additional Narrative" justification attachment and include a section labeled "Data Management and Sharing Justification" followed by the requested dollar amount and a brief justification of the proposed activities that will incur costs. Enter $0 if no costs are being requested.
- This summary is the only DMS information reviewers will have access to. They will not have access to the full DMSP.
Other budget considerations:
- DMS costs are included in any direct cost budget caps.
- Subrecipients can budget costs in their own budget (do not need to be in the PTE budget) following the NIH instructions.
- For complex grant applications that involve an overall budget and multiple project/core budgets (e.g. PPG), DMS costs must be included within the applicable component(s), as outlined in the application instructions.
- If no costs are expected, enter $0 in the DMS line in the budget and include a note in the justification.
Help & Frequently Asked Questions (FAQs)
Duke Resources to help
Regarding help with creating DMS Plans, writing informed consent language, data curation consultations and assistance and consultations around appropriate licensing and data sharing agreements for repositories: contact the Duke University Libraries at datamanagement.edu.
For questions around the NIH DMS Policy or Duke implementation of the NIH DMS Policy: contact researchdata@duke.edu.
For questions related to budgeting for data management and sharing costs, contact your grant manager.
NIH Resources to help
Duke Researcher FAQs on the NIH DMS Policy
These frequently asked questions have been asked by Duke researchers as part of a NIH Data Management & Sharing Policy Overview presentation. Answers are provided by Duke data management specialists as a community-based resource based on released NIH information and their interpretations of the policy. They will be updated when new information is available.
Do you have any sample plans that I can look at?
Yes! The NIH has shared sample data management plans for 14 different types of data from four different Institutes that are consistent with the expectations of the NIH. Please note that there may also be additional elements required by the specific institute shown in these examples, but they are labeled as such. These examples are relevant for these types of data even if you are submitting to a different Institute.
How are investigators budgeting for data management and sharing costs?
Cost for data management is determined on a case by case basis. Here is a link to the NIH Budgeting for Data Management and Sharing page. Researchers should consult with their desired data management service providers (whether independent or associated with a repository) to make a budget. We strongly recommend contacting the repository listed in your DMSP to learn more about associated fees and expected costs, as these costs should be included in the budget. For example, budgets for proposals listing Vivli as the intended repository need to include at least one line item of $10,000 for data anonymization costs.
Because personnel costs related to data management are allowable in addition to the other allowable costs, investigators are encouraged to consider including costs for their time related to data management. It would be a rare instance for there to be zero dollars associated with effort on data management and sharing for a research project. Contact your grants manager for more support on budgeting for data management and sharing costs.
Does this policy apply to a no-cost renewal of an existing grant?
If the grant existed prior to the 2023 requirements then no, it does not apply.
What is a data repository?
A data repository is an organization or group that collects, manages, and stores datasets for data analysis, sharing, reuse, and reporting. Repositories support public discovery and access by assigning persistent identifiers (e.g., DOIs) and metadata as well as the infrastructure to archive data into the future. This differs from a project website or internal archiving IT solution. NIH has guidance on selecting a repository including desirable characteristics to look for.
Can I say that data will be shared upon request instead of using a data repository?
Sharing data upon request is no longer considered an adequate plan for data dissemination. NIH recommends the use of an “established repository” that meets a set of desirable characteristics for sharing scientific data.
Can I say that data will be shared in publications?
The data management and sharing plan is concerned with sharing of the actual scientific data, not scientific results. Therefore, it is not necessary to disclose your plans for publishing traditional articles in your DMSP. The main exception to this is if a journal has the infrastructure to host data for sharing purposes. If this is how you plan on sharing data be sure to include details of the specific journal you intend to submit your manuscripts to (e.g., limitations on hosting, costs, formatting and metadata requirements) along with details on a backup method for data sharing that will be used if the manuscript is not accepted for publication.
What data do I need to share?
You are expected to share what is needed to verify (validate or reproduce) your findings or what may be useful to other researchers for the purposes of their research. Do not share laboratory notebooks, preliminary analyses, completed case report forms, drafts of scientific papers, plans for future research, peer reviews, communications with colleagues, or physical objects, such as laboratory specimens. All scientific data generated from an NIH grant are subject to the Data Management and Sharing Policy, including basic sciences, translational, social, behavioral, quantitative, and qualitative data.
What is the difference between a Data Management & Sharing Plan and a Resource Sharing Plan?
A Resource Sharing Plan is geared toward physical resources to be developed, such as model organisms or antibodies, while a DMSP concerns digital research and genomic data. Almost all NIH research proposals will require a DMSP, and a subset of those proposals will also require a Resource Sharing Plan. The Resource Sharing Plan should not be included within the DMSP.
Does the DMSP have to be two pages?
Two pages is a recommendation from the NIH, but it is not a requirement. Plans longer than two pages may be appropriate for proposals involving very large or complex datasets or for multi-project grants.
I have a huge amount of data, how am I expected to share it?
In cases where your desired repository has size limitations on deposited datasets in comparison to the size of your data, strategies to deposit include: only sharing aggregate files representing raw data, sharing a representative sampling of the raw data, or compressing data.
If I don’t share certain parts of my data, do I need a justification?
Your DMS Plan should include a justification for why you are not including part of your dataset as well as a description of the data that you plan on sharing. For more information, see the NIH FAQ response.
What about data that could lead to a patent?
Evaluating an invention for patent protection or filing a patent application may justify a need to delay disclosure of research findings and scientific data for up to 60 days beyond the standard data sharing timelines. Researchers should consult with the Duke Office for Translation and Commercialization (nadine.wong@duke.edu) well in advance of anticipated publication or data release to evaluate inventions for patent protection and, when appropriate, researchers may then update their data-sharing plan if any delays are necessary.
Does the Duke Research Data Repository (RDR) have requirements for Duke-authorship, such as only hosting data when the Duke author is the lead or PI? Or can it host data on which a Duke author is any kind of collaborator?
One of the authors must be affiliated with Duke; however, they do not have to be the PI. For multi-institutional collaborations, the RDR can work with the team to determine the best deposit method. Contact datamanagement@duke.edu for more information.
Does the RDR have limitations for data deposits for each faculty member/PI?
No. The RDR currently has a 300 GB limit for individual data deposits; however, one PI may make numerous data deposits that may be connected to a publication or a grant project. If your data are larger than 300 GB, please contact datamanagement@duke.edu to discuss options.
I have a data archive set up here at Duke. Will that count as a repository?
It depends on the scope and characteristics of the archive and whether it complies with the desirable characteristics outlined by NIH. NIH recommends the use of an established repository such as disciplinary repositories, NIH funded repositories, or general repositories like the Duke Research Data Repository (RDR). Please see the NIH website on repositories for more information.
The policy says that you have to share data at publication or end of award. What happens if the award ends, there hasn't been a publication, and you think your data is only preliminary? Do you have to share the preliminary data?
You should share the research data that represents the main outputs of the research you conducted per the goals of the grant even if the results were “negative” or no publication has come out of the research.
Am I required to share identifiable data?
No. NIH expects that in drafting their DMS Plans, researchers will attempt to maximize scientific data sharing, but may acknowledge that certain factors (i.e., ethical, legal, or technical) may necessitate limiting sharing to some extent. Foreseeable limitations should be described when drafting DMS Plans. Please note that best efforts to de-identify data should be attempted. In cases where de-identified data still pose a risk to participant confidentiality due to the potential for inappropriate disclosure to occur, these data should not be shared openly, but instead shared via a controlled access repository under specific terms and conditions.
We use GitHub to disseminate code and data, will that count?
NIH recommends the use of an “established repository” that meets a set of desirable characteristics for sharing scientific data. While version control repositories allow for dissemination and collaboration and are optimized for code and software sharing, they differ in scope from “archival data repositories”, see this blog post by the Center for Data and Visualization Sciences that describes these differences in more detail. For example, you can consider archiving your GitHub repository in Zenodo so you can cite and reference GitHub content.
Does the DMPTool allow you to collaborate with others outside of Duke?
Yes, anyone, regardless of institutional affiliation, can create a DMPTool account and use the tool at no cost. You may invite outside collaborators from other institutions directly via the tool and in many cases their institution may also be a member allowing them to use their institutional login for access.
See NIH page of FAQs for the NIH Policy for Data Management and Sharing Policy for additional FAQs posted by NIH.