Blog
Mastering the NIH Data Management and Sharing Policy
You're mid-assay, one timer is about to go off, your gloves are wet, and you've just remembered that the next NIH application needs a Data Management and Sharing Plan. Not a vague promise to upload files later. A real plan, with decisions about what data you'll keep, how you'll document it, where it will go, and when it will be shared.
That's the moment many wet-lab researchers are in now. The science hasn't gotten simpler, but the documentation expectations have become much more explicit. What makes this frustrating is that the policy language often lives at the level of repositories, plan elements, and compliance terms, while actual risk starts much earlier, at the bench, when someone delays recording an observation because both hands are occupied.
The good news is that the nih data management and sharing policy is manageable if you treat it as an operational workflow problem, not just a grant-writing exercise. The labs that handle it well usually do one thing differently. They decide early how data will be captured, organized, and prepared for sharing, then they build those habits into routine experimental work.
Table of Contents
- Introduction The New Reality of NIH Grant Compliance
- What the NIH DMS Policy Actually Demands
- Crafting Your Data Management and Sharing Plan DMSP
- Budgeting Costs and Selecting Repositories
- Navigating Exemptions Consent and Data Reuse
- Integrating DMSP into Your Daily Lab Workflow
- Wet Lab Compliance Checklist and Final Questions
Introduction The New Reality of NIH Grant Compliance
For years, a lot of smaller labs could treat formal data-sharing planning as something that mostly hit large awards. That mental model no longer works. The NIH Data Management and Sharing Policy became effective on January 25, 2023, and it applies to all competing grant proposals generating scientific data, regardless of funding level, which is a major expansion from the earlier policy that only applied to grants over $500,000 in direct costs and asked for a much less detailed plan, as summarized by the NIH DMS policy overview from HSLS.

That change matters because it shifts data stewardship to the front of the project. You're not just promising that data might be available later. You're expected to submit a plan with your application, include a budget for data management, and think through organization, storage, documentation, and sharing before the work starts.
For wet-lab groups, this creates a practical mismatch. NIH tells you to manage data according to the approved plan and report on it, but a lot of bench work still depends on fragmented note-taking, manual transcription, and end-of-day cleanup. That's where compliance starts to feel disconnected from reality.
Good DMS compliance starts before repository deposit. It starts with whether the record created during the experiment is clear enough to trust later.
The labs that adapt fastest usually stop asking, “How do we fill out the form?” and start asking, “What evidence will we have, months from now, that this experiment was documented in a way someone else can interpret, preserve, and share appropriately?” That's the level where the policy becomes useful instead of just burdensome.
What the NIH DMS Policy Actually Demands
The easiest way to misread the nih data management and sharing policy is to treat it like a repository rule. It isn't. It's a research-planning rule that reaches back into how your lab creates and describes data from the first day of the project.
Who is covered now
The current policy applies broadly. If your competing NIH proposal will generate scientific data, you should assume the DMS requirement is part of the application package. That includes work across different funding mechanisms and research settings, not just large multi-center projects.
What changed in practice is the level of specificity. Under the older framework, some investigators could address sharing at a high level or explain why sharing wasn't possible. The current expectation is more operational. NIH wants a Data Management and Sharing Plan and wants the project run in accordance with that plan.
A simple way to think about it is this:
- Old mindset: describe intentions about sharing near the end.
- Current mindset: define handling, documentation, preservation, and sharing from the start.
- Lab consequence: if your records are inconsistent at the bench, the downstream plan becomes harder to execute.
If you want a useful companion piece on handling research records before they become submission problems, this guide on managing scientific data in the lab is aligned with the same operational issue.
What counts as scientific data
Wet-lab teams often overfocus on the final figure-ready dataset. NIH's concern is broader. In practical terms, think about the data needed to validate what you're claiming. That usually includes the underlying experimental records, structured observations, instrument outputs, processed files, and the documentation that lets another person interpret what those files mean.
That doesn't mean every scratch note belongs in a public repository. It does mean your lab should distinguish between:
| Record type | Practical question |
|---|---|
| Raw experimental observations | Could someone understand what happened and when? |
| Processed data files | Is the transformation from raw to processed understandable? |
| Metadata and context | Are sample identity, conditions, timing, and methods documented clearly enough for reuse? |
A common failure point is treating metadata as an afterthought. In wet-lab work, metadata is often the difference between a reusable dataset and a folder of uninterpretable files. Reagent identity, timing, experimental phase, deviations from protocol, and sample context often matter as much as the numerical readout itself.
Practical rule: If a colleague outside your project couldn't tell what a file represents without emailing the person who ran the assay, the documentation is probably too thin.
Crafting Your Data Management and Sharing Plan DMSP
Most researchers freeze when they hear “write the DMSP,” because it sounds like a policy essay. It's better approached as a set of decisions that your future self will have to live with during the award.

The shift from narrative to structured answers
Starting May 25, 2026, NIH requires DMS Plan elements to move into simplified yes/no questions with mandatory repository mapping, including naming one or more repositories where the data will be preserved and providing a 300-word rationale when limiting sharing is necessary, according to the University of Michigan guide to the updated NIH policy.
That change matters because hand-wavy language won't carry you very far. A generic statement such as “data will be shared in accordance with institutional policy” is weak because it avoids the operational details NIH is now asking you to name directly.
A practical way to draft each required element
Start with the six familiar planning elements and answer them as if a new lab member had to run your process without guessing.
Data type
Be concrete. Name the kinds of data your project will produce.
For a wet-lab project, that might include:
- Primary experimental records: observation notes, instrument outputs, assay readouts, imaging files
- Processed datasets: normalized values, compiled tables, annotated results files
- Supporting context: run conditions, reagent details, timing records, protocol deviations
Don't write “all relevant data.” List the categories your lab will create.
Related tools software and or code
Some wet-lab groups think this applies only to computational projects. It doesn't. If specific software is needed to open, interpret, or process your data, name it. If code is used for analysis, identify that too.
The point isn't to impress NIH with technical detail. The point is to avoid leaving future users with files they can't interpret.
Standards
This section is where many plans become vague. Describe the formats, naming conventions, and metadata practices your lab will use. In wet-lab settings, standards often mean file formats, structured templates, and consistent naming rules rather than one formal schema everyone in the lab already follows.
A useful internal prompt is: what would make this dataset understandable to someone who didn't watch the experiment happen?
Data preservation access and associated timelines
You name the repository and the timing. If you don't know the exact repository, resolve that before the application is finalized. Don't leave it for later if the project depends on a particular deposit path.
Access distribution or reuse considerations
State limits transparently. Sensitive human data, proprietary constraints, technical barriers, and ethical restrictions are appropriate if they are real and specific. General discomfort with sharing is not.
Oversight of data management and sharing
Name responsibility. Someone in the project has to own the process. In many labs, the best answer is a shared model where the PI is accountable, but day-to-day implementation sits with a project lead, lab manager, or designated data steward.
The strongest plans usually sound operational, not aspirational. They name who records what, where files are stored, how metadata is captured, and who checks that the plan is actually being followed.
Budgeting Costs and Selecting Repositories
A strong DMSP that has no budget logic behind it is often a sign that the plan was written in isolation from the actual work. Wet-lab researchers run into trouble here because data management takes labor, and labor has to be acknowledged somewhere.
What belongs in the budget
The simplest budgeting approach is to ask what your lab must do beyond the experiment itself in order to preserve and share data as promised. That can include time for organizing files, cleaning metadata, preparing documentation, formatting datasets for deposit, and paying repository-related fees when applicable.
For many labs, the hidden cost is staff time. If a technician, analyst, project manager, or senior trainee has to prepare records for repository submission, that work should be treated as part of the project's data management effort rather than as invisible overtime.
A practical internal checklist looks like this:
- Curation time: hours needed to clean filenames, finalize metadata, and package data for deposit
- Documentation work: preparation of readme files, data dictionaries, and method notes
- Repository-related expenses: any deposit or preservation costs that are allowed and necessary
- Quality review: time to confirm that the deposited package matches the plan
How to choose a repository without guessing
Repository selection should be driven by fit, not convenience alone. In practice, most labs are choosing among a domain-specific repository, an institutional repository, or a generalist repository.
Here's a simple comparison framework:
| Repository type | Usually a good fit when | Common trade-off |
|---|---|---|
| Domain-specific | Your field has an accepted archive and community norms | Rules may be stricter and formatting more demanding |
| Institutional | Your university provides a supported option | Discovery may be less field-specific |
| Generalist | No obvious disciplinary home exists | You may need to do more work to make context clear |
A few decision criteria matter more than the label:
- Repository suitability: Can it preserve the data type you generate?
- Access conditions: Can it handle any restrictions your project requires?
- Metadata support: Does it allow enough description for others to understand the dataset?
- Persistence: Will the record remain stable and citable over time?
For wet-lab teams, the trap is choosing a repository too late. If the archive expects structured metadata or particular formats, those requirements can force cleanup work back onto the project. Labs that pick the destination early usually document more intelligently from the start.
Navigating Exemptions Consent and Data Reuse
The NIH policy pushes toward sharing, but it doesn't require reckless sharing. The hard part is distinguishing a legitimate limitation from an unexamined habit.
When limiting sharing is legitimate
Some projects have real constraints. Human participant privacy, ethical restrictions, intellectual property concerns, and technical barriers can all affect what can be shared and under what conditions. The key is specificity. If sharing is limited, your justification should describe the actual reason, not a broad institutional preference.
This matters especially for consent-driven work. If your protocol involves people, your downstream sharing options are shaped by what participants were told, what permissions were obtained, and what level of access control is appropriate. Labs often create future headaches by using consent language that is too narrow or too ambiguous for later data sharing.
A defensible approach usually includes:
- Matching sharing plans to consent terms: don't promise openness your consent process can't support
- Separating restricted and shareable components: not every file has the same access profile
- Documenting the reason for limits: ethical, legal, or technical constraints should be stated plainly
The hard question with reused data
One of the murkiest areas is secondary analysis. NIH says researchers should maximize appropriate sharing, but public guidance doesn't definitively resolve whether teams who reuse already shared datasets need to reshare the original primary data as part of their own outputs. That ambiguity is noted in the NIH guidance on writing a DMS Plan.
That uncertainty shows up quickly in computational biology, integrative projects, and translational work that combines multiple sources. If you pull from existing datasets, generate cleaned intermediates, merge sources, and produce new analytical outputs, you need to be explicit about what is yours to share, what was obtained from others, and what terms attach to reuse.
For privacy-sensitive work, this broader discussion of data security and compliance in scientific documentation is useful because sharing decisions are only as good as the controls around the original records.
If reused data came with its own access conditions, don't assume your project can erase them. Your plan should distinguish original source data from newly generated derivative outputs.
In practice, the safest approach is to document provenance carefully. Identify the source datasets, the conditions attached to them, and the derivative materials your project creates. That won't eliminate all ambiguity, but it makes your decisions reviewable and defensible.
Integrating DMSP into Your Daily Lab Workflow
Most DMS problems aren't caused by bad intentions. They come from delayed capture. Someone means to write down the exact incubation stop time, the deviation in wash volume, or the unexpected change in sample appearance, then fills it in later from memory. That's where data quality weakens.

NIH's public guidance leaves a real operational gap here. It requires researchers to plan and budget for managing and sharing scientific data, but offers minimal practical direction on contemporaneous capture in wet-lab environments, as described in the NIH DMS policy overview. That gap matters because repository compliance depends on record quality long before deposit day.
Why contemporaneous capture matters
In wet-lab work, timing and sequence are often part of the scientific meaning. If the record doesn't show when an observation happened, whether a timer overran, or when a step was repeated, later interpretation gets weaker.
Contemporaneous documentation helps with three things at once:
- Reproducibility: the next person can see what happened, not what was reconstructed later
- Internal review: the PI or lab manager can spot missing context before it spreads into analysis files
- Sharing readiness: exported datasets are easier to describe when the original record is structured and timestamped
What fails is familiar. Gloves come off too late. Notes land on scrap paper. A trainee transcribes from memory afterward. A folder of instrument files gets saved with names that made sense in the moment and nowhere else.
What works at the bench and what fails
Bench documentation has to match bench reality. If both hands are occupied, the capture method must be low-friction enough to use during the experiment, not after it. That's why many labs are moving away from purely retrospective entry.
One practical category is voice capture paired with structured note review. Tools in that category don't replace experimental judgment. They reduce the lag between observation and record creation. If you're evaluating options, this overview of electronic lab software for scientists is a good starting point for thinking about fit by workflow rather than by marketing category.
A bench-level example is Verbex, an iPhone app for voice-first lab note capture. It lets scientists speak notes during experiments, structures those notes into ELN-style sections such as Objective, Materials, Procedure, Observations, and Results, timestamps each capture, records timer events into the note history, and processes everything on-device rather than sending data to external servers. For labs worried about delayed note-taking, confidentiality, or auditable timing, that kind of workflow directly addresses a gap the policy itself doesn't really solve.
A short product walkthrough is below.
Better DMS compliance usually doesn't come from writing a longer plan. It comes from making the right action easier at the moment the data is created.
Wet Lab Compliance Checklist and Final Questions
Use this as a final pre-submission and pre-award review sheet.
| Phase | Task | Status (To Do / Done) |
|---|---|---|
| Planning | Confirm the project generates scientific data and needs a DMSP | To Do / Done |
| Planning | Define the specific data types the lab will create | To Do / Done |
| Planning | Identify software, file formats, and metadata practices needed for interpretation | To Do / Done |
| Planning | Assign responsibility for day-to-day DMS oversight | To Do / Done |
| Repository | Select one or more repositories appropriate for the data type | To Do / Done |
| Budget | Include realistic effort for curation, documentation, and deposit preparation | To Do / Done |
| Lab workflow | Decide how observations, deviations, and timed events will be captured during experiments | To Do / Done |
| Lab workflow | Standardize naming, sectioning, and record-review habits across the team | To Do / Done |
| Pre-submission | Check that any sharing limitations are specific and defensible | To Do / Done |
| Pre-submission | Verify that consent language and sharing plans are aligned where relevant | To Do / Done |
A few final questions come up repeatedly.
Common questions
Does the DMSP get scored by peer reviewers?
Peer reviewers aren't asked to comment on the DMS Plan unless sharing data is integral to the project design, but that still leaves room for interpretation in practice.
Can I use a generic statement about my institution's repository?
That's risky. NIH's more structured format increases the expectation that you identify repository choices specifically rather than relying on vague institutional language.
What if my project reuses data from other sources?
Be explicit about provenance, reuse conditions, and which outputs your team is newly generating. That area remains less clear than many investigators would like.
What's the biggest practical mistake wet labs make?
Treating the DMSP as a writing task instead of a documentation workflow. If the bench record is weak, the formal plan won't rescue it.
If your lab is trying to make NIH compliance more practical at the bench, Verbex is worth a look. It gives individual scientists a way to capture timestamped experiment notes by voice while they work, organize them into structured ELN-style sections, document timer events automatically, and export a clean PDF record without sending data off the device.