Research Data Center (RDC)
Substance Abuse and Mental Health Services Administration (SAMHSA) Restricted Use Data
The National Survey on Drug Use and Health (NSDUH) is a premier population health survey.
- use of illegal drugs, prescription drugs, alcohol, and tobacco and misuse of prescription drugs
- substance use disorder and substance use treatment, major depressive episodes and depression care
- serious psychological distress, mental illness, and mental health care
The data provide estimates of substance use and mental illness at the national, state, and substate levels. NSDUH data also help to identify the extent of substance use and mental illness among different subgroups, trends over time, and the need for treatment services.
NSDUH releases two versions of data yearly. The restricted use file (RUF) is not publicly accessible because it contains sensitive information such as zip code and other geographic identifiers. The public use file (PUF) is created from the RUF by applying disclosure control techniques and is publicly available online. The research data center (RDC) program provides a mechanism for data users to access NSDUH restricted-use data files in a secure, confidentiality-compliant manner. SAMHSA RDC does not have our own RDC sites. SAMHSA RDC collaborates with the National Center for Health Statistics (NCHS) RDC and the Federal Statistical Research Data Centers (FSRDC) to carry out the NSDUH RDC program. All SAMHSA RDC users must first carefully read "Guidelines for SAMHSA RDC Data Users."
To learn more, please visit the following sections:
- Confidential data files
- Application process
- Application fee
- Proposal review
- Security protocol
- Accessing files
- Resources for preparing the application
- Direct assistance via a resource mailbox
Confidential data files
NSDUH RUF is available from 2004 to present. RUFs contain more data than the public use files, both in terms of the number of records and number of variables.
Comparison between the NSDUH public use and restricted use files
Confidential and Non-Public Use Variables
- Fully Specified Industry and Occupation Codes: These codes allow a worker's employment to be identified by industry and occupation (available through 2014).
- Region, State, and County FIPS Codes: These codes can be used to merge any data at the region, state, and/or county level onto the NSDUH data.
- Non-Public Use Data Elements: These are data elements from our questionnaires that are not directly identifiable data —sensitive in nature and not available in the public use files— such as age, detailed race-ethnicity, life experiences, and sexual identity (LGB).
Prospective researchers must submit an RDC application —including a research proposal— that will be reviewed by the SAMHSA RDC team. Applications are accepted continuously and reviewed every week.
The SAMHSA RDC team reviews materials for completeness, receives clarification from the researchers when needed, and makes a recommendation for ultimate approval to the SAMHSA RDC Governance Board. As needed, the researcher will revise the application until the proposal is finalized.
SAMHSA’s data hosting partner, the NCHS RDC network, charges a user fee for access. The amount of the fee varies by volume, usage, and requests for technical assistance. In addition, researchers using one of the FSRDCs are required to secure Special Sworn Status (‘Triple S’), a security clearance required by the Census Bureau in order to enter and use a FSRDC.
The SAMHSA RDC team coordinates the review of each application. The following will be considered in reviewing proposals:
- The feasibility of existing data to the project, that is, whether it is possible for the research to be conducted with the available information. On occasion, it is clear from the outset that the sample will not support the intended analysis. For instance, NSDUH does not allow for individual-level record linkage.
- The risk of disclosure of restricted information, that is, whether the analysis can be conducted without compromising the confidentiality promised to all respondents (children, adults, households, neighborhoods).
Users should note that approval of the proposal does not constitute endorsement by SAMHSA of the substantive, methodological, theoretical, policy relevance, or scientific merit of the proposed research. Approval only constitutes a judgment that the research is possible, given the population survey’s design features, and an appropriate use of the data per confidentiality and privacy protections.
Once the application has been approved, the confidentiality training must be completed and signed, documenting that the researcher has read and will follow the RDC disclosure review policies and procedures.
Maintaining confidentiality is the primary objective of the restricted use data program. The confidentiality training, confidentiality forms, and the disclosure manual outline the expected policies and procedures that are required to protect the data and prevent the disclosure of confidential information. Both the principal investigator and the analyst must complete the confidentiality training and sign the confidentiality forms. The completed certificate and data user agreement forms must be uploaded with the application to be considered a complete package.
Researchers wanting to use a FSRDC must also secure Special Sworn Status. This process includes an application, background check, and a fee. The process takes on average 3-4 months and is facilitated by the designated FSRDC after the proposal has been approved.
Accessing the data files
Researchers must conduct their study within the designated RDC, either one of the four NCHS RDC locations, or at one of the 31 FSRDC located throughout the US. Remote access is not available at this time. Once the application has been approved and the requisite security protocol completed, please reach out to the designated RDC to schedule an in-person appointment.
The SAMHSA Restricted Use Data Program allows researchers to supply their own data to be merged with SAMHSA data at a feasible geographic unit of analysis. The user-supplied data may consist of proprietary data collected and owned by the user. Users must provide the Data Center staff with complete documentation of any data proposed to be merged and written approval for use. The documentation should include descriptive variables and value labels. Users are responsible for communicating with Data Center staff to ensure that the data can be merged and that the formats are consistent.
What output can be taken from the RDC?
- Only populated shell tables that exist within an approved application.
- All materials, including populated table shells, must undergo disclosure review by SAMHSA prior to release.
- In addition to approved tables or figures, researchers may request that the SAMHSA RDC team release the researchers' programming code.
What output cannot be taken from the RDC?
- Output that does not match shell tables or figures within the approved proposal.
- Any output that could potentially identify respondents or small geographic areas, either directly or inferentially, cannot be removed from any Research Data Center.
- Any direct or inferential identifiers not revealed in the public use files.
- Sample case printouts or screenshots.
- Output that does not follow the disclosure rules as outlined in "Guidelines for SAMHSA RDC Data Users".
- Intermediate output poses disclosure risks. As a result, your output must be constrained to what is needed for a final research paper or journal article. Intermediate output can be created and used onsite at the RDC, but we do not allow any intermediate output to be released.
Resources for preparing your application
Application process and guidelines
- Guidelines for SAMHSA RDC Data Users
- Example RDC Output Summary Report
- Creating a data dictionary
- NSDUH reports
- 2004 Codebook [PDF, 11.9MB]
- 2005 Codebook [PDF, 11.8MB]
- 2006 Codebook [PDF, 15.5MB]
- 2007 Codebook [PDF, 12.6MB]
- 2008 Codebook [PDF, 15.7MB]
- 2009 Codebook [PDF, 17.2MB]
- 2010 Codebook [PDF, 15.6MB]
- 2011 Codebook [PDF, 15.3MB]
- 2012 Codebook [PDF, 15.5MB]
- 2013 Codebook [PDF, 13.8MB]
- 2014 Codebook [PDF, 15.2MB]
- 2015 Codebook [PDF, 14.5MB]
- 2016 Codebook [PDF, 19.8MB]
- 2017 Codebook [PDF, 22.9MB]
- 2018 Codebook [PDF, 19.9MB]
- 2019 Codebook [PDF, 20.5MB]
- 2020 Codebook [PDF, 12.7MB]
- NSDUH Methodology
Completing confidentiality requirements
- Confidentiality training
- Data user agreement form [PDF, 163KB]
- Access agreement (attachment from CDC) [PDF, 15KB]
- Application [PDF, 121KB]
SAMHSA Restricted Use Data Program
Substance Abuse and Mental Health Services
Center for Behavioral Health Statistics and Quality
5600 Fishers Lane
Rockville, MD 20857