Research Data Center (RDC)
A pathway to accessing SAMHSA's National Survey on Drug Use and Health (NSDUH) Restricted Use File
Substance Abuse and Mental Health Services Administration (SAMHSA) Restricted Use Data
The National Survey on Drug Use and Health (NSDUH) is a premier population health survey.
- use of illegal drugs, prescription drugs, alcohol, tobacco, and misuse of prescription drugs
- substance use disorder and substance use treatment, major depressive episodes and depression care
- serious psychological distress, mental illness, and mental health care
NSDUH data provide estimates of substance use and mental illness at the national, state, and substate levels. NSDUH data also help to identify the extent of substance use and mental illness among different subgroups, to estimate trends over time, and to determine the need for treatment services.
NSDUH releases two versions of data yearly. The restricted use file (RUF) is not publicly accessible because it contains sensitive information such as zip code and other geographic identifiers. The public use file (PUF) is created from the RUF by applying disclosure control techniques and is publicly available online.
The research data center (RDC) program provides a mechanism for data users to access NSDUH restricted-use data files in a secure, confidentiality-compliant manner. SAMHSA RDC does not have our own RDC sites. SAMHSA RDC collaborates with the National Center for Health Statistics (NCHS) RDC and the Federal Statistical Research Data Centers (FSRDC) to carry out the NSDUH RDC program. All SAMHSA RDC users should carefully read "Guidelines for SAMHSA RDC Data Users" before accessing RUF data.
Virtual Data Enclave (VDE)
The VDE is coming soon! The VDE will provide greater access to NSDUH data by allowing approved researchers to access the RUF without physically going to an RDC. Instead, the institution where the researchers belong may designate a room that meets all data security requirements as specified, which would potentially save the data users time and money. Pilot implementation of the VDE will begin February 2023 will full implementation projected for later that year. Please visit our webpage for important updates.
To learn more, please visit the following sections:
- Confidential and public use data files
- Application process
- Application fee
- Security protocol
- Accessing files
- Resources for preparing the application
- Contact information
Confidential and public use data files
NSDUH RUF is available from 2004 to the latest release of the most recent surveys. RUFs contain more data than the public use files, both in terms of the number of records and number of variables.
Comparison between the NSDUH public use and restricted use files (Note: There could be small variations between years.)
Confidential and Non-Public Use Variables
- Fully Specified Industry and Occupation Codes: These codes allow a worker's employment to be identified by industry and occupation (available through 2014).
- Region, State, and County FIPS Codes: These codes can be used to merge any data at the region, state, and/or county level onto the NSDUH data.
- Non-Public Use Data Elements: These are data elements from our questionnaires that are not directly identifiable data—sensitive in nature and not available in the public use files—such as age, detailed race-ethnicity, life experiences, and sexual identity (LGB).
Prospective researchers must submit an RDC application, also known as an RDC proposal, that will be reviewed by the SAMHSA RDC team. The proposal must be approved before any other procedures can happen. SAMHSA review is to make sure that all requirements as specified in "Guidelines for SAMHSA RDC Data Users" and "RDC sample proposal" are carefully followed. In addition to the format and completeness, the following two aspects are of particular importance for the application to pass SAMHSA review:
The SAMHSA RDC team coordinates the review of each application.
- The feasibility of existing data to the project, that is, whether it is possible for the research to be conducted with the available information. On occasion, it is clear from the outset that the sample will not support the intended analysis. For instance, NSDUH does not allow for individual-level record linkage.
- The risk of disclosure of restricted information, that is, whether the analysis can be conducted without compromising the confidentiality promised to all respondents (children, adults, households, neighborhoods).
We may ask the researchers or the data users to provide additional clarifications and revisions if it is deemed necessary. The application will be approved if all requirements are met. Approval of the proposal does not constitute endorsement by SAMHSA of the substantive, methodological, theoretical, policy relevance, or scientific aspects of the proposed research.
Conventional or non-SAP application
Starting in December 2022, statistical agencies across the federal government including SAMHSA will start to adopt the Standard Application Process (SAP) (see below). Prior to the full functioning of the SAP, data users should continue to use the "RDC sample proposal" to create their application and submit the RDC application and subsequent revisions and amendments, if any, to SAMHSA via the email RDCA@samhsa.hhs.gov. All RDC projects approved prior to the SAP implementation will be referred to as non-SAP projects and should continue to use the conventional way to correspond with SAMHSA RDC.
Application via SAP portal
The application process for requesting access to SAMHSA RDC data is evolving. The Foundations for Evidence-Based Policymaking Act of 2018 calls for the Federal government to establish a standard application process (SAP) through which agencies, the Congressional Budget Office, State, local, and Tribal governments, researchers, and other individuals, as appropriate, may apply for access to confidential microdata. In response, the federal statistical system is developing the SAP Portal at www.ResearchDataGov.org. The SAP Portal is a web-based data catalog and common application that will serve as a “front door” to apply for restricted data from any of the 16 principal federal statistical agencies and units for evidence building purposes.
The SAP Portal is being implemented in a phased approach. On February 28, 2022, the federal statistical agencies launched the SAP Portal data catalog to provide prospective applicants with comprehensive metadata about federal statistical agencies’ confidential microdata. Beginning December 8, 2022, SAMHSA and other principal federal statistical agencies and units will begin accepting applications to the SAP Portal.
Keep in mind that output requests will NOT be affected by the SAP and should be requested directly through RDCA@samhsa.hhs.gov. Also, amendments and revisions of applications submitted prior to December 8, 2022, will continue to be sent to RDCA@samhsa.hhs.gov.
Whenever there is an official notice that the SAP system is not working properly, data users may create and submit applications in the conventional way (see above). Applications approved prior to the SAP implementation will continue to submit documentation (e.g., updated variable selection list, amendments, etc.) via RDCA@samhsa.hhs.gov.
To learn more about the SAP, please visit www.ResearchDataGov.org.
SAMHSA's data hosting partner, the NCHS RDC network, charges a user fee for access. The amount of the fee varies by volume, usage, and requests for technical assistance.
Maintaining confidentiality is the primary objective of the restricted use data program. Once the application has been approved, the confidentiality training must be completed, and the required forms must be signed to document that the researcher has read and will follow the RDC disclosure review policies and procedures.
All researchers on the project must complete confidentiality training and sign and complete the Designated Agent Form (DAF) with a notary signature. Signing the DAF allows researchers to become designated agents to access CIPSEA protected data. In addition, all analysts entering the RDC must sign the DAA (Data Access Agreement) form. The training certificate and signed DAA and DAF must be submitted to be considered a complete package. For students wanting to access NSDUH RUF, both students and their advisors must also sign the SAMHSA RDC Student Data User Acknowledgement form.
Researchers wanting to use a FSRDC must also secure Special Sworn Status. This process includes an application, background check, and a fee. The process takes on average 3–4 months and is facilitated by the designated FSRDC after the proposal has been approved.
Accessing the data files
Researchers must conduct their study within the designated RDC, either one of the four NCHS RDC locations, or at one of the 31 FSRDC located throughout the US. Once the application has been approved and the requisite security protocol completed, please reach out to the designated RDC to schedule an in-person appointment.
The SAMHSA Restricted Use Data Program allows researchers to merge RUF with approved external data while working at the RDC. The user-supplied data may consist of proprietary data collected and owned by the user or public use data. Proprietary or restricted use data obtained apart from SAP portal should be accompanied with written approval for use. All external data must be sent to SAMHSA RDC for approval. If approved, SAMHSA RDC will upload the external data to the researchers' analytic folder.
What output can be taken from the RDC?
- Only populated shell tables that exist within an approved application.
- All materials, including populated table shells, must undergo disclosure review by SAMHSA prior to release.
- Researchers may request the release of the programming code as part of the output package submitted for review.
What output cannot be taken from the RDC?
- Output that does not match shell tables or figures within the approved proposal.
- Any output that could potentially identify respondents or small geographic areas, either directly or inferentially, cannot be removed from any Research Data Center.
- Any direct or inferential identifiers not revealed in the public use files.
- Sample case printouts or screenshots.
- Output that does not follow the disclosure rules as outlined in "Guidelines for SAMHSA RDC Data Users".
- Intermediate output. Intermediate output can be created and used onsite at the RDC, but cannot be included in the output package submitted for review.
Resources for preparing your application
- RDC locations: (1) NCHS RDC; (2) Census Bureau FSRDC
- RDC fees
- RDC sample proposal
Application resources and guidelines
- Guidelines for SAMHSA RDC Data Users
- RDC Output Summary Report
- Example of Data Dictionary
- Example of Table/Figure Shells
- NSDUH reports
- 2004 Codebook [PDF, 11.9MB]
- 2005 Codebook [PDF, 11.8MB]
- 2006 Codebook [PDF, 15.5MB]
- 2007 Codebook [PDF, 12.6MB]
- 2008 Codebook [PDF, 15.7MB]
- 2009 Codebook [PDF, 17.2MB]
- 2010 Codebook [PDF, 15.6MB]
- 2011 Codebook [PDF, 15.3MB]
- 2012 Codebook [PDF, 15.5MB]
- 2013 Codebook [PDF, 13.8MB]
- 2014 Codebook [PDF, 15.2MB]
- 2015 Codebook [PDF, 14.5MB]
- 2016 Codebook [PDF, 19.8MB]
- 2017 Codebook [PDF, 22.9MB]
- 2018 Codebook [PDF, 19.9MB]
- 2019 Codebook [PDF, 20.5MB]
- 2020 Codebook [PDF, 12.7MB]
- NSDUH Methodology
Completing confidentiality requirements
- Confidentiality training
- Designated Agent Form (DAF) [PDF, 163 KB]
- Data Access Agreement (DAA) [PDF, 15KB]
- SAMHSA RDC Student Data User Acknowledgement [PDF, 138KB]
- Application [PDF, 121KB]
SAMHSA Restricted Use Data Program
Substance Abuse and Mental Health Services
Center for Behavioral Health Statistics and Quality
5600 Fishers Lane
Rockville, MD 20857