Research Data Center (RDC)

Substance Abuse and Mental Health Services Administration (SAMHSA) Restricted Use Data
The National Survey on Drug Use and Health (NSDUH) is a premier population health survey.

NSDUH measures:

  • use of illegal drugs, prescription drugs, alcohol, and tobacco and misuse of prescription drugs
  • substance use disorder and substance use treatment major depressive episode and depression care
  • serious psychological distress, mental illness, and mental health care

The data provide estimates of substance use and mental illness at the national, state, and substate levels. NSDUH data also help to identify the extent of substance use and mental illness among different subgroups, estimate trends over time, and determine the need for treatment services.

Researchers can access restricted data files that have not been publicly released due to confidentiality using the National Center for Health Statistics (NCHS) Research Data Center (RDC) network or any of the Federal Statistical Research Data Centers (FSRDC). All projects can be initiated by requesting a RDC consult with SAMHSA or submitting a RDC application to SAMHSA.

To learn more, please visit the following sections:

Confidential data files

NSDUH Restricted Use File (available for 2004-current). The restricted use files contain more data than public use files, both in number of records (10,000 more records) and number and type of variables (1,400 variables). Sensitive data on mental health, healthcare, substance use, demographics, and geography are available within the confidential data files.

Comparison between the NSDUH public use and restricted use files

Confidential and Non-Public Use Variables

  • Fully Specified Industry and Occupation Codes: These codes allow a worker's employment to be identified by industry and occupation (available through 2014).
  • Region, State, and County FIPS Codes: These codes can be used to merge any data at the region, state and/or County level onto the NSDUH data.
  • Non-Public Use Data Elements: These are data elements from our questionnaires that are not directly identifiable data —sensitive in nature and not available in the public use files— such as age, detailed race-ethnicity, life experiences, and sexual identity (LGB).

Application process

Prospective researchers must submit an RDC application —including a research proposal— that will be reviewed by the SAMHSA Data Center team. Applications are accepted continuously and are reviewed every week.

The SAMHSA Data Center team reviews materials for completeness, receives clarification from the researchers when needed, and makes a recommendation for ultimate approval to the SAMHSA RDC Governance Board. As needed, the researcher will revise the application until the proposal is finalized.

Application fee

SAMHSA’s data hosting partner, the NCHS’s RDC network, charges a user fee for access. The amount of the fee varies by volume, usage, and requests for technical assistance. In addition, researchers using one of the FSRDC’s are required to secure Special Sworn Status (‘Triple S’), a security clearance required by the Census Bureau in order to enter and use a FSRDC.

Proposal review

The SAMHSA Data Center team coordinates the review of each application. The following will be considered in reviewing proposals:

  • The feasibility of existing data to the project, that is, whether it is possible for the research to be conducted with the available information. On occasion, it is clear from the outset that the sample will not support the intended analysis. For instance, NSDUH does not allow for individual-level record linkage.
  • The risk of disclosure of restricted information, that is, whether the analysis can be conducted without compromising the confidentiality promised to all respondents (children, adults, households, neighborhoods).

Users should note that approval of the proposal does not constitute endorsement by SAMHSA of the substantive, methodological, theoretical, policy relevance, or scientific merit of the proposed research. Approval only constitutes a judgment that the research is possible, given the population survey’s design features, and an appropriate use of the data per confidentiality and privacy protections.

Security protocol

Once the application has been approved, the confidentiality training must be completed and signed, documenting that the researcher has read and will follow the RDC disclosure review policies and procedures.

Maintaining confidentiality is the primary objective of the restricted use data program. The confidentiality training, confidentiality forms, and the disclosure manual outline the expected policies and procedures that are required to protect the data and prevent the disclosure of confidential information. Both the principal investigator and the analyst must complete the confidentiality training and sign the confidentiality forms. The completed certificate and data user agreement forms must be uploaded with the application to be considered a complete package.

Researchers wanting to use a FSRDC must also secure Special Sworn Status. This process includes an application, background check, and a fee. The process takes on average 3-4 months and is facilitated by the designated FSRDC after the proposal has been approved.

Accessing the data files

Researchers must execute all computer runs within the designated RDC, either one of the four NCHS RDC locations, or at one of the 31 FSRDC located throughout the U.S. Remote access is not available at this time. Once the application has been approved and the requisite security protocol completed, please reach out to the designated RDC to schedule an in-person appointment.

The SAMHSA Restricted Use Data Program allows researchers to supply their own data to be merged with SAMHSA data at a feasible geographic unit of analysis. The user-supplied data may consist of proprietary data collected and owned by the user. Users must provide the Data Center staff with complete documentation of any data proposed to be merged and written approval for use. The documentation should include descriptive variables and value labels. Users are responsible for communicating with Data Center staff to ensure that the data can be merged and that the formats are consistent.


What output can be taken from the RDC?

  • Only populated shell tables that exist within an approved application.
  • All materials, included populated table shells, to be removed from the RDC also undergo disclosure review by SAMHSA prior to release.
  • In addition to approved tables or figures, researchers may request that the SAMHSA Data Center team releases the researchers’ programming code.

What output cannot be taken from the RDC?

  • Output that does not match shell tables or figures within the approved proposal.
  • Any output that could potentially identify respondents or small geographic areas, either directly or inferentially, cannot be removed from any Research Data Center.
  • Any direct or inferential identifiers not revealed on the public use files.
  • Sample case printouts or screenshots.
  • Please follow suppression rules as outlined in the Analytic Guide.
  • Intermediate output poses disclosure risks. As a result, your output must be constrained to what is needed for a final research paper or journal article. Intermediate output can be created and used onsite at the RDC, but the we do not allow any intermediate output to be released.

Resources for preparing your application

Getting started

Preparing a quality application

Completing confidentiality requirements

Online query tool

Direct assistance

SAMHSA Restricted Use Data Program
Substance Abuse and Mental Health Services
Center for Behavioral Health Statistics and Quality
5600 Fishers Lane
Rockville, MD 20857