NSDUH Data Files
The National Surveys on Drug Use and Health (NSDUH) measure substance use, mental illness, and treatment in the civilian noninstitutionalized population 12 or older. This page contains NSDUH public use files (PUFs) for download, the PUF documentation, and documentation for restricted use files (RUFs – no data files).
To find a specific dataset or documentation, select a year, then the particular data set you are looking for. There are three kinds available – single-year PUFs, combined PUFs, and combined RUFs. Combined datasets allow researchers to more easily do trend analyses, or to pool years to examine small populations. However, not all variables in the combined PUF are comparable across years - please see the variable crosswalk charts below to ensure that the variables are comparable across all the years of interest.
Public Use Files (PUFs)
PUFs are full datasets treated with confidentiality protections - the PUF codebooks give more information on specific data treatments, including the exclusion of most geographic variables. PUFs and RUFs will generate comparable, but not identical, estimates because of the PUFs’ disclosure-avoidance methods.
Restricted Use Files (RUFs)
The documentation available for the combined RUFs can help you select appropriate variables when running cross-tabulations (crosstabs) in the Data Analysis System (DAS). For more information on these codebooks, see the Introduction to Combined Restricted-use Codebooks. Combined RUFs only contain variables that are comparable across the included data years.
The microdata files for single-year RUFs are by application only. See the Research Data Center site for more information. Codebooks for these files can also be found on that page.
Please also see the following FAQs to help with your analysis:
Versions
2013-06-21: Released methodological resource documentation and updated xml file to include variable groupings.Dataset Documentation
ASCII Setup Files
Publications Using SAMHSA Data
Variable Crosswalk Charts
There have been changes to the questionnaire and variable definitions over the years. For those who want to make trends or pool data, it is important that the variable be comparable over the period of interest. These charts list all available variables for each dataset and for each one lists 1) whether it was available in the given year and 2) whether it was comparable to the previous year(s). Crosswalks go back to 2002.
In 2021, due to the addition of web interviewing, a new crosswalk chart was started. Data from 2021 and the years that follow cannot be compared or pooled with earlier years for any variable, so the crosswalk chart only includes data from 2021 and onward.
The following variable crosswalk charts are available for public use files:
PUF Variable Crosswalk Chart: 2021 and 2022 (xlsx)PUF Variable Crosswalk Chart: 2019 and Prior (xlsx)
RUFs Variable Crosswalk Chart: Multi-Year (xlsx)
Scope and Methodology Notes
GEOGRAPHIC COVERAGE: United States
UNIT OF OBSERVATION: Individual
DATA TYPES: Survey Data
UNIVERSE: Civilian, noninstitutionalized population of the United States aged 12 and older, including residents of noninstitutional group quarters such as college dormitories, group homes, shelters, rooming houses, and civilians living on military installations.
The survey began as the Nationwide Study of Beliefs, Information, and Experiences in 1971. From 1977 to 2001 the survey was known as the National Survey on Drug Abuse. In 2002 the survey was renamed the National Survey on Drug Use and Health (NSDUH).
There have been major changes over the years. For example, before 1990 the survey was only administered every 2 to 3 years and had a very small sample size compared to later iterations. The questionnaire was significantly redesigned in 1994. A rural population supplement was added to allow separate estimates to be calculated for rural areas. Another break occurred in 1999, when the survey administration began to employ a 50-state design with an independent, multistage area probability sample for each of the 50 states and the District of Columbia. The collection mode of the survey changed from personal interviews and self-enumerated answer sheets to computer-assisted personal interviews and audio computer-assisted self-interviews.
There was also a hard break in comparability between 2002 and previous years, when the survey’s title was officially changed, improvements were made to sampling, and respondents began to receive $30 for completing the study, which increased participation.
NSDUH underwent a partial redesign in 2015, so there are several measures that broke trends in 2015 as well. For affected measures, data from before 2015 cannot be pooled with 2015 or later. Measures that were not affected can be pooled with any years between 2002 and 2019. The SAMHSA Data Website has more information on the partial 2015 redesign and its effects on estimates.
In 2020, NSDUH began using web data collection in addition to in-person interviews in the fourth quarter. This led to a complete break in comparability with previous years, meaning that estimates from 2020 and later are no longer comparable to their 2019 and earlier counterparts. This also means that you cannot pool data across incomparable years. Because there was not a full year of collection in 2020, the data are also not comparable to 2021.
Also, in 2002, 2011, and 2021 the new population data from the 2000, 2010, and 2020 decennial Censuses, respectively, became available for use in the sample weighting procedures.
For historical research, data from 1979 through 2014 can also be accessed from the Inter-university Consortium for Political and Social Research (ICPSR) at the University of Michigan.
NSDUH covers the general civilian population aged 12 and older in the United States. Active-duty military, residents of institutions, and people who are homeless but not in shelters are not included in the survey population.
Substance use topics covered include lifetime, past-year, and past-month use, as well as age at first use, substance use treatment history and perceived need for treatment, and substance use disorders. Respondents are also asked about problems resulting from the use of drugs, perceptions of risks, and potentially protective factors such as participation in drug prevention programs.
Along with substance use, NSDUH also covers several mental health topics. These include major depressive episodes, suicidal ideation and attempts, mental illness, and access to and use of mental health care.
To ensure that individual respondents cannot be identified from their responses, NSDUH public-use files (PUFs) have been treated with a number of disclosure-avoidance methods. Also to protect respondent privacy, some variables are not available, including geographic variables. Estimates generated from PUFs and Restricted Use data files will not be the same due to the disclosure-avoidance methods applied in PUFs.
Although NSDUH is useful for many purposes, it has certain limitations. First, the data are based on self-reports of drug use, and their value depends on respondents’ truthfulness and memory. Although some experimental studies have established the validity of self-reported data in similar contexts, and NSDUH procedures were designed to encourage honesty and recall, some underreporting and overreporting may take place.
Second, the survey is cross-sectional rather than longitudinal. That is, individuals were interviewed only once and were not followed for additional interviews in subsequent years. Each year’s survey provides an overview of the prevalence of drug use at a specific point in time, rather than a view of how drug use changes over time for specific individuals. Measures such as age at first use are based on the respondent’s memory, not past measurements of substance use.
Third, because the target population of the survey is defined as the civilian, noninstitutionalized population of the United States, a small proportion (less than two percent) of the population is excluded. The subpopulations excluded are active-duty members of the military, individuals in institutional group quarters (such as hospitals, prisons, nursing homes, and treatment centers), and homeless people not in shelters. If the drug use or mental illness of these groups differs from that of the civilian, noninstitutionalized population, NSDUH may provide slightly inaccurate estimates of substance use and mental health in the total population. This may be particularly true for prevalence estimates for less commonly used drugs, such as heroin.
Finally, changes in the methodology of the survey over time has limited the comparability of the estimates across years. Trends in drug use and mental health indicators have multiple breaks, in which the estimates before and after cannot be compared. In January 2024, an updated 2021 PUF file was released. This file has an updated weight that allows the data to be compared with 2022. See codebook for more details. Due to methodology changes, particularly the addition of web-based interviewing, the 2021 NSDUH data are not comparable to data from previous years. Data from 2020 and/or 2021 should not be pooled or compared with prior years. Methodology updates in 1999 and 2002 also cause complete breaks in comparability. Additionally, questionnaire revisions in 2015 mean that many variables are not comparable between 2015 data and before.
The estimates yielded by NSDUH are based on sample survey data rather than on complete data for the entire population. This means that the data must be weighted to obtain unbiased estimates for survey outcomes in the population represented by the survey.
For methodological information for a particular year or date range, including how to use the weights and stratification variables to get accurate estimates, please check the codebook for a specific data set above.
General information can also be found in the Methodological Summary and Definitions Reports.