1997 National Household Survey on Drug Abuse:  Preliminary Results

Previous Page TOC Next Page



APPENDIX 1: DESCRIPTION OF THE SURVEY

I. Sample Design

The sample design of the survey has changed over time, but it has always been representative of the U.S. general population age 12 and older and has always oversampled youth and young adults. The 1997 NHSDA employed a multistage area probability sample of 24,505 persons. The first stage of selection is a sample of 115 Primary National Sampling Units (PSUs) along with 18 Supplemental California/Arizona PSUs, each consisting of counties (administrative subdivisions of States) or groups of counties such as metropolitan areas. Within these PSUs, segments (such as city blocks or enumeration districts) are selected. In 1997, 2,696 segments were selected (1940 National segments and 756 Supplemental segments), and in each of these segments a listing of all addresses was made, from which a sample of 95,332 addresses was selected. Of these, 81,068 were determined to be eligible sample units. In these sample units (which can be either households or units within group quarters), sample persons were randomly selected (with unequal probabilities) using a screening procedure carried out by interviewers.

The 1997 NHSDA sampled segments were allocated equally into four separate samples, one for each three month period during the year, so that the survey is essentially continuous in the field. Data for the supplemental segments were collected starting in the second quarter or fourth month of the year. By assigning the appropriate selection probabilities at the PSU, segment, and person levels, oversampling of certain subpopulations of interest was accomplished. In 1997, these subpopulations included younger individuals (age 12-34), African-Americans, Hispanics, and residents of Arizona and California (particularly 12-17 year-olds).

II. Data Collection Methodology

The data collection method used in the NHSDA is to conduct in-person interviews with sample persons, incorporating procedures that would be likely to maximize respondents' cooperation and willingness to report honestly about their illicit drug use behavior. Introductory letters are sent to sampled addresses, followed by an interviewer visit. A five-minute screening procedure involves listing all household members along with their basic demographic data and possible selection of sample person(s). This selection process is designed to provide the necessary sample sizes for specified population groups by selecting either 0, 1, or 2 persons per household, depending on the composition of the household.

Interviewers attempt to conduct interviews in a private place, away from other household members. The interview averages about an hour, and includes a combination of interviewer-administered and self-administered questions. With this procedure, the answers to sensitive questions (such as those on illicit drug use) are recorded by the respondent and not seen or reviewed by the interviewer. After these answer sheets are completed, they are placed by the respondent in an envelope, which is sealed and mailed to the contractor, Research Triangle Institute, with no personal identifying information attached.

III. Data Processing

Upon receipt, questionnaires are checked for critical identification and demographic data, then keyed to disk. This creates a file consisting of one record for eachcompleted interview. Extensive within-record consistency checks and resolution of most inconsistencies and missing data are done using machine editing routines, called logical imputation. For some key variables that still have missing values after the application of logical imputation, statistical imputation is used to replace the missing data with appropriate valid response codes. Two types of statistical imputation procedures are used. Hot-deck imputation involves the replacement of a missing value with a valid code taken from another respondent who is "similar" and has complete data. Logistic regression models are also used to determine replacement values for some variables.

Each record (i.e., respondent) is assigned an analysis weight which incorporates:

a.The inverse of the selection probability for the respondent. This is the product of the inverses of selection probabilities at each stage of sampling.

b.Adjustments for household and person-level nonresponse.

c.Poststratification adjustment to Census projections (of the civilian noninstitutionalized population of the total U.S.) for the midpoint of each NHSDA data collection period. Adjustments are made to age, sex, and race/ethnicity distributions (see Appendix 2 for a discussion of the poststratification adjustment).

Data are generally released to the public about six months after the end of data collection. Public use data files are available 1-2 years after completion of data collection.

Previous Page Page Top TOC Next Page

This page was last updated on February 05, 2009.