Go to the Table Of Contents

2000 State Estimates of Substance Use & Mental Health

bulletNational data      bulletState level data       bulletMetropolitan and other subState area data

Appendix E: Statistical Methods and Limitations of the Data

E.1 Target Population

An important limitation of the National Household Survey on Drug Abuse (NHSDA) estimates of drug use prevalence is that they are only designed to describe the target population of the survey (e.g., the civilian, noninstitutionalized population aged 12 or older). Although this population includes almost 98 percent of the total U.S. population aged 12 or older, it does exclude some important and unique subpopulations who may have very different drug–using patterns. The survey excludes active military personnel, who have been shown to have significantly lower rates of illicit drug use. Persons living in institutional group quarters, such as prisons and residential drug treatment centers, are not included in the NHSDA and have been shown in other surveys to have higher rates of illicit drug use. Also excluded are homeless persons not living in a shelter on the survey date, another population shown to have higher than average rates of illicit drug use. Appendix F describes other surveys that provide data for these populations.

E.2 Nonsampling Error

Nonsampling errors can occur from nonresponse, coding errors, computer–processing errors, errors in the sampling frame, reporting errors, and other errors not due to sampling. Nonsampling errors are reduced through data editing, statistical adjustments for nonresponse, close monitoring and periodic retraining of interviewers, and improvement in various quality control procedures.

Although nonsampling errors can often be much larger than sampling errors, measurement of most nonsampling errors is difficult or impossible. However, some indication of the effects of some types of nonsampling errors can be obtained through proxy measures, such as response rates, and from other research studies.

E.2.1 Screening and Interview Response Rate Patterns

Response rates for the NHSDA were stable for the period from 1994 to 1998, with the screening response rate at about 93 percent and the interview response rate at about 78 percent (response rates discussed in this appendix are weighted). In 1999, the computer–assisted interviewing (CAI) screening response rate was 89.6 percent, and the interview response rate was about 68.6 percent. A more stable and experienced field interviewer (FI) workforce improved these rates in 2000. Of the 182,576 eligible households sampled for the 2000 NHSDA main study, 169,769 were successfully screened for a weighted screening response rate of 92.8 percent (see Table E.1 at the end of this appendix). In these screened households, a total of 91,961 sample persons were selected, and completed interviews were obtained from 71,764 of these sample persons, for a weighted interview response rate of 73.9 percent (Table E.2). A total of 10,109 (15.0 percent) sample persons were classified as refusals, 4,834 (5.5 percent) were not available or never at home, and 5,254 (5.5 percent) did not participate for various other reasons, such as physical or mental incompetence or language barrier. Table s E.3 and E.4 show the distribution of the selected sample by interview code and age group. The weighted interview response rate was highest among 12 to 17 year olds (82.6 percent), females (75.1 percent), blacks and Hispanics (76.2 and 78.0 percent, respectively), among persons residing in the South (76.4 percent), and among those in nonmetropolitan areas (77.6 percent) (Table E.5).

The increase in nonresponse between the 1998 and 1999 NHSDAs can be attributed primarily to the hiring of many new and inexperienced FIs in 1999 and a larger than usual turnover. By the end of 2000, the interviewer workforce primarily consisted of experienced interviewers, and fewer were leaving for other jobs. In 1999, there were 1,997 FIs hired and trained to conduct the CAI and paper–and–pencil interviewing (PAPI) surveys. More than a third of them did not complete the survey year (37.7 percent). In 2000, the number of trained interviewers decreased to 1,356 (because only CAI interviews were conducted in 2000), and the attrition rate dropped to 29.8 percent. Both prior NHSDA experience and on–the–job experience were shown to be related to nonresponse. Previously experienced interviewers and interviewers with one, two, or three quarters of on–the–job experience were more successful at obtaining an interview.

The overall weighted response rate, defined as the product of the weighted screening response rate and weighted interview response rate, was 61.5 percent in 1999 and 68.6 percent in 2000. Nonresponse bias can be expressed as the product of the response rate (R) and the difference between the characteristic of interest between respondents and nonrespondents in the population (Pr - Pnr). Thus, assuming the quantity (Pr - Pnr) is fixed over time, the improvement in response rates in 2000 will result in estimates with lower nonresponse bias.

E.2.2 Inconsistent Responses and Item Nonresponse

Among survey participants, item response rates were above 98 percent for most questionnaire items. However, inconsistent responses for some items, including the drug use items, are common. Estimates of substance use from the NHSDA are based on the responses to multiple questions by respondents, so that the maximum amount of information is used in determining whether a respondent is classified as a drug user. Inconsistencies in responses are resolved through a logical editing process that involves some judgment on the part of survey analysts and is a potential source of nonsampling error. Because of the automatic routing through the CAI questionnaire (e.g., lifetime drug use questions that skip entire modules when answered "no"), there is less editing of this type than in the PAPI questionnaire used in previous years.

In addition, less logical editing is used because with the CAI data, statistical imputation is relied upon more heavily to determine the final values of drug use variables in cases where there is the potential to use logical editing to make a determination. The combined amount of editing and imputation in the CAI data is still considerably less than the total amount used in prior PAPI surveys. For the 2000 CAI data, for example, 3.2 percent of the estimate of past month hallucinogen use was based on logically edited cases and 5.4 percent on imputed cases, for a combined amount of 8.6 percent. For the 1999 CAI data, 1.7 percent of the estimate of past month hallucinogen use was based on logically edited cases and 4.6 percent on imputed cases, for a combined amount of 6.2 percent. In the 1998 NHSDA (administered using PAPI), the amount of editing and imputation for past month hallucinogen use was 60 and 0 percent, respectively, for a total of 60 percent. The combined amount of editing and imputation for the estimate of past month heroin use was 5.0 percent for the 2000 CAI, 14.8 percent for the 1999 CAI, and 37.0 percent for the 1998 PAPI data.

E.2.3 Imputation Error in the 1999 NHSDA Estimates

While working on the 2000 NHSDA imputations, a programming error was discovered in the 1999 imputations of recency of use, frequency of use, and age at first use for several drugs. This error resulted in overestimates of past year and past month use of marijuana, inhalants, heroin, and alcohol. Thus, estimates such as past month any illicit drug use and use of any illicit drug other than marijuana were also affected. The error was limited to cases that did not have complete recency information, where it was necessary to maintain consistency between the 30–day frequency and 12–month frequency data during the imputation process. This error did not affect lifetime use measures. Because of the sequential nature of the imputation procedures (i.e., imputed values for a substance processed early are used subsequently in the imputation of data on other substances), it was necessary to reimpute recency of use, frequency of use, and age at first use measures for all substances. Rerunning the imputations for all substances provided the opportunity to employ several minor enhancements to the imputation procedure that had been developed for the 2000 data, thereby improving consistency between the 1999 and 2000 estimates. Due to these enhancements and the random nature of the imputation process, the revised 1999 substance use estimates are slightly different from those previously published for all substances. A more complete discussion of how the imputation error was discovered and corrected can be found in Section 4 of the 1999 NHSDA Methodological Resource Book (Grau, Bowman, Giacoletti, Odom, & Sathe, 2001).

E.2.4 Impact of Field Interviewer Experience on the 1999 and 2000 CAI Estimates

In the 1999 NHSDA Summary of Findings (Office of Applied Studies [OAS], 2000), it was reported that the large change in the distribution of experienced and inexperienced FIs between the 1998 and 1999 surveys was associated with unanticipated and unusually large increases in substance use rates for data collected using the PAPI method. The report also found that data collected from interviewers with prior NHSDA experience resulted in drug use rates that were significantly lower than rates based on data collected from interviewers with no prior NHSDA experience. As a result, the 1999 PAPI estimates presented in the above–mentioned OAS report were based on analysis weights adjusted to measures representing the 1998 FI experience distribution.

Along with fielding PAPI data, the 1999 NHSDA marked the beginning of the use of CAI methods. Data were solicited from over 66,000 respondents in 50 States and the District of Columbia that year. Analysis of 1999 and 2000 CAI data was done to determine the impact of FI experience on drug use estimates (PAPI data were not collected in 2000). Overall, it was found these interviewer effects still remained although they were not as pronounced as found in the PAPI data. Based on these findings, it was not necessary to adjust the CAI analysis weights as was done with the 1999 PAPI data. A more detailed explanation of this analysis and its findings can be found in Appendix B of the 2000 Summary of Findings (OAS, 2001).

E.3 Incidence Estimates

The average annual numbers of marijuana initiates and rates by State were obtained using small area estimation (SAE) methods applied to the pooled 1999–2000 survey data and are, therefore, different from incidence estimates reported in the other reports. NHSDA State estimates of each substance use measure are produced by combining an estimate of the measure based on the State sample data with the estimate of the measure based on a national regression model applied to local–area county and Census block group/tract–level estimates from the State. The parameters of the regression model are estimated from the entire national sample. Because the 42 smaller (in terms of population) States and the District of Columbia have smaller samples than the eight large States, estimates for the smaller States rely more heavily on the national model. The model for each substance use measure typically utilizes from 50 to 100 independent variables in the estimation. These variables include basic demographic characteristics of respondents (e.g., age, race/ethnicity, and gender), demographic and socioeconomic characteristics of the Census tract or block group (e.g., average family income and percentage of single–mother households), and county–level substance abuse and other indicators (e.g., rate of substance abuse treatment, drug arrest rate, and drug– and alcohol–related mortality rate). Population counts by State and age group are applied to the estimated rates to obtain the estimated number of persons with the substance use characteristic. Corresponding to each SAE estimate is a 95 percent prediction interval (PI) that indicates the precision of the estimate. The PI accounts for variation due to sampling, as well as variation due to the model, and is derived from the process that generates the State SAE. There is a 95 percent probability that the true value lies within the interval.

The incidence estimates discussed in this report are based on the combination of two separate SAE measures, calculated from the pooled 1999–2000 data:

Each of these measures is generated independently using SAE, by State and age group. The following formula was used to generate the average annual rate of first use of marijuana for each State:

Average annual incidence rate = 0.5 * {Number of initiates in past 24 months /
[(Number of initiates in past 24 months * 0.5) + Number of persons who never used]}.

For diseases, the incidence rate for a population, IR, is defined as the number of new cases of the disease, N, divided by the person time, PT, of exposure (i.e., IR = N / PT). The person time of exposure can be measured for the full period of the study or for a shorter period. The person time of exposure ends at the time of diagnosis (e.g., Greenberg, Daniels, Flanders, Eley, & Boring, 1996, pp. 16–19). Similar conventions are applied for defining the incidence of first use of a substance.

Beginning in 1999, the NHSDA questionnaire allows for collection of year and month of first use for recent initiates. Month, day, and year of birth are also obtained directly or imputed in the process. In addition, the questionnaire call record provides the date of the interview. By imputing a day of first use within the year and month of first use reported or imputed, the key respondent inputs in terms of exact dates are known. Using these respondent inputs, one can determine whether a person's first use episode occurred in the 24 months prior to the interview.

With person time of exposure measured in terms of 2–year units of time, the correct multiplier for the number of initiates in the past 24 months in the denominator of the SAE–based Average annual incidence rate is the average fraction of the exposure interval experienced prior to the initiation. Direct survey estimates of this average fraction of exposure experience prior to the initiation could be formed for each State–by–age–group combination, but direct estimates would be too imprecise to include in the SAE incidence rate estimation. Instead, the average fraction of exposure among initiates was assumed to be ½ of the 2–year exposure period. This approximation follows from the assumption that initiation episodes are distributed uniformly over the 2–year exposure period. Note that the "never" users at interview were all exposed for the full 2–year initiation period. The 24–month SAE incidence rates were then transformed into average 12–month or average annual rates by the ½ multiplier. Alternatively, one can view the final multiplication by ½ as transforming the person time units of exposure in the denominator of the rate from the number of 2–year exposure units to the number of 1 person year of exposure.

E.4 References

Grau, E. A., Bowman, K. R., Giacoletti, K. E. D., Odom, D. M., & Sathe, N. S. (2001, July). Imputation report. In 1999 National Household Survey on Drug Abuse: Methodological resource book (Vol. 1, Section 4, prepared for the Substance Abuse and Mental Health Services Administration, Office of Applied Studies, under Contract No. 283–98–9008, Deliverable No. 28, RTI 7190). Research Triangle Park, NC: Research Triangle Institute.

Greenberg, R. S., Daniels, S. R., Flanders, W. D., Eley, J. W., & Boring, J. R. (1996). Medical epidemiology. Norwalk, CT: Appleton & Lange.

Office of Applied Studies. (2000). Summary of findings from the 1999 National Household Survey on Drug Abuse (DHHS Publication No. SMA 00–3466, NHSDA Series H–12; available at /p0000016.htm#special). Rockville, MD: Substance Abuse and Mental Health Services Administration.

Office of Applied Studies. (2001). Summary of findings from the 2000 National Household Survey on Drug Abuse (DHHS Publication No. SMA 01–3549, NHSDA Series H–13; available at /p0000016.htm#standard). Rockville, MD: Substance Abuse and Mental Health Services Administration.

Table E.1 Weighted Percentages and Sample Sizes for the 1999 and 2000 NHSDAs, by Screening Result Code

Screening Result 1999 NHSDA 2000 NHSDA
Sample Size Weighted Percent Sample Size Weighted Percent
Total Sample 223,868 100.00 215,860 100.00
     Ineligible cases 36,026 15.78 33,284 15.09
     Eligible cases 187,842 84.22 182,576 84.91
Ineligibles 36,026 100.00 33,284 100.00
     Vacant 18,034 49.71 16,796 50.76
     Not a primary residence 4,516 12.90 4,506 13.26
     Not a dwelling unit 4,626 12.70 3,173 9.33
     All military personnel 482 1.22 414 1.21
     Other, ineligible 8,368 23.46 8,395 25.43
Eligible Cases 187,842 100.00 182,576 100.00
     Screening Complete 169,166 89.63 169,769 92.84
          No one selected 101,537 54.19 99,999 55.36
          One selected 44,436 23.63 46,981 25.46
          Two selected 23,193 11.82 22,789 12.03
     Screening Not Complete 18,676 10.37 12,807 7.16
          No one home 4,291 2.38 3,238 1.82
          Respondent unavailable 651 0.36 415 0.24
          Physically or mentally incompetent 419 0.24 310 0.16
          Language barrier - Hispanic 102 0.06 83 0.05
          Language barrier - other 486 0.28 434 0.27
          Refusal 11,097 5.92 7,535 4.14
          Other, access denied 1,536 1.08 748 0.45
          Other, eligible 38 0.02 7 0.00
          Other, problem case 56 0.03 37 0.02

Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999 and 2000.

Table E.2 Weighted Percentages and Sample Sizes for 1999 and 2000 NHSDAs, by Final Interview Code among Persons Aged 12 or Older

Final Interview Code 1999 NHSDA 2000 NHSDA
Sample Size Weighted Percent Sample Size Weighted Percent
Total Selected Persons 89,883 100.00 91,961 100.00
Interview Complete 66,706 68.55 71,764 73.93
No One at Dwelling Unit 1,795 2.13 1,776 2.02
Respondent Unavailable 3,897 4.53 3,058 3.52
Breakoff 50 0.07 72 0.09
Physically/Mentally Incompetent 1,017 2.62 1,053 2.57
Language Barrier – Spanish 168 0.12 109 0.08
Language Barrier – Other 480 1.46 441 1.06
Refusal 11,276 17.98 10,109 14.99
Parental Refusal 2,888 1.01 2,655 0.88
Other 1,606 1.53 924 0.86

Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999 and 2000.

Table E.3 Weighted Percentages and Sample Sizes for 1999 and 2000 NHSDAs, by Final Interview Code among Persons Aged 12 to 17

Final Interview Code 1999 NHSDA 2000 NHSDA
Sample Size Weighted Percent Sample Size Weighted Percent
Total Selected Persons 32,011 100.00 31,242 100.00
Interview Complete 25,384 78.07 25,756 82.58
No One at Dwelling Unit 322 1.09 278 0.86
Respondent Unavailable 872 3.04 617 2.05
Breakoff 13 0.03 18 0.05
Physically/Mentally Incompetent 244 0.76 234 0.76
Language Barrier - Spanish 15 0.03 10 0.03
Language Barrier - Other 58 0.18 50 0.20
Refusal 1,808 5.97 1,455 4.52
Parental Refusal 2,885 9.50 2,641 8.35
Other 410 1.33 183 0.59

Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999 and 2000.

Table E.4 Weighted Percentages and Sample Sizes for 1999 and 2000 NHSDAs, by Final Interview Code among Persons Aged 18 or Older

Final Interview Code 1999 NHSDA 2000 NHSDA
Sample Size Weighted Percent Sample Size Weighted Percent
Total Selected Persons 57,872 100.00 60,719 100.00
Interview Complete 41,322 67.41 46,008 72.92
No One at Dwelling Unit 1,473 2.25 1,498 2.16
Respondent Unavailable 3,025 4.71 2,441 3.69
Breakoff 37 0.07 54 0.09
Physically/Mentally Incompetent 773 2.85 819 2.78
Language Barrier - Spanish 153 0.13 99 0.09
Language Barrier - Other 422 1.62 391 1.16
Refusal 9,468 19.41 8,654 16.22
Parental Refusal 3 0.00 14 0.01
Other 1,196 1.55 741 0.89

Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999 and 2000.

Table E.5 Response Rates and Sample Sizes for the 1999 and 2000 NHSDAs, by Demographic Characteristics

Demographic Characteristic 1999 NHSDA 2000 NHSDA
Selected Persons Completed Interviews Weighted Response Rate Selected Persons Completed Interviews Weighted Response Rate
Total 89,883 66,706 68.55% 91,961 71,764 73.93%
Age in Years
     12–17 32,011 25,384 78.07% 31,242 25,756 82.58%
     18–25 30,439 22,151 71.21% 29,424 22,849 77.34%
     26 or older 27,433 19,171 66.76% 31,295 23,159 72.17%
Gender
     Male 43,883 31,987 67.12% 44,899 34,375 72.68%
     Female 46,000 34,719 69.81% 47,062 37,389 75.09%
Race/Ethnicity
     Hispanic 11,203 8,755 74.59% 11,454 9,396 77.95%
     Non-Hispanic,
     white
63,211 46,272 67.98% 64,517 49,631 73.39%
     Non-Hispanic,
     black
10,552 8,044 70.39% 10,740 8,638 76.19%
     Non-Hispanic,
     all other races
4,917 3,635 59.28% 5,250 4,099 67.31%
Region
     Northeast 16,794 11,830 64.03% 18,959 14,394 71.68%
     Midwest 24,885 18,103 69.63% 25,428 19,355 73.23%
     South 27,390 21,018 70.93% 27,217 22,041 76.38%
     West 20,814 15,755 67.47% 20,357 15,974 72.68%
County Type
     Large metropolitan 36,101 25,901 65.15% 37,754 28,744 71.77%
     Small metropolitan 30,642 22,612 69.98% 31,400 24,579 74.96%
     Nonmetropolitan 23,140 18,193 74.97% 22,807 18,441 77.58%

Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999 and 2000.

Go to the Table of Contents

This page was last updated on December 30, 2008.