Go to the Table Of ContentsSkip To Content
Click for DHHS Home Page
Click for the SAMHSA Home Page
Click for the OAS Drug Abuse Statistics Home Page
Click for What's New
Click for Recent Reports and HighlightsClick for Information by Topic Click for OAS Data Systems and more Pubs Click for Data on Specific Drugs of Use Click for Short Reports and Facts Click for Frequently Asked Questions Click for Publications Click to send OAS Comments, Questions and Requests Click for OAS Home Page Click for Substance Abuse and Mental Health Services Administration Home Page Click to Search Our Site

2001 State Estimates of Substance Use

bulletNational data      bulletState level data       bulletMetropolitan and other subState area data

Appendix E: State Estimation Methodology

This report includes estimates of 19 substance use measures. Twelve of the measures used the same definition for 1999 through 2001 and have estimates of change between 1999–2000 and 2000–2001, the difference of two 2-year moving averages. Six substance abuse and dependence measures used the same definition for 2000 and 2001, but not for 1999; therefore, only the estimates for 2000–2001 are provided. One new measure, serious mental illness (SMI), was introduced in 2001, and State estimates have been produced for that single year.

This appendix describes the methodology used to measure change in State estimates (Section E.1), the validation of that methodology (Section E.2), the validation of the estimates of prevalence levels based on the combined 1999–2000 National Household Survey on Drug Abuse (NHSDA) data (Section E.3), caveats regarding small area estimation (SAE) (Section E.4), and the general methodology (hierarchical Bayes) used to create the State estimates (Section E.5). Included at the end of this appendix are tables showing the State response rates for 1999–2001, the State sample sizes for 1999–2001, and the State sample sizes for the 2001 incentive experiment.

E.1. Measuring Change in State Estimates Between 1999–2000 and 2000–2001

The estimates of change in State estimates presented in this report are based on the 1999 through 2001 NHSDAs. State estimates for 1999–2000 and 2000–2001 were produced by combining State-level NHSDA data with local-area county and Census block group/tract-level predictor variables from the States for the two time periods. The SAE methodology for estimating change is described in this section, while Section E.5 provides a general overview of SAE methodology. The moving average State prevalence estimates displayed in Appendix A for the overlapping 1999–2000 and 2000–2001 time periods were obtained from independent applications of RTI's survey-weighted hierarchical Bayes (SWHB) methodology.

The State estimates for 1999–2000 are the model-based small area estimates previously published by the Substance Abuse and Mental Health Services Administration (SAMHSA) (see Wright, 2002a, 2002b). These estimates were derived by first fitting logistic mixed models to the pooled 1999–2000 survey dataset. These models fit separate fixed and random effects for each of four age groups. Each age group model had 51 State-level random effects and 300 substate region-level random effects. The fixed predictor variables for each age group were defined at five levels, namely, person-level demographics, 1990 decennial Census block group-level items, tract-level items, county variables, and State variables. The same fixed predictors were used for all 3 years (1999, 2000, and 2001) of data but annual updates were made when more current versions became available.

Having estimated the common fixed and random effects from the pooled 1999–2000 dataset, year-specific predicted probabilities of substance use were formed at the block group–b level for each of eight gender (2) by race/ethnicity (4) domains-d within each of four age groups-a.

Year specificity in the State estimates was induced by updating the fixed predictor variables annually and by using year-specific block group-level population projections for the 32 age by gender by race/ethnicity domains to weight together the domain-specific probabilities of use. These year-t population projections, [Notation N sub bad at time t denotes the year-t specific population projections for block-group b, age group a, and gender by race/ethnicity domain d.] were purchased from Claritas Inc. Letting Notation pi sub bad at time t denotes the year-t specific predicted probability of substance use for block-group b, age group a, and gender by race/ethnicity domain d. denote the predicted probability of substance use for the age group-a by race/ethnicity by gender subpopulation-d in block group-b for year-t, then the age group-specific estimates for State i were computed as population-weighted averages of the form

Equation E3   ,  D

where the summation extends over all the block groups-b belonging (epsilon) to the State i universe omega sub i. Note that the domain-d summations extend over the eight age group-specific gender by race/ethnicity domains within each block group.

To produce the 1999–2000 pooled estimates, the common fixed and random effect estimates were first employed to form State estimates Notation pi sub ia at time 99 denotes the predicted probability of substance use for State i, for age group a in 1999. and Notation pi sub ia at time 00 denotes the predicted probability of substance use for a State i, for age group a in 2000. for 1999 and 2000, respectively. These annualized State estimates were then combined as population-weighted averages of the form

Equation E8  ,   D

where Notation N sub ia at time t denotes the year-t specific population projection and is calculated as the sum of the year-t specific population projections for block-group b, age group a, and gender by race/ethnicity domain d (N sub bad at time t), summed over all 8 gender by race/ethnicity domains and all the block-groups in State i. The SWHB versions of these pooled estimates were computed as posterior means over 1,250 Gibbs samples drawn from the joint posterior distribution of the fixed and random effects. The 95 percent asymmetric prediction intervals (PIs) for these pooled 1999–2000 prevalence estimates were first formed as symmetric, approximately Gaussian, Bayes credible intervals on the log-odds scale. The end points of these log-odds symmetric intervals then were transformed back to the prevalence scale.

The State by age group prevalence estimates derived from the pooled 2000 and 2001 survey data were produced by refitting the logistic mixed models. In this independent refitting of the models, updated versions of the fixed predictors were used with the 2001 survey responses when updates were available. This refitting resulted in a new set of age group-specific fixed and random effects for the combined 2000 and 2001 surveys. As described previously, 1,250 Gibbs sample draws from the joint posterior distribution of these fixed and random effect parameters were used to calculate posterior means and 95 percent prediction intervals for the 2000 and 2001 State i by age group-a prevalence estimates Notation pi sub ia at time 00 and 01 denotes the predicted probability of substance use based on the 2000 and 2001 data for State i and age group a..

The 2000 and 2001 models were fit independently of the previously fit 1999 and 2000 models. This independent analysis approach was followed because there was no desire to revise the previous estimates and the associated moving average change measures as the result of jointly modeling all 3 years of survey data. This approach does have a shortcoming when computing the Bayes significance level for an estimated moving average change measure. Specifically, one needs to estimate the posterior variance of a change measure defined as the log-odds ratio:

Equation E11     D

A change measure like the log-odds ratio is favored over the simple difference because the Bayes significance calculation is much less burdensome when the posterior distribution of the change measure is approximately Gaussian as is the case for Notation lor sub ia denotes the log-odds ratio for State i and age group a. but not for the simple difference. Calculating the posterior variance of Notation lor sub ia denotes the log-odds ratio for State i and age group a. can be accomplished by using the posterior variance statistics that were previously obtained from the independent Markov chain Monte Carlo (MCMC) chains.

To complete the variance calculation for Notation lor sub ia denotes the log-odds ratio for State i and age group a., a correlation estimate for the two log-odds statistics is required. To approximate this correlation, the 1999–2000 and 2000–2001 models were fit simultaneously. This simultaneous fit yielded an MCMC sample of 1,250 draws from the joint posterior distribution of both sets of fixed and random effects. To accommodate this simultaneous fitting of the 1999–2000 and 2000–2001 models, a concatenated dataset containing both of the pooled samples was created. Because the PROC GIBBS software allows for separate logistic mixed models for a set of nonoverlapping subpopulations, it was possible to simultaneously fit eight age group (4) by dataset (2) models as if there were no overlap in the two datasets. This simultaneous solution yielded a set of 1,250 MCMC replicates for the two overlapping log-odds statistics. In these simultaneous models, the eight age group by dataset random effects for each State and for each substate region were allowed to have a general variance-covariance matrix. It was hoped that these random effect covariances between datasets would largely account for the 2000 survey overlap.

In the process of conducting the SAE change measure validation study (reported on in Section E.2), it was observed that the 95 percent prediction intervals for two of the SAE odds ratios, (namely, past month alcohol use and past month cigarette use) were approximately the same or wider than the 95 percent confidence intervals (CIs) for the associated design-based odds ratio estimates. These interval comparisons are displayed in Table  E.1. It had also been previously noted that the prediction intervals for the two SAE-based log-odds statistics involved in the log-odds ratios were substantially narrower than the corresponding design-based intervals. Therefore, it was clear that the correlations between the two odds statistics over the MCMC samples were substantially smaller than their design-based counterparts. Table  E.2 shows these underestimated correlations as compared with their design-based counterparts.

These model-based MCMC correlations were underestimated as a consequence of the faulty assumption that the eight age group by dataset subpopulations in the simultaneous models were nonoverlapping. The overlap associated with the 2000 survey data was not adequately accounted for by the random effect correlations. There is an alternative form of the odds ratio estimator that employs nonoverlapping subpopulations and provides for proper MCMC-based correlation estimation. This odds ratio for change is based on simultaneously fitting the three annual models to produce 1,250 MCMC samples from the joint posterior distribution of the triple Notation pi-tilde sub ia at time 99 denotes the predicted probability of substance use in 1999 for State i and age group a and is based on simultaneously fitting the three annual models., Notation pi-tilde sub ia at time 00 denotes the predicted probability of substance use in 2000 for State i and age group a and is based on simultaneously fitting the three annual models., and Notation pi-tilde sub ia at time 01 denotes the predicted probability of substance use in 2001 for State i and age group a and is based on simultaneously fitting the three annual models.. For this simultaneous model, there are 12 age (4) by year (3) subpopulation-specific models, each with their own sets of fixed and random effects. In this case, the general covariance matrices for the State and substate random effects are 12 by 12 matrices corresponding to the 12 element (age group by year) vectors of random effects. The associated odds ratio is based on the pooled prevalences:

Equation E16     D

and

Equation E17     D

Note that the survey-weighted Bernoulli-type log likelihood employed in PROC GIBBS was appropriate for this simultaneous model because the 12 age group by year subpopulations were nonoverlapping. The purpose of using the more complex 2-year averaging scheme described previously was to minimize bias. If one assumes the fixed and random effects are common for the 2 years being pooled, this should yield small area estimates that are closer to the design-based estimates than the Notation pi-tilde sub ia at time t denotes the predicted probability of substance use for State i and age group a. estimators above where year-specific parameters were assumed. For the odds ratio based on the Notation pi-tilde sub ia at time t denotes the predicted probability of substance use for State i and age group a. averaged prevalence estimates, it is clear that the correlation between the two log-odds statistics should be high. This follows from the fact that Notation pi-tilde sub ia at time 00 denotes the predicted probability of substance use for State i, age group a for 2000. is common to the two population-weighted averages. These correlation estimates based on Notation pi-tilde sub ia at time t denotes the predicted probability of substance use for State i and age group a. more properly reflect the true correlations associated with the Notation pi sub ia at time t denotes the predicted probability of substance use for State i, age group a. type of averages presented in the body of this report. Table  E.3 is similar to Table  E.1 except that the prediction intervals were obtained using the correlations from the alternative method. Table  E.4 displays the correlations from the alternative method and the corresponding design-based correlations. Table s E.5 to E.8 contrast the Bayes significance levels for these two correlation estimators. Note that the revised significance estimates [p value(2)] are smaller than the original ones [p value(1)]; they are about 20 percent smaller for past month use of cigarettes, alcohol, and marijuana, and about 6 percent smaller for past year use of cocaine.

E.2. Validation of Methodology to Measure Change

To validate the SAE models for estimating change between the pooled 1999–2000 small area estimates and the pooled 2000–2001 small area estimates, the design-based estimates of change for the eight large sample States were used as internal benchmarks. The eight large sample States had 2-year sample sizes that ranged between 6,200 and 9,700. Estimates were produced for four outcome variables representative of a range of prevalence rates: past year use of cocaine, past month use of marijuana, past month use of cigarettes, and past month use of alcohol. The goal of the validation was to compare the estimates for small States utilizing the SAE methodology with estimates based on the internal benchmarks.

E.2.1 Replicate Formation Methodology

The validation study was performed by first subsampling the eight large States; for each of these large States, four sample replicates ("pseudo" small States) were formed that mimicked the design properties of the 42 small States and the District of Columbia. A key feature of this replicate formation strategy was mimicking the 50 percent overlap between the 1999 and 2000 samples of 96 area segments and between the 2000 and 2001 segment samples in each small sample State. Because new samples of dwellings and persons were drawn from all sample segments every year, the survey design-induced covariance between years is limited to this 50 percent overlap of sample block groups/segments.

Exhibit E.1 presents the 50 percent segment overlap plan for the 3 survey years. Note that there are 48 field interviewer (FI) regions in each of the eight large States and 12 FI regions in each of the 42 small States and the District of Columbia. Each FI region has four quarters, and each quarter is then expected to have two area segments. For various reasons, some of the FI region-by-quarter slots may be empty. In the following illustration, segments A, C, E, and G in 1999 were kept in 2000. Segments B, D, F, and H were replaced by segments I, J, K, and L in 2000. In 2001, the segments I, J, K, and L of 2000 were kept, and segments A, C, E, and G from 2000 were replaced by segments M, N, O, and P.

Exhibit E.1 Sample Segment 50 Percent Overlap Plan for the 1999, 2000, and 2001 NHSDAs
FI Region Quarter Segments
1999 2000 2001
1 1 A A M
B I I
2 C C N
D J J
3 E E O
F K K
4 G G P
H L L
FI = field interviewer.

To select the four pseudo small State samples from each large State, 12 pseudo FI regions were first created within each large sample State by pooling their 48 initial FI regions into groups of 4. Each of these pseudo FI regions then was expected to have eight area segments per calendar quarter (see Exhibit E.2). For each of these pseudo FI region-by-quarter sets of eight area segments, any segments that were devoid of interviews were first randomly replaced by a selection from the non-empty segments in the set. The segments for the 1999, 2000, and 2001 NHSDA data were filled in separately. Once complete sets of eight non-empty segments for the 1999, 2000, and 2001 NHSDA data in each of the pseudo FI region-by-quarter sets were assembled, the 1999, 2000, and 2001 data were linked using State-by-pseudo FI region-by-quarter-by-segment identification codes.

Exhibit E.2 An Example of Sample Segment Assignment in Pseudo FI Regions in 1999, 2000, and 2001 NHSDAs
Pseudo
FI Region
Quarter Segments
1999 2000 2001
1 1 a a m
b i i
c c n
d j j
e e o
f k k
g g p
h l l
FI = field interviewer.

Let a, b, c, d, e, f, g, and h denote the eight segments in quarter 1 of pseudo FI region 1 in 1999. Approximately half of the eight segments represented cases where the 1999 segments were reused in 2000 (i.e., common segments a, c, e, and g in 1999 and 2000), and the remaining segments b, d, f, and h represented cases where 1999 segments were linked with new 2000 replacement segments i, j, k, and, l. Similarly between 2000 and 2001, segments i, j, k, and l are common segments, whereas segments a, c, e, and g are linked to new segments m, n, o, and p.

Next, the eight linked 1999 and 2000 segment pairs were stratified into two strata—the common segment pairs and the uncommon 1999 and 2000 segment pairs. One segment pair was then randomly drawn from each of these strata and combined to form four pseudo small States such that one of the paired replicates would have common segments in the 1999 and 2000 surveys and the other replicate pair would have uncommon segments for 1999 and 2000. The 2001 segments then were forced to go into the same pseudo States depending on the linkage between the 2000 and 2001 sample segments. For example, if segment "g" was assigned to pseudo State 1 in 1999, "g" also was linked to "p" in 2001 because "g" was common between 1999 and 2000; segment "g" in 2000 and the segment "p" in 2001 were forced to go into pseudo State 1. Exhibit E.3 demonstrates a typical assignment of segments among the four pseudo states for the 1999, 2000, and 2001 NHSDAs.

Exhibit E.3 Typical Assignment of Segments among Four Pseudo States for 1999, 2000, and 2001 NHSDAs
Pseudo
FI Region
Quarter Pseudo State Segments
1999 2000 2001
1 1 1 g g p
b i i
2 a a m
h l l
3 e e o
d j j
4 c c n
f k k
FI = field interviewer.

This subsampling validation exercise was repeated for all four quarters in a pseudo FI region and for all 12 pseudo FI regions in each of the eight large States. This resulted in 32 (8 large States × 4 subsamples from each large State) pseudo small States from eight large States. These pseudo small States mimicked the design properties of small States with the 50 percent sample segment overlap preserved across adjacent survey years.

E.2.2 Results of Validating the Small Area Estimates of Change Between 1999–2000 and 2000–2001

Table s E.9 to E.12 present the internal benchmark estimate (labeled "design-based") and the corresponding average estimate using the SAE procedures for the four substance use measures for each of the eight large States and the relative absolute bias (RAB) for each of the substance use measures. The estimate in each case is the odds of having used the substance in 2000–2001 divided by the odds of having used the substance in 1999–2000. In general, the average relative biases for the age 12 or older population are fairly small for substance use measures with larger prevalence rates and somewhat larger for the others. The average relative bias is worst for past year use of cocaine (12.7 percent for the population age 12 or older). Note, however, that the relative bias is generally conservative, producing SAE odds ratios that are closer to "no change" relative to the design-based odds ratios. For example, of the 32 pairs of State-by-age group estimates for cocaine, the SAE odds ratios are closer to 1.0 for 29 of the pairs and the design-based odds ratios are closer to 1.0 for only 3 pairs.

Table  E.3 presents the ratio of widths of the 95 percent prediction intervals from the SAE data to the 95 percent confidence intervals from a direct estimate based on the same size sample. The estimates in the table are based on the recalculated (larger) estimate of the correlation between the two 2-year moving averages. As one can see, the width of the 95 percent prediction intervals are much smaller on average for each of the four substance measures validated, ranging from 0.60 for past month use of marijuana and past year use of cocaine to 0.77 for past month use of cigarettes for persons age 12 or older. This represents an improved precision that is equivalent to a sample size almost 3 times larger for marijuana and cocaine and about 2 times larger for cigarettes-relative to the precision obtained from the corresponding direct design-based estimate.

E.3. Validation of Combined Prevalence-Level Estimates for 1999–2000

The 2-year estimates had been validated in the 2000 State report for four variables: past month use of marijuana, past year use of cocaine, past month binge alcohol use, and past month use of cigarettes. The results of that validation are repeated here in Table s E.13 to E.16. On average, the relative absolute biases (RABs) were quite small. For the 12 or older age group, the RABs were as follows:

Also, compared with the design-based confidence intervals, the 95 percent prediction intervals were much shorter, about 75 percent as large for marijuana, binge alcohol, and cigarettes and 65 percent as large for cocaine (Table  E.17).

In addition, the 2-year estimates were compared with the corresponding 1-year estimates to ascertain the extent of improvement in estimation for the 42 States and the District of Columbia, given that those sample sizes would now be approximately double their size in 1999. For example, comparing the prediction intervals' widths across the 50 States and the District of Columbia, the SAE average prediction interval width for past month use of marijuana among persons 12 or older was 2.40 percent in 1999, but only 1.98 percent for 1999 and 2000 combined (see Section B.4.2 from Wright, 2002b). Just as importantly, because the States (and the District of Columbia) had smaller single-year sample sizes, the national model had a greater relative influence in the SAE estimates for 1999 than for 1999 and 2000 combined. Therefore, the 1999–2000 pooled State estimates would not be shrunk as much toward the national model-based estimate as would similar estimates based on a single year of data. One result is that the 2-year small area estimates would tend to be closer to their corresponding design-based estimates than small area estimates based on a single year of data. The other implication is that States with design-based estimates that were relatively lower or higher than other States would retain that distinction, and the overall range and spread of the State estimates would tend to be larger, for example, than it was in 1999. This should make it easier to identify States that have notably lower or higher substance use prevalence rates than other States.

E.4. Caveats

Some of the caveats regarding SAE are addressed in Chapter 7 in Volume I of this report. Table s E.18 to E.20 show the screening, interview, and overall response rates for the 50 States and the District of Columbia from 1999 to 2001, respectively. The response rates are somewhat higher in both 2000 and 2001.

In 2001, an incentive experiment was embedded in the regular data collection during quarters 1 and 2. For that experiment, small random samples were selected in each State proportionate to their population size, and sampled persons were assigned to receive $0, $20, or $40 for completing the questionnaire. Analysis of that data revealed that the response rates were significantly higher among those receiving an incentive than among those who did not receive an incentive and that the overall cost of the survey was less due to the much smaller number of callbacks that were necessary (Eyerman & Bowman, 2002). Initial analysis of that data did not indicate any significant differences in estimated prevalence levels between the incentive and nonincentive cases; however, subsequent analysis has revealed higher prevalence rates for the incentive cases for some of the substance measures. Because the incentive sample size is relatively small compared to the total State sample size, the decision was made to combine both incentive and nonincentive samples in 2001 to produce the national estimates and to produce the State estimates for 2000 and 2001 combined. For example, the incentive sample size for Alabama totaled 98 cases that received either the $20 or $40 incentive (Table  E.21), but the total sample size for 2000–2001 for Alabama was 1,821 (Table  E.22). The largest allocation of incentive sample cases was in Illinois. There, 442 cases received either the $20 or $40 incentive out of a total combined sample size of 7,218, about 6 percent. Table  E.22 also presents the State sample sizes for 1999 through 2001. Table  E.21 presents the State sample allocations for just the incentive experiment.

One other possible contributor to bias in the State estimates, and the estimates in general, is the effect of editing and imputation of the summary data. In developing the editing and imputation process for 1999 and subsequent years, the desire was to minimize the amount of editing because of its somewhat subjective nature, and instead let the random imputation process supply any partially missing information. Overall, the percentage of imputed information is quite small for any given substance.

The imputation method is based on a multivariate imputation in which some demographic and other substance use information from the respondent is used to determine a donor who is similar in those characteristics but has supplied data for the drug in question (Grau et al., 2001, 2002, 2003). Often, information also is available from the partial respondent on the recency of drug use. For example, respondents may have indicated that they used the drug in their lifetime or in the past year, but left blank the question about use in the past month. For many of the records, this type of auxiliary information was available. In a small portion of the time, no auxiliary information was available, in which case a random donor with similar drug use patterns and demographic characteristics was used. For the different substances, the largest differences between the edited and the imputed estimates typically occurred when there was a lot of auxiliary information. For past month use of marijuana, based on the 1999 data, the State with the largest percentage change from edited to imputed data was Alabama, whose edited rate of use of marijuana was 2.1 percent and whose imputed rate of use was 3.1 percent—a relative increase of almost 50 percent.

E.5. SAE Methodology

E.5.1 Background

In response to the need for State-level information on substance abuse problems, SAMHSA began developing and testing SAE methods for the NHSDA in 1994 under a contract with RTI of Research Triangle Park, North Carolina. That developmental work used logistic regression models with data from the combined 1991 to 1993 NHSDAs and local area indicators, such as drug-related arrests, alcohol-related death rates, and block group/tract-level characteristics from the 1990 Census that were found to be associated with substance abuse. In 1996, the results were published for 25 States for which there were sufficient sample data (OAS, 1996). A subsequent report described the methodology in detail and noted areas in which improvements were needed (Folsom & Judkins, 1997).

The increasing need for State-level estimates of substance use led to the decision to expand the NHSDA to provide estimates for all 50 States and the District of Columbia on an annual basis beginning in 1999. It was determined that, with the use of modeling similar to that used with the 1991 to 1993 NHSDA data in conjunction with a sample designed for State-level estimation, a sample of about 67,500 persons would be sufficient to make reasonably precise estimates.

The State-based NHSDA sample design implemented in 1999 through 2001 had the following characteristics:

In preparation for the modeling of the 1999 data, RTI used the data from the combined 1994–1996 NHSDAs to develop an improved methodology that utilized more local area data and produced better estimates of the accuracy of the State estimates (Folsom, Shah, & Vaish, 1999). That effort involved the development of procedures that would validate the results for geographic areas with large samples. This work was reviewed by a panel with SAE expertise.1 They approved of the methodology, but suggested further improvements for the modeling to be used to produce the 1999 State estimates. Those improvements were incorporated into the methodology finally used for the 1999 State estimates. Similar methodology (as described earlier) was used for the 2000 State report and this 2001 State report. The SWHB methodology is described below.

E.5.2 Goals of Modeling

There were several goals underlying the estimation process. The first was to model drug use at the lowest possible level and aggregate over the levels to form the State estimates. The chosen level of aggregation was the 32 age group (12 to 17, 18 to 25, 26 to 34, 35+) by race/ethnicity (white, non-Hispanic; black, non-Hispanic; Hispanic; Other non-Hispanic) by gender cells at the block group level. Estimated population counts were obtained from a private vendor for each block group for each of the 32 cells. This level of aggregation was desired because the NHSDA first stage of sample selection was at the block group level, so that there would be data at this level to fit a model. In addition, there was a great deal of information from the Census at the block group level that could be used as predictors in the models. If prevalence rates could be estimated for each of the 32 cells at the block group level, it would only be necessary to multiply the rates by the estimated population counts and aggregate to the State level.

Another goal of the estimation process was to include the sampling weight in the model in such a way that the small area estimates would converge to the design-based (sample-weighted) estimates when they were aggregated to a sufficient sample size. There was a desire for the estimates to have this characteristic so that there would be consistency with the survey-weighted national estimates based on the entire sample.

A third goal was to include as much local source data as possible, especially data related to each substance use measure. This would help provide a better fit beyond the strictly sociodemographic information. The desire was to use national sources of these data so that there would be consistency of collection and estimation methodology across States.

Recognizing that estimates based solely on these "fixed" effects would not reflect differences across States due to differences in laws, enforcement activities, advertising campaigns, outreach activities, and other such unique State contributions, a fourth goal was to include "random" effects to compensate for these differences. The types of random effects that could be supported by the NHSDA data were a function of the size of sample and the model fit to the sample data. Random effects were included at the State level and for substate regions comprising three FI regions. Although this grouping of the three FI regions was principally motivated by the need to accumulate enough of a sample to support good model fitting for the low-prevalence NHSDA outcomes, it also was reasoned that it would be possible to produce substate hierarchical Bayes (HB) estimates for areas comprised of these FI region groups, once 2 or 3 years of NHSDA data were available, because that would yield substate region samples of at least 400 respondents. For substate areas that do not conform to the substate region boundaries (e.g., counties and large municipalities), HB estimates could be derived from their elemental block group-level contributions, but the design-based data employed in the estimation of the associated substate region effects would not be restricted to the county or city of interest. This mismatch of FI region and county/large municipality boundaries weakens the theoretical appeal of the associated HB estimate. For this reason, substate HB estimates probably should be restricted to areas that can be matched reasonably well to FI region groups.

One of the difficulties of typical SAE has been obtaining good estimates of the accuracy of the SAEs with prediction intervals that give a good representation of the true probability of coverage of the intervals. Therefore, the final major goal was to provide accurate prediction intervals—ones that would approach the usual sample-based intervals as the sample size increases.

E.5.3 Variables Modeled

In the 2001 NHSDA, a set of 19 measures covering a variety of aspects of substance use and abuse was designated for estimation. For the first 12, three estimates have been produced: one set based on pooled 1999 and 2000 NHSDA data, another set based on pooled 2000 and 2001 NHSDA data, and a third set measuring the change between the first two estimates. Estimates of measures of change between two consecutive single years had not been precise enough to declare significant the size of the annual changes that were observed. For the next six variables, only estimates based on the pooled 2000 and 2001 data were possible because the definitions of those variables had changed between 1999 and 2000. The final variable, serious mental illness (SMI), was added in 2001. The 19 outcome variables are listed as follows:

  1. past month use of any illicit drug,
  2. past month use of marijuana,
  3. perceptions of great risk of smoking marijuana once a month,
  4. average annual rates of first use of marijuana,
  5. past month use of any illicit drug other than marijuana,
  6. past year use of cocaine,
  7. past month use of alcohol,
  8. past month binge alcohol use,
  9. perceptions of great risk of having five or more drinks of an alcoholic beverage once or twice a week,
  10. past month use of any tobacco product,
  11. past month use of cigarettes,
  12. perceptions of great risk of smoking one or more packs of cigarettes per day,
  13. past year alcohol dependence or abuse,
  14. past year alcohol dependence,
  15. past year any illicit drug dependence or abuse,
  16. past year any illicit drug dependence,
  17. past year dependence or abuse for any illicit drug or alcohol,
  18. past year treatment gap, and
  19. past year serious mental illness.

E.5.4 Predictors Used in Logistic Regression Models

Local area data used as potential predictor variables in the logistic regression models were obtained from several sources, including Claritas, the Census Bureau, the FBI (Uniform Crime Reports), Health Resources and Services Administration (Area Resource File), SAMHSA (Uniform Facility Data Set), and the National Center for Health Statistics (mortality data). The major list of sources and potential data items used in the modeling are provided below.

The following lists provide the specific independent variables that were potential predictors in the models.

Claritas Data
Description Level
% Population aged 0–18 in block group Block group
% Population aged 19–24 in block group Block group
% Population aged 25–34 in block group Block group
% Population aged 35–44 in block group Block group
% Population aged 45–54 in block group Block group
% Population aged 55–64 in block group Block group
% Population aged 65+ in block group Block group
% Blacks in block group Block group
% Hispanics in block group Block group
% Other race in block group Block group
% Whites in block group Block group
% Males in block group Block group
% Females in block group Block group
% American Indian, Eskimo, Aleut in tract Tract
% Asian, Pacific Islander in tract Tract
% Population aged 0–18 in tract Tract
% Population aged 19–24 in tract Tract
% Population aged 25–34 in tract Tract
% Population aged 35–44 in tract Tract
% Population aged 45–54 in tract Tract
% Population aged 55–64 in tract Tract
% Population aged 65+ in tract Tract
% Blacks in tract Tract
% Hispanics in tract Tract
% Other race in tract Tract
% Whites in tract Tract
% Males in tract Tract
% Females in tract Tract
% Population aged 0–18 in county County
% Population aged 19–24 in county County
% Population aged 25–34 in county County
% Population aged 35–44 in county County
% Population aged 45–54 in county County
% Population aged 55–64 in county County
% Population aged 65+ in county County
% Blacks in county County
% Hispanics in county County
% Other race in county County
% Whites in county County
% Males in county County
% Females in county County

1990 Census Data
Description Level
% Population who dropped out of high school Tract
% Housing units built in 1940–1949 Tract
% Persons 16–64 with a work disability Tract
% Hispanics who are Cuban Tract
% Females 16 years or older in labor force Tract
% Females never married Tract
% Females separated/divorced/widowed/other Tract
% One-person households Tract
% Female head of household, no spouse, child <18 Tract
% Males 16 years or older in labor force Tract
% Males never married Tract
% Males separated/divorced/widowed/other Tract
% Housing units built in 1939 or earlier Tract
Average persons per room Tract
% Families below poverty level Tract
% Households with public assistance income Tract
% Housing units rented Tract
% Population 9–12 years of school, no high school diploma Tract
% Population 0–8 years of school Tract
% Population with associate's degree Tract
% Population some college and no degree Tract
% Population with bachelor's, graduate, professional degree Tract
Median rents for rental units Tract
Median value of owner-occupied housing units Tract
Median household income Tract

Uniform Crime Report Data
Description Level
Drug possession arrest rate County
Drug sale/manufacture arrest rate County
Drug violations' arrest rate County
Marijuana possession arrest rate County
Marijuana sale/manufacture arrest rate County
Opium cocaine possession arrest rate County
Opium cocaine sale/manufacture arrest rate County
Other drug possession arrest rate County
Other dangerous non-narcotics arrest rate County
Serious crime arrest rate County
Violent crime arrest rate County
Driving under influence arrest rate1 County

Other Categorical Data
Description Source Level
=1 if Hispanic, =0 otherwise Sample Person
=1 if non-Hispanic Black, =0 otherwise Sample Person
=1 if non-Hispanic Other, =0 otherwise Sample Person
=1 if male, =0 if female Sample Person
=1 if Northeast region, =0 otherwise 1990 Census State
=1 if Midwest region, =0 otherwise 1990 Census State
=1 if South region, =0 otherwise 1990 Census State
=1 if MSA with 1 million +, =0 otherwise 1990 Census County
=1 if MSA with <1 million, =0 otherwise 1990 Census County
=1 if non-MSA urban, =0 otherwise 1990 Census Tract
=1 if underclass tract Urban Institute Tract
=1 if no Cubans in tract, =0 otherwise 1990 Census Tract
=1 if urban area, =0 if rural area 1990 Census Tract
=1 if no arrests for dangerous non-narcotics, =0 otherwise UCR County

Miscellaneous Data
Variable Description Source Level
Alcohol death rate, direct cause ICD-9 County
Alcohol death rate, indirect cause ICD-9 County
Cigarettes death rate, direct cause ICD-9 County
Cigarettes death rate, indirect cause ICD-9 County
Drug death rate, direct cause ICD-9 County
Drug death rate, indirect cause ICD-9 County
Alcohol treatment rate UFDS County
Alcohol and drug treatment rate UFDS County
Drug treatment rate UFDS County
% Families below poverty level ARF County
Unemployment rate ARF County
Per capita income (in thousands) ARF County
Food stamp participation rate Census Bureau County
Single state agency maintenance of effort2 National Association of State Alcohol and Drug Abuse Directors (NASADAD) State
Block grant awards2 SAMHSA State
Cost of Services Factor Index (2001–2003)2 SAMHSA State
Total Taxable Resources Per Capita Index (1998)2 U.S. Department of Treasury State
Average suicide rate (1996–1998, per 10,000)1 ARF County
1 Indicates additional predictors used to model serious mental illness for 2001.
2 Indicates additional predictors used to model treatment gap for 2000–2001.

E.5.5 Selection of Independent Variables for the Models

For serious mental illness (SMI) modeled using 2001 data alone, independent variables for each age group were identified by a Chi-squared Automatic Interaction Detector (CHAID) algorithm, which does not use sample weights. Prior to this process, all the continuous variables were categorized using deciles and were treated as ordinal in CHAID. Region was treated as a nominal categorical variable in CHAID. Significant (at 3 percent level) independent variables from each age group model and final nodes in the tree-growing process were identified as predictor variables destined for inclusion at a later step.

Independently, a SAS stepwise logistic regression model was fit for each age group. The SAS stepwise was used because it was able to quickly run all of the variables for all of the models, although it was recognized that the software would not take into account the complex sample design. The independent variables included all the first-order or linear polynomial trend contrasts across the 10 levels of the categorized variables plus the gender, region, and race variables. Significant variables (at the 3 percent level) were identified from this process. Based on the combined list from CHAID and SAS, a list of variables was created that included the corresponding second- and third-order polynomials and the interaction of the first-order polynomials with the gender, race, and region variables.

Next, the variables were entered into a SAS stepwise logistic model at the 1 percent significance level. Because of past concerns about overfitting of the data in earlier estimation using the 1991 to 1993 NHSDA data, the significance levels were made quite stringent. These variables were then entered into a SUrvey DAta ANalysis (SUDAAN) logistic regression model because the SUDAAN software would adjust for the effects of the weights and other aspects of the complex sample design (RTI, 2001). All variables that were still significant at the 1 percent significance level were entered into the survey-weighted hierarchical Bayes (SWHB) process.

For outcome variables modeled using pooled 2000 and 2001 data, the predictor set was the same one used in the 1999–2000 analyses, which was obtained using the same variable selection method described above for SMI.

E.5.6 General Model Description

The model can be characterized as a complex mixed model (including both fixed and random effects) of the form:

[Notation depicting a complex mixed logistic model. Lambda equals X times beta plus Z times U. X times beta is the usual (fixed) regression contribution, and Z times U represents random effects for the States and FI region groups. Lambda is a vector of the log odds of the propensity for a particular person in a particular FI composite region in a given State to engage in the behavior of interest.

Each of the symbols represents a matrix or vector. The leading term Notation depicting X times beta, which is the usual (fixed) regression contribution. is the usual (fixed) regression contribution, and Notation depicting Z times U, which represents random effects for the States and FI region groups. represents random effects for the States and field interviewer (FI) region groups that the data will support and for which estimates are desired. Not obvious from the notation is that the form of the model is a logistic model used to estimate dichotomous data. The lambda vector has elements Notation depicting lambda, which is a vector of the log odds of the propensity for a particular person-k in a particular FI composite region-j in a given State i to engage in the behavior of interest., where the Notation depicting pi sub i, j, k, which is the propensity for the kth person in the jth FI composite region in the ith State to engage in the behavior of interest. is the propensity for the kth person in the jth FI composite region in the ith State to engage in the behavior of interest (e.g., to use marijuana in the past month). Also not obvious from the notation is that the model fitting utilizes the final "sample" weights as discussed above. The "sample" weights have been adjusted for nonresponse and poststratified to known Census counts.

The estimate for each State behaves like a "weighted" average of the design-based estimate in that State and the predicted value based on the national regression model. The "weights" in this case are functions of the relative precision of the sample-based estimate for the State and the predicted estimate based on the national model. The eight large States have large samples, and thus more "weight" is given to the sample estimate relative to the model-based regression estimate. The 42 small States and the District of Columbia put relatively more "weight" on the regression estimate because of their smaller samples. The national regression estimate actually uses national parameters that are based on the pooled 2000 and 2001 sample; however, the regression estimate for a specific State is based on applying the national regression parameters to that State's "local" county, block group, and tract-level predictor variables and summing to the State level. Therefore, even the national regression component of the estimate for a State includes "local" State data.

The goal then was to come up with the best estimates of beta and U. This would lead to the best estimates of lambda, which would in turn lead to the best estimate of pi. Once the best estimate of pi for each block group and each age/race/gender cell within a block group has been estimated, the results could be weighted by the projected Census population counts at that level to make estimates for any geographic area larger than a block group.

In the model fitting for the pooled 2000 and 2001 data, the small numbers of predictor variables updated in 2001 were used in both their 2000 and 2001 versions when they appeared in a model. To produce the 2000–2001 pooled small area estimates, the common fixed and random effects were first employed to form State estimates Notation pi at time 00 denotes the predicted probability of substance use for 2000. and Notation pi at time 01 denotes the predicted probability of substance use for 2001. for 2000 and 2001 respectively. These annualized State estimates then were combined as population-weighted averages of the form

Equation E28 ,    D

where Notation N at time 00 denotes the population projections in 2000. and Notation N at time 01 denotes the population projections in 2001. are the population counts obtained from Claritas Inc.

E.5.7 Implementation of Modeling

The solution to the equation for in Section E.5.6 is not straightforward but involves a series of iterative steps to generate values of the desired fixed and random effects from the underlying joint distribution. The basic process can be described as follows.

Let beta denote the matrix of fixed effects, eta be the matrix of State random effects i = 1-51, and nu denote the matrix of FI composite region effects j within State i. Because the goal is to estimate separate models for four age groups, it is assumed that the random effect vectors are four-variate Normal with null mean vectors and 4×4 covariance matrices Notation depicting D sub eta, which is the 4 by 4 variance-covariance matrix of the State random effects. and Notation depicting D sub nu, which is the 4 by 4 variance-covariance matrix of the FI composite region level random effects., respectively. To estimate the individual effects, a Bayesian approach is used to represent the joint density function given the data by Notation depicting joint probability density function of fixed effects (beta), State random effects (eta), composite field interviewer region effects (nu) within the State, and associated 4 by 4 variance-covariance matrices (D sub nu) and (D sub eta) assuming that the data (y) are known.. According to the Bayes process, this can be estimated once the conditional distributions are known:

Notation depicting conditional probability distribution of fixed effects (beta), assuming that the data (y) and the following parameters are known:  State random effects (eta), composite field interviewer region effects (nu) within the State, and associated 4 by 4 variance-covariance matrices (D sub nu) and (D sub eta)., Notation depicting conditional probability distribution of the 4 by 4 variance-covariance matrices (D sub nu)and (D sub eta), assuming that the data (y) and the following parameters are known:  fixed effects (beta), State random effects (eta), composite field interviewer region effects (nu) within the State., and Notation depicting conditional probability distribution of State random effects (eta) and composite field interviewer region effects (nu) within the State, assuming that the data (y) and the following parameters are known: fixed effects (beta)  and associated 4 by 4 variance-covariance matrices (D sub nu) and (D sub eta)..

To generate random draws from these distributions, MCMC processes need to be used. There is a body of methods for generating pseudo-random draws from probability distributions via Markov chains. A Markov chain is fully specified by its starting distribution Notation depicting probability of X sub zero, where X sub zero is the starting point. and the transition kernel Notation depicting probability of X sub t, given X sub (t-1), where t represents the current time or step..

Each MCMC step that involves the vector of binary outcome variables y in the conditioning set needs first to be modified by defining a pseudolikelihood using survey weights. In defining pseudolikelihood, weights are introduced after scaling them to the effective sample size based on a suitable design effect. Note that with the pseudolikelihood, the covariance matrix of the pseudoscore functions is no longer equal to the pseudoinformation matrix; therefore, a sandwich type of covariance matrix was used to compute the design effect. In this process, weights are largely assumed to be noninformative (i.e., unrelated to the outcome variable y). The assumption of noninformative weights is useful in finding tractable expressions for the appropriate information matrix of the pseudoscore functions. The pseudo log-likelihood remains an unbiased estimate of the finite-population log-likelihood regardless of this assumption.

Step I Notation depicting the conditional probability of fixed effects (beta), assuming that the data (y) and the following parameters are known:  State random effects (eta), composite field interviewer region effects (nu) within the State. (this does not depend on Notation depicting D sub eta., Notation depicting D sub nue.)

With a flat prior for Notation depicting fixed effect, beta sub a, where a denotes a specific age group., the conditional posterior is proportional to the pseudolikelihood function. For large samples, this posterior can be approximated by the multivariate normal distribution with mean vector equal to the pseudomaximum likelihood estimate and with asymptotic covariance matrix having the associated sandwich form. Assuming that the survey weights are noninformative makes the age group-specific Notation depicting fixed effect, beta sub a, where a denotes a specific age group. vectors conditionally independent of each other. Therefore, the Notation depicting fixed effect, beta sub a, where a denotes a specific age group. can be updated separately at each MCMC cycle.

Step II Notation depicting the conditional probability of State random effects (eta) for State i, assuming that the data (y) and the following parameters are known: fixed effects (beta), composite field interviewer region effects (nu) and the associated 4 by 4 variance-covariance matrix (D sub eta). (this does not depend on Notation depicting D sub nue)

Here, the conditional posterior is proportional to the product of the prior Notation depicting the prior distribution of the State i random effects, eta sub i., the pseudo-likelihood function Notation depicting the pseudo-likelihood function of the data given the parameters. as well as the prior Notation depicting the prior distribution of the fixed effects (beta) and variance-covariance matrix (D sub eta).; this last prior can be omitted as it does not involve Notation depicting the State random effect for State i. To calculate the denominator (or the normalization constant) of the posterior distribution for Notation depicting the State random effect for State i requires multidimensional integration and is numerically intractable. To get around this problem, the Metropolis-Hastings (M-H) algorithm is used that requires a dominating density convenient for Monte Carlo sampling. For this purpose, the mode and curvature of the conditional posterior distribution are used; these can be simply obtained from its numerator. Then a Gaussian distribution is used with matching mode and curvature to define the dominating density for M-H. As with the age group-specific Notation depicting the fixed effect, beta sub a, where a denotes a specific age group. parameters, the State-specific random effect vectors Notation depicting the State random effect for State i are conditionally independent of each other and can be updated separately at each MCMC cycle.

Step III Notation depicting the conditional probability of composite field interviewer region effects (nu) within the State, assuming that the data (y) and the following parameters are known: fixed effects (beta), State random effects (eta) and the associated 4 by 4 variance-covariance matrix (D sub nu). (this does not depend on Notation depicting D sub eta.)

Similar to step II.

Step IV Notation depicting the conditional probability of D sub eta, given State random effects eta., Conditional probability of D sub nu, given composite FI region random effects nu. (here, eta and nu include all the information from y)

Here, the pseudo-likelihood involving design weights comes in implicitly through the conditioning parameters eta and nu evaluated at the current cycle. An exact conditional posterior distribution is obtained because the inverse Wishart priors for Notation depicting D sub eta. and Notation depicting D sub nu. are conjugate.

E.5.8 Remarks

E.6. References

Eyerman, J., & Bowman, K. (2002, January). 2001 National Household Survey on Drug Abuse: Incentive experiment combined quarter 1 and quarter 2 analysis. Rockville, MD: Substance Abuse and Mental Health Services Administration, Office of Applied Studies. [Available as a PDF at /nhsda/methods/incentive.pdf]

Folsom, R. E., & Judkins, D. R. (1997). Substance abuse in states and metropolitan areas: Model based estimates from the 1991–1993 National Household Surveys on Drug Abuse: Methodology report (DHHS Publication No. SMA 97–3140, Methodology Series M-1). Rockville, MD: Substance Abuse and Mental Health Services Administration, Office of Applied Studies. [Available at /methods.htm#methods]

Folsom, R. E., Shah, B., & Vaish, A. (1999). Substance abuse in states: A methodological report on model based estimates from the 1994–1996 National Household Surveys on Drug Abuse. In Proceedings of the Section on Survey Research Methods of the American Statistical Association (pp. 371–375). Washington, DC: American Statistical Association.

Grau, E. A., Bowman, K. R., Giacoletti, K. E. D., Odom, D. M., & Sathe, N. S. (2001, July). Imputation report. In 1999 National Household Survey on Drug Abuse: Methodological resource book (Vol. 1, Section 4). Rockville, MD: Substance Abuse and Mental Health Services Administration, Office of Applied Studies. [Available as a PDF at ]

Grau, E. A., Barnett-Walker, K., Copello, E., Frechtel, P., Licata, A., Liu, B., & Odom, D. M. (2003, May). Imputation report. In 2001 National Household Survey on Drug Abuse: Methodological resource book (Vol. 1, Section 4). Rockville, MD: Substance Abuse and Mental Health Services Administration, Office of Applied Studies. [Available as a PDF at /analytic.htm]

RTI. (2001). SUDAAN user's manual: Release 8.0. Research Triangle Park, NC: RTI.

Wright, D. (2002a). State estimates of substance use from the 2000 National Household Survey on Drug Abuse: Volume I. Findings (DHHS Publication No. SMA 02–3731, NHSDA Series H-15). Rockville, MD: Substance Abuse and Mental Health Services Administration, Office of Applied Studies. [Available at /states.htm]

Wright, D. (2002b). State estimates of substance use from the 2000 National Household Survey on Drug Abuse: Volume II. Supplementary technical appendices (DHHS Publication No. SMA 02–3732, NHSDA Series H-16). Rockville, MD: Substance Abuse and Mental Health Services Administration, Office of Applied Studies. [Available at /states.htm]

Table E.1 Ratio of Average Widths of Change Between the 1999–2000 Pooled Data and the 2000–2001 Pooled Data (Based on the Underestimated Model-Based Correlations)
State Age in Years Total
12–17 18–25 26+
Past Month Use of Marijuana
CA 0.84 0.94 0.77 0.72
FL 0.74 1.02 0.72 0.73
IL 0.87 0.99 0.60 0.71
MI 0.74 1.04 0.88 0.84
NY 0.72 0.94 0.54 0.64
OH 0.75 1.01 0.75 0.80
PA 0.74 1.00 0.86 0.86
TX 0.94 1.01 0.38 0.67
Average 0.79 0.99 0.69 0.75
Past Year Use of Cocaine
CA 0.99 0.83 0.59 0.60
FL 0.64 1.20 0.92 1.05
IL 0.90 0.81 0.32 0.50
MI 0.09 0.96 0.79 0.79
NY 0.48 0.75 0.52 0.61
OH 0.44 1.07 0.69 0.87
PA 0.59 0.77 0.46 0.52
TX 0.86 0.97 0.39 0.67
Average 0.63 0.92 0.59 0.70
Past Month Use of Alcohol
CA 0.98 1.08 1.01 1.00
FL 0.82 0.91 1.03 1.01
IL 0.91 1.00 0.92 0.90
MI 0.96 0.99 1.00 0.95
NY 0.98 0.76 0.96 0.96
OH 0.93 0.87 1.09 1.10
PA 0.96 0.83 0.92 0.90
TX 1.25 1.03 1.10 1.07
Average 0.97 0.93 1.01 0.99
Past Month Use of Cigarettes
CA 1.03 1.14 1.02 0.99
FL 0.97 1.05 1.14 1.13
IL 1.04 1.20 1.10 1.12
MI 0.95 1.05 1.04 1.01
NY 0.81 1.10 1.11 1.08
OH 1.05 1.22 1.02 1.02
PA 1.02 1.05 1.10 1.07
TX 1.11 1.27 1.03 1.02
Average 1.00 1.14 1.07 1.05
Note: Ratio = Average width of model-based PIs of change for substates / Average width of design-based CIs of change for substates
Note: The change measure is defined as the odds ratio {P2/(1-P2)}/{P1/(1-P1)}, where P1 is the pooled 1999–2000 small area estimate and P2 is the pooled 2000–2001 small area estimate.
CI = confidence interval; PI = predication interval.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999, 2000, and 2001.

Table E.2 Average Correlation Between the 1999–2000 and the 2000–2001 Model-Based and Design Based Estimates (Based on the Underestimated Model-Based Correlations)
State Age in Years Total
12–17 18–25 26+
DB MB DB MB DB MB DB MB
Past Month Use of Marijuana
CA 0.3204 0.1217 0.4943 0.1508 0.4107 0.3515 0.3273 0.3701
FL 0.5079 0.1998 0.5020 0.1456 0.3114 0.3024 0.3492 0.3308
IL 0.4133 0.1733 0.4996 0.1649 0.5736 0.3816 0.5988 0.3986
MI 0.3316 0.1322 0.4838 0.1203 0.5615 0.3651 0.5476 0.3843
NY 0.4372 0.2003 0.5343 0.1757 0.4083 0.3752 0.4609 0.3991
OH 0.3827 0.1516 0.6195 0.1514 0.5057 0.3990 0.5723 0.3984
PA 0.4838 0.1611 0.5863 0.1533 0.5799 0.3420 0.6406 0.3549
TX 0.5088 0.1337 0.5064 0.1675 0.3134 0.4462 0.4329 0.4362
Average 0.4346 0.1634 0.5321 0.1540 0.4633 0.3725 0.5094 0.3856
Past Year Use of Cocaine
CA 0.4937 0.1827 0.3807 0.1365 0.4240 0.2833 0.4380 0.3131
FL 0.3228 0.2723 0.4839 0.1286 0.6494 0.2852 0.5982 0.2994
IL 0.6058 0.3017 0.4796 0.1452 0.3945 0.2476 0.4316 0.2724
MI 0.4221 0.2550 0.5056 0.1419 0.5341 0.2935 0.5134 0.3233
NY 0.4502 0.2938 0.4186 0.1903 0.4097 0.2728 0.3996 0.3012
OH 0.5629 0.2872 0.4782 0.1389 0.5790 0.2679 0.5704 0.2887
PA 0.3517 0.2333 0.5553 0.1465 0.4333 0.2681 0.4394 0.2972
TX 0.3932 0.2160 0.3400 0.1274 0.2720 0.2830 0.3627 0.2952
Average 0.4455 0.2633 0.4635 0.1453 0.4662 0.2743 0.4726 0.2972
Past Month Use of Alcohol
CA 0.3987 0.0866 0.5756 0.0821 0.5560 0.1390 0.5808 0.1562
FL 0.4226 0.0998 0.5331 0.1181 0.4971 0.1539 0.5078 0.1659
IL 0.3669 0.1073 0.5651 0.0958 0.4712 0.1379 0.4637 0.1563
MI 0.4200 0.1142 0.4815 0.0836 0.5311 0.1466 0.4978 0.1634
NY 0.4680 0.1147 0.4835 0.1540 0.4485 0.1382 0.4914 0.1571
OH 0.3443 0.1063 0.5001 0.1032 0.4647 0.1207 0.4843 0.1383
PA 0.4636 0.0793 0.6181 0.1300 0.4895 0.1264 0.4856 0.1471
TX 0.6342 0.0738 0.5562 0.1084 0.6464 0.1576 0.6509 0.1700
Average 0.4444 0.0990 0.5351 0.1124 0.5083 0.1401 0.5136 0.1569
Past Month Use of Cigarettes
CA 0.3284 0.0717 0.5193 0.0491 0.5963 0.0760 0.5655 0.0910
FL 0.4907 0.0863 0.5048 0.0912 0.5069 0.0788 0.5184 0.0848
IL 0.4375 0.0827 0.5203 0.0861 0.5016 0.0367 0.5550 0.0577
MI 0.4284 0.0440 0.5433 0.0493 0.4787 0.0555 0.4999 0.0647
NY 0.3974 0.0829 0.5050 0.0715 0.4655 0.0581 0.4643 0.0706
OH 0.4731 0.0688 0.5462 0.0461 0.4433 0.0596 0.4696 0.0714
PA 0.4733 0.0734 0.5898 0.0483 0.4217 0.0558 0.4253 0.0727
TX 0.5882 0.0766 0.6083 0.0544 0.6135 0.1101 0.6321 0.1200
Average 0.4659 0.0735 0.5447 0.0634 0.4931 0.0652 0.5108 0.0778
Note: The design based (DB) correlation is derived from the SUDAAN sampling variance and covariance calculations for P1 and P2, where P1 is the 1999–2000 pooled small area estimate and P2 is the 2000–2001 pooled small area estimate. SUDAAN uses between-replicate, within-FI (field interviewer) region, mean squares, and cross products. The DB correlation on the log-odds scale is the same as on the prevalence scale. The model-based (MB) correlations are Bayes posterior correlations for the log-odds calculated from the Markov chain Monte Carlo (MCMC) samples. The MB correlations are underestimated because the software cannot properly account for the sampling covariance resulting from the 2000 data overlap.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999, 2000, and 2001.

Table E.3 Ratio of Average Widths of Change Between the 1999–2000 Pooled Data and the 2000–2001 Pooled Data (Based on the Appropriately Estimated Model-Based Correlations)
State Age in Years Total
12–17 18–25 26+
Past Month Use of Marijuana
CA 0.65 0.72 0.60 0.56
FL 0.55 0.78 0.58 0.58
IL 0.69 0.73 0.49 0.57
MI 0.58 0.76 0.72 0.68
NY 0.56 0.72 0.46 0.54
OH 0.57 0.71 0.62 0.63
PA 0.57 0.71 0.67 0.66
TX 0.68 0.73 0.34 0.55
Average 0.61 0.73 0.56 0.60
Past Year Use of Cocaine
CA 0.71 0.66 0.53 0.54
FL 0.48 0.91 0.76 0.87
IL 0.69 0.64 0.28 0.42
MI 0.07 0.73 0.71 0.71
NY 0.36 0.63 0.44 0.52
OH 0.33 0.82 0.59 0.73
PA 0.47 0.56 0.39 0.43
TX 0.65 0.76 0.36 0.59
Average 0.47 0.71 0.51 0.60
Past Month Use of Alcohol
CA 0.72 0.76 0.73 0.72
FL 0.59 0.66 0.76 0.74
IL 0.67 0.70 0.65 0.63
MI 0.71 0.70 0.73 0.69
NY 0.70 0.55 0.71 0.71
OH 0.68 0.62 0.77 0.77
PA 0.70 0.58 0.66 0.65
TX 0.88 0.71 0.76 0.72
Average 0.71 0.66 0.72 0.70
Past Month Use of Cigarettes
CA 0.77 0.84 0.72 0.70
FL 0.71 0.78 0.81 0.80
IL 0.79 0.85 0.78 0.80
MI 0.69 0.76 0.81 0.78
NY 0.60 0.80 0.82 0.80
OH 0.78 0.84 0.77 0.76
PA 0.77 0.73 0.81 0.78
TX 0.81 0.90 0.75 0.74
Average 0.74 0.81 0.78 0.77
Note: Ratio = Average width of model-based PIs of change for substates / Average width of design-based CIs of change for substates
Note: The change measure is defined as the odds ratio {P2/(1-P2)}/{P1/(1-P1)}, where P1 is the pooled 1999–2000 small area estimate and P2 is the pooled 2000–2001 small area estimate.
CI = confidence interval; PI = predication interval.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999, 2000, and 2001.

Table E.4 Average Correlation Between the 1999–2000 and the 2000–2001 Model-Based and Design-Based Estimates (Based on the Appropriately Estimated Model-Based Correlations)
State Age in Years Total
12–17 18–25 26+
DB MB DB MB DB MB DB MB
Past Month Use of Marijuana
CA 0.3204 0.4760 0.4943 0.4916 0.4107 0.5962 0.3273 0.6235
FL 0.5079 0.5380 0.5020 0.5025 0.3114 0.5441 0.3492 0.5775
IL 0.4133 0.4812 0.4996 0.5351 0.5736 0.5820 0.5988 0.6067
MI 0.3316 0.4588 0.4838 0.5279 0.5615 0.5752 0.5476 0.5944
NY 0.4372 0.5092 0.5343 0.5221 0.4083 0.5293 0.4609 0.5668
OH 0.3827 0.5138 0.6195 0.5711 0.5057 0.5844 0.5723 0.6269
PA 0.4838 0.4861 0.5863 0.5708 0.5799 0.5904 0.6406 0.6112
TX 0.5088 0.5371 0.5064 0.5606 0.3134 0.5498 0.4329 0.6190
Avg. 0.4346 0.5027 0.5321 0.5400 0.4633 0.5659 0.5094 0.6010
Past Year Use of Cocaine
CA 0.4937 0.5673 0.3807 0.4353 0.4240 0.4077 0.4380 0.4349
FL 0.3228 0.5644 0.4839 0.4814 0.6494 0.4919 0.5982 0.5117
IL 0.6058 0.5783 0.4796 0.4570 0.3945 0.4344 0.4316 0.4747
MI 0.4221 0.5396 0.5056 0.4837 0.5341 0.4272 0.5134 0.4568
NY 0.4502 0.5941 0.4186 0.4262 0.4097 0.4536 0.3996 0.4855
OH 0.5629 0.5787 0.4782 0.4728 0.5790 0.4549 0.5704 0.4816
PA 0.3517 0.4995 0.5553 0.5260 0.4333 0.4738 0.4394 0.5086
TX 0.3932 0.5457 0.3400 0.4495 0.2720 0.3754 0.3627 0.4430
Avg. 0.4455 0.5575 0.4635 0.4700 0.4662 0.4434 0.4726 0.4790
Past Month Use of Alcohol
CA 0.3987 0.4984 0.5756 0.5453 0.5560 0.5487 0.5808 0.5625
FL 0.4226 0.5282 0.5331 0.5375 0.4971 0.5352 0.5078 0.5494
IL 0.3669 0.5149 0.5651 0.5542 0.4712 0.5643 0.4637 0.5815
MI 0.4200 0.5062 0.4815 0.5398 0.5311 0.5416 0.4978 0.5613
NY 0.4680 0.5424 0.4835 0.5507 0.4485 0.5272 0.4914 0.5454
OH 0.3443 0.5266 0.5001 0.5399 0.4647 0.5625 0.4843 0.5791
PA 0.4636 0.5073 0.6181 0.5713 0.4895 0.5436 0.4856 0.5583
TX 0.6342 0.5414 0.5562 0.5711 0.6464 0.5996 0.6509 0.6189
Avg. 0.4444 0.5231 0.5351 0.5519 0.5083 0.5533 0.5136 0.5703
Past Month Use of Cigarettes
CA 0.3284 0.4741 0.5193 0.4868 0.5963 0.5384 0.5655 0.5452
FL 0.4907 0.5036 0.5048 0.5014 0.5069 0.5287 0.5184 0.5369
IL 0.4375 0.4614 0.5203 0.5375 0.5016 0.5109 0.5550 0.5240
MI 0.4284 0.5001 0.5433 0.5026 0.4787 0.4268 0.4999 0.4317
NY 0.3974 0.4829 0.5050 0.5028 0.4655 0.4770 0.4643 0.4851
OH 0.4731 0.4763 0.5462 0.5427 0.4433 0.4707 0.4696 0.4830
PA 0.4733 0.4727 0.5898 0.5377 0.4217 0.4913 0.4253 0.4996
TX 0.5882 0.5017 0.6083 0.5293 0.6135 0.5291 0.6321 0.5345
Avg. 0.4659 0.4852 0.5447 0.5210 0.4931 0.4920 0.5108 0.5005
NOTE: The design based (DB) correlation is derived from the SUDAAN sampling variance and covariance calculations for P1 and P2, where P1 is the 1999–2000 pooled small area estimate and P2 is the 2000–2001 pooled small area estimate. SUDAAN uses between replicate, within-FI (field interviewer) region, mean squares, and cross products. The DB correlation on the log-odds scale is the same as on the prevalence scale. The model-based (MB) correlations are Bayes posterior correlations for the log-odds calculated from the Markov chain Monte Carlo (MCMC). The MB correlations are adjusted to account for the sampling covariance resulting from the 2000 data overlap.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999, 2000, and 2001.

Table E.5 Comparison Between the p Values Obtained from Method 1 and Method 2 for Past Month Use of Marijuana
State 12–17 18–25 26 or Older Total
p Value(1) p Value(2) Ratio p Value(1) p Value(2) Ratio p Value(1) p Value(2) Ratio p Value(1) p Value(2) Ratio
CA1 0.889 0.861   0.900 0.865   0.946 0.931   0.940 0.919  
CA2 0.903 0.872   0.974 0.965   0.780 0.709   0.748 0.671  
CA3 0.284 0.162   0.726 0.658   0.851 0.828   0.964 0.956  
CA4 0.756 0.689   0.067 0.024   0.995 0.994   0.467 0.339  
Average 0.708 0.646 0.91 0.667 0.628 0.94 0.893 0.866 0.97 0.780 0.721 0.92
FL1 0.364 0.219   0.809 0.749   0.627 0.581   0.752 0.716  
FL2 0.338 0.211   0.911 0.886   0.998 0.997   0.859 0.821  
FL3 0.601 0.456   0.820 0.758   0.764 0.704   0.831 0.782  
FL4 0.822 0.787   0.325 0.208   0.668 0.578   0.491 0.369  
Average 0.531 0.418 0.79 0.716 0.650 0.91 0.764 0.715 0.94 0.733 0.672 0.92
IL1 0.636 0.560   0.195 0.075   0.628 0.552   0.352 0.238  
IL2 0.999 0.999   0.298 0.174   0.487 0.395   0.290 0.198  
IL3 0.539 0.443   0.205 0.073   0.373 0.291   0.135 0.072  
IL4 0.970 0.959   0.450 0.335   0.634 0.558   0.452 0.344  
Average 0.786 0.740 0.94 0.287 0.164 0.57 0.531 0.449 0.85 0.307 0.213 0.69
MI1 0.521 0.407   0.402 0.278   0.956 0.945   0.863 0.835  
MI2 0.268 0.131   0.410 0.262   0.682 0.589   0.313 0.172  
MI3 0.317 0.228   0.434 0.275   0.875 0.856   0.479 0.397  
MI4 0.695 0.633   0.691 0.576   0.694 0.642   0.542 0.471  
Average 0.450 0.350 0.78 0.484 0.348 0.72 0.802 0.758 0.95 0.549 0.469 0.85
NY1 0.676 0.596   0.995 0.993   0.981 0.979   0.916 0.903  
NY2 0.460 0.303   0.110 0.038   0.584 0.535   0.197 0.132  
NY3 0.438 0.349   0.590 0.453   0.778 0.742   0.534 0.461  
NY4 0.841 0.802   0.474 0.350   0.667 0.609   0.520 0.440  
Average 0.604 0.513 0.85 0.542 0.459 0.85 0.753 0.716 0.95 0.542 0.484 0.89
OH1 0.761 0.694   0.668 0.555   0.583 0.486   0.448 0.324  
OH2 0.228 0.143   0.680 0.568   0.728 0.703   0.676 0.627  
OH3 0.766 0.685   0.543 0.378   0.646 0.551   0.549 0.415  
OH4 0.947 0.924   0.421 0.253   0.628 0.579   0.398 0.283  
Average 0.676 0.612 0.91 0.578 0.439 0.76 0.646 0.580 0.90 0.518 0.412 0.80
PA1 0.136 0.069   0.578 0.429   0.933 0.916   0.532 0.434  
PA2 0.949 0.937   0.704 0.577   0.819 0.778   0.722 0.642  
PA3 0.692 0.594   0.442 0.291   0.582 0.452   0.388 0.238  
PA4 0.466 0.331   0.570 0.442   0.622 0.543   0.409 0.308  
Average 0.561 0.483 0.86 0.574 0.435 0.76 0.739 0.672 0.91 0.513 0.406 0.79
TX1 0.349 0.222   0.657 0.523   0.656 0.602   0.713 0.638  
TX2 0.636 0.547   0.906 0.870   0.327 0.272   0.337 0.241  
TX3 0.786 0.694   0.912 0.885   0.479 0.451   0.497 0.441  
TX4 0.995 0.993   0.679 0.565   0.367 0.331   0.366 0.265  
Average 0.692 0.614 0.89 0.789 0.711 0.90 0.457 0.414 0.91 0.478 0.396 0.83
Average across substates 0.87     0.80     0.92     0.84
Note: p value(1) represents the Bayes significance level obtained from Method 1.
Note: p value(2) represents the Bayes significance level obtained from Method 2.
Note: In method 1, an eight age-group model was fit, where age groups 1 to 4 correspond with the pooled 1999–2000 data and age groups 5 to 8 correspond with pooled 2000–2001 data. The p value for this method was obtained by using the variance of the log-odds produced by fitting this model. In method 2, a 12 age-group model was fit. Age groups 1 to 4 correspond with the 1999 data, age groups 5 to 8 with the 2000 data, and age groups 9 to 12 correspond with the 2001 data. The p values were obtained using the correlation produced from this method and the variances of the logits produced in method 1.
Note: Ratio = Average p value(2) / Average p value(1).
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999, 2000, and 2001.

Table E.6 Comparison Between the p Values Obtained from Method 1 and Method 2 for Past Year Use of Cocaine
State 12–17 18–25 26 or Older Total
p Value(1) p Value(2) Ratio p Value(1) p Value(2) Ratio p Value(1) p Value(2) Ratio p Value(1) p Value(2) Ratio
CA1 0.765 0.672   0.831 0.793   0.689 0.664   0.810 0.792  
CA2 0.946 0.931   0.375 0.263   0.936 0.931   0.671 0.645  
CA3 0.989 0.985   0.841 0.799   0.384 0.331   0.380 0.333  
CA4 0.945 0.924   0.575 0.505   0.962 0.957   0.745 0.714  
Average 0.911 0.878 0.96 0.656 0.590 0.90 0.743 0.721 0.97 0.652 0.621 0.95
FL1 0.783 0.713   0.872 0.837   0.979 0.976   0.984 0.982  
FL2 0.956 0.941   0.476 0.398   0.894 0.872   0.913 0.894  
FL3 0.500 0.406   0.856 0.792   0.947 0.936   0.944 0.930  
FL4 0.642 0.555   0.934 0.914   0.972 0.967   0.973 0.969  
Average 0.720 0.654 0.91 0.785 0.735 0.94 0.948 0.938 0.99 0.954 0.944 0.99
IL1 0.823 0.762   0.383 0.210   0.826 0.781   0.601 0.486  
IL2 0.782 0.739   0.698 0.658   0.829 0.803   0.756 0.725  
IL3 0.594 0.508   0.720 0.675   0.650 0.604   0.624 0.582  
IL4 0.643 0.532   0.336 0.203   0.683 0.662   0.445 0.382  
Average 0.711 0.635 0.89 0.534 0.437 0.82 0.747 0.713 0.95 0.607 0.544 0.90
MI1 0.989 0.987   0.367 0.283   0.853 0.839   0.504 0.474  
MI2 0.867 0.831   0.594 0.464   0.929 0.920   0.873 0.858  
MI3 0.985 0.979   0.983 0.978   0.737 0.707   0.775 0.740  
MI4 0.665 0.594   0.483 0.370   0.645 0.611   0.967 0.963  
Average 0.877 0.848 0.97 0.607 0.524 0.86 0.791 0.769 0.97 0.780 0.759 0.97
NY1 0.390 0.306   0.560 0.526   0.773 0.736   0.711 0.681  
NY2 0.939 0.917   0.441 0.357   0.757 0.722   0.545 0.478  
NY3 0.822 0.769   0.766 0.714   0.937 0.926   0.870 0.843  
NY4 0.670 0.538   0.527 0.428   0.933 0.925   0.886 0.867  
Average 0.705 0.633 0.90 0.574 0.506 0.88 0.850 0.827 0.97 0.753 0.717 0.95
OH1 0.962 0.953   0.512 0.411   0.737 0.700   0.557 0.500  
OH2 0.836 0.778   0.899 0.870   0.811 0.790   0.763 0.732  
OH3 0.943 0.927   0.653 0.562   0.829 0.806   0.989 0.987  
OH4 0.847 0.797   0.267 0.156   0.981 0.977   0.622 0.542  
Average 0.897 0.864 0.96 0.583 0.500 0.86 0.840 0.818 0.97 0.733 0.690 0.94
PA1 0.721 0.677   0.269 0.127   0.663 0.623   0.349 0.269  
PA2 0.618 0.564   0.402 0.252   0.758 0.701   0.555 0.453  
PA3 0.846 0.811   0.829 0.793   0.723 0.665   0.658 0.603  
PA4 0.936 0.909   0.699 0.585   0.795 0.770   0.698 0.653  
Average 0.780 0.740 0.95 0.550 0.439 0.80 0.735 0.690 0.94 0.565 0.495 0.88
TX1 0.750 0.671   0.668 0.532   0.900 0.893   0.683 0.636  
TX2 0.422 0.296   0.614 0.496   0.970 0.967   0.939 0.930  
TX3 0.654 0.563   0.885 0.873   0.879 0.869   0.973 0.971  
TX4 0.943 0.926   0.549 0.475   0.964 0.963   0.770 0.747  
Average 0.692 0.614 0.89 0.679 0.594 0.87 0.928 0.923 0.99 0.841 0.821 0.98
Average across substates 0.93     0.87     0.97     0.94
Note: p value(1) represents the Bayes significance level obtained from Method 1.
Note: p value(2) represents the Bayes significance level obtained from Method 2.
Note: In method 1, an eight age-group model was fit, where age groups 1 to 4 correspond with the pooled 1999–2000 data and age groups 5 to 8 correspond with pooled 2000–2001 data. The p value for this method was obtained by using the variance of the log-odds produced by fitting this model. In method 2, a 12 age-group model was fit. Age groups 1 to 4 correspond with the 1999 data, age groups 5 to 8 with the 2000 data, and age groups 9 to 12 correspond with the 2001 data. The p values were obtained using the correlation produced from this method and the variances of the logits produced in method 1.
Note: Ratio = Average p value(2) / Average p value(1).
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999, 2000, and 2001.

Table E.7 Comparison Between the p Values Obtained from Method 1 and Method 2 for Past Month Use of Alcohol
State 12–17 18–25 26 or Older Total
p Value(1) p Value(2) Ratio p Value(1) p Value(2) Ratio p Value(1) p Value(2) Ratio p Value(1) p Value(2) Ratio
CA1 0.317 0.175   0.801 0.707   0.239 0.103   0.238 0.096  
CA2 0.788 0.723   0.486 0.324   0.743 0.651   0.627 0.502  
CA3 0.819 0.758   0.150 0.049   0.267 0.129   0.157 0.055  
CA4 0.657 0.546   0.496 0.336   0.956 0.939   0.842 0.780  
Average 0.645 0.551 0.85 0.483 0.354 0.73 0.551 0.456 0.83 0.466 0.358 0.77
FL1 0.306 0.153   0.287 0.155   0.157 0.052   0.193 0.074  
FL2 0.386 0.239   0.806 0.734   0.826 0.761   0.819 0.748  
FL3 0.733 0.624   0.308 0.154   0.875 0.837   0.982 0.976  
FL4 0.872 0.829   0.603 0.465   0.987 0.982   0.943 0.923  
Average 0.574 0.461 0.80 0.501 0.377 0.75 0.711 0.658 0.93 0.734 0.680 0.93
IL1 0.883 0.847   0.819 0.744   0.668 0.526   0.654 0.507  
IL2 0.512 0.342   0.920 0.889   0.491 0.337   0.491 0.330  
IL3 0.863 0.819   0.711 0.593   0.689 0.602   0.747 0.668  
IL4 0.714 0.622   0.891 0.843   0.501 0.332   0.531 0.365  
Average 0.743 0.658 0.88 0.835 0.767 0.92 0.587 0.449 0.77 0.606 0.468 0.77
MI1 0.879 0.839   0.978 0.970   0.332 0.180   0.366 0.211  
MI2 0.655 0.549   0.383 0.210   0.121 0.034   0.096 0.020  
MI3 0.464 0.331   0.479 0.325   0.631 0.524   0.531 0.402  
MI4 0.716 0.622   0.097 0.016   0.273 0.130   0.179 0.059  
Average 0.679 0.585 0.86 0.484 0.380 0.79 0.339 0.217 0.64 0.293 0.173 0.59
NY1 0.853 0.780   0.567 0.427   0.896 0.859   0.956 0.939  
NY2 0.441 0.261   0.904 0.873   0.966 0.955   0.953 0.937  
NY3 0.763 0.693   0.950 0.932   0.483 0.326   0.458 0.297  
NY4 0.834 0.784   0.995 0.993   0.464 0.337   0.493 0.362  
Average 0.723 0.630 0.87 0.854 0.806 0.94 0.702 0.619 0.88 0.715 0.634 0.89
OH1 0.697 0.580   0.579 0.437   0.361 0.187   0.411 0.233  
OH2 0.438 0.337   0.327 0.156   0.359 0.219   0.271 0.133  
OH3 0.950 0.930   0.393 0.245   0.877 0.823   0.793 0.701  
OH4 0.831 0.753   0.128 0.037   0.853 0.789   0.684 0.558  
Average 0.729 0.650 0.89 0.357 0.219 0.61 0.613 0.505 0.82 0.540 0.406 0.75
PA1 0.358 0.203   0.420 0.244   0.896 0.863   0.769 0.694  
PA2 0.457 0.301   0.585 0.432   0.852 0.798   0.764 0.679  
PA3 0.317 0.173   0.855 0.803   0.340 0.168   0.305 0.138  
PA4 0.454 0.320   0.216 0.073   0.931 0.905   0.761 0.675  
Average 0.397 0.249 0.63 0.519 0.388 0.75 0.755 0.684 0.91 0.650 0.547 0.84
TX1 0.528 0.392   0.991 0.987   0.747 0.649   0.827 0.753  
TX2 0.970 0.958   0.757 0.662   0.419 0.253   0.381 0.206  
TX3 0.864 0.796   0.773 0.684   0.888 0.842   0.839 0.769  
TX4 0.667 0.539   0.704 0.568   0.181 0.037   0.171 0.031  
Average 0.757 0.671 0.89 0.806 0.725 0.90 0.559 0.445 0.80 0.555 0.440 0.79
Average across substates 0.84     0.80     0.82     0.79
Note: p value(1) represents the Bayes significance level obtained from Method 1.
Note: p value(2) represents the Bayes significance level obtained from Method 2.
Note: In method 1, an eight age-group model was fit, where age groups 1 to 4 correspond with the pooled 1999–2000 data and age groups 5 to 8 correspond with pooled 2000–2001 data. The p value for this method was obtained by using the variance of the log-odds produced by fitting this model. In method 2, a 12 age-group model was fit. Age groups 1 to 4 correspond with the 1999 data, age groups 5 to 8 with the 2000 data, and age groups 9 to 12 correspond with the 2001 data. The p values were obtained using the correlation produced from this method and the variances of the logits produced in method 1.
Note: Ratio = Average p value(2) / Average p value(1).
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999, 2000, and 2001.

Table E.8 Comparison Between the p Values Obtained from Method 1 and Method 2 for Past Month Use of Cigarettes
State 12–17 18–25 26 or Older Total
p Value(1) p Value(2) Ratio p Value(1) p Value(2) Ratio p Value(1) p Value(2) Ratio p Value(1) p Value(2) Ratio
CA1 0.648 0.554   0.810 0.742   0.498 0.321   0.482 0.305  
CA2 0.831 0.780   0.454 0.291   0.978 0.969   0.892 0.849  
CA3 0.965 0.954   0.383 0.247   0.809 0.733   0.917 0.883  
CA4 0.541 0.407   0.678 0.577   0.572 0.443   0.586 0.457  
Average 0.746 0.674 0.90 0.581 0.464 0.80 0.714 0.617 0.86 0.719 0.624 0.87
FL1 0.304 0.178   0.943 0.926   0.469 0.294   0.425 0.246  
FL2 0.344 0.199   0.771 0.687   0.823 0.770   0.849 0.801  
FL3 0.791 0.700   0.797 0.730   0.695 0.588   0.656 0.536  
FL4 0.393 0.261   0.113 0.029   0.977 0.966   0.861 0.797  
Average 0.458 0.335 0.73 0.656 0.593 0.90 0.741 0.655 0.88 0.698 0.595 0.85
IL1 0.535 0.423   0.326 0.172   0.532 0.377   0.467 0.300  
IL2 0.554 0.401   0.757 0.670   0.443 0.277   0.444 0.282  
IL3 0.566 0.445   0.758 0.659   0.479 0.332   0.575 0.439  
IL4 0.134 0.071   0.812 0.736   0.557 0.412   0.438 0.277  
Average 0.447 0.335 0.75 0.663 0.559 0.84 0.503 0.350 0.70 0.481 0.325 0.67
MI1 0.191 0.052   0.541 0.404   0.266 0.148   0.176 0.078  
MI2 0.893 0.856   0.942 0.919   0.414 0.295   0.435 0.314  
MI3 0.951 0.932   0.833 0.773   0.457 0.361   0.425 0.334  
MI4 0.690 0.601   0.257 0.115   0.606 0.495   0.508 0.380  
Average 0.681 0.610 0.90 0.643 0.553 0.86 0.436 0.325 0.75 0.386 0.277 0.72
NY1 0.153 0.040   0.647 0.546   0.373 0.249   0.283 0.164  
NY2 0.207 0.092   0.176 0.063   0.857 0.811   0.840 0.787  
NY3 0.472 0.373   0.451 0.311   0.247 0.106   0.346 0.193  
NY4 0.485 0.354   0.682 0.560   0.416 0.276   0.438 0.296  
Average 0.329 0.215 0.65 0.489 0.370 0.76 0.473 0.361 0.76 0.477 0.360 0.76
OH1 0.684 0.572   0.473 0.280   0.420 0.281   0.541 0.411  
OH2 0.953 0.941   0.928 0.899   0.639 0.537   0.633 0.529  
OH3 0.069 0.015   0.251 0.102   0.417 0.272   0.231 0.103  
OH4 0.617 0.500   0.488 0.317   0.889 0.853   0.942 0.923  
Average 0.581 0.507 0.87 0.535 0.400 0.75 0.591 0.486 0.82 0.587 0.492 0.84
PA1 0.574 0.463   0.827 0.758   0.312 0.181   0.306 0.176  
PA2 0.586 0.450   0.774 0.681   0.979 0.971   0.933 0.908  
PA3 0.247 0.129   0.438 0.261   0.677 0.563   0.668 0.552  
PA4 0.672 0.582   0.660 0.526   0.593 0.472   0.580 0.456  
Average 0.520 0.406 0.78 0.675 0.557 0.82 0.640 0.547 0.85 0.622 0.523 0.84
TX1 0.910 0.882   0.982 0.974   0.827 0.760   0.813 0.739  
TX2 0.989 0.986   0.496 0.327   0.749 0.657   0.658 0.545  
TX3 0.317 0.157   0.783 0.705   0.914 0.888   0.804 0.742  
TX4 0.717 0.591   0.727 0.612   0.949 0.928   0.992 0.988  
Average 0.733 0.654 0.89 0.747 0.655 0.88 0.860 0.808 0.94 0.817 0.754 0.92
Average across substates 0.81     0.83     0.82     0.81
Note: p value(1) represents the Bayes significance level obtained from Method 1.
Note: p value(2) represents the Bayes significance level obtained from Method 2.
Note: In method 1, an eight age-group model was fit, where age groups 1 to 4 correspond with the pooled 1999–2000 data and age groups 5 to 8 correspond with pooled 2000–2001 data. The p value for this method was obtained by using the variance of the log-odds produced by fitting this model. In method 2, a 12 age-group model was fit. Age groups 1 to 4 correspond with the 1999 data, age groups 5 to 8 with the 2000 data, and age groups 9 to 12 correspond with the 2001 data. The p values were obtained using the correlation produced from this method and the variances of the logits produced in method 1.
Note: Ratio = Average p value(2) / Average p value(1).
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999, 2000, and 2001.

Table E.9 Relative Absolute Bias for Change Between Pooled 1999–2000 Data and Pooled 2000–2001 Data for Past Month Marijuana Use
State Age in Years Total
12–17 18–25 26+
CA (design-based) 1.10 1.06 1.08 1.08
Average across 4 substates 1.06 1.06 1.01 1.03
Relative Absolute Bias 3.04 0.29 6.74 4.18
FL (design-based) 1.21 0.98 0.93 0.99
Average across 4 substates 1.10 1.02 0.99 1.02
Relative Absolute Bias 9.20 3.88 5.65 2.87
IL (design-based) 0.97 1.22 1.33 1.21
Average across 4 substates 1.01 1.16 1.13 1.12
Relative Absolute Bias 3.88 4.27 15.06 7.72
MI (design-based) 1.26 1.05 1.01 1.06
Average across 4 substates 1.14 1.05 1.05 1.06
Relative Absolute Bias 9.96 0.58 4.31 0.10
NY (design-based) 1.14 1.10 1.43 1.22
Average across 4 substates 1.04 1.11 1.07 1.07
Relative Absolute Bias 8.71 1.24 25.53 12.21
OH (design-based) 1.13 1.04 1.05 1.06
Average across 4 substates 1.06 1.05 1.10 1.07
Relative Absolute Bias 5.79 1.11 4.42 1.49
PA (design-based) 1.24 1.15 1.07 1.11
Average across 4 substates 1.12 1.08 1.07 1.08
Relative Absolute Bias 9.67 5.40 0.62 2.87
TX (design-based) 1.02 0.99 1.37 1.11
Average across 4 substates 1.08 1.01 1.18 1.08
Relative Absolute Bias 6.10 1.65 13.55 2.62
Average Relative Absolute Bias 7.04 2.30 9.48 4.26
Note: Relative absolute bias = 100*abs(Average model-based change over 4 substates - Large State design-based change) / Large State design-based change.
Note: The change measure is defined as the odds ratio {P2/(1-P2)}/{P1/(1-P1)}, where P1 is the pooled 1999–2000 small area estimate and P2 is the pooled 2000–2001 small area estimate.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999, 2000, and 2001.

Table E.10 Relative Absolute Bias for Change Between Pooled 1999–2000 Data and Pooled 2000–2001 Data for Past Year Use of Cocaine
State Age in Years Total
12–17 18–25 26+
CA (design-based) 0.98 1.12 1.31 1.21
Average across 4 substates 0.97 1.08 1.10 1.08
Relative Absolute Bias 0.51 3.92 16.27 10.46
FL (design-based) 0.84 0.79 0.79 0.80
Average across 4 substates 0.93 0.95 1.02 0.99
Relative Absolute Bias 11.54 19.50 29.31 24.10
IL (design-based) 0.71 1.29 1.43 1.33
Average across 4 substates 0.90 1.16 1.10 1.10
Relative Absolute Bias 26.64 10.71 23.10 17.29
MI (design-based) 1.26 0.94 0.65 0.83
Average across 4 substates 1.02 1.03 0.93 0.97
Relative Absolute Bias 18.44 10.07 43.20 17.23
NY (design-based) 0.82 1.27 1.23 1.21
Average across 4 substates 0.90 1.13 1.04 1.06
Relative Absolute Bias 9.71 10.81 15.46 12.36
OH (design-based) 1.10 0.85 0.92 0.90
Average across 4 substates 0.97 1.00 0.97 0.98
Relative Absolute Bias 11.51 17.03 5.74 9.10
PA (design-based) 1.14 1.27 1.17 1.20
Average across 4 substates 1.00 1.15 1.10 1.10
Relative Absolute Bias 12.02 10.08 6.23 8.12
TX (design-based) 0.87 0.98 1.11 1.00
Average across 4 substates 0.91 0.97 1.00 0.97
Relative Absolute Bias 4.14 0.93 9.99 3.00
Average Relative Absolute Bias 11.81 10.38 18.66 12.71
Note: Relative absolute bias = 100*abs(Average model-based change over 4 substates - Large State design-based change) / Large State design-based change.
Note: The change measure is defined as the odds ratio {P2/(1-P2)}/{P1/(1-P1)}, where P1 is the pooled 1999–2000 small area estimate and P2 is the pooled 2000–2001 small area estimate.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999, 2000, and 2001.

Table E.11 Relative Absolute Bias for Change Between Pooled 1999–2000 Data and Pooled 2000–2001 Data for Past Month Use of Alcohol
State Age in Years Total
12–17 18–25 26+
CA (design-based) 0.94 1.07 1.01 1.02
Average across 4 substates 0.96 1.08 1.08 1.07
Relative Absolute Bias 1.84 0.82 6.44 5.02
FL (design-based) 1.14 0.96 1.09 1.07
Average across 4 substates 1.09 0.95 1.05 1.04
Relative Absolute Bias 4.97 1.17 3.41 3.18
IL (design-based) 1.03 0.99 1.08 1.05
Average across 4 substates 1.04 1.00 1.02 1.02
Relative Absolute Bias 1.70 0.87 4.73 3.46
MI (design-based) 1.06 1.10 1.11 1.09
Average across 4 substates 1.05 1.10 1.11 1.10
Relative Absolute Bias 1.43 0.18 0.44 0.27
NY (design-based) 1.03 1.00 0.96 0.97
Average across 4 substates 1.04 1.02 1.00 1.00
Relative Absolute Bias 0.91 1.69 3.62 3.03
OH (design-based) 1.04 1.08 1.01 1.02
Average across 4 substates 1.04 1.08 1.06 1.06
Relative Absolute Bias 0.21 0.14 5.30 4.00
PA (design-based) 1.16 1.15 1.07 1.08
Average across 4 substates 1.12 1.08 1.04 1.04
Relative Absolute Bias 3.77 6.13 3.00 3.25
TX (design-based) 0.99 1.01 1.01 1.01
Average across 4 substates 1.00 1.03 1.06 1.05
Relative Absolute Bias 1.73 1.23 5.15 4.04
Average Relative Absolute Bias 2.07 1.53 4.01 3.28
Note: Relative absolute bias = 100*abs(Average model-based change over 4 substates - Large State design-based change) / Large State design-based change.
Note: The change measure is defined as the odds ratio {P2/(1-P2)}/{P1/(1-P1)}, where P1 is the pooled 1999–2000 small area estimate and P2 is the pooled 2000–2001 small area estimate.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999, 2000, and 2001.

Table E.12 Relative Absolute Bias for Change Between Pooled 1999–2000 Data and Pooled 2000–2001 Data for Past Month Use of Cigarettes
State Age in Years Total
12–17 18–25 26+
CA (design-based) 0.96 1.00 1.02 1.02
Average across 4 substates 1.00 0.98 0.97 0.97
Relative Absolute Bias 3.44 1.87 5.20 4.23
FL (design-based) 0.99 1.07 0.96 0.97
Average across 4 substates 0.96 1.03 0.96 0.97
Relative Absolute Bias 2.43 3.02 0.47 0.25
IL (design-based) 0.87 1.03 1.00 0.99
Average across 4 substates 0.89 1.02 1.00 1.00
Relative Absolute Bias 2.02 0.84 0.51 0.30
MI (design-based) 0.95 1.00 1.02 1.01
Average across 4 substates 0.94 1.01 0.99 0.99
Relative Absolute Bias 1.35 1.22 3.50 2.51
NY (design-based) 1.02 1.10 0.88 0.92
Average across 4 substates 1.01 1.07 0.91 0.94
Relative Absolute Bias 0.38 2.64 2.63 1.41
OH (design-based) 0.93 0.95 1.03 1.01
Average across 4 substates 0.91 0.97 1.01 1.00
Relative Absolute Bias 1.80 2.32 1.30 0.81
PA (design-based) 0.90 1.04 1.01 1.00
Average across 4 substates 0.91 1.03 1.00 1.00
Relative Absolute Bias 1.50 0.96 0.65 0.55
TX (design-based) 0.96 0.99 1.01 1.00
Average across 4 substates 0.96 0.98 0.99 0.99
Relative Absolute Bias 0.36 0.37 1.69 1.33
Average Relative Absolute Bias 1.66 1.65 1.99 1.42
Note: Relative absolute bias = 100*abs(Average model-based change over 4 substates - Large State design-based change) / Large State design-based change.
Note: The change measure is defined as the odds ratio {P2/(1-P2)}/{P1/(1-P1)}, where P1 is the pooled 1999–2000 small area estimate and P2 is the pooled 2000–2001 small area estimate.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999, 2000, and 2001.

Table E.13 Relative Absolute Bias for Past Month Use of Marijuana Based on Pooled 1999 and 2000 Data
State Age in Years Total
12–17 18–25 26+
CA (design-based) 7.60 13.94 4.16 5.86
Average across 4 substates 7.47 13.45 3.77 5.49
Relative Absolute Bias 1.75 3.54 9.28 6.35
FL (design-based) 6.33 13.31 3.39 4.73
Average across 4 substates 6.80 13.28 3.52 4.87
Relative Absolute Bias 7.39 0.19 3.77 3.02
IL (design-based) 8.57 14.31 2.51 4.70
Average across 4 substates 7.69 14.45 2.75 4.81
Relative Absolute Bias 10.24 1.01 9.66 2.44
MI (design-based) 7.77 16.64 3.53 5.68
Average across 4 substates 8.01 16.92 3.40 5.64
Relative Absolute Bias 3.08 1.69 3.71 0.68
NY (design-based) 6.32 16.77 2.02 4.26
Average across 4 substates 7.08 15.38 2.62 4.63
Relative Absolute Bias 12.08 8.26 29.53 8.69
OH (design-based) 6.07 14.31 2.49 4.40
Average across 4 substates 6.68 13.98 2.44 4.38
Relative Absolute Bias 10.03 2.31 2.17 0.50
PA (design-based) 5.83 14.16 2.79 4.42
Average across 4 substates 6.81 13.91 2.75 4.45
Relative Absolute Bias 16.90 1.75 1.63 0.71
TX (design-based) 6.00 10.41 1.34 3.22
Average across 4 substates 5.84 10.59 1.77 3.55
Relative Absolute Bias 2.65 1.79 32.35 10.19
Average Relative Absolute Bias 8.01 2.57 11.51 4.07
Note: Relative absolute bias = 100 × abs(Average small area estimate over 4 substates - Large State design-based estimate) / Large State design-based estimate.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999 and 2000.

Table E.14 Relative Absolute Bias for Past Year Use of Cocaine Based on Pooled 1999 and 2000 Data
State Age in Years Total
12–17 18–25 26+
CA (design-based) 2.05 4.79 1.29 1.85
Average across 4 substates 2.00 4.76 1.17 1.75
Relative Absolute Bias 2.19 0.65 9.24 5.36
FL (design-based) 1.52 5.96 1.18 1.73
Average across 4 substates 1.55 4.86 1.20 1.63
Relative Absolute Bias 2.18 18.43 1.60 5.78
IL (design-based) 0.96 3.93 1.11 1.47
Average across 4 substates 1.41 4.36 1.08 1.55
Relative Absolute Bias 47.60 10.97 2.88 5.43
MI (design-based) 1.02 5.04 0.86 1.42
Average across 4 substates 1.41 4.72 1.13 1.62
Relative Absolute Bias 38.23 6.34 30.39 14.07
NY (design-based) 1.18 3.87 1.01 1.37
Average across 4 substates 1.46 4.30 1.10 1.53
Relative Absolute Bias 23.10 11.10 9.67 11.32
OH (design-based) 0.78 4.98 0.92 1.43
Average across 4 substates 1.32 4.68 1.07 1.57
Relative Absolute Bias 69.04 6.15 17.00 9.45
PA (design-based) 1.18 4.39 1.00 1.41
Average across 4 substates 1.47 4.57 1.02 1.48
Relative Absolute Bias 25.11 4.01 1.71 4.45
TX (design-based) 2.66 6.10 0.83 1.82
Average across 4 substates 2.32 5.54 1.17 1.95
Relative Absolute Bias 12.90 9.07 41.21 7.17
Average Relative Absolute Bias 27.54 8.34 14.21 7.88
Note: Relative absolute bias = 100 × abs(Average small area estimate over 4 substates - Large State design-based estimate) / Large State design-based estimate.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999 and 2000.

Table E.15 Relative Absolute Bias for Past Month Binge Alcohol Use Based on Pooled 1999 and 2000 Data
State Age in Years Total
12–17 18–25 26+
CA (design-based) 9.12 32.46 18.58 19.42
Average across 4 substates 9.16 32.16 18.55 19.36
Relative Absolute Bias 0.40 0.93 0.18 0.32
FL (design-based) 7.93 35.02 17.72 18.67
Average across 4 substates 8.79 34.68 17.60 18.62
Relative Absolute Bias 10.94 0.97 0.68 0.28
IL (design-based) 11.53 41.83 21.43 23.13
Average across 4 substates 11.00 41.62 21.00 22.72
Relative Absolute Bias 4.60 0.50 2.00 1.77
MI (design-based) 10.88 42.23 19.08 21.23
Average across 4 substates 10.84 41.12 19.55 21.44
Relative Absolute Bias 0.38 2.64 2.47 1.00
NY (design-based) 10.14 39.47 18.61 20.33
Average across 4 substates 9.92 38.99 18.74 20.35
Relative Absolute Bias 2.25 1.22 0.72 0.11
OH (design-based) 9.97 41.73 20.32 22.04
Average across 4 substates 10.42 41.67 19.95 21.79
Relative Absolute Bias 4.48 0.15 1.84 1.13
PA (design-based) 9.30 42.13 20.55 21.97
Average across 4 substates 10.20 41.92 19.67 21.35
Relative Absolute Bias 9.67 0.50 4.26 2.84
TX (design-based) 11.07 35.62 20.08 21.31
Average across 4 substates 10.78 36.06 20.15 21.39
Relative Absolute Bias 2.59 1.24 0.35 0.39
Average Relative Absolute Bias 4.41 1.02 1.56 0.98
Note: Relative absolute bias = 100 × abs(Average small area estimate over 4 substates - Large State design-based estimate) / Large State design-based estimate.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999 and 2000.

Table E.16 Relative Absolute Bias for Past Month Use of Cigarettes Based on Pooled 1999 and 2000 Data
State Age in Years Total
12–17 18–25 26+
CA (design-based) 8.73 29.62 22.07 21.62
Average across 4 substates 9.16 30.65 21.53 21.41
Relative Absolute Bias 4.93 3.49 2.41 0.98
FL (design-based) 10.92 34.60 25.16 24.85
Average across 4 substates 11.59 35.52 24.92 24.82
Relative Absolute Bias 6.16 2.65 0.96 0.13
IL (design-based) 15.61 43.44 25.57 26.93
Average across 4 substates 15.16 42.31 25.29 26.51
Relative Absolute Bias 2.87 2.60 1.11 1.53
MI (design-based) 15.68 43.81 25.14 26.57
Average across 4 substates 15.91 42.82 26.38 27.42
Relative Absolute Bias 1.45 2.27 4.93 3.17
NY (design-based) 12.28 36.29 23.95 24.31
Average across 4 substates 12.19 36.30 24.08 24.40
Relative Absolute Bias 0.76 0.03 0.54 0.38
OH (design-based) 15.83 45.66 28.21 29.21
Average across 4 substates 16.06 44.67 27.70 28.71
Relative Absolute Bias 1.45 2.16 1.83 1.71
PA (design-based) 16.21 42.32 24.97 26.14
Average across 4 substates 16.36 41.74 25.32 26.37
Relative Absolute Bias 0.97 1.39 1.44 0.87
TX (design-based) 12.73 34.49 23.12 23.57
Average across 4 substates 12.39 35.11 23.37 23.80
Relative Absolute Bias 2.74 1.79 1.07 0.98
Average Relative Absolute Bias 2.67 2.05 1.79 1.22
Note: Relative absolute bias = 100 × abs(Average small area estimate over 4 substates - Large State design-based estimate) / Large State design-based estimate.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999 and 2000.

Table E.17 Ratio of Average Widths for Pooled 1999 and 2000 Data
State Age in Years Total
12–17 18–25 26+
Past Month Use of Marijuana
CA 0.76 0.71 0.75 0.76
FL 0.72 0.76 0.77 0.81
IL 0.62 0.70 0.79 0.74
MI 0.72 0.81 0.73 0.80
NY 0.79 0.70 0.91 0.85
OH 0.67 0.64 0.62 0.67
PA 0.71 0.65 0.65 0.71
TX 0.72 0.72 0.67 0.75
Average 0.71 0.71 0.74 0.76
Past Year Use of Cocaine
CA 0.70 0.66 0.52 0.58
FL 0.53 0.60 0.60 0.64
IL 0.65 0.66 0.46 0.54
MI 0.54 0.58 0.59 0.65
NY 0.46 0.71 0.75 0.79
OH 0.60 0.62 0.61 0.68
PA 0.61 0.59 0.50 0.57
TX 0.62 0.65 0.72 0.71
Average 0.59 0.63 0.59 0.65
Past Month Binge Alcohol Use
CA 0.82 0.76 0.77 0.81
FL 0.71 0.63 0.72 0.73
IL 0.64 0.66 0.70 0.69
MI 0.69 0.75 0.71 0.71
NY 0.74 0.60 0.76 0.77
OH 0.85 0.60 0.75 0.72
PA 0.75 0.59 0.70 0.69
TX 0.79 0.71 0.70 0.72
Average 0.75 0.66 0.73 0.73
Past Month Use of Cigarettes
CA 0.82 0.84 0.65 0.66
FL 0.71 0.74 0.86 0.86
IL 0.67 0.83 0.69 0.69
MI 0.79 0.71 0.73 0.72
NY 0.64 0.76 0.82 0.82
OH 0.72 0.81 0.75 0.75
PA 0.72 0.69 0.81 0.78
TX 0.72 0.74 0.68 0.66
Average 0.72 0.77 0.75 0.74
Note: Ratio = Average width of model-based prediction intervals for substates / Average width of design-based confidence intervals for substates.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999 and 2000.

Table E.18 1999 NHSDA Weighted Screening and Interview Response Rates, by State
State Screening Response Rate Interview Response Rate Overall Response Rate State Screening Response Rate Interview Response Rate Overall Response Rate
Total 89.63 68.55 61.44 Missouri 91.32 73.59 67.21
Alabama 92.60 71.36 66.08 Montana 92.76 76.39 70.86
Alaska 91.07 77.20 70.31 Nebraska 89.99 72.05 64.84
Arizona 94.43 65.87 62.21 Nevada 79.89 63.05 50.37
Arkansas 95.71 80.45 77.00 New Hampshire 85.36 69.87 59.65
California 87.47 64.12 56.08 New Jersey 89.65 65.24 58.48
Colorado 91.62 65.84 60.32 New Mexico 96.12 77.77 74.75
Connecticut 85.62 58.60 50.17 New York 84.28 59.98 50.55
Delaware 87.13 58.36 50.85 North Carolina 92.87 71.84 66.72
District of Columbia 93.35 79.93 74.61 North Dakota 89.89 77.48 69.65
Florida 89.94 68.20 61.33 Ohio 90.35 67.78 61.24
Georgia 90.47 66.97 60.59 Oklahoma 91.58 67.79 62.08
Hawaii 89.11 67.61 60.25 Oregon 85.20 71.57 60.98
Idaho 92.93 75.45 70.11 Pennsylvania 92.34 68.99 63.71
Illinois 87.35 63.74 55.68 Rhode Island 86.68 66.72 57.83
Indiana 91.68 73.06 66.98 South Carolina 91.96 65.92 60.61
Iowa 92.44 69.69 64.41 South Dakota 94.35 76.14 71.84
Kansas 90.59 72.89 66.03 Tennessee 90.92 67.70 61.56
Kentucky 92.36 73.75 68.12 Texas 92.57 75.12 69.54
Louisiana 94.81 76.97 72.98 Utah 93.16 81.70 76.11
Maine 89.96 75.18 67.63 Vermont 90.26 74.49 67.24
Maryland 87.78 64.66 56.76 Virginia 89.84 66.28 59.55
Massachusetts 80.59 61.82 49.82 Washington 86.49 75.06 64.92
Michigan 88.21 66.54 58.70 West Virginia 95.59 74.31 71.03
Minnesota 89.46 77.72 69.53 Wisconsin 90.19 73.05 65.89
Mississippi 94.51 82.77 78.23 Wyoming 93.79 72.62 68.11
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999.

Table E.19 2000 NHSDA Weighted Screening and Interview Response Rates, by State
State Screening Response Rate Interview Response Rate Overall Response Rate State Screening Response Rate Interview Response Rate Overall Response Rate
Total 92.84 73.93 68.64 Missouri 92.25 70.80 65.31
Alabama 95.50 77.98 74.47 Montana 94.91 80.21 76.13
Alaska 95.43 80.24 76.58 Nebraska 93.13 74.58 69.46
Arizona 92.99 73.78 68.61 Nevada 92.08 74.44 68.54
Arkansas 97.19 81.00 78.73 New Hampshire 92.41 75.12 69.42
California 90.99 69.50 63.24 New Jersey 91.96 66.56 61.21
Colorado 94.84 75.26 71.37 New Mexico 97.43 80.80 78.72
Connecticut 89.83 71.36 64.10 New York 88.78 73.73 65.46
Delaware 92.91 68.25 63.42 North Carolina 94.51 73.19 69.17
District of Columbia 93.50 85.56 80.00 North Dakota 94.43 79.46 75.03
Florida 94.64 75.73 71.67 Ohio 94.89 75.79 71.92
Georgia 92.95 69.76 64.84 Oklahoma 93.06 74.85 69.66
Hawaii 91.95 78.45 72.14 Oregon 91.87 73.91 67.90
Idaho 93.94 74.45 69.94 Pennsylvania 94.37 73.50 69.36
Illinois 88.71 65.59 58.19 Rhode Island 91.26 74.11 67.63
Indiana 92.62 73.87 68.42 South Carolina 94.69 77.84 73.71
Iowa 94.78 80.00 75.83 South Dakota 95.15 76.67 72.95
Kansas 92.28 73.45 67.79 Tennessee 90.25 72.45 65.39
Kentucky 95.79 84.14 80.59 Texas 94.72 78.12 74.00
Louisiana 95.04 80.81 76.80 Utah 95.11 83.44 79.36
Maine 92.39 78.46 72.49 Vermont 92.62 80.80 74.83
Maryland 94.88 76.88 72.94 Virginia 91.44 75.18 68.75
Massachusetts 89.77 66.45 59.65 Washington 93.59 75.45 70.61
Michigan 93.19 73.18 68.20 West Virginia 95.19 78.17 74.41
Minnesota 94.66 80.62 76.32 Wisconsin 94.33 75.06 70.81
Mississippi 93.60 79.14 74.07 Wyoming 95.41 76.61 73.09
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 2000.

Table E.20 2001 NHSDA Weighted Screening and Interview Response Rates, by State
State Screening Response Rate Interview Response Rate Overall Response Rate State Screening Response Rate Interview Response Rate Overall Response Rate
Total 91.86 73.31 67.34 Missouri 93.12 78.34 72.95
Alabama 92.20 73.31 67.59 Montana 95.08 77.50 73.68
Alaska 96.03 79.62 76.46 Nebraska 94.04 76.47 71.91
Arizona 93.50 76.41 71.44 Nevada 95.32 75.37 71.84
Arkansas 96.70 75.36 72.88 New Hampshire 92.35 76.00 70.18
California 92.46 71.83 66.42 New Jersey 87.52 70.28 61.51
Colorado 94.78 70.64 66.95 New Mexico 97.07 80.81 78.45
Connecticut 92.16 69.79 64.32 New York 84.33 68.67 57.91
Delaware 92.03 69.07 63.57 North Carolina 92.76 72.11 66.89
District of Columbia 86.40 78.30 67.65 North Dakota 94.38 77.62 73.25
Florida 91.15 72.34 65.94 Ohio 93.46 76.51 71.51
Georgia 91.53 70.84 64.84 Oklahoma 93.07 74.69 69.51
Hawaii 91.13 68.17 62.12 Oregon 93.40 77.36 72.25
Idaho 93.83 76.75 72.01 Pennsylvania 93.65 74.97 70.21
Illinois 85.85 64.39 55.28 Rhode Island 90.97 69.70 63.41
Indiana 92.29 69.68 64.31 South Carolina 94.46 71.52 67.55
Iowa 94.00 77.52 72.87 South Dakota 94.13 80.36 75.64
Kansas 94.35 77.32 72.96 Tennessee 94.37 74.43 70.24
Kentucky 94.76 76.62 72.61 Texas 93.00 77.77 72.33
Louisiana 94.47 74.21 70.11 Utah 96.19 80.23 77.18
Maine 90.69 84.36 76.51 Vermont 93.00 80.29 74.67
Maryland 92.45 79.19 73.21 Virginia 91.50 75.20 68.81
Massachusetts 89.99 67.51 60.76 Washington 93.67 74.07 69.38
Michigan 91.28 73.71 67.28 West Virginia 94.34 70.06 66.10
Minnesota 93.10 79.88 74.36 Wisconsin 92.85 70.98 65.91
Mississippi 95.62 73.73 70.50 Wyoming 94.44 76.73 72.46
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 2001.

Table E.21 Total Number of Respondents in the Incentive Experiment, by State, for 2001
State $0 $20 $40 State $0 $20 $40
Total 4,233 2,489 2,878 Missouri 50 31 40
Alabama 79 45 53 Montana 65 38 69
Alaska 18 10 9 Nebraska 74 23 38
Arizona 63 41 22 Nevada 51 29 75
Arkansas 29 24 10 New Hampshire 91 67 44
California 144 94 93 New Jersey 86 29 30
Colorado 63 54 37 New Mexico 122 25 65
Connecticut 136 66 115 New York 336 209 224
Delaware 120 62 60 North Carolina 26 21 9
District of Columbia 80 54 35 North Dakota 22 17 11
Florida 216 93 142 Ohio 208 106 176
Georgia 28 8 17 Oklahoma 74 58 50
Hawaii 5 11 1 Oregon 68 46 68
Idaho 39 28 23 Pennsylvania 196 103 119
Illinois 313 209 233 Rhode Island 80 48 35
Indiana 7 8 17 South Carolina 71 58 48
Iowa 49 31 29 South Dakota 35 31 41
Kansas 76 42 77 Tennessee 35 36 74
Kentucky 43 25 32 Texas 203 133 90
Louisiana 49 20 17 Utah 80 40 54
Maine 103 42 41 Vermont 21 10 10
Maryland 19 8 15 Virginia 0 0 0
Massachusetts 96 50 55 Washington 75 65 66
Michigan 187 109 157 West Virginia 49 28 39
Minnesota 53 36 24 Wisconsin 0 0 0
Mississippi 43 21 29 Wyoming 57 47 60
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 2001.

Table E.22 Total Number of Respondents, by State, for 1999, 2000, and 2001
State 1999 2000 2001 State 1999 2000 2001
Total 66,706 71,764 68,929 Missouri 903 893 882
Alabama 826 936 885 Montana 899 914 896
Alaska 879 833 951 Nebraska 847 906 920
Arizona 824 927 964 Nevada 756 925 944
Arkansas 926 960 911 New Hampshire 791 883 913
California 4,681 5,022 3729 New Jersey 933 1,200 1,069
Colorado 865 911 886 New Mexico 830 874 872
Connecticut 768 891 1055 New York 2,669 3,589 4,023
Delaware 883 928 893 North Carolina 1,167 1043 852
District of Columbia 776 918 877 North Dakota 951 896 883
Florida 3,096 3,478 3502 Ohio 3,234 3,678 3,706
Georgia 1,164 1,145 940 Oklahoma 858 973 862
Hawaii 895 945 887 Oregon 915 864 880
Idaho 943 894 936 Pennsylvania 3,460 3,997 3,734
Illinois 3,201 3,660 3558 Rhode Island 789 950 895
Indiana 1,044 1,061 915 South Carolina 832 855 891
Iowa 907 921 961 South Dakota 936 855 931
Kansas 886 897 922 Tennessee 938 947 921
Kentucky 969 1,018 911 Texas 3,951 4,020 3,604
Louisiana 934 939 909 Utah 1,280 1031 895
Maine 856 901 896 Vermont 802 981 926
Maryland 887 967 961 Virginia 946 1,047 929
Massachusetts 762 1,002 933 Washington 1,070 1,006 911
Michigan 3,109 3,576 3768 West Virginia 910 950 876
Minnesota 1,019 893 883 Wisconsin 1,066 1,119 883
Mississippi 955 917 885 Wyoming 918 828 913
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999, 2000, and 2001.

1 The panel included William Bell of the U.S. Bureau of the Census; Partha Lahiri of the Joint Program in Survey Methodology and Interim Director, University of Maryland Statistics Consortium; Balgobin Nandram of Worcester Polytechnic Institute; Wesley Schaible, formerly Associate Commissioner for Research and Evaluation at the Bureau of Labor Statistics; J.N.K. Rao of Carleton University; and Alan Zaslavsky of Harvard University. Other attendees involved in the development or discussion were Ralph Folsom, Judith Lessler, Avinash Singh, and Akhil Vaish of RTI and Joe Gfroerer and Doug Wright of SAMHSA.

Go to the Table of Contents

This page was last updated on June 03, 2008.