Methodology Used to Estimate the Bayes
Significance Level for the Null Hypothesis of No Change between
the 2022‑2023 and 2023‑2024 State Population Percentages

This methodology document is a companion piece to the National Surveys on Drug Use and Health: Comparison of the 2022‑2023 and 2023‑2024 Population Percentages (50 States and the District of Columbia) tables. These tables can be found on the NSDUH State Releases web page. These tables present the 2022‑2023 and 2023‑2024 National Surveys on Drug Use and Health (NSDUHs) state estimates and an indication of the statistical significance of the difference or change (p value). These tables were produced for outcomes that were consistently defined across the two time periods and for which 2022‑2023 and 2023‑2024 state‐level small area estimates were available.1 The moving average state estimates for the overlapping 2022‑2023 and 2023‑2024 time periods were obtained from independent applications of the survey‐weighted hierarchical Bayes (SWHB) methodology; that is, the 2023‑2024 models were fit independently of the previously fitted 2022‑2023 models. This independent analysis approach was followed because there was no desire to revise the previously published 2022‑2023 estimates. The methodology used to conduct statistical tests of significance for comparing 2022‑2023 and 2023‑2024 state population percentages is described here.

Let pi 1 sub s and a and pi 2 sub s and a denote the 2022‑2023 and 2023‑2024 population percentages, respectively, for state‑s and age group‑a. The difference between pi 1 sub s and a and pi 2 sub s and a is defined in terms of the log‐odds ratio (lor sub s and a) as opposed to the simple difference because the posterior distribution of lor sub s and a is closer to Gaussian than the posterior distribution of the simple difference (Pi 2 sub s and a minus pi 1 sub s and a represents the simple difference between the 2022-2023 and 2023-2024 prevalence rates.). Let ln denote the natural logarithm, then lor sub s and a is defined as follows:

Equation 1. Click link below to access long description..

View Equation 1 Long Description

The p value given in the above referenced tables is computed to test the null hypothesis of no difference (i.e., Pi 2 sub s and a is equal to pi 1 sub s and a or equivalently, Log-odds ratio lor sub s and a is equal to zero). An estimate of lor sub s and a is given by

Equation 2. Click link below to access long description.,

View Equation 2 Long Description

where pi hat 1 sub s and a and pi hat 2 sub s and a are small area estimates of pi 1 sub s and a and pi 2 sub s and a, respectively.

Let Theta 1 hat equal the ratio of pi hat 1 sub s and a and 1 minus pi hat 1 sub s and a and Theta 2 hat equal the ratio of pi hat 2 sub s and a and 1 minus pi hat 2 sub s and a, noting that subscript sa has been dropped from theta hat 1 and theta hat 2 in order to simplify the notation. An estimate of the posterior variance of lor sub s and a is given by the following formula:

Equation 3. Click link below to access long description.,

View Equation 3 Long Description

where the covariance between the natural logarithm of Theta 1 hat and the natural logarithm of Theta 2 hat denotes the covariance between the natural logarithm of Theta 1 hat and the natural logarithm of Theta 2 hat. This covariance is defined in terms of the associated correlation as follows:

Equation 4. Click link below to access long description..

View Equation 4 Long Description

Note that variance v of the natural logarithm of Theta 1 hat and variance v of the natural logarithm of Theta 2 hat used here to calculate variance v of the estimate of the log-odds ratio, lor hat sub s and a are the same posterior variances used in calculating 2022‑2023 and 2023‑2024 Bayesian confidence intervals, respectively.

The correlation between the natural logarithm of Theta 1 hat and the natural logarithm of Theta 2 hat was obtained by simultaneously modeling the 2022, 2023, and 2024 NSDUH data. This simultaneous modeling approach was adopted based on the results of the validation study2 conducted for measuring change in the 1999‑2000 and 2000‑2001 state population percentages. For this simultaneous model, four age groups (12 to 17, 18 to 25, 26 to 34, and 35 or older) by 3 years (2022, 2023, and 2024), that is, 12 subpopulation‐specific models, were fitted, each with its own set of fixed and random effects. These models used the same predictors (fixed effects) employed in the 2023‑2024 small area estimation models for all 3 years. The general covariance matrices for the state and within‐state random effects were 12 × 12 matrices corresponding to the 12 element (age group × year) vectors of random effects. Note that the survey‐weighted, Bernoulli‐type log likelihood employed in the SWHB methodology was appropriate for this simultaneous model because the 12 (age group × year) subpopulations were nonoverlapping. The correlation between the natural logarithm of Theta 1 hat and the natural logarithm of Theta 2 hat was approximated by the correlation calculated using the posterior distributions of the natural logarithm of pi 1 sub s and a divided by 1 minus pi 1 sub s and a and the natural logarithm of pi 2 sub s and a divided by 1 minus pi 2 sub s and a from the simultaneous model.

Note that for four outcomes,3 the above‐mentioned model did not converge. A different model based on simultaneous modeling of 2022‑2023 and 2023‑2024 data where 2023 data are repeated twice was used to obtain the correlations between 2022‑2023 and 2023‑2024 state estimates. This overlapping year model simultaneously fits eight subpopulation‐specific models (i.e., four age groups × two overlapping time points [2022‑2023 and 2023‑2024]) instead of 12 subpopulation‐specific models. Based on previous validation studies, this model is shown to underestimate the correlations,4 resulting in more conservative tests, meaning that fewer significant differences may have been able to be detected for these outcomes.

To calculate the p value for testing the null hypothesis of no difference (Log-odds ratio lor sub s and a is equal to zero), it is assumed that the posterior distribution of lor sub s and a is normal with estimated Mean is equal to estimate of the log-odds ratio, lor hat sub s and a. and Variance is equal to variance v of the estimate of the log-odds ratio, lor hat sub s and a.. The Bayesian p value or significance level for the null hypothesis of no difference, Log-odds ratio lor sub s and a is equal to zero, is The p value is equal to 2 times the probability of realizing a standard normal variate greater than or equal to the absolute value of a quantity z., where capital Z is a standard normal random variate, Quantity z is the estimate of the log-odds ratio, lor hat sub s and a, divided by the square root of the variance v of the estimate of the log-odds ratio, lor hat sub s and a., and absolute value of quantity z denotes the absolute value of quantity z. This Bayesian significance level (or p value) for the null value of lor sub s and a, say log-odds ratio lor sub zero, is defined following Rubin (1987)5 as the posterior probability for the collection of the lor sub s and a values that are less likely or have smaller posterior density, d of the log-odds ratio lor sub s and a, than the null (no change) value, log-odds ratio lor sub zero. That is,

Equation 5. Click link below to access long description..
View Equation 5 Long Description

With the posterior distribution of log-odds ratio lor sub s and a approximately normal, the p value of log-odds ratio lor sub zero is given by the above expression.

Endnotes

1 For details, see Section B in 2023‑2024 National Surveys on Drug Use and Health: Guide to State Tables and Summary of Small Area Estimation Methodology on the NSDUH State Releases web page.

2 See Appendix E, Section E.2, of the following report: Wright, D. (2003). State estimates of substance use from the 2001 National Household Survey on Drug Abuse: Volume II. Individual state tables and technical appendices (HHS Publication No. SMA 03‑3826, NHSDA Series H‑20). Substance Abuse and Mental Health Services Administration, Office of Applied Studies.

3 The outcomes were cocaine use in the past year (Table 6), heroin use in the past year (Table 8), hallucinogen use in the past year (Table 10), and methamphetamine use in the past year (Table 11).

4 See Appendix E, Section E.1, of the following report: Wright, D. (2003). State estimates of substance use from the 2001 National Household Survey on Drug Abuse: Volume II. Individual state tables and technical appendices (HHS Publication No. SMA 03‑3826, NHSDA Series H‑20). Substance Abuse and Mental Health Services Administration, Office of Applied Studies.

5 See the following reference: Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys (Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics). John Wiley & Sons.

Long Descriptions for Equations

Long description, Equation 1: The log‐odds ratio, lor sub s and a, is defined as the natural logarithm of the ratio of two quantities. The numerator of the ratio is pi 2 sub s and a divided by 1 minus pi 2 sub s and a. The denominator of the ratio is pi 1 sub s and a divided by 1 minus pi 1 sub s and a.

Long description end. Return to Equation 1.

Long description, Equation 2: The estimate of the log‐odds ratio, lor hat sub s and a, is defined as the natural logarithm of the ratio of two quantities. The numerator of the ratio is pi hat 2 sub s and a divided by 1 minus pi hat 2 sub s and a. The denominator of the ratio is pi hat 1 sub s and a divided by 1 minus pi hat 1 sub s and a, where pi 1 sub s and a represents the 2022‑2023 state estimates and pi hat 2 sub s and a represents the 2023‑2024 state estimates.

Long description end. Return to Equation 2.

Long description, Equation 3: Variance v of the estimate of the log‐odds ratio, lor hat sub s and a, is a function of three quantities: q1, q2, and q3. It is expressed as the sum of q1 and q2 minus q3. Quantity q1 is the variance v of the natural logarithm of Theta 1 hat, quantity q2 is the variance v of the natural logarithm of Theta 2 hat, and quantity q3 is 2 times the covariance between the natural logarithm of Theta 1 hat and the natural logarithm of Theta 2 hat.

Long description end. Return to Equation 3.

Long description, Equation 4: The covariance between the natural logarithm of Theta 1 hat and the natural logarithm of Theta 2 hat is equal to the correlation between the natural logarithm of Theta 1 hat and the natural logarithm of Theta 2 hat multiplied by the square root of the product of the variance v of the natural logarithm of Theta 1 hat and variance v of the natural logarithm of Theta 2 hat.

Long description end. Return to Equation 4.

Long description, Equation 5: The p value of log‐odds ratio lor sub zero is equal to the probability of d of the log‐odds ratio lor sub s and a when it is less than or equal to d of the log‐odds ratio lor sub zero.

Long description end. Return to Equation 5.

▲ TOP