Go to the Table Of Contents

Risk and Protective Factors for Adolescent Drug Use:
Findings from the 1999 National Household Survey on Drug Abuse

Chapter 4. Prediction of Past Year Substance Use Using Multiple Regression Models

4.1 Introduction

Earlier chapters, using descriptive statistics and simple odds ratios (ORs), presented the prevalence of risk and protective factors and the associations of those factors with past year marijuana use. This chapter presents the strength of the relationship between risk and protective factors and marijuana use using multiple logistic regression models, in which the associations with past year marijuana use are adjusted for both demographic variables and other risk and protective factors included in the models. This chapter addresses the following issues:

The word "prediction" is used not to imply that events have occurred in a certain sequence, but to describe a statistical question: "How well does statistical information about one characteristic improve one's ability to guess what happened to a different characteristic?" For example, if knowing the employment status of each person in a group would improve how well early initiation of marijuana use could be estimated, employment status would be called a "predictor," without necessarily meaning that employment status came first. Moreover, there are statistical methods for determining just how strong a predictor employment status may prove to be in any given group of people. When a number of predictors are used together in a statistical analysis of this kind, the combination of predictors is referred to as a "prediction model."

Because of the complex survey design of the National Household Survey on Drug Abuse (NHSDA), the regression analyses were performed using the LOGISTIC procedure in SUrvey DAta ANalysis (SUDAAN), a statistical program employing variance estimation calculations that take into account this complexity (Shah, Barnwell, & Bieler, 1998). Note that the initial analyses use simple individual (person-level) logistic regression models that adjust for the effects of clustering on the estimates but otherwise ignore the true hierarchical structure of the data, namely, the fact that youths aged 12 to 17 are nested within families that are, in turn, nested in neighborhoods. Therefore, these analyses treat variables at the higher levels of hierarchy as being individual (youth) variables.17 Analyses presented later in the chapter address the hierarchical structure of the data and the utility of including this structure in prediction models.

Multiple logistic regression determines the importance of individual predictor variables by testing whether these factors account for a statistically significant amount of variation in the dependent variable after controlling for other predictor variables included in the model. Multiple logistic regression can also determine the relative importance of groups of variables by measuring how much (additional) variation in the dependent variable that one group of predictor variables can explain beyond another group of variables. The lack of statistical significance of a predictor variable does not imply that the variable is unimportant in the epidemiology of substance use. For example, the variable may have a significant indirect relationship to the dependent variable through another independent variable in a path analysis. Other analysis techniques, such as structural equation modeling, may be more appropriate for analyzing those relationships.

First, results are presented for individual-level models predicting past year use of marijuana. This involves a comparison of the explained variation of each of the four domains as well as a "full model" that contains a set of demographic variables and factors from all four domains. Second, results are presented for individual-level models predicting past year use of cigarettes and alcohol. Third, simple hierarchical models are used to highlight the difference between hierarchical models and ordinary least squares models.

4.2 Past Year Use of Marijuana

4.2.1 Comparisons Between Domains

In this section, three separate multiple regression models of past year marijuana use are presented for each of the four domains discussed in Chapter 1.18 The first regression model (Model 1) includes only a set of demographic variables: race/ethnicity, gender, age, number of parents in the home, household income, geographic region, and county type. The second model (Model 2) includes all the risk and protective factors that comprise the domain. The third model (Model 3) includes both the set of demographic variables as well as the risk and protective factors that comprise the domain. Comparisons of Model 2 with Model 1 assess whether the set of factors that make up each domain are more or less predictive of past year marijuana use than the set of demographic variables. Comparisons of Model 3 with Model 2 assess the extent to which the addition of the set of demographic factors improves the predictiveness of the set of risk and protective factors that comprise the domain.

The results of these models are presented in Tables 4.1 through 4.4, with each domain presented in a separate table. For each model, these tables present the regression coefficient (or Beta) and OR for each predictor, a significance test for each predictor, and two measures that summarize the explanatory power for the model as a whole. The OR is easier to understand than the regression coefficient, both of which are measures that describe the strength and direction of the relationship between the predictors and past year marijuana use. For example in Table 4.1, the OR for gender indicates that the odds of past year marijuana use were 1.18 times higher for males than for females, after controlling for other demographic variables.19 The p value for this is less than 0.05, indicating that gender is a significant variable in Model 1 after controlling for the other demographic variables. With the exception of the comparison between Hispanic youths and white youths, there were significant associations between each demographic variable and past year marijuana use in Model 1.

The summary measures in Table 4.1 indicate that the set of community domain factors (Model 2) accounted for significantly more variance (R2 = 0.17; RN2 = 0.31) than the demographic variables in Model 1 (R2 = 0.09; RN2 = 0.15).20 The addition of the demographic variables to the model with the community domain factors (Model 3) resulted in only a slight improvement in explanatory power (R2 = 0.19; RN2 = 0.34) compared with Model 2. The results were similar for the peer/individual domain (Table 4.3) and the school domain (Table 4.4). In both of these domains, the risk and protective factors that comprised the domains (Model 2) accounted for significantly more variance than the demographic variables (Model 1), and the addition of the demographic variables to the factors in these domains (Model 3) did little to improve the model. In the family domain (Table 4.2), the set of risk and protective factors accounted for a similar amount of variance (R2 = 0.10; RN2 = 0.17) compared with the set of demographics. In addition, the model that included the set of family risk and protective factors and the set of demographic variables accounted for significantly more variation (R2 = 0.15; RN2 = 0.25) than the model that included only the set of family domain factors.

Of the four domains, the factors in the peer/individual domain accounted for the most variation in past year marijuana use by youths (R2 = 0.30; RN2 = 0.53). Following this were the community domain (R2 = 0.19; RN2 = 0.34) and the school domain (R2 = 0.18; RN2 = 0.32). The family domain accounted for the least amount of variation in past year marijuana use (R2 = 0.15; RN2 = 0.25). However, it should be noted that these estimates of relative contribution are based only on the items used to measure these constructs and the methodology of the 1999 NHSDA. It could be that other measures of these constructs, or other research methodologies, would result in different relative contributions for these domains.

4.2.2 Full Model, Across Domains

In this section, tables are presented in which certain risk and protective factors from all four domains were combined into a single model. Table 4.5 presents a "combined reduced" model that includes the set of demographic variables as well as all of the risk and protective factors that were significant predictors of past year marijuana use in Model 3 of Tables 4.1 through 4.4. Collectively, this set of variables accounted for more variation in past year marijuana use (R2 = 0.33; RN2 = 0.56) than any of the domains individually. However, this combined reduced model improved only slightly on the variance accounted for by the model that contained only demographics and the factors in peer/individual domain (see Model 3 of Table 4.3).

The combined reduced model presented in Table 4.5 included all of the risk and protective factors that were significant in the test of the different domains. As a result, some of the factors in the combined reduced model were not significant. In an effort to obtain a more parsimonious model, a "final" model was created that included the set of demographic variables as well as the risk and protective factors that were significant in the combined reduced model (Table 4.6). This final model accounted for the same amount of variation as the combined reduced model. These results indicate that the variables in this model accounted for a significant percentage of the total variation in whether a youth used marijuana in the past year. To the extent that the model includes risk and protective factors that have been demonstrated in well-designed prevention programs to reduce marijuana use, application of such programs has the potential of reducing youth marijuana use. By contrast, if the prevention factors had only accounted for a small percentage of the total variation, this could raise concern that programs aimed at reducing the levels of the variables in the model might not reduce usage of marijuana among youths in a significant way. It is worth emphasizing that the NHSDA is an annual cross-sectional survey that provides a snapshot of the relationship between these risk and protective factors and marijuana use for youths who have been surveyed at some point during 1999. A number of youths aged 12 to 17 reported that they used marijuana in the past year and indicated the presence of various risk or protective factors. However, the use of marijuana may have preceded the presence of the risk factor for some youths, resulting in a somewhat "inflated" RN2. Therefore, one should be cautious in drawing conclusions about youth marijuana use from the set of risk and protective factors reported in the NHSDA.

In the final model, the strongest associations with past year marijuana were found with the risk factors in the peer/individual domain; youths were more likely to have used marijuana in the past year if they reported higher levels of antisocial behavior (OR = 2.13), had friends who used marijuana (OR = 2.07), perceived low risks from marijuana use (OR = 1.79), and had more positive individual attitudes toward marijuana use (OR = 1.71) (Table 4.6). Among the protective factors, youths were less likely to have used marijuana in the past year if they listed their parents as a source of social support (OR = 0.71) and if they had been exposed to prevention messages in the media (OR = 0.81).

There were some variables that had ORs that were counterintuitive. One reason this can occur is the cross-sectional nature of the survey. For example, the final model indicated that youths were more likely to have used marijuana in the past year if their parents had talked with them about the dangers of substance use in the past year (OR = 1.55). This association does not necessarily indicate that parental communication with youths about the dangers of substance use increases the likelihood that they will use marijuana; it is possible that this association is the result of increased communication about the dangers of substance use among parents who know or suspect that their children are using, or are in danger of using, marijuana. Another reason for ORs that are counterintuitive to expectations is that the association between a given variable and marijuana use can be affected by the inclusion of other variables in the model. For example, Model 1 in Table 4.1 indicated that males were more likely to have used marijuana in the past year (OR = 1.18) compared with females. The final model, however, indicated that after controlling for risk and protective factors from all domains, males were less likely to have used marijuana in the past year (OR = 0.85) than females.

A small number of the risk factors in the final model were highly correlated with each other (see Tables A.9 to A.11 in Appendix A for intercorrelations between factors). For example, friends' use of marijuana was highly correlated (r = 0.67) with perceived prevalence of marijuana at school. This type of "multicollinearity" of predictors can be problematic, as it can reduce the ability of each individual predictor to make a unique contribution to explained variation in the outcome measure (Cohen & Cohen, 1983). To test whether these high intercorrelations had a sizable effect on the final model, the model was repeated after eliminating three variables: friends' use of marijuana, friends' attitude toward marijuana use, and perceived prevalence of marijuana use at school. Eliminating these three variables acted to eliminate all correlations higher than r = 0.50 from the set of predictors. The removal of these factors, all of which were significant predictors of past year marijuana use in the final model (Table 4.6), had little effect; the adjusted R2 of this reduced model was only slightly lower (RN2 = 0.52) compared with the final model (RN2 = 0.57). In addition, the fact that all three of these variables were significant in the final model, in which all predictors were adjusted for the other predictors in the model, suggests that each does account for unique variation in past year marijuana use among youths.

4.3 Past Year Use of Cigarettes and Alcohol

Models predicting past year use of cigarettes are presented in Tables 4.7 through 4.9. Table 4.7 presents the results of four models; each model contained the risk and protective factors from one domain,21 in addition to the set of demographic variables. The factors that were significant in these models, along with the demographic variables were then included in the combined reduced model (Table 4.8). The risk and protective factors that were significant in the combined reduced model were then included in the final model (Table 4.9). Similar models predicting any past year use of alcohol are presented in Tables 4.10 through 4.12.22

In terms of explained variation as measured by the Nagelkerke R2, the final models for past year cigarette use (R2 = 0.29; RN2 = 0.43) and past year alcohol use (R2 = 0.34; RN2 = 0.46) accounted for less variation than the final model for past year marijuana use (R2 = 0.33; RN2 = 0.56).23 As was the case for the final model of past year use of marijuana, the strongest predictors of past year cigarette and alcohol use were the peer/individual risk factors. Friends' use of cigarettes and friends' use of alcohol were the strongest predictors in these models.

4.4 Hierarchical Models

The following discussion provides some general background to hierarchical modeling and some simple models. Raudenbush and Bryk (2002) provide further information on the diversity and advantages of hierarchical models.

4.4.1 Background

Hierarchical modeling has been described under a variety of names historically: mixed-effects models, random-effects models, random-coefficient regression models, and covariance components models. Raudenbush and Bryk (2002, pp. 5–6) give the following description for these types of mixed models:

The models discussed in this book appear in diverse literatures under a variety of titles. In sociological research, they are often referred to as multilevel linear models (cf. Goldstein, 1995; Mason et al., 1983). In biometric applications, the terms mixed-effects models and random-effects models are common (cf. Elston & Grizzle, 1962; Laird & Ware, 1982; Singer, 1998). They are also called random-coefficient regression models in the econometrics literature (cf. Rosenberg, 1973; Longford, 1993) and in the statistical literature have been referred to as covariance components models (cf. Dempster, Rubin, & Tsutakawa, 1981; Longford, 1987).

In this report, the above models are referred to collectively as hierarchical models in order to emphasize the nested and clustered nature of the data that has a direct impact on assumptions about dependence of observations within and across hierarchical levels. There has been a significant amount of analysis in areas such as education (Bock, 1989; Bryk, Thum, Easton, & Luppescu, 1998; Morris, 1995). In elementary and secondary education, one typical structure consists of students nested within classrooms, which are in turn nested within schools, which are nested within school districts. Another type of structure is repeated measures, where observations over time are nested within an individual. The focus of much of that analysis has been on the effects of school administration and quality of teachers, or teaching, on student achievement. Although there has been some application of these models to the field of substance use (i.e., Duncan, Duncan, Hops, & Alpert, 1997; Kreft, 1994; Novak & Clayton, 2001), their application has not been as prevalent in the field of substance use as in the field of education.

In the current study, the focus regarding hierarchical models is the effect of family and community characteristics on the use of marijuana by youths aged 12 to17. The prevention literature includes numerous risk and protective factors for youth substance use that are a function of family or community characteristics; those included in the 1999 NHSDA are listed in Tables A.1 and A.2 in Appendix A.

Historically, analyses in a variety of areas treated clustered observations as independent—failing to account for the fact that units within the same cluster tend to be more similar to each other than to units outside the cluster.24 For example, members of the same family or persons in the same neighborhood tend to share characteristics that make them more similar to each other than to other persons. One result of assuming independent observations at the person level when that is not true is that the researcher may conclude that certain explanatory variables are significant (i.e., significantly different from 0), when in fact, they are not. Because within-cluster correlation tends to be positive, a realistic effective sample size is typically smaller than the nominal sample size. Hence, variances estimated under the independence assumption tend to be too small. Another result of assuming independent observations at the person level is that it has "fostered an impoverished conceptualization, discouraging the formation of explicit multilevel models with hypotheses about effects occurring at each level and across levels" (Raudenbush & Bryk, 2002, p. 5). In the case of continuous data, the classical assumptions are that the observations are independently normally distributed and the model residuals have a common mean and variance. It is not necessary, however, to make these restrictive assumptions if they are unrealistic.

In the case of a model in which the dependent variable of interest is dichotomous (e.g., used or did not use marijuana in the past year), the observations are conditionally Bernoulli distributed (a special case of the binomial) given the explanatory variables, and the predicted probabilities of "success" are typically transformed by taking the log of the odds (the logit function). However, in this form there are difficulties in describing how much of the total variation in the dependent variable has been "explained" by the model because the measures of variance and explained variation are also in the log odds metric. Some of the issues involved in accurately estimating the parameters of a hierarchical model when the dependent variable is binary are discussed in Rodriguez and Goldman (1995) and Goldstein and Rasbash (1996).

In a nested hierarchical design, when the original data are normally distributed, the total variation in the dependent variable can be broken down into components at each level of the hierarchy. For example, if the dependent variable were the student math achievement score on a test and those scores followed a normal distribution, the total variation could be partitioned into the part deriving from student variation (within schools) and the part from school variation (between schools). The first part would be determined by the variation among students within a school, averaged over all schools. The second part would be characterized by the variation in the average student score between schools. The percentage of total variation that is between schools then is an indication of the magnitude of influence in student scores that is determined by school characteristics. The interest then might be in identifying what those school characteristics are that lead to higher math achievement scores given the same set of students.

When the dependent variable is dichotomous, as it is for past year use of marijuana, and the predicted probabilities of "success" have been transformed into the log odds metric, the predicted probabilities of success can be retransformed into the original metric, which can be used in predicting prevalences.25 From such analysis, the overall variance for past year use of marijuana can be partitioned into three parts corresponding to variation accounted for by the person level, the family level, and the neighborhood level. The person level refers to the individual choices that a youth makes to either use or not use a substance. The family level refers to the degree of influence the family with whom a youth lives has on the youth's substance use. The neighborhood level refers to the degree of influence the neighborhood in which a youth lives has on the youth's substance use. The partitioning described in this report assumes a nested structure in which youths live in households (referred to as families), and the households are situated in neighborhoods (defined by groups of contiguous Census blocks, which are the first stage of sampling for the NHSDA). Analyses using the 1999 NHSDA have indicated that the person level accounts for 78 percent of the total variation in past year marijuana use among youths, the family accounts for 16 percent of the total, and the neighborhood accounts for the remaining 6 percent.26 One way to interpret this information is that youth reports about using marijuana in the past year appear to be mostly influenced by their own choices (78 percent) and not by the family (16 percent) or neighborhood (6 percent). Experience with these percentages for different NHSDA years confirms that the percentages have remained fairly constant.

Another way to better understand this information is to consider what the estimates would have been under other circumstances. If youths in each neighborhood (group of contiguous Census blocks) included in the survey reported the same percentage of marijuana use in the past year (e.g., 10 percent of youths in every sampled neighborhood reported using marijuana in the past year), the variation accounted for by the neighborhood level would have been 0 percent. If, on the other hand, there was a large amount of variation between neighborhoods in the youth reports of marijuana use (e.g., a small percentage of youths in some of the sampled neighborhoods reported use whereas a large percentage of youths in othersampled neighborhoods reported use), the neighborhood level would have accounted for a large percentage of the total variation. At the family level, if youth marijuana use was completely controlled by factors that exist within the household in which a youth lives (e.g., the influence of parents and siblings), all youths living in the same household would report the same level of marijuana use. In this case, the total variation in youth marijuana use accounted for by the family level would be larger, and variation accounted for by the person level would be smaller.

It is important to state that the contributions of the family and neighborhood presented in this report are overall results for the United States for youths aged 12 to 17. It is likely that the actual impact by the family (e.g., the impact of parents) or the neighborhood differ for different demographic groups within the overall youth population. For example, some cross-sectional research has suggested that the influence of parents on the behavior of youths decreases as youths get older (Kandel, 1996; Krosnick & Judd, 1982). To the extent that this true, family-level variables may account for more variation in the substance use of youths aged 12 to 14 than youths aged 15 to 17. Because of this perception of greater parental influence during early adolescence, and because most youths do not initiate substance use before age 12 (Gfroerer, Wu, & Penne, 2002), most family-based prevention programs labeled as "model programs" by the Center for Substance Abuse Prevention (CSAP, 2001) are targeted toward youths in their preteen or early-teenage years.

4.4.2 Models

To simplify the discussion of the advantages of hierarchical modeling, the analysis presented below focuses on a continuous measure of perceived risk of marijuana use (RSKMJUSE) rather than the dichotomous measure of past year marijuana use that was employed in previous models. The use of a scaled continuous variable that is assumed to be normally distributed simplifies the discussion by rendering the interpretation of explained variation easier to understand. Perceived risk is a scaled variable based on the average of responses to two questions: "How much do people risk harming themselves physically and in other ways when they smoke marijuana once a month?" and "How much do you think people risk harming themselves physically and in other ways when they smoke marijuana once or twice a week?" The response options for both questions are (1) great risk, (2) moderate risk, (3) slight risk, and (4) no risk. Perceived risk of marijuana use is typically closely associated with marijuana use among youths. For example, the 1999 NHSDA indicated that 52.2 percent of youths who perceived no risk of using marijuana once a month had used marijuana in the past year compared with 24.7 percent among those who perceived slight risk, 9.0 percent among those who perceived moderate risk, and only 4.6 percent among those who perceived great risk.

A series of models were fit in which three covariates that might reasonably be expected to have explanatory power at the community, family, and individual levels were introduced sequentially. The definition of community used in the present study is the segment, which is a Census block or group of contiguous Census blocks (where the blocks are those defined by the U.S. Bureau of the Census). The community-level variable was a dichotomous measure (yes/no) asking whether the youth had been approached by a drug seller in the past 30 days. As a perceived community-level variable, the responses to this question were "averaged up" to the community (segment) level. Put another way, the mean value of all respondents in a given segment was calculated, and all respondents in that segment were assigned this mean value for this variable. The family-level variable was how often parents had helped the youth with homework during the past 12 months. The response options for this question were (1) never, (2) seldom, (3) sometimes, or (4) always. A maximum of two youths from the same family could be included in the 1999 NHSDA; in cases where two youths from the same family were interviewed, each youth was assigned the average of the responses for the two youths in the family. The person-level variable was a scaled score measuring the youths' attitude toward youth substance use, assessed using three questions asking "How do you feel about someone your age trying (marijuana/hashish once or twice) (smoking one or two packs of cigarettes per day) (having one or more drinks of an alcoholic beverage nearly every day)?" The response options for each question were (1) strongly disapprove, (2) somewhat disapprove, or (3) neither approve nor disapprove.

Exhibit 4.1, shown on the facing page, presents the estimates of variance components for each level, the total variance estimates, the fixed effects estimates, and the standard errors of five models involving these variables. For completeness, both fixed effects as well as random effects are included. The discussion centers on the estimates of variance components because these illustrate the main points of interest. The model assumptions are summarized below as model equations of the form Yijk = Fixed effects + Random effects, where the random effects are assumed to be independently normally distributed. The Y variable is perceived risk of marijuana use (RSKMJUSE). The analysis does not use the sample weights and is focused on a few simple models to assess whether the hierarchical modeling represents an improvement over a strictly person-level model. The software used was MlwiN (Version 1.1).

Exhibit 4.1 Estimates of Variance Components, Estimates of Fixed Effects, and Standard Errors for Hierarchical Models for Perceived Risk of Marijuana Use as Functions of Community-Level, Family-Level, and Person-Level Explanatory Variables: 1999

Model Estimates of Variance Components Estimates of Fixed Effects
Community Level (SE) Family Level (SE) Person Level (SE) Total1 (SE) Community Level (SE) Family Level (SE) Person Level (SE) Intercept (SE)
1.  Random Effects (RE) Only .026 (.004) .125 (.011) .556 (.011) .707 (.016) 1.831 (.006)
2.  (RE) & Community (C) .013 (.003) .121 (.011) .557 (.011) .691 (.016) .247 (.011) 1.830 (.006)
3.  (RE) & (C) & Family (F) .011 (.003) .092 (.011) .551 (.011) .654 (.016) .209 (.011) .204 (.006) 1.830 (.006)
4.  (RE) & (C) & (F) & Person (P) .011 (.003) .075 (.009) .491 (.010) .577 (.016) .138 (.010) .098 (.006) .300 (.005) 1.830 (.005)
5.  Fixed Effects (FE) Only .576 (.005) .576 (.005) .139 (.010) .098 (.006) .302 (.005) 1.829 (.005)
Legend: Model 1 (random effects only) has no fixed effects but includes random effects at the person, family, and community level. Model 2 includes the same three random effects, as well as a community-level fixed effect (C) (approached by someone selling drugs). Model 3 has the same effects as Model 2, as well as a family-level fixed effect (F) (parents helped with homework during the past year). Model 4 has the same effects as Model 3, as well as a person-level fixed effect (P) (favorable attitude toward drug use). Model 5 includes only the fixed effects for the community, family, and person levels.

1 The total column is the sum of the community-, family-, and person-level columns and indicates the total variation left unexplained by the model. The total variance in Model 1 (.707), which contains only random effects, represents the total unexplained variation in the perceived risk of marijuana use. For Models 2 to 4, the total column indicates how much of the explainable variation in the perceived risk of marijuana use is left unexplained after adding fixed effects to the random effects. For example, Model 4, which includes the random effects of Model 1 plus three fixed effects (one variable for each of the levels), indicates that 82 percent (.577/.707 x 100) of the total variation is still unexplained; however, the hierarchical model indicates how much is unexplained (equal to 1 minus the percentage of explained variation) at each of the three levels. Model 5 is a single-level (person-level) model that treats each of the fixed effect variables as person-level variables. The total unexplained variation in Model 5 is the same (approximately) as that for Model 4, but in Model 5 there is no information about the variance components at each of the levels. Also, the standard error of the total variance in Model 5 is understated because the clustering of persons is not taken into account.

Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999.

Model 1 simply contained a constant and a random effect for each level of the hierarchy. This model can be represented using the following notation:

RSKMJUSEijk = B0 + v0k + u0jk + e0ijk .

In this notation, RSKMJUSEijk denotes the ith individual, in the jth family, in the kth neighborhood (segment), B0 is the fixed intercept, v0 is the random effect of the neighborhood, u0 is the random effect of the family, and e0 is the random effect of the individual. The random effects are assumed to be mutually statistically independent with zero means and variances var(v0), var(u0), var(e0). The total variation explained by this model, determined by summingacross the three levels, was .026 + .125 + .556 = .707 (Exhibit 4.1). Most of the variation is at the person level: .556 / .707 = 79 percent of the total variation. The second largest component is at the family level, .125 (18 percent of the total). The remaining variation, .026 (about 3 percent of the total), is at the neighborhood level.

Model 2 contained the random effects included in Model 1, and also included the fixed effect for the community-level variable (COMMUNITY) asking about being approached by a drug seller in the past 30 days. The model then became

RSKMJUSEijk = B0 + B1k * COMMUNITY + v0k + u0jk + e0ijk .

The error terms are now residual variances. The results indicate that compared with Model 1, the community-level variation remaining (to be explained) dropped by half from .026 to .013. The family variation dropped slightly, from .125 to .121. The person-level variation was similar to Model 1's.

Model 3 contained the effects included in Model 2 (random effects and the fixed effects for the community-level), as well as the fixed effect for the family-level variable (FAMILY) asking how often parents help youths with their homework. The model then became

RSKMJUSEijk = B0 + B1k * COMMUNITY + B2jk * FAMILY + v0k + u0jk + e0ijk .

Compared with Model 2, the family-level variation dropped from .121 to .092. The community-level variation dropped slightly from .013 to .011. The person-level variation remaining dropped slightly from .557 to .551.

Model 4 contained the effects included in Model 3 (random effects as well as fixed effects at the community and family levels), and it also included the fixed effects for the person-level variable (PERSON) asking about positive attitudes toward drug use. The model then became

RSKMJUSEijk = B0 + B1k * COMMUNITY + B2jk * FAMILY + B3ijk * PERSON + v0k + u0jk + e0ijk .

Compared with Model 3, the person-level variation fell from .551 to .491. The family-level variation also dropped slightly from .092 to .075. The community-level variation remained unchanged.

In Model 4, approximately 18 percent of the total variation ([1 - (.577 / .707)] H 100 = 18.4 percent) has been explained. Among the variables at the different levels, approximately 12 percent of the person-level variation has been explained ([1 - (.491 / .556)] H100 = 11.7 percent);40 percent of the family-level variation has been explained ([1 - (.075 / .125)] H 100 = 40.0 percent); and 58 percent of the community-level variation has been explained ([1 - (.011 / .026)] H 100 = 57.7 percent).

Model 5, for comparison purposes, contained only the individual-level regression model (i.e., fixed effects for the community, family, and person-level variables). This model can be represented using the following notation:

RSKMJUSEijk = B0 + B1k * COMMUNITY + B2jk * FAMILY + B3ijk * PERSON + e0ijk .

This indicates that the overall total variation is similar (.576 for Model 5 and .577 for Model 4), but Model 5 does not include information on how much of the variation has been explained at each level. In addition, the standard errors for the estimates of the fixed effects of the variables B0, B1, B2, and B3 from Model 5 would typically be somewhat smaller (underestimates) than those reported in Models 2 to 4 because they would assume independence within the family and within the neighborhood. However, there is little difference between these standard errors in this case because of the magnitude of the individual-level variation (relative to the family and neighborhood components) and the large overall sample size.

4.4.3 Comments

The examples above are meant to clarify some of the differences between hierarchical models and ordinary least squares individual-level regression models, especially the incorporation of the correct assumptions about dependence among observations and the improved understanding of explained variation based on multiple levels of variation. It should be noted that there are numerous additional advantages to hierarchical modeling, such as the ability to build separate regression models at each level of the hierarchy, and to further relax assumptions so that both the individual coefficients (slopes) can vary across units at the same level as can the variances of those units (Raudenbush & Bryk, 2002).

Table 4.1 Results of Logistic Regression Models Predicting Past Year Marijuana Use with Demographics and Community Domain Risk and Protective Factors among Youths Aged 12 to 17: 1999

  Model 1: Demographics Model 2: Community Risk/Protective Factors Model 3: Demographics + Community Risk/Protective Factors
Beta OR1 95% CI p value Beta OR1 95% CI p value Beta OR1 95% CI p value
Intercept -9.06 <.0001 -6.67 <.0001 -10.50 <.0001
Demographics                        
     Race/ethnicity                        
     Black vs. white -0.43 0.65 (0.55, 0.78) <.0001 -0.72 0.49 (0.40, 0.60) <.0001
     Hispanic vs. white -0.10 0.90 (0.76, 1.06) .2155 -0.10 0.90 (0.75, 1.09) .2767
     Other vs. white -0.39 0.67 (0.50, 0.91) .0095 -0.03 0.97 (0.68, 1.37) .5120
Gender - male vs. female 0.16 1.18 (1.07, 1.29) .0008 0.15 1.16 (1.05, 1.29) .0480
Age (continuous - 12 to 17) 0.52 1.68 (1.63, 1.72) <.0001 0.30 1.35 (1.30, 1.40) <.0001
Number of parents in home (2 vs. others) -0.67 0.51 (0.46, 0.57) <.0001 -0.44 0.65 (0.57, 0.73) <.0001
Economic deprivation (household income under $20,000) -0.16 0.85 (0.74, 0.98) .0242 -0.25 0.78 (0.65, 0.92) .0038
Geographic region                        
     Northeast vs. West -0.20 0.82 (0.70, 0.96) .0119 -0.11 0.89 (0.75, 1.06) .2007
     North Central vs. West -0.17 0.84 (0.73, 0.96) .0127 -0.13 0.88 (0.75, 1.03) .1161
     South vs. West -0.26 0.77 (0.68, 0.88) .0001 -0.15 0.86 (0.75, 1.00) .0442
County type                        
     Large MSA vs. non-MSA 0.19 1.21 (1.07, 1.36) .0023 0.16 1.17 (1.03, 1.34) .0190
     Small MSA vs. non-MSA 0.21 1.24 (1.09, 1.41) .0008 0.17 1.19 (1.03, 1.36) .0149
Community Domain2                        
Community disorganization and crime -0.15 0.86 (0.79, 0.95) .0017 0.00 1.00 (0.91, 1.10) .9587
Neighborhood cohesiveness 0.01 1.01 (0.94, 1.09) .7262 0.03 1.03 (0.96, 1.11) .4024
Community attitudes toward marijuana use 0.37 1.44 (1.33, 1.56) <.0001 0.29 1.34 (1.23, 1.45) <.0001
Community norms toward marijuana use 1.00 2.71 (2.48, 2.97) <.0001 0.99 2.70 (2.46, 2.96) <.0001
Availability of marijuana 0.82 2.26 (2.12, 2.41) <.0001 0.68 1.97 (1.84, 2.12) <.0001
Exposed to prevention messages in the media -0.25 0.78 (0.67, 0.90) .0006 -0.26 0.77 (0.67, 0.89) .0006
Sample size 25,357 23,031 23,031
R2 (see footnote 3) 0.09 0.17 0.19
RN2 (see footnote 4) 0.15 0.31 0.34

OR = odds ratio; CI = confidence interval; MSA = metropolitan statistical area.
1 ORs are derived from multiple logistic regression models and adjusted for other variables included in each model. ORs > 1.0 indicate that the odds of past year marijuana use increased with each unit increase in the predictor. For risk factors, each unit increase in the predictor generally indicates an increased risk of marijuana use. For protective factors, each unit increase in the predictor generally indicates a higher level of protection against marijuana use.
2 The questions used to measure each of the factors are provided in Appendix A (Table A.1). The coding and distribution of the responses for each factor are provided in Table 2.1.
3 Cox and Snell (1989) R2 is a measure of the fit of the model, defined as 1 minus a certain quantity raised to the power of 2 over n, where n is the sample size. The aforementioned quantity is the ratio of the likelihood of the intercept-only model to the likelihood of the full model where L(O) is the likelihood of the intercept-only model, The likelihood of the full model is the likelihood of the full model, and n is the sample size.
4 Recognizing that the Cox and Snell R2 reaches a maximum for models that depend on the value of the estimated percentage, Nagelkerke (1991) proposed dividing the Cox and Snell measure by the maximum. In this sense, RN2 measures the absolute percentage of variation explained by the model.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999.

Table 4.2 Results of Logistic Regression Models Predicting Past Year Marijuana Use with Demographics and Family Domain Risk and Protective Factors among Youths Aged 12 to 17: 1999

  Model 1: Demographics Model 2: Family Risk/Protective Factors Model 3: Demographics + Family Risk/Protective Factors
Beta OR1 95% CI p value Beta OR1 95% CI p value Beta OR1 95% CI p value
Intercept -9.06 <.0001 -3.59 <.0001 -9.25 <.0001
Demographics                        
Race/ethnicity                        
     Black vs. white -0.43 0.65 (0.55, 0.78) <.0001 -0.34 0.71 (0.58, 0.88) .0014
     Hispanic vs. white -0.1 0.90 (0.76, 1.06) .2155 0.09 1.09 (0.89, 1.34) .3856
     Other vs. white -0.39 0.67 (0.50, 0.91) .0095 -0.39 0.67 (0.47, 0.96) .0312
Gender - male vs. female 0.16 1.18 (1.07, 1.29) .0008 0.13 1.14 (1.02, 1.28) .0189
Age (continuous - 12 to 17) 0.52 1.68 (1.63, 1.72) <.0001 0.44 1.56 (1.50, 1.61) <.0001
Number of parents in home (2 vs. others) -0.67 0.51 (0.46, 0.57) <.0001 -0.56 0.57 (0.50, 0.65) <.0001
Economic deprivation (household income under $20,000) -0.16 0.85 (0.74, 0.98) .0242 -0.21 0.81 (0.68, 0.97) .0205
Geographic region                        
     Northeast vs. West -0.2 0.82 (0.70, 0.96) .0119 -0.14 0.87 (0.72, 1.04) .1285
     North Central vs. West -0.17 0.84 (0.73, 0.96) .0127 -0.16 0.85 (0.73, 1.00) .0481
     South vs. West -0.26 0.77 (0.68, 0.88) .0001 -0.17 0.85 (0.73, 0.98) .0282
County type                        
     Large MSA vs. non-MSA 0.19 1.21 (1.07, 1.36) .0023 0.20 1.22 (1.06, 1.41) .0057
     Small MSA vs. non-MSA 0.21 1.24 (1.09, 1.41) .0008 0.13 1.14 (0.98, 1.32) .0925
Family Domain2                        
Parental monitoring 0.77 2.16 (1.96, 2.38) <.0001 0.50 1.65 (1.49, 1.83) <.0001
Parental encouragement -0.21 0.81 (0.75, 0.87) <.0001 -0.21 0.81 (0.75, 0.87) <.0001
Parental attitudes toward marijuana use 0.95 2.57 (2.35, 2.82) <.0001 0.88 2.42 (2.20, 2.67) <.0001
Parents communicate about substance use 0.45 1.57 (1.39, 1.77) <.0001 0.40 1.50 (1.32, 1.70) <.0001
Parents are source of social support -0.67 0.51 (0.45, 0.58) <.0001 -0.67 0.51 (0.45, 0.58) <.0001
Sample size 25,357 18,896 18,896
R2 (see footnote 3) 0.09 0.10 0.15
RN2 (see footnote 4) 0.15 0.17 0.25

OR = odds ratio; CI = confidence interval; MSA = metropolitan statistical area.
1 ORs are derived from multiple logistic regression models and adjusted for other variables included in each model. ORs > 1.0 indicate that the odds of past year marijuana use increased with each unit increase in the predictor. For risk factors, each unit increase in the predictor generally indicates an increased risk of marijuana use. For protective factors, each unit increase in the predictor generally indicates a higher level of protection against marijuana use.
2 The questions used to measure each of the factors are provided in Appendix A (Table A.2). The coding and distribution of the responses for each factor are provided in Table 2.2.
3 Indicates X2 comparison -2log-likelihood of Model 2 vs. Model 3 is significant.
3 Cox and Snell (1989) R2 is a measure of the fit of the model, defined as 1 minus a certain quantity raised to the power of 2 over n, where n is the sample size. The aforementioned quantity is the ratio of the likelihood of the intercept-only model to the likelihood of the full model where L(O) is the likelihood of the intercept-only model, The likelihood of the full model is the likelihood of the full model, and n is the sample size.
4 Recognizing that the Cox and Snell R2 reaches a maximum for models that depend on the value of the estimated percentage, Nagelkerke (1991) proposed dividing the Cox and Snell measure by the maximum. In this sense, RN2 measures the absolute percentage of variation explained by the model.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999.

Table 4.3 Results of Logistic Regression Models Predicting Past Year Marijuana Use with Demographics and Peer/Individual Domain Risk and Protective Factors among Youths Aged 12 to 17: 1999

  Model 1: Demographics Model 2: Peer/Individual Risk/Protective Factors Model 3: Demographics + Peer Individual Risk/Protective Factors
Beta OR1 95% CI p value Beta OR1 95% CI p value Beta OR1 95% CI p value
Intercept -9.06 <.0001 -8.04 <.0001 -12.37 <.0001
Demographics                        
Race/ethnicity                        
     Black vs. white -0.43 0.65 (0.55, 0.78) <.0001 -0.31 0.73 (0.59, 0.91) .0044
     Hispanic vs. white -0.10 0.90 (0.76, 1.06) .2155 -0.04 0.96 (0.77, 1.19) .7172
     Other vs. white -0.39 0.67 (0.50, 0.91) .0095 -0.03 0.97 (0.69, 1.38) .8716
Gender - male vs. female 0.16 1.18 (1.07, 1.29) .0008 -0.24 0.79 (0.69, 0.90) .0003
Age (continuous - 12 to 17) 0.52 1.68 (1.63, 1.72) <.0001 0.31 1.36 (1.30, 1.43) <.0001
Number of parents in home (2 vs. others) -0.67 0.51 (0.46, 0.57) <.0001 -0.37 0.69 (0.60, 0.79) <.0001
Economic deprivation (household income under $20,000) -0.16 0.85 (0.74, 0.98) .0242 -0.24 0.79 (0.65, 0.95) .0115
Geographic region                        
     Northeast vs. West -0.20 0.82 (0.70, 0.96) .0119 -0.24 0.78 (0.63, 0.98) .0315
     North Central vs. West -0.17 0.84 (0.73, 0.96) .0127 -0.02 0.98 (0.81, 1.17) .8103
     South vs. West -0.26 0.77 (0.68, 0.88) .0001 -0.07 0.94 (0.78, 1.12) .4642
County type                
     Large MSA vs. non-MSA 0.19 1.21 (1.07, 1.36) .0023 0.09 1.09 (0.93, 1.27) .2792
     Small MSA vs. non-MSA 0.21 1.24 (1.09, 1.41) .0008 0.06 1.06 (0.90, 1.24) .4840
Peer/Individual Domain2                        
Antisocial behavior 0.60 1.82 (1.50, 2.21) <.0001 0.82 2.26 (1.83, 2.80) <.0001
Individual attitudes toward marijuana use 0.58 1.79 (1.63, 1.97) <.0001 0.56 1.74 (1.58, 1.92) <.0001
Friends' attitudes toward marijuana use 0.38 1.46 (1.34, 1.59) <.0001 0.35 1.42 (1.31, 1.55) <.0001
Friends' marijuana use 1.15 3.17 (2.90, 3.46) <.0001 1.03 2.79 (2.55, 3.06) <.0001
Perceived risk of marijuana use 0.55 1.74 (1.61, 1.88) <.0001 0.54 1.72 (1.59, 1.86) <.0001
Risk-taking proclivity 0.29 1.34 (1.22, 1.47) <.0001 0.33 1.38 (1.25, 1.53) <.0001
Participation in two or more extracurricular activities -0.09 0.91 (0.80, 1.04) .1773 -0.08 0.92 (0.81, 1.05) .2384
Religiosity -0.11 0.9 (0.82, 0.98) .0149 -0.06 0.95 (0.87, 1.04) .2279
Sample size 25,357 23,487 23,487
R2 (see footnote 3) 0.09 0.29 0.30
RN2 (see footnote 4) 0.15 0.51 0.53

OR = odds ratio; CI = confidence interval; MSA = metropolitan statistical area.
1 ORs are derived from multiple logistic regression models and adjusted for other variables included in each model. ORs > 1.0 indicate that the odds of past year marijuana use increased with each unit increase in the predictor. For risk factors, each unit increase in the predictor generally indicates an increased risk of marijuana use. For protective factors, each unit increase in the predictor generally indicates a higher level of protection against marijuana use.
2 The questions used to measure each of the factors are provided in Appendix A (Table A.3). The coding and distribution of the responses for each factor are provided in Table 2.3.
3 Cox and Snell (1989) R2 is a measure of the fit of the model, defined as 1 minus a certain quantity raised to the power of 2 over n, where n is the sample size. The aforementioned quantity is the ratio of the likelihood of the intercept-only model to the likelihood of the full model where L(O) is the likelihood of the intercept-only model, The likelihood of the full model is the likelihood of the full model, and n is the sample size.
4 Recognizing that the Cox and Snell R2 reaches a maximum for models that depend on the value of the estimated percentage, Nagelkerke (1991) proposed dividing the Cox and Snell measure by the maximum. In this sense, RN2 measures the absolute percentage of variation explained by the model.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999.

Table 4.4 Results of Logistic Regression Models Predicting Past Year Marijuana Use with Demographics and School Domain Risk and Protective Factors among Youths Aged 12 to 17: 1999

  Model 1: Demographics Model 2: School Risk/Protective Factors Model 3: Demographics + School Risk/Protective Factors
Beta OR1 95% CI p value Beta OR1 95% CI p value Beta OR1 95% CI p value
Intercept -9.06 <.0001 -3.93 <.0001 -8.46 <.0001
Demographics                        
Race/ethnicity                        
     Black vs. white -0.43 0.65 (0.55, 0.78) <.0001 -0.66 0.52 (0.41, 0.65) <.0001
     Hispanic vs. white -0.10 0.90 (0.76, 1.06) .2155 -0.05 0.95 (0.78, 1.16) .6092
     Other vs. white -0.39 0.67 (0.50, 0.91) .0095 -0.16 0.85 (0.60, 1.21) .3656
Gender - male vs. female 0.16 1.18 (1.07, 1.29) .0008 0.11 1.12 (0.99, 1.26) .0695
Age (continuous - 12 to 17) 0.52 1.68 (1.63, 1.72) <.0001 0.34 1.41 (1.35, 1.47) <.0001
Number of parents in home (2 vs. others) -0.67 0.51 (0.46, 0.57) <.0001 -0.55 0.57 (0.50, 0.66) <.0001
Economic deprivation (household income under $20,000) -0.16 0.85 (0.74, 0.98) .0242 -0.11 0.90 (0.75, 1.07) .2349
Geographic region                        
     Northeast vs. West -0.20 0.82 (0.70, 0.96) .0119 -0.20 0.82 (0.67, 0.99) .0371
     North Central vs. West -0.17 0.84 (0.73, 0.96) .0127 -0.15 0.86 (0.73, 1.02) .0757
     South vs. West -0.26 0.77 (0.68, 0.88) .0001 -0.18 0.84 (0.71, 0.99) .0323
County type                        
     Large MSA vs. non-MSA 0.19 1.21 (1.07, 1.36) .0023 0.12 1.13 (0.98, 1.31) .1039
     Small MSA vs. non-MSA 0.21 1.24 (1.09, 1.41) .0008 0.02 1.03 (0.88, 1.20) .7517
School Domain2                        
Commitment to school -0.39 0.68 (0.62, 0.74) <.0001 -0.37 0.69 (0.63, 0.76) <.0001
Sanctions against marijuana use at school -0.11 0.89 (0.75, 1.07) .2087 -0.11 0.90 (0.75, 1.07) .2255
Perceived prevalence of marijuana use 1.43 4.17 (3.82, 4.55) <.0001 1.26 3.52 (3.20, 3.87) <.0001
Academic performance 0.32 1.38 (1.29, 1.48) <.0001 0.33 1.39 (1.30, 1.50) <.0001
Exposed to prevention messages in school -0.23 0.79 (0.69, 0.90) .0006 -0.14 0.87 (0.76, 1.00) .0511
Sample size 25,357 17,679 17,679
R2 (see footnote 3) 0.09 0.16 0.18
RN2 (see footnote 4) 0.15 0.27 0.32

OR = odds ratio; CI = confidence interval; MSA = metropolitan statistical area.
1 ORs are derived from multiple logistic regression models and adjusted for other variables included in each model. ORs > 1.0 indicate that the odds of past year marijuana use increased with each unit increase in the predictor. For risk factors, each unit increase in the predictor generally indicates an increased risk of marijuana use. For protective factors, each unit increase in the predictor generally indicates a higher level of protection against marijuana use.
2 The questions used to measure each of the factors are provided in Appendix A (Table A.4). The coding and distribution of the responses for each factor are provided in Table 2.4.
3 Cox and Snell (1989) R2 is a measure of the fit of the model, defined as 1 minus a certain quantity raised to the power of 2 over n, where n is the sample size. The aforementioned quantity is the ratio of the likelihood of the intercept-only model to the likelihood of the full model where L(O) is the likelihood of the intercept-only model, The likelihood of the full model is the likelihood of the full model, and n is the sample size.
4 Recognizing that the Cox and Snell R2 reaches a maximum for models that depend on the value of the estimated percentage, Nagelkerke (1991) proposed dividing the Cox and Snell measure by the maximum. In this sense, RN2 measures the absolute percentage of variation explained by the model.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999.

Table 4.5 Odds Ratios and 95 Percent Confidence Intervals of Combined Reduced Model of Demographics and Risk and Protective Factors Predicting Past Year Marijuana Use among Youths Aged 12 to 17: 1999

  Beta OR1 95% CI p value
Intercept -14.85 <.0001
Demographics        
Race/ethnicity        
     Black vs. white -0.56 0.57 (0.44, 0.74) <.0001
     Hispanic vs. white -0.01 0.99 (0.77, 1.28) .9264
     Other vs. white 0.01 1.01 (0.65, 1.58) .9619
Gender - male vs. female -0.15 0.86 (0.74, 1.00) .0494
Age (continuous - 12 to 17) 0.25 1.29 (1.22, 1.36) <.0001
Number of parents in home (2 vs. others) -0.28 0.76 (0.64, 0.90) .0017
Economic deprivation (household income under $20,000) -0.21 0.81 (0.65, 1.02) .0762
Geographic region        
     Northeast vs. West -0.18 0.83 (0.65, 1.06) .1377
     North Central vs. West -0.01 0.99 (0.80, 1.23) .9084
     South vs. West 0.02 1.03 (0.83, 1.27) .8171
County type        
     Large MSA vs. non-MSA 0.15 1.16 (0.96, 1.40) .1274
     Small MSA vs. non-MSA 0.00 1.00 (0.82, 1.22) .9836
Community Domain2        
Community attitudes toward marijuana use -0.13 0.88 (0.79, 0.99) .0285
Community norms toward marijuana use 0.35 1.42 (1.25, 1.60) <.0001
Availability of marijuana 0.25 1.28 (1.18, 1.39) .0001
Exposed to prevention messages in the media -0.21 0.81 (0.67, 0.99) .0434
Family Domain2        
Parental monitoring 0.10 1.11 (0.97, 1.26) .1345
Parental encouragement -0.02 0.98 (0.88, 1.09) .7285
Parental attitudes toward marijuana use 0.17 1.19 (1.03, 1.38) .0197
Parents communicate about substance use 0.47 1.59 (1.35, 1.88) <.0001
Parents are source of social support -0.32 0.73 (0.62, 0.85) .0001
Peer/Individual Domain2        
Antisocial behavior 0.75 2.11 (1.63, 2.73) <.0001
Individual attitudes toward marijuana use 0.54 1.71 (1.53, 1.91) .0001
Friends' attitudes toward marijuana use 0.34 1.40 (1.26, 1.55) <.0001
Friends' marijuana use 0.73 2.07 (1.80, 2.37) <.0001
Perceived risk of marijuana use 0.58 1.78 (1.63, 1.95) <.0001
Risk-taking proclivity 0.24 1.27 (1.12, 1.44) .0002
School Domain2        
Commitment to school 0.35 1.42 (1.25, 1.61) <.0001
Perceived prevalence of marijuana use 0.35 1.42 (1.24, 1.62) <.0001
Academic performance 0.18 1.20 (1.09, 1.32) .0003
Sample size 16,402
R2 (see footnote 3) 0.33
RN2 (see footnote 4) 0.56

OR = odds ratio; CI = confidence interval; MSA = metropolitan statistical area.
1 ORs are derived from a single multiple logistic regression model that included the set of demographic variables as well as all of the risk and protective factors included in the table. ORs > 1.0 indicate that the odds of past year marijuana use increased with each unit increase in the predictor. For risk factors, each unit increase in the predictor generally indicates an increased risk of marijuana use. For protective factors, each unit increase in the predictor generally indicates a higher level of protection against marijuana use.
2 The questions used to measure each of the factors are provided in Appendix A (Tables A.1 to A.4). The coding and distribution of the responses for each factor are provided in Tables 2.1 to 2.4.
3 Cox and Snell (1989) R2 is a measure of the fit of the model, defined as 1 minus a certain quantity raised to the power of 2 over n, where n is the sample size. The aforementioned quantity is the ratio of the likelihood of the intercept-only model to the likelihood of the full model where L(O) is the likelihood of the intercept-only model, The likelihood of the full model is the likelihood of the full model, and n is the sample size.
4 Recognizing that the Cox and Snell R2 reaches a maximum for models that depend on the value of the estimated percentage, Nagelkerke (1991) proposed dividing the Cox and Snell measure by the maximum. In this sense, RN2 measures the absolute percentage of variation explained by the model.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999.

Table 4.6 Odds Ratios and 95 Percent Confidence Intervals of Final Model of Demographics and Risk and Protective Factors Predicting Past Year Marijuana Use among Youths Aged 12 to 17: 1999

  Beta OR1 95% CI p value
Intercept -14.75 <.0001
Demographics        
Race/ethnicity        
     Black vs. white -0.57 0.57 (0.44, 0.74) <.0001
     Hispanic vs. white -0.01 0.99 (0.77, 1.28) .9312
     Other vs. white 0.01 1.01 (0.65, 1.58) .9502
Gender - male vs. female -0.16 0.85 (0.73, 0.99) .0405
Age (continuous - 12 to 17) 0.26 1.30 (1.22, 1.37) <.0001
Number of parents in home (2 vs. others) -0.28 0.75 (0.64, 0.90) .0014
Economic deprivation (household income under $20,000) -0.21 0.81 (0.65, 1.02) .0749
Geographic region        
     Northeast vs. West -0.18 0.84 (0.66, 1.07) .1517
     North Central vs. West -0.01 0.99 (0.80, 1.23) .9575
     South vs. West 0.03 1.03 (0.84, 1.27) .7656
County type        
     Large MSA vs. non-MSA 0.15 1.16 (0.96, 1.40) .1263
     Small MSA vs. non-MSA 0.00 1.00 (0.82, 1.21) .9725
Community Domain2        
Community attitudes toward marijuana use -0.12 0.88 (0.79, 0.99) .0323
Community norms toward marijuana use 0.35 1.42 (1.25, 1.61) <.0001
Availability of marijuana 0.25 1.28 (1.18, 1.40) <.0001
Exposed to prevention messages in the media -0.21 0.81 (0.66, 0.99) .0423
Family Domain2        
Parental attitudes toward marijuana use 0.18 1.19 (1.03, 1.38) .0186
Parents communicate about substance use 0.44 1.55 (1.32, 1.82) <.0001
Parents are source of social support -0.34 0.71 (0.61, 0.83) <.0001
Peer/Individual Domain2        
Antisocial behavior 0.76 2.13 (1.65, 2.75) <.0001
Individual attitudes toward marijuana use 0.54 1.71 (1.53, 1.91) <.0001
Friends' attitudes toward marijuana use 0.34 1.40 (1.26, 1.56) <.0001
Friends' marijuana use 0.73 2.07 (1.81, 2.38) <.0001
Perceived risk of marijuana use 0.58 1.79 (1.64, 1.95) <.0001
Risk-taking proclivity 0.24 1.27 (1.12, 1.44) .0002
School Domain2        
Commitment to school 0.33 1.39 (1.22, 1.58) <.0001
Perceived prevalence of marijuana use 0.35 1.42 (1.25, 1.63) <.0001
Academic performance 0.18 1.20 (1.09, 1.32) .0004
Sample size 16,411
R2 (see footnote 3) 0.33
RN2 (see footnote 4) 0.56

OR = odds ratio; CI = confidence interval; MSA = metropolitan statistical area.
1 ORs are derived from a single multiple logistic regression model that included the set of demographic variables as well as all of the risk and protective factors included in the table. ORs > 1.0 indicate that the odds of past year marijuana use increased with each unit increase in the predictor. For risk factors, each unit increase in the predictor generally indicates an increased risk of marijuana use. For protective factors, each unit increase in the predictor generally indicates a higher level of protection against marijuana use.
2 The questions used to measure each of the factors are provided in Appendix A (Tables A.1 to A.4). The coding and distribution of the responses for each factor are provided in Tables 2.1 to 2.4.
3 Cox and Snell (1989) R2 is a measure of the fit of the model, defined as 1 minus a certain quantity raised to the power of 2 over n, where n is the sample size. The aforementioned quantity is the ratio of the likelihood of the intercept-only model to the likelihood of the full model where L(O) is the likelihood of the intercept-only model, The likelihood of the full model is the likelihood of the full model, and n is the sample size.
4 Recognizing that the Cox and Snell R2 reaches a maximum for models that depend on the value of the estimated percentage, Nagelkerke (1991) proposed dividing the Cox and Snell measure by the maximum. In this sense, RN2 measures the absolute percentage of variation explained by the model.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999.

Table 4.7 Results of Logistic Regression Models Predicting Past Year Cigarette Use with Demographics and Risk and Protective Factors, by Domain, among Youths Aged 12 to 17: 1999

  Demographics1 + Risk/Protective Factors2
Beta OR3 95% CI p value
Community Domain2 + Demographics1        
Community disorganization and crime 0.05 1.05 (0.98, 1.13) .1439
Neighborhood cohesiveness 0.01 1.01 (0.95, 1.07) .7155
Community attitudes toward cigarette use 0.39 1.47 (1.40, 1.56) <.0001
Community norms toward cigarette use 0.70 2.00 (1.87, 2.14) <.0001
Exposed to prevention messages in the media -0.22 0.80 (0.72, 0.89) .0001
Family Domain2 + Demographics1        
Parental monitoring 0.37 1.45 (1.32, 1.60) <.0001
Parental encouragement -0.19 0.83 (0.77, 0.89) <.0001
Parental attitudes toward cigarette use 0.76 2.14 (1.96, 2.34) <.0001
Parents communicate about substance use 0.30 1.34 (1.22, 1.48) <.0001
Parents are source of social support -0.76 0.47 (0.42, 0.52) <.0001
Peer/Individual Domain2 + Demographics1        
Antisocial behavior 0.68 1.97 (1.66, 2.34) <.0001
Individual attitudes toward cigarette use 0.45 1.56 (1.47, 1.67) <.0001
Friends' attitudes toward cigarette use 0.21 1.24 (1.16, 1.32) <.0001
Friends' cigarette use 0.92 2.52 (2.34, 2.71) <.0001
Perceived risk of cigarette use 0.18 1.19 (1.12, 1.27) <.0001
Risk-taking proclivity 0.46 1.59 (1.47, 1.71) <.0001
Participation in two or more extracurricular activities -0.17 0.85 (0.76, 0.94) .0015
Religiosity -0.17 0.85 (0.79, 0.91) <.0001
School Domain2 + Demographics1        
Commitment to school -0.42 0.65 (0.60, 0.71) <.0001
Sanctions against cigarette use at school -0.23 0.79 (0.73, 0.86) <.0001
Perceived prevalence of cigarette use 0.74 2.10 (1.93, 2.29) <.0001
Academic performance 0.41 1.51 (1.43, 1.60) <.0001
Exposed to prevention messages in school -0.12 0.88 (0.78, 1.00) .0542

OR = odds ratio; CI = confidence interval.
Note: No question was asked about availability of cigarettes.
1 Demographic variables included in models were race/ethnicity, gender, age, number of parents in home, household income, geographic region, and county type.
2 The questions used to measure each of the factors are provided in Appendix A (Tables A.1 to A.4). The coding and distribution of the responses for each factor are provided in Tables 2.1 to 2.4.
3 ORs are derived from multiple logistic regression models, run separately for each domain, and adjusted for the demographic variables as well as the other factors within each domain. ORs > 1.0 indicate that the odds of past year cigarette use increased with each unit increase in the predictor. For risk factors, each unit increase in the predictor generally indicates an increased risk of cigarette use. For protective factors, each unit increase in the predictor generally indicates a higher level of protection against cigarette use.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999.

Table 4.8 Results of Logistic Regression Combined Reduced Model Predicting Past Year Cigarette Use with Demographics and Risk and Protective Factors among Youths Aged 12 to 17: 1999

  Beta OR1 95% CI p value
Intercept -9.90 <.0001
Demographics        
Race/ethnicity        
     Black vs. white -0.57 0.56 (0.46, 0.69) <.0001
     Hispanic vs. white -0.30 0.74 (0.61, 0.90) .0021
     Other vs. white -0.22 0.80 (0.59, 1.09) .1594
Gender - male vs. female -0.26 0.77 (0.68, 0.87) <.0001
Age (continuous - 12 to 17) 0.25 1.29 (1.24, 1.34) <.0001
Number of parents in home (2 vs. others) -0.24 0.79 (0.69, 0.90) .0003
Economic deprivation (household income under $20,000) -0.22 0.80 (0.68, 0.95)  .0119
Geographic region        
     Northeast vs. West -0.11 0.89 (0.73, 1.10) .2918
     North Central vs. West 0.00 1.00 (0.84, 1.18) .9626
     South vs. West 0.09 1.09 (0.93, 1.29) .2910
County type        
     Large MSA vs. non-MSA -0.10 0.90 (0.78, 1.04) .1612
     Small MSA vs. non-MSA -0.01 0.99 (0.85, 1.15) .8923
Community Domain2        
Community attitudes toward cigarette use 0.03 1.03 (0.95, 1.11) .4500
Community norms toward cigarette use 0.11 1.12 (1.02, 1.23) .0161
Exposed to prevention messages in the media -0.02 0.98 (0.85, 1.13) .7816
Family Domain2        
Parental monitoring 0.09 1.10 (0.97, 1.23) .1290
Parental encouragement -0.01 0.99 (0.91, 1.08) .8356
Parental attitudes toward cigarette use 0.25 1.29 (1.16, 1.43) <.0001
Parents communicate about substance use 0.38 1.46 (1.30, 1.64) <.0001
Parents are source of social support -0.49 0.62 (0.54, 0.70) <.0001
Peer/Individual Domain2        
Antisocial behavior 0.53 1.69 (1.39, 2.06) <.0001
Individual attitudes toward cigarette use 0.46 1.58 (1.46, 1.70) <.0001
Friends' attitudes toward cigarette use 0.13 1.14 (1.05, 1.24) .0015
Friends' cigarette use 0.82 2.28 (2.08, 2.49) <.0001
Perceived risk of cigarette use 0.17 1.18 (1.09, 1.28) <.0001
Risk-taking proclivity 0.39 1.48 (1.34, 1.63) <.0001
Participation in two or more extracurricular activities -0.11 0.90 (0.78, 1.03) .1298
Religiosity -0.17 0.85 (0.78, 0.92) .0001
School Domain2        
Commitment to school 0.03 1.03 (0.93, 1.14) .5952
Sanctions against cigarette use at school 0.01 1.01 (0.92, 1.11) .8681
Perceived prevalence of cigarette use 0.16 1.17 (1.06, 1.31) .0029
Academic performance 0.17 1.19 (1.11, 1.28) <.0001

OR = odds ratio; CI = confidence interval; MSA = metropolitan statistical area.
Note: No question was asked about availability of cigarettes.
1 ORs are derived from a single multiple logistic regression model that included the set of demographic variables as well as all of the risk and protective factors included in the table. ORs > 1.0 indicate that the odds of past year cigarette use increased with each unit increase in the predictor. For risk factors, each unit increase in the predictor generally indicates an increased risk of cigarette use. For protective factors, each unit increase in the predictor generally indicates a higher level of protection against cigarette use.
2 The questions used to measure each of the factors are provided in Appendix A (Tables A.1 to A.4). The coding and distribution of the responses for each factor are provided in Tables 2.1 to 2.4.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999.

Table 4.9 Results of Logistic Regression Final Model Predicting Past Year Cigarette Use with Demographics and Risk and Protective Factors among Youths Aged 12 to 17: 1999

  Beta OR1 95% CI p value
Intercept -9.81 <.0001
Demographics        
Race/ethnicity        
     Black vs. white -0.58 0.56 (0.46, 0.68) <.0001
     Hispanic vs. white -0.25 0.78 (0.64, 0.94) .0079
     Other vs. white -0.25 0.78 (0.58, 1.05) .1031
Gender - male vs. female -0.26 0.77 (0.68, 0.87) <.0001
Age (continuous - 12 to 17) 0.26 1.30 (1.25, 1.35) <.0001
Number of parents in home (2 vs. others) -0.25 0.78 (0.68, 0.88) .0001
Economic deprivation (household income under $20,000) -0.22 0.80 (0.68, 0.95) .0115
Geographic region        
     Northeast vs. West -0.12 0.89 (0.72, 1.09) .2506
     North Central vs. West -0.02 0.98 (0.84, 1.16) .8539
     South vs. West 0.09 1.09 (0.93, 1.28) .2826
County type        
     Large MSA vs. non-MSA -0.11 0.90 (0.78, 1.03) .1316
     Small MSA vs. non-MSA -0.02 0.98 (0.85, 1.14) .8305
Community Domain2        
Community's norms toward cigarette use 0.13 1.14 (1.04, 1.25) .0052
Family Domain2        
Parental attitudes toward cigarette use 0.28 1.32 (1.19, 1.46) <.0001
Parents communicate about substance use 0.36 1.43 (1.29, 1.60) <.0001
Parents are source of social support -0.50 0.61 (0.54, 0.69) <.0001
Peer/Individual Domain2        
Antisocial behavior 0.55 1.72 (1.42, 2.09) <.0001
Individual attitudes toward cigarette use 0.45 1.57 (1.46, 1.70) <.0001
Friends' attitudes toward cigarette use 0.15 1.16 (1.07, 1.26) .0003
Friends' cigarette use 0.82 2.28 (2.09, 2.49) <.0001
Perceived risk of cigarette use 0.17 1.18 (1.09, 1.28) <.0001
Risk-taking proclivity 0.38 1.46 (1.33, 1.60) <.0001
Religiosity -0.19 0.83 (0.77, 0.90) <.0001
School Domain2        
Perceived prevalence of cigarette use 0.15 1.17 (1.05, 1.29) .0035
Academic performance 0.18 1.20 (1.12, 1.28) <.0001
Sample size 17,410
R2 (see footnote 3) 0.29
RN2 (see footnote 4) 0.43

OR = odds ratio; CI = confidence interval; MSA = metropolitan statistical area.
Note: No question was asked about availability of cigarettes.
1 ORs are derived from a single multiple logistic regression model that included the set of demographic variables as well as all of the risk and protective factors included in the table. ORs > 1.0 indicate that the odds of past year cigarette use increased with each unit increase in the predictor. For risk factors, each unit increase in the predictor generally indicates an increased risk of cigarette use. For protective factors, each unit increase in the predictor generally indicates a higher level of protection against cigarette use.
2 The questions used to measure each of the factors are provided in Appendix A (Tables A.1 to A.4). The coding and distribution of the responses for each factor are provided in Tables 2.1 to 2.4.
3 Cox and Snell (1989) R2 is a measure of the fit of the model, defined as 1 minus a certain quantity raised to the power of 2 over n, where n is the sample size. The aforementioned quantity is the ratio of the likelihood of the intercept-only model to the likelihood of the full model where L(O) is the likelihood of the intercept-only model, The likelihood of the full model is the likelihood of the full model, and n is the sample size.
4 Recognizing that the Cox and Snell R2 reaches a maximum for models that depend on the value of the estimated percentage, Nagelkerke (1991) proposed dividing the Cox and Snell measure by the maximum. In this sense, RN2 measures the absolute percentage of variation explained by the model.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999.

Table 4.10 Results of Logistic Regression Models Predicting Past Year Alcohol Use with Demographics and Risk and Protective Factors, by Domain, among Youths Aged 12 to 17: 1999

  Demographics1 + Risk/Protective Factors2
Beta OR3 95% CI p value
Community Domain2 + Demographics1        
Community disorganization and crime -0.05 0.95 (0.89, 1.02) .1610
Neighborhood cohesiveness 0.01 1.01 (0.96, 1.07) .6212
Community attitudes toward alcohol use 0.27 1.31 (1.24, 1.39) <.0001
Community norms toward alcohol use 0.92 2.51 (2.35, 2.68) <.0001
Exposed to prevention messages in the media -0.07 0.93 (0.84, 1.02) .1336
Family Domain2 + Demographics1        
Parental monitoring 0.49 1.64 (1.51, 1.77) <.0001
Parental encouragement -0.13 0.88 (0.82, 0.93) <.0001
Parental attitudes toward alcohol use 0.65 1.91 (1.73, 2.11) <.0001
Parents communicate about substance use 0.27 1.31 (1.19, 1.44) <.0001
Parents are source of social support -0.65 0.52 (0.47, 0.58) <.0001
Peer/Individual Domain2 + Demographics1        
Antisocial behavior 0.50 1.65 (1.35, 2.01) <.0001
Individual attitudes toward alcohol use 0.38 1.46 (1.37, 1.57) <.0001
Friends' attitudes toward alcohol use 0.08 1.08 (1.01, 1.16) .0310
Friends' alcohol use 0.99 2.69 (2.50, 2.89) <.0001
Perceived risk of alcohol use 0.19 1.21 (1.14, 1.30) <.0001
Risk-taking proclivity 0.64 1.89 (1.76, 2.03) <.0001
Participation in two or more extracurricular activities 0.11 1.12 (1.01, 1.24) .0292
Religiosity -0.26 0.77 (0.72, 0.82) <.0001
School Domain2 + Demographics1        
Commitment to school -0.46 0.63 (0.58, 0.68) <.0001
Sanctions against alcohol use at school 0.11 1.11 (0.99, 1.25) .0676
Perceived prevalence of alcohol use 0.96 2.62 (2.42, 2.83) <.0001
Academic performance 0.21 1.23 (1.16, 1.30) <.0001
Exposed to prevention messages in school -0.03 0.97 (0.87, 1.08) .5494

Note: No question was asked about availability of alcohol.
1 Demographic variables included in models were race/ethnicity, gender, age, number of parents in home, household income, geographic region, and county type.
2 The questions used to measure each of the factors are provided in Appendix A (Tables A.1 to A.4). The coding and distribution of the responses for each factor are provided in Tables 2.1 to 2.4.
3 ORs are derived from multiple logistic regression models, run separately for each domain, and adjusted for the demographic variables as well as the other factors within each domain. ORs > 1.0 indicate that the odds of past year alcohol use increased with each unit increase in the predictor. For risk factors, each unit increase in the predictor generally indicates an increased risk of alcohol use. For protective factors, each unit increase in the predictor generally indicates a higher level of protection against alcohol use.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999.

Table 4.11 Results of Logistic Regression Combined Reduced Model Predicting Past Year Alcohol Use with Demographics and Risk and Protective Factors among Youths Aged 12 to 17: 1999

  Beta OR1 95% CI p value
Intercept -10.73 <.0001
Demographics        
Race/ethnicity        
     Black vs. white -0.63 0.53 (0.44, 0.65) <.0001
     Hispanic vs. white 0.00 1.00 (0.83, 1.22) .9681
     Other vs. white -0.28 0.75 (0.56, 1.01) .0567
Gender - male vs. female -0.41 0.67 (0.60, 0.74) <.0001
Age (continuous - 12 to 17) 0.36 1.43 (1.38, 1.48) <.0001
Number of parents in home (2 vs. others) -0.18 0.83 (0.74, 0.94) .0021
Economic deprivation (household income under $20,000) -0.17 0.84 (0.72, 0.99) .0329
Geographic region        
     Northeast vs. West 0.01 1.01 (0.85, 1.20) .8829
     North Central vs. West -0.06 0.94 (0.81, 1.09) .4278
     South vs. West -0.03 0.97 (0.83, 1.13) .6693
County type        
     Large MSA vs. non-MSA 0.09 1.09 (0.96, 1.24) .1935
     Small MSA vs. non-MSA 0.08 1.08 (0.95, 1.24) .2480
Community Domain2        
Community attitudes toward alcohol use -0.06 0.94 (0.87, 1.02) .1238
Community norms toward alcohol use 0.27 1.31 (1.19, 1.43) <.0001
Family Domain2        
Parental monitoring 0.16 1.17 (1.07, 1.29) .0012
Parental encouragement 0.03 1.03 (0.96, 1.11) .4043
Parental attitudes toward alcohol use 0.17 1.18 (1.03, 1.35) .0139
Parents communicate about substance use 0.26 1.30 (1.17, 1.46) <.0001
Parents are source of social support -0.34 0.71 (0.63, 0.80) <.0001
Peer/Individual Domain2        
Antisocial behavior 0.52 1.69 (1.32, 2.16) <.0001
Individual attitudes toward alcohol use 0.41 1.50 (1.38, 1.63) <.0001
Friends' attitudes toward alcohol use -0.01 0.99 (0.91, 1.08) .8207
Friends' alcohol use 0.85 2.34 (2.12, 2.59) <.0001
Perceived risk of alcohol use 0.22 1.24 (1.15, 1.35) <.0001
Risk-taking proclivity 0.59 1.81 (1.65, 1.98) <.0001
Participation in two or more extracurricular activities 0.04 1.04 (0.92, 1.19) .5205
Religiosity -0.24 0.79 (0.73, 0.85) <.0001
School Domain2        
Commitment to school 0.07 1.07 (0.97, 1.18) .1598
Perceived prevalence of alcohol use 0.13 1.14 (1.03, 1.27) .0118
Academic performance 0.03 1.03 (0.97, 1.11) .3341

OR = odds ratio; CI = confidence interval; MSA = metropolitan statistical area.
Note: No question was asked about availability of alcohol.
1 ORs are derived from a single multiple logistic regression model that included the set of demographic variables as well as all of the risk and protective factors included in the table. ORs > 1.0 indicate that the odds of past year alcohol use increased with each unit increase in the predictor. For risk factors, each unit increase in the predictor generally indicates an increased risk of alcohol use. For protective factors, each unit increase in the predictor generally indicates a higher level of protection against alcohol use.
2 The questions used to measure each of the factors are provided in Appendix A (Tables A.1 to A.4). The coding and distribution of the responses for each factor are provided in Tables 2.1 to 2.4.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999.

Table 4.12 Results of Logistic Regression Final Model Predicting Past Year Alcohol Use with Demographics and Risk and Protective Factors among Youths Aged 12 to 17: 1999

  Beta OR3 95% CI p value
Intercept -10.06 <.0001
Demographics        
Race/ethnicity        
     Black vs. white -0.61 0.54 (0.45, 0.66) <.0001
     Hispanic vs. white 0.01 1.01 (0.84, 1.21) .9453
     Other vs. white -0.28 0.76 (0.57, 1.01) .0580
Gender - male vs. female -0.40 0.67 (0.61, 0.75) <.0001
Age (continuous - 12 to 17) 0.35 1.41 (1.36, 1.47) <.0001
Number of parents in home (2 vs. others) -0.20 0.82 (0.73, 0.92) .0007
Economic deprivation (household income under $20,000) -0.18 0.83 (0.71, 0.97) .0203
Geographic region        
     Northeast vs. West 0.02 1.02 (0.86, 1.21) .8179
     North Central vs. West -0.06 0.94 (0.81, 1.09) .4085
     South vs. West -0.03 0.97 (0.83, 1.13) .7133
County type        
     Large MSA vs. non-MSA 0.06 1.06 (0.93, 1.21) .3661
     Small MSA vs. non-MSA 0.05 1.05 (0.92, 1.20) .4733
Community Domain2        
Community norms toward alcohol use 0.27 1.31 (1.20, 1.43) <.0001
Family Domain2        
Parental monitoring 0.13 1.14 (1.04, 1.25) .0038
Parental attitudes toward alcohol use 0.16 1.17 (1.04, 1.33) .0114
Parents communicate about substance use 0.26 1.30 (1.17, 1.45) <.0001
Parents are source of social support -0.33 0.72 (0.64, 0.81) <.0001
Peer/Individual Domain2        
Antisocial behavior 0.48 1.61 (1.29, 2.03) <.0001
Individual attitudes toward alcohol use 0.39 1.48 (1.38, 1.58) <.0001
Friends' alcohol use 0.84 2.31 (2.09, 2.54) <.0001
Perceived risk of alcohol use 0.21 1.24 (1.14, 1.33) <.0001
Risk-taking proclivity 0.56 1.75 (1.60, 1.90) <.0001
Religiosity -0.24 0.79 (0.73, 0.85) <.0001
School Domain2        
Perceived prevalence of alcohol use 0.16 1.17 (1.06, 1.29) .0020
Sample size 17,265
R2 (see footnote 3) 0.34
RN2 (see footnote 4) 0.46

OR = odds ratio; CI = confidence interval; MSA = metropolitan statistical area.
Note: No question was asked about availability of alcohol.
1 ORs are derived from a single multiple logistic regression model that included the set of demographic variables as well as all of the risk and protective factors included in the table. ORs > 1.0 indicate that the odds of past year alcohol use increased with each unit increase in the predictor. For risk factors, each unit increase in the predictor generally indicates an increased risk of alcohol use. For protective factors, each unit increase in the predictor generally indicates a higher level of protection against alcohol use.
2 The questions used to measure each of the factors are provided in Appendix A (Tables A.1 to A.4). The coding and distribution of the responses for each factor are provided in Tables 2.1 to 2.4.
3 Cox and Snell (1989) R2 is a measure of the fit of the model, defined as 1 minus a certain quantity raised to the power of 2 over n, where n is the sample size. The aforementioned quantity is the ratio of the likelihood of the intercept-only model to the likelihood of the full model where L(O) is the likelihood of the intercept-only model, The likelihood of the full model is the likelihood of the full model, and n is the sample size.
4 Recognizing that the Cox and Snell R2 reaches a maximum for models that depend on the value of the estimated percentage, Nagelkerke (1991) proposed dividing the Cox and Snell measure by the maximum. In this sense, RN2 measures the absolute percentage of variation explained by the model.
Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse, 1999.

Go to the Table of Contents

This page was last updated on July 17, 2008.