1997 National Household Survey on Drug Abuse |
Field Data Collection Period and Data Preparation
The 1997 NHSDA retained the quarterly data collection schedule that had been used for the 1992, 1993, 1994, 1995, and 1996 NHSDAs. Year-round data collection provides for a more immediate and continuous picture of the Nation's drug problem and eliminates seasonal effects on NHSDA estimates.
Interviewer recruitment, selection, and training. Field interviewers for the 1997 NHSDA were selected from prior NHSDAs, especially the 1996 survey, the contractor's national interviewer file, other survey organizations, and local government employment agencies. Initially, 171 field interviewers were hired and sent to training; throughout the entire survey period, a total of 411 interviewers were trained. This number of staff was required because of two main factors. One, beginning with quarter 2 data collection, a decision was made to increase the total sample size by oversampling the States of Arizona and California. Thus, additional field interviewers were needed to cover the increased sample in these two States. Two, a natural consequence of the year-long data collection period was attrition of staff who had to be replaced.
Of the 411 interviewers ultimately trained for the 1997 NHSDA, 63 (15%) were black and 56 (14%) were Hispanic. A total of 76 (18%) of those recruited were bilingual in Spanish and English. A majority of the field interviewers were experienced interviewers-143 of the initial 171 hired had worked on the 1996 NHSDA.
The 1997 NHSDA began with eight field supervisors managing the field interviewers. With the addition of the oversample in Arizona and California, the final number of field supervisors managing the 1997 NHSDA rose to 12.
Unlike the 1996 NHSDA, which used virtually the same segments surveyed in the 1995 NHSDA, the 1997 NHSDA basically surveyed a new segment sample. Approximately 95% of the 1997 sample, or 1,844 segments, consisted of the previously unused units of the pairwise segment sample that was selected at the same time the 1995 and 1996 segment samples were selected. The remaining 5%, or 96 segments, overlapped with the 1996 survey year. Of the new national segments, 786 were selected from noncertaintyPSUs. Additionally, for the Arizona/California supplement, a total of 756 segments were selected within the two States, with 162 of these segments selected from 18 new noncertainty PSUs.
Territorial assignments were allocated roughly equally to the field supervisors. The assignments were based primarily on geographical location and previous territorial field assignments. Before the interviewing portion of data collection began, the field supervisors recruited the interviewer staff, trained the interviewers for counting and listing activities, prepared and monitored interviewers' assignments, and assisted with interviewer training on data collection methods.
All interviewers working on the 1997 NHSDA participated in a comprehensive training program. Veteran interviewers (those who had worked on the 1996 NHSDA) were trained using an extensive home study program. The home study package consisted of a copy of the 1997 NHSDA Field Interviewer Manual, a memo from the national field director reviewing the few study changes from 1996 to 1997, and a workbook with questions from the interviewer manual, screening exercises, and questions about the questionnaire and answer sheet changes. Certification to work on the 1997 NHSDA required successful completion of all parts of the home study process. All 143 veteran interviewers were certified to work on the 1997 NHSDA.
New interviewer staff attended an in-person training
session in either Raleigh, North Carolina, or Los Angeles, California.
Prior to attending the session, each trainee received the following items:
The in-person training session consisted of 4 days of project-specific training. Bilingual staff participated in an extra half-day of training to familiarize them with administering the Spanish-language version of the data collection instrument. Field interviewers who were new to survey research arrived at training 1 day early to attend a general orientation on field interviewing. All sessions were conducted by the contractor's senior survey operations staff, assisted by the regional supervisors and field supervisors for the area who had received training earlier for their roles in training the field interviewers.
Fieldwork-preliminary activities. Before the initial fieldwork of counting and listing segments began, segment kits were prepared for each of the 2,600 segments and mailed to field interviewers. Initially, 1,844 segments were prepared for the national sample; an additional 756 were required for the Arizona/California supplement.
Upon receiving their counting and listing assignments, interviewers listed the address or description of up to 400 dwelling units in each segment, and then returned the segment kit to the contractor. Sample dwelling units (SDUs) were selected from segment listings using a routine designed by the sampling statisticians. A label containing study identification information and housing unit address was printed for each SDU and attached to a screening form. On the form was printed the different person-selectionprocedures the interviewer was to follow, depending on the type of SDU and ages of the residents. The screening forms were sent to the field supervisors for assignment to field interviewers.
Field interviewers made initial contact with SDUs by mailing an introductory letter from the study director to each residence 1 week before their first visit. The letter provided a brief description of the study and its methods, informed the recipient that participation was voluntary, and assured confidentiality.
Fieldwork-interviewing. Interviewers had received training in introducing themselves and the study to SDU residents, answering questions, and soliciting cooperation. They also had received training in completing the screening form, including rostering household members aged 12 or older, and the person-selection procedures to select respondents randomly from the age-race/ethnicity strata appropriate for the dwelling unit. When the sampled respondent was available and cooperative, the interview was conducted immediately following screening and person selection. Interviewers were required to make at least five callbacks to an SDU to complete screening and interviewing. In reality, however, unlimited callbacks were made as long as the field supervisor believed there was a reasonable chance the screening or the interview could be completed. In particular, repeated visits were made to interview sampled respondents. Similarly, initial refusals were not simply accepted but were assigned to other interviewers and sometimes even to the field supervisors for conversion.
After each completed interview, the respondent was asked to complete a verification form by adding his or her name, address, and telephone number so that the field interviewer's work could be verified. This form was sealed in a preaddressed envelope separate from the envelope used for mailing the interview data collection forms to the contractor. Upon receipt by the contractor, these forms were filed according to interviewer. When verification forms did not have a telephone number but did have an address, verification by mail was attempted. Discrepancies were identified, and the appropriate field supervisor was notified by electronic mail for resolution; all discrepancies were satisfactorily resolved. Verification interviews, follow-up letters, and records of any discrepancies and their resolution were filed with the respondents' original verification forms.
Imputations. The questionnaire items on the 1997 NHSDA screening and interview instrument that were used during the imputation procedures (i.e., completeness and replacement) were nearly identical to the questions used during the 1996 NHSDA imputation procedures. Beginning in quarter 2 of the 1997 NHSDA, the sample in Arizona and California was expanded; thus, a State (Arizona, California, and the remainder of the United States) by quarter indicator was included among the sorting and explanatory variables used during the statistical imputation process. Due to its increasing popularity, cigar use was added to the list of variables for which missing responses were statistically imputed. Unlike other drugs, data on cigar use were collected on a "non-core" section of the questionnaire. Thus, imputations for cigar use differed slightly from those for other drugs. Population estimates are based on either the total sample or all cases in a subgroup, including where missing data for some recency-of-use and frequency-of-use variables were replaced with logically or statistically imputed (i.e., replaced) values. The interview classification "minimally complete" (a status necessary for a case to be included in the database) requires that data on the recency of use of alcohol, marijuana, and cocaine be present.
Logical imputations. To determine case completeness, an editing procedure is employed to replace missing data for those substances based on information supplied by the respondent elsewhere in the questionnaire. After this editing, case completeness is determined. When necessary, additional logical imputation also is done to replace other inconsistent, missing, or otherwise faulty data.
Statistical imputations. For selected variables of interest, which still have missing values after the application of logical imputation, statistical imputation is used to replace missing responses with statistically imputed responses. Two types of statistical imputations are used. A technique known as "hot-deck imputation" involves the replacement of a missing value by the last encountered nonmissing response from another respondent who is "similar" and has complete data. Logistic regression models also are used to determine replacement values for some variables. In general, analysts are advised to use the statistically imputed data when creating tabular summaries and other descriptive analyses for population subgroups of interest for trend data analysis. When forming population estimates, statistical imputations often will reduce the bias associated with the estimate. For example, data with missing responses that are not statistically imputed will produce estimates of totals that will necessarily be underestimated and consequently biased because the total population will not be accounted for in the estimates. Also, the bias in per-unit type estimates, such as the estimate of a population mean, can be reduced with statistical imputation, particularly when the imputation procedure accounts for differential nonresponse patterns and differential reporting patterns among subgroups of the population. Although the statistically imputed data will not totally eliminate the bias associated with estimates produced from these data, the imputation should help reduce the bias depending on the type of analyses.
For analyses of relationships involving multiple data items, use of the variables revised by statistical imputation may not be appropriate. Usually, these analyses span data items that were not jointly used in defining the imputation procedure. In this situation, use of nonimputed data items may be best. In summary, statistically imputed responses were created for the following variables: most of the drug recency-of-use items, the past year frequency-of-use items, cigar use, age, race, gender, the Hispanic origin items, marital status, work status, education, high school graduate indicator, total and private health insurance, and the personal earnings and family income items.
In the 1997 NHSDA, imputations for all imputation-revised variables (except for the personal earnings and family income variables and frequency of use for alcohol, marijuana, and cocaine) were constructed using hot-deck imputation. The first step in this procedure was to sort the data file with a progressive sorting series, using data on recency of use of alcohol, marijuana, cocaine, age, gender, Hispanic origin, race, and a State (Arizona, California, and the remainder of the United States) by quarter indicator. The second step of the hot-deck imputation procedure was to replace the missing item(s) on a particular record with the last encountered nonmissing response from an adjacent record (except alcohol, marijuana, and cocaine recency of use) on the sorted database. The hot-deck imputation procedure was appropriate for the recency-of-use and demographic variables because these variables' levels of item nonresponse were low.
Missing data for all personal and family income variables, and the frequency of use for alcohol, marijuana, and cocaine were statistically imputed using a regression-based method of imputation. This imputation procedure involved estimating a logistic regression model using item respondent data. After the model parameters were estimated, the resulting model was used to predict a categorical response for each item nonrespondent. Demographic variables, including a State by quarter indicator, served as the explanatory variables in the income models. In additional to these variables, the frequency-of-use models included recency of use (four levels) of alcohol, marijuana, and cocaine. Because the income and frequency-of-use variables have a large number of response categories, the regression-based model method was used to first impute collapsed response categories, then the collapsed categories were expanded to more levels using the hot-deck method. The model-based imputation procedure was appropriate for these variables for two reasons: (a) the level of nonresponse to these questions was larger than observed for the recency-of-use and demographic items; and (b) the model-based imputation procedure allows a greater number of statistically significant explanatory variables to affect an imputed response than is possible with the hot-deck method.
|
This page was last updated on December 30, 2008. |