Skip directly to search Skip directly to A to Z list Skip directly to navigation Skip directly to page options Skip directly to site content

Documentation

Editing and imputation of race and Hispanic origin data in the NHIS

Race and Hispanic origin data are edited in the Division of Health Interview Statistics (DHIS) on a quarterly basis to produce the variables that appear on our data files. When the raw data are received by DHIS from the U.S. Bureau of the Census (which administers the NHIS), responses that do not initially match any of the existing categories are back-coded where possible. After a basic check for valid responses, race and Hispanic origin recodes are created based on specifications developed by NHIS staff. Figures 1-3 [PDF - 22 KB] illustrate how the raw data variable names are created from the questionnaire items (Figure 1), how the Hispanic origin recodes are derived (Figure 2), and how the race recodes are derived (Figure 3). Additional details about the race and Hispanic origin editing can be found in Appendix II [PDF - 1.3 MB] of the 2004 Survey Description Document.

The NHIS made major changes to its editing procedures in the 2003 data year. Beginning with the 2003 NHIS, "Other race" is no longer coded as a separate race response. Any responses that fall into this category are treated as missing, and the race is imputed if this is the only race response (see the paragraph on imputation below). In cases where "Other race" is mentioned along with one or more OMB race groups, the "Other race" response is dropped and the OMB race group information is retained. These procedures are consistent with the methods used to edit the Modified Race Data summary file (MRD), created by the U.S. Bureau of the Census. More information on the MRD can be found below under "Links to U.S. Bureau of the Census information".

In the 2000 survey year, the NHIS began implementation of hot-deck imputation of race and Hispanic origin, in order to improve the overall quality of the data. Similar to the editing procedures, the imputation is done in DHIS, and the procedures used are based on methods developed by the U.S. Bureau of the Census. Race and Hispanic origin are first imputed from data on other household members, if available. If not, race and Hispanic origin are imputed from data on members of other households within a small geographic area. Figure 4 [PDF - 18 KB] illustrates how the imputation of race and Hispanic origin are done in the NHIS.

 

Using NHIS race and Hispanic origin data in analysis

Along with the information included on this site, additional material is included each year in Appendix II of the NHIS Survey Description Document (SDD) to aid analysts in using the NHIS race and Hispanic origin data. This information includes a description of the variables on the public use file for that survey year, as well as detailed information on the editing and imputation procedures used in processing the data. Appendix II of the SDD also includes sample SAS code for merging race variables across data years and using imputation flags. The 2005 SDD [PDF - 544 KB] is currently available, and we will update the links to the 2006 SDD as soon as it is available. Users are advised to review the documentation carefully, since variable names sometimes change. This site will also alert data users to any major changes in the data.

Because of NCHS data confidentiality rules, we cannot release data for small population groups, such as Native Hawaiian and Other Pacific Islander or the multiple race groups, on our public use data files. Analysts who wish to include these groups in their data analyses may submit a proposal to use the NCHS Research Data Center (RDC).

To aid users in tracking how variable names have changed over time, a table [PDF - 24 KB] is available that shows variable name changes for the 1997-2006 survey years, as well as a brief description of the reason for the change. Also, some of the terminology used on this site and in the tables and figures may be unfamiliar to our data users. A glossary of terms is included to assist users in understanding both the terms that are used and the context in which they are used in relation to NHIS race and Hispanic origin data. Some frequently asked questions and their answers are also provided for our users.

We hope that users will provide feedback on this site, and we will regularly update the FAQs to reflect data changes, NCHS policy changes, and input from our users.

Top