2016 Data Release
Updated October 16, 2017: Inaccuracies were identified in the sampling weights for the 2016 NHIS, and a revised set of sampling weights for 2016 are now available for download. The original data files remain on this website with the inaccurate weights. All of the 2016 data files will be updated at a later date to reflect the inclusion of their corresponding revised weight. For most analyses, the conclusions are unlikely to change when using the revised weights. To be sure, however, data users should rerun analyses with the revised weight.
Data Files
- Family file
- Household file
- Injury Episode File
- Person file
- Sample Child file
- Sample Adult file
Imputed Income Files
To increase the usability of NHIS income data, the 2016 NHIS imputed income files contain income dollar amount point estimates rather than income dollar intervals. Because respondent confidentiality must be balanced against providing more detailed information, the variables containing the dollar amounts for personal earnings and family income have been top-coded to the 95th percentile of the appropriate distribution. The 95th percentile was calculated separately for each of the 5 imputed family income/personal earnings datasets and then a weighted average of the 5 individual 95th percentile amounts was calculated. The weighted average was rounded to the nearest $1,000 and this weighted average was used to top-code all 5 supplemental imputed personal earnings and family income datasets. The same procedure was used for family income and personal earnings dollar amounts. For all observations which were top-coded, the family income or personal earnings dollar amounts were replaced with the top-coded value. Also, since the 1997 NHIS, poverty ratio intervals have been provided on the NHIS public-use data files. The poverty ratio is a ratio of the family’s income to the appropriate Federal poverty threshold. In the 2016 NHIS imputed income files, the Federal poverty ratio has been calculated using top-coded family income and the final calculated poverty ratio value is truncated to 3 decimal places. Note that all top-coded point estimates contained in the 2016 NHIS Imputed Family Income/Personal Earnings files are analogous to the top-coded point estimates contained in the 1997-2008 Supplemental Imputed Family Income/Personal Earnings files and the 2009-2016 Imputed Family Income/Personal Earnings file. No analogue of the 1997-2008 Imputed Family Income/Personal Earnings files (with categorical income data) was produced in 2009-2016.
Multiple imputation is a technique that allows analysts to incorporate the extra variability due to imputation into their analyses. This is accomplished by analyzing each of the five completed data sets separately using methods and software that are appropriate for survey data, and then combining the estimates and standard errors using the combining rules described in Section 2.2 and Appendix A of the document available via the Technical Documentation link below. The extra variability due to imputation cannot be incorporated by simply analyzing a single completed data set as if the imputed values were true values. Moreover, analysts should not create a single completed data set using the average of the five sets of imputed values. Examples of correct data analyses using SAS-callable SUDAAN and SAS-callable IVEware are provided in Section 4 of the document available via the Technical Documentation link below; the document also provides information on the procedures used to create the imputations.
The Dataset Documentation link below opens to a document containing both the file layout description and the frequency counts (on the last page) of the variables in the data sets containing imputed values for the 2016 survey year. Users interested in data for several years should note that to date, multiple imputation has been carried out for the 1997-2016 NHIS, and that the file layout description is identical for years 1997-2003. Since 2004, there have been several changes to the imputed income file layout:
- Beginning with 2004, the person number variable changed to FPX which is unique within each family.
- For the years 2007-2008, the variables INCGRP_F, INCGRP_I, RAT_CATF, and ERNYRG_I changed to INCGRPF2, INCGRPI2, RATCATF2, and ERNYRGI2 respectively due to questionnaire and response category changes.
- Beginning with 2009, the variables INCGRPF2, INCGRPI2, RATCATF2, RAT_CATI, and ERNYRGI2 were removed and the variables FAMINCF2, TCINCM_F, FAMINCI2, POVRATI2, TCEARN_F, and ERNYR_I2 were added, replacing family income and personal earnings dollar amount intervals with dollar amount point estimates.
- Beginning with 2010, the variable POVRATI2 changed to POVRATI3, adding a third decimal place to the poverty ratio estimate.
Users are also encouraged to check the NHIS website for updates and to subscribe to the NHIS Listserv to receive notices of any corrections/updates.
Paradata File
The Paradata File Description Document gives an overview of the 2016 Paradata File, including information about the sample design, weighting, and variables found on the file. Appendix I of this Description Document contains an example of SAS code that can be used to link the 2016 Paradata File with the 2016 regular health data files.
An ASCII and CSV data set containing paradata for the 2016 survey year (PARADATA.ZIP, PARADATAcsv.ZIP) can be downloaded via the Dataset link below.
Dataset documentation for the Paradata File consists of a variable summary, variable layout and variable frequencies. Sample input programs are also provided.
Users are encouraged to check the NHIS website for updates and to subscribe to the NHIS Listserv to receive notices of any corrections/updates.
- Paradata File Description Document [PDF – 99 KB]
- Variable Summary [PDF – 62 KB]
- Variable Layout [PDF – 234 KB]
- Variable Frequencies [PDF – 56 KB]
- ASCII data [ZIP – 1.1 MB]
- CSV data [ZIP – 1.2 MB]
- Sample SAS Statements [SAS – 25 KB]
- Sample SPSS Statements [SPS – 20 KB]
- Sample Stata Statements [DO – 21 KB]
Functioning and Disability File
- Functional Disability File Description Document [PDF – 75 KB]
- Variable Summary [PDF – 40 KB]
- Variable Layout [PDF – 294 KB]
- Variable Frequencies [PDF – 41 KB]
- ASCII data [ZIP – 223 KB]
- CSV data [ZIP – 243 KB]
- Sample SAS Statements [SAS – 14 KB]
- Sample SPSS Statements [SPS – 15 KB]
- Sample Stata Statements [DO – 12 KB]
Family Disability Questions File
- Family Disability Questions File Description Document [PDF – 129 KB]
- Variable Summary [PDF – 26 KB]
- Variable Layout [PDF – 106 KB]
- Variable Frequencies [PDF – 24 KB]
- ASCII data [ZIP – 330 KB]
- CSV data [ZIP – 317 KB]
- Sample SAS Statements [SAS – 7 KB]
- Sample SPSS Statements [SPS – 4 KB]
- Sample Stata Statements [DO – 5 KB]
- Page last reviewed: September 20, 2017
- Page last updated: September 20, 2017
- Content source: