Key Concepts About Missing Data in NHANES

Almost all variables in NHANES environmental chemical data contain some missing values, especially for persons who did not provide blood or urine specimens.  This means that the available sample is generally less than the subsample eligible to be tested. Because these missing values may distort your results, you must evaluate the extent of missing data to determine whether the data are useable without additional reweighting for item non-response. 

When you check missing values in NHANES environmental chemical data, as a general rule, if 10% or less of the data for a variable of interest is missing from your analytic dataset, it is usually acceptable to continue your analysis without further evaluation or adjustment.

 

Warning icon If more than 10% of the data for a variable is missing, you may need to determine whether the missing values are distributed equally across socio-demographic characteristics, and decide whether further imputation of missing values or use of adjusted weights is necessary. (See the Analytic and Reporting Guidelines for more information.)

 

Data coded as “missing” in the dataset are completely unavailable for analysis. In the codebooks of current NHANES environmental chemical data, missing values for numeric variables are coded as a period (.).

Note that a period (.) is the smallest number in SAS. Therefore, it is important to remember that when recoding data and creating variables using SAS operators less than (<), and less than or equal (<=) without lower bound may include missing data erroneously.

Warning icon To identify all participants who were both selected and eligible to be tested for an environmental chemical analyte, look in the dataset that contains the analyte of interest. All records with a non-missing sample weight in that dataset were eligible to be tested for that analyte.

 

close window icon Close Window