Task 1: Key Concepts about Merging Data in NHANES

For each 2-year cycle, NHANES data files are organized into five types of files: Demographic, Dietary, Examination, Laboratory, and Questionnaire. To allow for timelier releases, different data files are released at different times as they are completed.  Combining variables from these different data files in a dataset is called merging. This is similar to adding columns to a table.

To merge data, the variables must be linked in terms of a unique identifier. Because almost all analyses are conducted with individuals as the unit of analysis, the most frequently used unique identifier in NHANES is SEQN, the sequence number that identifies each participant in the sample.  Whenever you conduct an analysis with individuals, SEQN is the variable you must use to merge data files.  

In contrast, because the dietary supplement data files contain data about the supplements themselves as well as about individuals, the unique identifier can be the SEQN, the supplement ID number, or the ingredient ID number, depending on which files you wish to merge. 

Before merging data, you need to sort each data file by the SEQN variable or other unique identifier. This will ensure that all records are ordered in the same way in each data file. Use the PROC SORT procedure in SAS to sort the data. After sorting the data files, you can continue merging.

An important point to remember is that dietary supplement data and dietary recall data in the Individual Foods Files contain multiple records per person.  This will become apparent when you check the results of your merge statements because you may notice many more records than the number of people in your sample. 

After you have merged the data files, it is advisable that you check the contents again to make sure that the files merged correctly.  Use the PROC CONTENTS statement to list all variable names and labels and use the PROC MEANS statement to check the number of observations, as well as missing, minimum, and maximum values, for each variable.  

 

close window icon Close Window to return to module page.