NHANES II Web Tutorial: Keep & Merge Data: Keeping Data

Key Concepts About Keeping Data in NHANES II

NHANES II data files have been released by collection method or component, such as Physician's Examination or Hematology and Biochemistry. These are very large data files with many variables. The process of removing the variables you do not need for your analyses and retaining only those you will require in your analytic dataset is called keeping. As described in the Data Structure and Contents module, all of the NHANES II data files contain demographic, sample design, and weighting variables. It is highly recommended that you include these variables from the Health History Questionnaire files; one file covers ages 6 months-11 years and one file covers ages 12-74 years. These files include all survey participants who completed the household interview.

WARNING

When keeping NHANES II variables you should always include the SEQN variable. SEQN stands for sequence number and is a unique identifier for each observation (participant) in NHANES II. Every time you extract variables from an NHANES II data file, you should include the SEQN variable in your selection. Failing to do so will lead to problems if you want to sort or merge your data files at a later time.

After keeping the variables of interest, you will need to check the results. You should check to see that all your variables were included and that any variables you renamed or recoded are correct.

Close Window