NHANES Dietary Web Tutorial: Estimating Population-Level Distributions of Usual Dietary Intake: Task 2

Task 2: How to Estimate Distributions of Usual Intake for a Single Ubiquitously-consumed Dietary Constituent with a Few Days of 24-hour Recalls for Subpopulations using a Covariate

The following example shows how the distribution of calcium from foods and beverages can be estimated for women by age group (at or younger vs. older than age 50 years).

This example uses the demoadv dataset (download at Sample Code and Datasets). The variables w0304_0 to w0304_16 are the weights (dietary weights and Balanced Repeated Replication [BRR] weights) used in the analysis of 2003-2004 dietary data that require the use of BRR to calculate standard errors. The model is run 17 times, including 16 runs using BRR (see Module 18, Task 4 for more information). BRR uses weights w0304_1 to w0304_16.

IMPORTANT NOTE

Note: If 4 years of NHANES data are used, 32 BRR runs are required. Additional weights are found in the demoadv dataset.

The effect of the sequence of the 24-hour recall is removed from the estimated nutrient intake distribution (Day 1 or Day 2 24-hour recall). An adjustment is also made for day of the week the 24-hour recall was collected, dichotomized as weekend (Friday-Sunday) or weekday (Monday-Thursday). (See Module 18, Task 3 for more information on covariate adjustment.)

A SAS macro is a useful technique for rerunning a block of code when you want to change only a few variables; the macro BRR202 is created and called in this example. The BRR202 macro calls the MIXTRAN macro and the DISTRIB macro, and calculates BRR standard errors of the parameter estimates. The MIXTRAN macro obtains preliminary estimates for the values of the parameters in the model, and then fits the model using PROC NLMIXED. It also produces summary reports of the model fit.

Modeling the complex survey structure of NHANES requires procedures that account for both differential weighting of individuals and the correlation among sample persons within a cluster. The SAS procedure NLMIXED can account for differential weighting by using the replicate statement. The use of BRR to calculate standard errors accounts for the correlation among sample persons in a cluster. Therefore, NLMIXED (or any SAS procedure that incorporates differential weighting) may be used with BRR to produce standard errors that are suitable for NHANES data without using specialized survey procedures. The DISTRIB macro estimates the distribution of usual intake, producing estimates of percentiles and the percent of the population below a cutpoint.

IMPORTANT NOTE

Note that the DISTRIB macro currently requires that at least 2 cutpoints be requested in order to calculate the percent of the population below a cutpoint.

The MIXTRAN and DISTRIB macros used in this example were downloaded from the NCI website. Version 1.1 of the macros was used. Check this website for macro updates before starting any analysis. Additional details regarding the macros and additional examples also may be found on the website.

Step 1: Create a dataset so that each row corresponds to a single person day and define variables if necessary

Statements	Explanation
data demoadv; format sel sel. ; set nh.demoadv; if w0304_0 ne . ; /* keeps only those with dietary data / if ridageyr ge 51 then sel= 1 ; else sel= 2 ; run* ;	First, select only those people with dietary data by selecting those without missing BRR weights. The variable sel is created to identify the subgroups of interest, those who are 50 years old or younger, and those who are older than 50 years.
data day1; set demoadv; if riagendr= 2 and ridageyr>= 19 ; DRTCALC=DR1TCALC; day= 1 ; run ; data day2; set demoadv; if riagendr= 2 and ridageyr>= 19 ; DRTCALC=DR2TCALC; day= 2 ; run ;	The variables DR1TCALC and DR2TCALC are NHANES variables representing total calcium consumed on days 1 and 2, respectively, from all foods and beverages (other than water). To create a dataset with 2 records per person, the demoadv dataset is set 2 times to create 2 datasets, one where day=1 and one where day=2. The same variable name, DRTCALC, is used for calcium on both days. It is created by setting it equal to DR1TCALC for day 1 and DR2TCALC for day 2. Adult women ages 19 years and older are selected for analysis.
data calcium; set day1 day2; if DAY_WK in ( 1 , 6 , 7 ) then weekend= 1 ; / should be named 'weekend'/ else if DAY_WK in ( 2 , 3 , 4 , 5 ) then weekend= 0 ; run ;

Statements

Explanation

data demoadv;
format sel sel. ;
set nh.demoadv;
if w0304_0 ne . ; /* keeps only those with dietary data */
if ridageyr ge 51 then sel= 1 ;
else sel= 2 ;
run ;

First, select only those people with dietary data by selecting those without missing BRR weights. The variable sel is created to identify the subgroups of interest, those who are 50 years old or younger, and those who are older than 50 years.

data day1;
set demoadv;
if riagendr= 2 and ridageyr>= 19 ;
DRTCALC=DR1TCALC;
day= 1 ;
run ;

data day2;
set demoadv;
if riagendr= 2 and ridageyr>= 19 ;
DRTCALC=DR2TCALC;
day= 2 ;
run ;

The variables DR1TCALC and DR2TCALC are NHANES variables representing total calcium consumed on days 1 and 2, respectively, from all foods and beverages (other than water).

To create a dataset with 2 records per person, the demoadv dataset is set 2 times to create 2 datasets, one where day=1 and one where day=2. The same variable name, DRTCALC, is used for calcium on both days. It is created by setting it equal to DR1TCALC for day 1 and DR2TCALC for day 2. Adult women ages 19 years and older are selected for analysis.

data calcium;
set day1 day2;
if DAY_WK in ( 1 , 6 , 7 ) then weekend= 1 ; /** should be named 'weekend'**/
else if DAY_WK in ( 2 , 3 , 4 , 5 ) then weekend= 0 ;
run ;