NHANES Dietary Web Tutorial: Estimating Population-Level Distributions of Usual Dietary Intake: Task 1

Task 1: How to Estimate the Distribution of Usual Intake for a Single Ubiquitously-consumed Dietary Constituent for One Population or Subpopulation using the NCI Method

The following example shows how the distribution of calcium from foods and beverages can be estimated for women ages 19 years and older.

This example uses the demoadv dataset (download at Sample Code and Datasets). The variables w0304_0 to w0304_16 are the weights (dietary weights and Balanced Repeated Replication [BRR] weights) used in the analysis of 2003-2004 dietary data that require the use of BRR to calculate standard errors. The model is run 17 times, including 16 runs using BRR (see Module 18, Task 4 for more information). BRR uses weights w0304_1 to w0304_16.

IMPORTANT NOTE

Note: If 4 years of NHANES data are used, 32 BRR runs are required. Additional weights are found in the demoadv dataset.

A SAS macro is a useful technique for rerunning a block of code when you want only to change a few variables; the macro BRR201 is created and called in this example. The BRR201 macro calls the MIXTRAN macro and the DISTRIB macro, and calculates BRR standard errors of the parameter estimates. The MIXTRAN macro obtains preliminary estimates for the values of the parameters in the model, and then fits the model using PROC NLMIXED. It also produces summary reports of the model fit.

Modeling the complex survey structure of NHANES requires procedures that account for both differential weighting of individuals and the correlation among sample persons within a cluster. The SAS procedure NLMIXED can account for differential weighting by using the replicate statement. The use of BRR to calculate standard errors accounts for the correlation among sample persons in a cluster. Therefore, NLMIXED (or any SAS procedure that incorporates differential weighting) may be used with BRR to produce standard errors that are suitable for NHANES data without using specialized survey procedures. The DISTRIB macro estimates the distribution of usual intake, producing estimates of percentiles and the percent of the population below a cutpoint.

IMPORTANT NOTE

Note that the DISTRIB macro currently requires that at least 2 cutpoints be requested in order to calculate the percent of the population below a cutpoint.

The effect of the sequence of the 24-hour recall is removed from the estimated nutrient intake distribution (Day 1 or Day 2 24-hour recall). An adjustment is also made for day of the week the 24-hour recall was collected, dichotomized as weekend (Friday-Sunday) or weekday (Monday-Thursday). (See Module 18, Task 3 for more information on covariate adjustment.) BRR (Module 18, Task 4) is used to calculate standard errors.

The MIXTRAN and DISTRIB macros used in this example were downloaded from the NCI website. Version 1.1 of the macros was used. Check this website for macro updates before starting any analysis. Additional details regarding the macros and additional examples also may be found on the website.

Step 1: Create a dataset so that each row corresponds to a single person day and define variables if necessary

Statements	Explanation
data demoadv; set nh.demoadv; if w0304_0 ne . ; run ;	First, select only those people with dietary data by selecting those without missing BRR weights.
data day1; set demoadv; if riagendr= 2 and ridageyr>= 19 ; DRTCALC=DR1TCALC; day= 1 ; run ; data day2; set demoadv; if riagendr= 2 and ridageyr>= 19 ; DRTCALC=DR2TCALC; day= 2 ; run ;	The variables DR1TCALC and DR2TCALC are NHANES variables representing total calcium consumed on days 1 and 2, respectively, from all foods and beverages (other than water). To create a dataset with 2 records per person, the demoadv dataset is set 2 times to create 2 datasets, one where day=1 and one where day=2. The same variable name, DRTCALC, is used for calcium on both days. It is created by setting it equal to DR1TCALC for day 1 and DR2TCALC for day 2. Adult women ages 19 years and older are selected for analysis.
data calcium;

Statements

Explanation

data demoadv;
set nh.demoadv;
if w0304_0 ne . ;
run ;

First, select only those people with dietary data by selecting those without missing BRR weights.

data day1;
set demoadv;
if riagendr= 2 and ridageyr>= 19 ;
DRTCALC=DR1TCALC;
day= 1 ;
run ;

data day2;
set demoadv;
if riagendr= 2 and ridageyr>= 19 ;
DRTCALC=DR2TCALC;
day= 2 ;
run ;

The variables DR1TCALC and DR2TCALC are NHANES variables representing total calcium consumed on days 1 and 2, respectively, from all foods and beverages (other than water).

To create a dataset with 2 records per person, the demoadv dataset is set 2 times to create 2 datasets, one where day=1 and one where day=2. The same variable name, DRTCALC, is used for calcium on both days. It is created by setting it equal to DR1TCALC for day 1 and DR2TCALC for day 2. Adult women ages 19 years and older are selected for analysis.

data calcium;