Task 1: How to Evaluate the Effects of Covariates on Usual Intake of a Single Ubiquitously-Consumed Dietary Constituent

In this example, the relationship association of two covariates—race/ethnicity and age—with calcium intake from food and beverages in adult women ages 19 years and older is modeled.  

This example uses the demoadv dataset (download at Sample Code and Datasets).  The variables w0304_0 to w0304_16 are the weights (integer dietary weights and Balanced Repeated Replication [BRR] weights) used in the analysis of 2003-2004 dietary data; the use of BRR to calculate correct standard errors is required. The model is run 17 times, including 16 runs using BRR (see Module 18 "Model Usual Intake Using Dietary Recall Data ", task 4 for more information).  BRR uses weights w0304_1 to w0304_16.

 

Info iconIMPORTANT NOTE

Note: if 4 years of NHANES data are used, 32 BRR runs are required.

 

A SAS macro is a useful technique for rerunning a block of code when the analyst only wants to change a few variables; the macro BRR191 is created and called in this example. The BRR191 macro calls the MIXTRAN macro, and calculates BRR standard errors of the parameter estimates.  The MIXTRAN macro obtains preliminary estimates for the values of the parameters in the model, and then fits the model using PROC NLMIXED. It also produces summary reports of the model fit. 

Recall that modeling the complex survey structure of NHANES requires procedures that account for both differential weighting of individuals and the correlation among sample persons within a cluster.  The SAS procedure NLMIXED can account for differential weighting by using the replicate statement.  The use of BRR to calculate standard errors accounts for the correlation among sample persons in a cluster.  Therefore, NLMIXED (or any SAS procedure that incorporates differential weighting) may be used with BRR to produce standard errors that are suitable for NHANES data without using specialized survey procedures.

The MIXTRAN macro used in this example was downloaded from the NCI website.  Version 1.1 of the macro was used.  We recommend that you check this website for macro updates before starting any analysis.  Additional details regarding the macro and additional examples also may be found on the website and in the users’ guide.

 

Step 1: Create a dataset so that each row corresponds to a single person day and define indicator variables if necessary

First, select only those people with dietary data by selecting those without missing BRR weights.

data demoadv;
set nh.demoadv;
if w0304_0 ne . ;  
run ;

 

The variables DR1TCALC and DR2TCALC are NHANES variables representing total calcium (mg) consumed on days 1 and 2 respectively from all foods and beverages (other than water).  To create a dataset with 2 records per person, the demoadv dataset is set 2 times to create 2 datasets, one where day=1 and one where day=2.  The same variable name, DRTCALC, is used for calcium on both days.  This variable is created by setting it equal to DR1TCALC for day 1 and DR2TCALC for day 2.  The datasets also select women ages 19 and older.

 

data day1;
set demoadv;
if riagendr= 2 and ridageyr>= 19 ;
DRTCALC=DR1TCALC;
day= 1 ;
run ;

 

data day2;
set demoadv;
if riagendr= 2 and ridageyr>= 19 ;
DRTCALC=DR2TCALC;
day= 2 ;
run ;

 

Finally, these data sets are appended, and dummy variables are created.  To use the NLMIXED procedure, dummy variables must be created (there is no CLASS statement to create dummy variables as in other SAS procedures).  In this example, the following code was used:

data calcium;
set day1 day2;
eth1=(ridreth1= 1 );
eth2=(ridreth1= 2 );
eth3=(ridreth1= 3 );
eth4=(ridreth1= 4 );
run ;

 

Because ridreth1 (race/ethnicity) has 5 levels, 4 dummy variables are needed.  This type of programming creates a variable called, for example, eth1 if the variable ridreth1 is equal to 1, and it is coded as 0 otherwise. 

Info iconIMPORTANT NOTE

Note: if the variable you are using has missing values, these will be coded to zero using the above code. Additional code would need to be added to set these to missing.  Also, if you use the “<” symbol in SAS to create a dummy variable, note that missing data are automatically assigned negative values of very large magnitude, so they must always considered to be <0 and will be coded as missing.).

 

Step 2: Sort the dataset by respondent and day

It is important to sort the dataset by respondent and by day of the intake (day 1 and day 2) before fitting the NLMIXED procedure because the procedure uses this information to estimate the model parameters.

 

Step 3: Create the BRR191 macro

The BRR191 macro calls the MIXTRAN macro and computes standard errors of parameter estimates.  After creating this macro and running it one time, it may be called multiple times, each time changing the macro variables.

 

Create the BRR191 Macro
Statements Explanation

%macro BRR191(data, response, foodtype, subject, repeat, covars_prob, covars_amt, outlib, modeltype, lambda, seq, weekend, vargroup, numvargroups, subgroup, start_val1, start_val2, start_val3, vcontrol, nloptions, titles, printlevel, final);

The start of the BRR191 macro is defined.  All of the terms inside the parentheses are the macro variables that are used in the macro.

%MIXTRAN  (data=&data, response=&response, foodtype= &foodtype, subject= &subject, repeat=&repeat, covars_prob=&covars_prob, covars_amt= &covars_amt, outlib=&outlib, modeltype=&modeltype, lambda=&lambda, replicate_var=w0304_0, seq=&seq, weekend=&weekend, vargroup= &vargroup, numvargroups=&numvargroups, subgroup=&subgroup,                  

start_val1=&start_val1, start_val2=&start_val2, start_val3= &start_val3, vcontrol=&vcontrol, nloptions=&nloptions, titles= &titles, printlevel=&printlevel) 

Within the BRR191 macro the MIXTRAN macro is called.  All of the variables preceded by “&” will be defined by the BRR191 macro call.  The only variable without an “&” is the replicate_var macro variable; it is set to w0304_0 for the first run.

data _null_;

format old varA $255. ;

%let I=1;

%let varamtu= %upcase (INTERCEPT &covars_amt);

%do %until ( %qscan (&varamtu,&I, %str ( ))= %str ());

%let varb&I= %qscan (&varamtu,&I, %str ( ));

%if %eval (&i) le 9 %then %let znum = "0";

  %else %let znum='';

num= %eval (&i);

varA= strip( 'A' ||strip(&znum)||strip(num)|| '_' || strip( "&&varb&i." ));

old =  trim(old)|| ' ' ||trim(varA);

%let I= %eval (&I+1);

%end ;

%let cnt= %eval (&I-1);

%if &covars_amt= %str () %then %let cnt=1;

call symput( 'old' ,old);

run;

This data step defines macro variables that will be used in the next step of the macro.

This code recreates the way that the MIXTRAN macro defines the parameter names, and makes a list of parameter names that are stored in the _param_unc_&foodtype (called &old).  It also counts the number of parameters (&cnt).

data parms;

set & outlib.._ param_unc_&foodtype;

array old (&cnt) &old;

array new (&cnt) &varamtu;

do k= 1 to dim(new);

new[k]=old[k];

end;

keep &varamtu;

run;

The dataset _param_unc_&foodtype is defined in the MIXTRAN macro.  This data step sets the dataset _param_unc_&foodtype and renames the parameters to their variable names.

data _null_

set & outlib.._ param_unc_&foodtype;

call symput ( 'lamb' ,a_lambda);

run;

Lambda is fixed in the BRR runs.  The lambda value from the first run is saved in a macro variable called &lamb.

%do run= 1 %to 16 ;     

This code starts a loop to run the 16 BRR runs.

%MIXTRAN (data=&data, response=&response, foodtype=&foodtype, subject= &subject, repeat=&repeat, covars_prob=&covars_prob, covars_amt= &covars_amt, outlib=&outlib, modeltype=&modeltype, lambda=&lamb, replicate_var=w0304_&run, seq=&seq, weekend=&weekend, vargroup= &vargroup, numvargroups=&numvargroups, subgroup=&subgroup, start_val1=&start_val1, start_val2=&start_val2, start_val3= &start_val3, vcontrol=&vcontrol, nloptions=&nloptions, titles=&titles, printlevel= 2 )

Within the BRR191 macro the MIXTRAN macro is called for the BRR run.  All of the variables preceded by “&” will be defined by the BRR191 macro call.  The only variable without an “&” is the replicate_var macro variable; it is set to w0304_&run where &run=1 to 16. Notice that the &lamb from the previous dataset is fixed for lambda.

data _null_;

format old varA var new $255. ;

%let I=1;

%do %until ( %qscan (&varamtu,&I, %str ( ))= %str ());

%let varb&I= %qscan (&varamtu,&I, %str ( ));

%if %eval (&i) lt 9 %then %let znum = "0";

%else %let znum= %str () ;

num= %eval (&i);

varA= strip( 'A' ||strip(&znum)||strip(num)|| '_' || strip( "&&varb&i." ));

old =  trim(old)|| ' ' ||trim(varA);

var= strip(strip( "&&varb&i." )|| '_' ||strip( "&run" ));

new = trim(new)|| ' ' ||trim(var);

%let I= %eval (&I+1);

%end ;

call symput( 'old' ,old);

call symput( 'new' ,new);

run;

This data step defines macro variables that will be used in the next step of the macro.

As before, this code recreates the way that the MIXTRAN macro defines the parameter names, and makes a list of parameter names that are stored in the _param_unc_&foodtype (called &old).  It also creates a list of the intercept and the other variables in the model with the BRR run number at the end (called &var).

data parmsbrr;

set & outlib.._ param_unc_&foodtype;

array old (&cnt) &old;

array new (&cnt) &new;

do k= 1 to dim(new);

new[k]=old[k];

end;

keep &new;

run;

The dataset  _param_unc_&foodtype from the MIXTRAN macro.  This data step sets the dataset _param_unc_&foodtype and renames the parameters to their variable names with the BRR run number at the end.

data parms;

merge parms parmsbrr;

run;

The point estimates of the parameters are merged with the BRR runs.

proc datasets nolist; delete parmsbrr;

After merging, the information parmsbrr can be deleted.

%end ;

The end of the BRR runs.

%let I=1;

  %do %until ( %qscan (&varamtu,&I, %str ( ))= %str ());

    %let varb&I= %qscan (&varamtu,&I, %str ( ));

This code starts a loop where the following code is evaluated for the intercept and the other variables in the model one at a time until all variables are evaluated.

data _null_;

format var call $255. ;

  set parms;

  call= "" ;

   %do r= 1 %to 16 ;

   var = strip(strip( "&&varb&i." )|| '_' ||strip( "&r" ));

   call = strip(strip(call)|| ' ' ||strip(var));

   %end ;

  call symput ( 'call' ,call);

run;

This code creates a macro variable with the BRR run number appended to the variable name.

data brr;

format variable $32. ;

set parms;

 array reps ( 16 ) &call;

   do m= 1 to 16 ;

    reps[m] = reps[m] - &&varb&i;

   end;

estimate=&&varb&i;

brrse=sqrt(uss(of &call)/( 16 * .49 ));

variable= "&&varb&i" ;

keep variable estimate brrse;

run;

For the 16 BRR runs, the value of the point estimate is subtracted from the estimate of the parameter from the BRR run.  The standard error is calculated.

proc append base=allvars data=brr;

The datasets for each variable are appended to the dataset allvars.

proc datasets nolist; delete brr;

run;

The dataset brr is deleted.

%let I= %eval (&I+1);

%end ;

The variable I is incremented, and the end of the variable loop is defined.

data &final;

 format pvalue 6.4 ;

 set allvars;

 t=estimate/brrse;

 pvalue= 2 *( 1 -probt(abs(t), 15 ));

The final dataset is defined, and p-values are calculated.

proc print; var variable estimate brrse t pvalue;

run;

The final dataset is printed.

proc datasets nolist; delete parms;

run;

The dataset parms is deleted.

%mend BRR191;

The end of the BRR191 macro is indicated.

 

 

Step 4: Run the BRR191 macro to obtain parameter estimates for the covariates of interest from the model used in the NCI method

Use the BRR191 macro to obtain parameter estimates.  It is possible to call the BRR191 macro several times, varying the values of the parameters each time. For example, the variables of interest could be changed.  This merely requires calling the macro again (using a call similar to that below), not redefining the macro each time.

 

Run the BRR191 Macro
Statements Explanation

%BRR191(data=calcium, response=DRTCALC, foodtype=Calcium, subject=seqn, repeat=day, covars_amt=ridageyr eth1 eth2 eth3 eth4, outlib=work, modeltype=amount, titles= 1 , printlevel= 2 , final=nh.m19task1)  

This code calls the BRR191 macro.  The dataset calcium defined in Step 1 is used; the macro variable response for which you want to model the distribution is DRTCALC.  The macro variable foodtype is used to label the param dataset.  The variable seqn identifies the subject, and the macro variable repeat defines the variable that identifies the repeats on the subject, which is day.  The covariates ridageyr eth1 eth2 eth3 eth4 are included in the model.

The macro variable outlib specifies the library where the data are to be stored.  In this case, the working directory, work, was used. 

Because this is a ubiquitously consumed dietary constituent, modeltype= amount is specified.  This fits the amount model.

The macro variable titles saves one line for a title supplied by the user.  The printlevel is 2, which prints the output from the NLMIXED runs and the summary.

The variable final specifies the name of the final dataset produced.

 

Step 5: Interpret parameter estimates for the covariates of interest

 

close window icon Close Window to return to module page.