Task 1: How to Set Up a T-Test in NHANES Using SUDAAN

In this task, you will use SUDAAN to calculate a t-statistic and assess whether the mean calcium intake in males versus females ages 20 and older is statistically different.

 

Step 1: Sort Data

Before running any SUDAAN procedure, sort the data by strata and PSUs, using the PROC SORT procedure. 

 

Step 2: Compute Properly Weighted Estimated Means

Use the PROC DESCRIPT procedure to generate means and specify the sample design using the design option WR (with replacement).  Use the NEST statement with strata and PSU to account for the design effects, and the WEIGHT statement to account for the unequal probability of sampling and non-response.  The SUBPOPN statement is used to select the population of interest.  Note that for accurate estimates of the standard error, it is preferable to use the SUBPOPN in SUDAAN to select a subgroup for analysis, rather than select the study subgroup in SAS when preparing the data file.  Use a CLASS statement to define the categorical variables in the analysis and the NOFREQ option to suppress frequencies. Use the VAR statement to choose the continuous variable for mean calcium.

 

Calculate Mean Calcium Intake, in Milligrams, among Males and Females Ages 20 Years and Older Using SUDAAN

Sample Code

*-------------------------------------------------------------------------;
* Use the PROC SORT procedure to sort the data files by strata and PSU.   ;
* Data must always be sorted before running a SUDAAN procedure.           ;
*                                                                         ;
* Use the PROC DESCRIPT procedure to estimate the mean dietary calcium    ;
* intake (DR1TCALC) by gender (RIAGENDR) in males and females ages 20     ;
* and older.                                                              ;
*-------------------------------------------------------------------------;

proc sort data =CALCMILK;
      < by SDMVSTRA SDMVPSU;
run ;

proc descript data=CALCMILK design=wr;
      nest SDMVSTRA SDMVPSU;     
      weight WTDRD1;
      subpopn RIDAGEYR >= 20 ;
      class RIAGENDR/nofreq;
      var DR1TCALC;
      print nsum mean semean/style=nchs;
      rformat RIAGENDR GENDER. ;
      rtitle "Mean dietary calcium intake in males and females >= 20 years"
      "of age"
;
run ;

 

Output of Program


Mean dietary calcium intake in males and females >= 20 years of age


Number of observations read    :   9034    Weighted count :286222757           
Number of observations skipped :   1088                                        
(WEIGHT variable nonpositive)                                                  
Observations in subpopulation  :   4448    Weighted count:205284669            
Denominator degrees of freedom :     15                                        
                                                                               
                                                                             
Variance Estimation Method: Taylor Series (WR)                                 
For Subpopulation: RIDAGEYR >= 20                                              
Mean dietary calcium intake in males and females >= 20 years of age       
by: Variable, Gender - Adjudicated.                                            
                                                                               
---------------------------------------------------------                      
Variable                                                                       
   Gender -            Sample                                                  
     Adjudicated       Size             Mean      SE Mean                      
---------------------------------------------------------                      
Calcium (mg)                                                                   
   Total                   4448       880.13        16.72                      
   Male                    2135       998.36        21.81                      
   Female                  2313       770.73        15.29                      
---------------------------------------------------------                      
  

Highlights from the output include:

 

Step 3: Use a t-test to Test for Significance

In this case, a t-test is used to test whether the mean calcium intake by males is statistically different from the mean calcium intake by females.  Note that the program below and the program presented in Step 1 are identical except for the CONTRAST statement.  The CONTRAST statement is used to test the hypothesis that the difference in the means is equal to 0.  In other words, the mean calcium intake by males is equal to that by females.

 

T-test for Association between Calcium Intake and Gender

Sample Code

*-------------------------------------------------------------------------;
* Use the PROC SORT procedure to sort the data files by strata and PSU.   ;
* Data must always be sorted before running a SUDAAN procedure.           ;
*                                                                         ;
* Use the PROC DESCRIPT procedure and the CONTRAST statement to perform a ;
* t-test.  This will test whether the mean dietary calcium intake         ;
* (DR1TCALC) in males and females is significantly different.             ;
*-------------------------------------------------------------------------;

proc sort data =CALCMILK;
      by SDMVSTRA SDMVPSU;
run ;

proc descript data=CALCMILK design=wr;
      nest SDMVSTRA SDMVPSU;   
      weight WTDRD1;
      subpopn RIDAGEYR >= 20 ;
      class RIAGENDR/nofreq;
      var DR1TCALC;
      contrast RIAGENDR = ( 1 - 1 )/name = "Males vs. Females" ;
      print nsum t_mean p_mean/style=nchs;
      rformat RIAGENDR GENDER. ;
      rtitle "Mean dietary calcium intake in males and females >= 20 years"
      "of age"
;
run ;

 

Output of Program

Output of Program


Number of observations read    :   9034    Weighted count :286222757           
Number of observations skipped :   1088                                        
(WEIGHT variable nonpositive)                                                  
Observations in subpopulation  :   4448    Weighted count:205284669            
Denominator degrees of freedom :     15                                        
                                                                               
Variance Estimation Method: Taylor Series (WR)                                 
For Subpopulation: RIDAGEYR >= 20                                              
Mean dietary calcium intake in males and females >= 20 years                   
of age                                                                         
by: Variable, One, Contrast.                                                   
                                                                               
for: Variable = Calcium (mg).                                                  
                                                                               
-------------------------------------------------------                        
One                                            P-value                         
   Contrast                       T-Test       T-Test                          
                       Sample     Cont.Mean-   Cont.                           
                       Size       =0           Mean=0                          
-------------------------------------------------------                        
Total                                                                          
   Males vs. Females       4448        13.03     0.0000                        
1                                                                              
   Males vs. Females       4448        13.03     0.0000                        
-------------------------------------------------------                        
 

Highlights from the output include:

 

close window icon Close Window to return to module page.