Task 4a: How to Set Up a T-Test in NHANES Using SUDAAN

In this task, you will use SUDAAN to calculate a t-statistic and assess whether the mean systolic blood pressure in calcium users vs. non-users ages 20 years and older is statistically different.  

 

Step 1: Identify variables

This example uses the demoadv dataset (download at Sample Code and Datasets).  The t-test is used with one continuous variable and one dichotomous variable.  The dataset contains a created variable called anycalsup that has a value of 1 for those who report calcium supplement use, and a value of 2 for those who do not.  A participant was considered not to have any calcium supplement use if the daily average amount of calcium supplement use was zero; otherwise, a participant was considered a supplement user (see Supplement Code under Sample Code and Module 9, Task 4 for more information). Blood pressure is measured in the MEC; therefore MEC weights are used in the analysis. The demoadv dataset for this example only includes those with MEC weights (wtmec2yr>0):

data demoadv;

  set nh.demoadv;

    if wtmec2yr> 0 ;

run ;

 

Step 2: Sort Data

Before running any SUDAAN procedure, sort the data by strata and PSUs, using the PROC SORT procedure.   

proc sort data=demoadv ;
by sdmvstra sdmvpsu;
run ;

 

Step 3: Compute Properly Weighted Estimated Means

Code to Generate Independent Categorical Variables

Statements Explanation

proc descript data=demoadv design=wr;

Use the proc descript procedure to generate means and specify the sample design using the design option wr (with replacement). 

nest sdmvstra sdmvpsu;

 

Use the nest statement with strata and PSU to account for the design effects,

weight wtmec2yr;

 

Use the weight statement to account for the unequal probability of sampling and non-response. 

subpopn ridageyr >= 20 ;

 

The subpopn statement is used to select the population of interest.  Note that for accurate estimates of the standard error, it is preferable to use the subpopn in SUDAAN to select a subgroup for analysis, rather than select the study subgroup in a SAS datastep when preparing the data file. 

class anycalsup/nofreq;

Use a class statement to define the categorical variables in the analysis and the nofreq option to suppress frequencies.

var mean_spb;

Use the var statement to choose the continuous variable for mean systolic blood pressure.

print nsum mean semean/style=nchs;

Use the print statement to obtain the N, mean, and standard error of the mean and to display the output in the style of your choice.

rformat anycalsup yesnos. ;

 

Use the rformat statement to read the SAS formats into SUDAAN.

rtitle "Mean systolic blood pressure by calcium supplement use in males and females >= 20 years of age" ;

Use the rtitle statement to title the output.

 

Step 4: Interpret Results

 Highlights from the output include:

 

Step 5: Use a t-test to Test for Significance

A t-test is used to test whether the mean systolic blood pressure in calcium supplement users is statistically different from the mean systolic blood pressure in non-users.  Note that the program below and the program presented in Step 3 are identical except for the contrast statement.  The contrast statement is used to test the hypothesis that the difference in the means is equal to 0.  In other words, it is used to test whether the mean systolic blood pressure in calcium supplement users is the same as the mean systolic blood pressure in non-users. 

T-test for Association between Calcium Supplements and Systolic Blood Pressure

Statements Explanation

proc sort data =demoadv;

by sdmvstra sdmvpsu;

run ;

Use the SAS procedure, proc sort, to sort the data by strata and primary sampling units (PSU) before running the procedure.

proc descript data=demoadv design=wr;

Use the proc descript procedure to generate means and specify the sample design using the design option WR (with replacement). 

nest SDMVSTRA SDMVPSU;

Use the nest statement with strata and PSU to account for the design effects,

weight wtmec2yr;

Use the weight statement to account for the unequal probability of sampling and non-response. 

subpopn ridageyr >= 20 ;

The subpopn statement is used to select the population of interest.  Note that for accurate estimates of the standard error, it is preferable to use the subpopn in SUDAAN to select a subgroup for analysis, rather than select the study subgroup in a SAS datastep when preparing the data file. 

class anycalsup/nofreq;

Use a class statement to define the categorical variables in the analysis and the nofreq option to suppress frequencies.

var mean_sbp;

Use the var statement to choose the continuous variable for mean systolic blood pressure.

contrast anycalsup = ( 1 - 1 )/name = "calcium supp user vs. non user";

Use the contrast statement to compare the differences between the groups. In this case, we are comparing calcium supplement users and non-users.

print nsum t_mean p_mean/style=nchs;

Use the print statement to obtain the N, mean, and standard error of the mean.

rformat anycalsup yesnos. ;

Use the rformat statement to read the SAS formats into SUDAAN.

rtitle "Mean systolic blood pressure by calcium supplement use in males and females >= 20 years of age"

Use the rtitle statement to title the output.

 Highlights from the output include:

 

close window icon Close Window to return to module page.