Task 2a: How to Use SUDAAN Code to Specify Sampling Parameters in NHANES III

Once data are sorted in SAS, SUDAAN can be used to specify the sampling design parameters.  In this example, the SUDAAN procedure, proc descript, is used and the name of the dataset is BP_analysis_Data.  Proc descript is being used as a generic example, but these statements apply to all SUDAAN procedures.

 

Step 1: Sorting in SAS

To carry out the appropriate SUDAAN design option for NHANES data, the data from BP_analysis_Data must be sorted by strata first and then PSU (unless the data have already been sorted by PSU within strata). The SAS proc sort statement must precede the SUDAAN statements.

warning iconWARNING

Data must always be sorted in SAS before doing analyses in SUDAAN.

 

Step 2: Use proc statement in SUDAAN

This statement immediately follows the sort statement. In this example, the proc descript statement is used.  In addition, the data option specifies BP_analysis_Data as the SAS dataset being used and the design option specifies with replacement (WR) as the design.

 

 

Step 3: Use nest statement in SUDAAN

The nest statement lists the variables that identify the strata and the PSU.  The nest statement is required for the appropriate design option for NHANES III to be used.   

As in the sort statement, the nest statement lists the stratum variable (sdpstra6) first, followed by the PSU variable (sdppsu6).

The variable that designates in which phase of data collection the sample person was chosen is SDPPHASE.  Values of 1 or 2 designates phase 1 or phase 2 respectively.

In NHANES III the stratum and PSU variables for each phase and the combined sample are called:

 

Phase 1 (1988-1991) — sdpstra1 for stratum and sdppsu1 for PSU

Phase 2 (1991-1994) — sdpstra2 for stratum and sdppsu2 for PSU

Combined (1988-1994) — sdpstra6 for stratum and sdppsu6 for PSU

 

 

Step 4: Use weight statement in SUDAAN

In NHANES III, a sample weight is assigned to each sample participant.  The sample weight is a measure of the number of individuals in the target population that the sampled individual represents. Sample weights are needed to obtain unbiased estimates of population parameters when the sample participants are chosen with unequal probabilities. (See module on weighting for more details).

The weight statement in SUDAAN Survey procedures is required for all NHANES analyses. It identifies the sample weight. In this example, the MEC + Home exam weight for 6 years of data (wtpfhx6) is used.

 

 

Summary: Sample SUDAAN code for sorting and specifying sampling design parameters

The following table shows how to combine the statements described above to properly sort the data, and specify the sample design, design parameters, and sample weights. The procedure proc descript is being used as an example, but the design, nest and weight statements can be used in the same manner for all SUDAAN procedures. Additionally, other procedure options can be added to these statements to customize the analysis and output. Consult the SUDAAN manual for specifications on the options for each SUDAAN procedure.

 

SUDAAN descript Procedure

Statements

Explanation

proc sort data=BP_analysis_Data;

by sdpstra6 sdppsu6;

run ;

Use the SAS procedure, proc sort, to sort the data by the design parameters, strata (sdpstra6) and primary sampling units (sdppsu6), before running the procedure in SUDAAN.

proc descript data= BP_analysis_Data design= WR;

Use the proc statement to specify the SUDAAN procedure being used (proc descript here), the data set (BP_analysis_Data), and the sample design (with replacement — WR).