Task 1a: How to Estimate Prevalence of Supplement Use Using Proportions Using SUDAAN

In this example, to determine the prevalence rate of calcium supplement use among older adults in the U.S., you will identify women and men 50 years and older who report calcium supplement use on the household interview.

 

Step 1: Determine variables of interest

This example uses the demoadv dataset (may be downloaded at Sample Code and Datasets). This dataset contains a created variable called anycalsup that has a value of 1 for those who report calcium supplement use, and a value of 2 for those who do not. A participant was considered not to have any calcium supplement use if the daily average amount of calcium supplement use was zero; otherwise, a participant was considered a supplement user (see Supplement Code under Sample Code and Module 9, Task 4 for more information).

 

Step 2: Sort data

The data from demoadv must be sorted by strata first and then PSU (unless the data have already been sorted by PSU within strata). The SAS proc sort statement must precede the SUDAAN statements.

Info iconIMPORTANT NOTE

The design variables sdmvstra and sdmvpsu are provided in the NHANES demographic data files and are used to calculate variance estimates. Before you call SUDAAN into SAS, the data must be sorted by these variables.

 

Step 3: Use proc descript to generate proportions

In this example, you will use proc descript in SUDAAN to generate proportions. The dataset contains a categorical variable, anycalsup, to indicate whether or not a person reported supplement use. That categorical variable will be identified in the procedure and the weighted percent (prevalence) of sample persons with the value anycalsup=1 (calcium supplement use) will be estimated along with the standard error.

You can code your variables in this example in two possible ways. Using the catlevel option in SUDAAN, persons who report calcium supplement use are assigned a value of 1. All other sample persons are assigned a value of 2. The weighted percentage of sample persons with a value equal to 1 is an estimate of the prevalence of calcium supplement use in the U.S.  You may also code persons who report calcium supplement use a value of 100, and persons who do not report calcium supplement use a value of 0. The weighted mean of sample persons with a value equal to 100 or 0 (which will be expressed as a percent) is an estimate of the prevalence of calcium supplement use in the U.S. 

The SUDAAN procedure, proc descript, is used to generate percents and standard errors.  You request those estimates on the print statement along with the sample size (nsum). The general program for obtaining weighted percents and standard errors is shown below.

 

Generate Proportions in SUDAAN
Statements Explanation

PROC SORT DATA =demoadv;

BY sdmvstra sdmvpsu;

RUN ;

Use the proc sort procedure in SAS to sort the dataset by strata (sdmvstra) and PSU (sdmvpsu). The data statement refers to the dataset, demoadv.

PROC descript data= demoadv design=wr;

Use the proc descript procedure in SUDAAN to generate means and specify the sample design using the design option wr (with replacement).

subpopn ridageyr >= 50 ;

Use the subpopn statement to select people 50 years and older (ridageyr >=50) because only those individuals are of interest in this example. Please note that for accurate variance estimates, it is preferable to use subpopn in SUDAAN to select a subpopulation of interest for analysis, rather than select the study population in the SAS datastep while preparing the analysis data file.

nest sdmvstra sdmvpsu;

Use the nest statement with strata (sdmvstra) and PSU (sdmvpsu) to account for the design effects.

weight wtint2yr;

Use the weight statement to account for the unequal probability of sampling and non-response.  In this example, the interview weight for 2 years of data (wtint2yr) is used.

subgroup riagendr;

Use the subgroup statement to list the categorical variables for which statistics are requested. This example specifies gender (riagendr) as a categorical variable. This variable will also appear in the table statement.

levels 2;

Use the levels statement to define the number of categories for each of the subgroup variables. The level must be an integer greater than 0. This example uses two levels for gender (indicated by the "2" after riagendr).

var anycalsup;

Use the var statement to name the variable(s) to be analyzed. In this example, the calcium supplement use variable (anycalsup) is used.

catlevel 1;  

Use the catlevel statement to indicate that the variable(s) on the var statement are categorical and to select the level of each variable to be analyzed. This example indicates the variable anycalsup is categorical and that anycalsup=1, i.e., persons who report calcium supplement use.

Info iconIMPORTANT NOTE

Note that the catlevel statement may be omitted if you code the anycalsup variable as 100 for persons reporting calcium supplement use and 0 for persons reporting no use.

table riagendr;

Use the table statement to specify cross-tabulations that estimates are requested. This example uses estimates by gender (riagendr).

print nsum= "Sample Size" percent= "Percent" sepercent= "SE" /

nohead notime style=NCHS nsumfmt= f8.0 percentfmt= f8.4 sepercentfmt= f8.4 ;

 

Use the print statement to assign names, format the statistics desired, and view the output. If the statement print is used alone, all of the default statistics are printed with default labels and formats.

In this example, sample size (nsum), percent (percent), and standard error of the percent (sepercent) are requested.  The percent represents the proportion of persons with anycalsup=1.

Note: For a complete list of statistics that can be requested on the print statement see the SUDAAN Users’ Manual (http://www.rti.org/sudaan/).

Use the style option equal to NCHS to produce output which parallels a table style used at NCHS.

rtitle "Prevalence of SPs age 50 and older who report calcium supplement use" ;

run ;

Use the rtitle statement to assign a heading for each page of output.

 

 

Step 4: Review Output

The percents in the output are the estimated proportions of persons ages 50 years and older in the target population who consume calcium supplements.

 

close window icon Close Window to return to module page.