Task 3b: How to Create an Appropriate Subset of Your Data for NHANES Analyses in SAS Survey Procedures

The following text explains the critical code necessary to create subsets of your data appropriately for SAS Survey procedure analyses. For examples of full SAS Survey procedure codes, please see the Logistic Regression module.

Example used throughout this task:  You are interested in analyzing only 20-49 year old females who were tested for total cholesterol in a 2-year dataset.

 

Step 1: Create Dataset

First, you determine that you will include all MEC examined individuals in your data set. 

The ridstatr variable on your demographic file designates interviewed participants with a value=1, and interviewed plus examined participants with a value = 2. Therefore, in the SAS data step, you use the ridstatr variable (ridstatr=2) to create a MEC-examined subset of data.

 

 

Step 2: Specify correct weight in program

Next, in SAS Survey Procedures you specify the correct weight to be used in the procedure by using a weight statement. Since you are using a single 2-year cycle, use the wtmec2yr variable.

 

 

Step 3: Include selected subset

If you wanted to complete an analysis of those who are greater than or equal to age 20 and less than or equal to age 49 years, are female, and have a valid measure for the total cholesterol variable lbxtc, then you need to create a subset of data containing only those observations. For SAS Survey procedures, there is no subpopn statement. Instead, most SAS 9.2 Survey procedures use a domain statement for domain analysis, also known as subgroup analysis or subpopulation analysis. In SAS 9.1 Survey Procedures, proc surveymeans, proc surveyreg, proc surveyfreq, and proc surveylogistic have different methods for selecting a subpopulation.

 

Info iconIMPORTANT NOTE

You should not use a where clause or by-group processing in order to analyze a subpopulation with SAS Survey procedures.

Methods for Subpopulation Analysis in SAS 9.1 Survey Procedures

proc surveymeans

proc surveymeans has a domain statement for domain or subpopulation analysis. Syntax details are in the SAS OnlineDoc:

http://support.sas.com/onlinedoc/913/getDoc/en/statug.hlp/surveymeans_sect6.htm

 

proc surveyreg

You can use the %sregsub macro available on the SAS website at:

http://support.sas.com/ctx/samples/index.jsp?sid=483

A domain statement is being added to proc surveyreg in SAS 9.2.

 

proc surveyfreq

You can perform a domain analysis by including your domain variable(s) in the tables statement. Details are at: http://support.sas.com/onlinedoc/913/getDoc/en/statug.hlp/surveyfreq_sect12.htm

 

proc surveylogistic

To get an approximate domain analysis, you assign a near zero weight to observations that do not belong to your current domain. The reason that you cannot make the weight zero is that the procedure will exclude any observation with zero weight. For example, if you have a domain gender=male or female, and if you specify in a data step:  

     if gender=male then newweight=weight; 

     else newweight=1e-6;

you could then perform the logistic regression using the newweight variable as:

     weight newweight;

 

SAS hopes to add a domain statement for proc surveylogistic in future releases, although no timetable has been set.

 

Reference:

SAS Technical Support.

 

close window icon Close Window