Task 2b: How to Use SAS 9.2 Survey Procedures to Perform Linear Regression

In this example, you will assess the association between high density lipoprotein (HDL) cholesterol and selected covariates in NHANES 1999-2002. These covariates include gender (riagendr), race/ethnicity (ridreth1), age (ridageyr), body mass index (bmxbmi), smoking (smoker, derived from SMQ020 and SMQ040; smoker =1 if non-smoker, 2 if past smoker and 3 if current smoker) and education (dmdeduc).

 

Step 1: Create Variable to Subset Population

In order to subset the data in SAS Survey Procedures, you will need to create a variable for the population of interest. You should not use a where clause or by-group processing in order to analyze a subpopulation with the SAS Survey Procedures.  

In this example, restrict the analysis to individuals with complete data for all the variables used in the final multiple regression model.  Then this variable is used in the domain statement to specify the population of interest.

if (LBDHDL^=. and RIAGENDR^=. and  RIDRETH1^=. and SMOKER^=. and DMDEDUC^=. and BMXBMI^=.) and WTMEC4YR>0 and (RIDAGEYR>=20)

then ELIGIBLE=1;   else ELIGIBLE=2;

 

Step 2: Recode Discrete Variables

To change the reference level for a discrete variable, recode the variable so that the desired reference category has the highest level.

The variable riagendr was recoded to make men the reference category. The name of the recoded variable is sex.

If  RIAGENDR EQ 1 then SEX=2;

Else if RIAGENDR EQ 2 THEN SEX=1;

The variable ridreth1 was recoded to make non-Hispanic Whites the reference group. The recoded variable is ethn.

ETHN= RIDRETH1;

If RIDRETH1 eq 3 then ETHN=5;

Else if RIDRETH1 eq 4 then ETHN=2;

Else if RIDRETH1 eq 2 then ETHN=3;

Else if RIDRETH1 eq  3 then ETHN=4;

The variable bmicat was recoded to make normal weight the the reference group. The recoded variable is bmicatf.

if 0 le BMXBMI lt 18.5 then BMICATF=1;

else if 18.5 le BMXBMI lt 25 then BMICATF=4;

else if 25 le BMXBMI lt 30 then BMICATF=2;

else if BMXBMI ge 30 then BMICATF=3;

 

Step 3: Set up SAS Survey Procedures for Simple Linear Regression

The dependent variable should be a continuous variable and will always appear on the left hand side of the equation. The variables on the right hand side of the equation are the independent variables and may be discrete or continuous.

When interactions are included in the model, they are denoted with an asterisk, *, between the two variables. An interaction can occur between a discrete and a continuous variable, or between two discrete variables. An interaction term always will always appear on the right hand side of an equation.  

The summary table below provides steps for performing linear regression analyses using SAS Survey procedures.

 

Info iconIMPORTANT NOTE

These programs use variable formats listed in the Tutorial Formats page. You may need to format the variables in your dataset the same way to reproduce results presented in the tutorial.

 

Option 1. Use SAS Survey Procedures for Simple Linear Regression
Statements Explanation
PROC SURVEYREG DATA=analysis_data nomcar;

Use the SAS Survey procedure, proc surveyreg, to calculate significance. Use the nomcar option to read all observations.

STRATA sdmvstra;

Use the strata statement to specify the strata (sdmvstra) and account for design effects of stratification.

CLUSTER sdmvpsu;

Use the cluster statement to specify PSU (sdmvpsu) to account for design effects of clustering.

WEIGHT wtmec4yr;

Use the weight statement to account for the unequal probability of sampling and non-response.  In this example, the MEC weight for 4 years of data (wtmec4yr) is used.

DOMAIN eligible;

Use the domain statement to restrict the analysis to individuals with complete data for all the variables used in the final multiple regression model.

warning iconWARNING

When using proc surveyreg, use a domain statement to select the population of interest. Do not use a where or by-group statement to analyze subpopulations with the SAS Survey Procedures.

MODEL lbdhdl= bmxbmi/CLPARM VADJUST=none;