NHANES - Physical Activity Tutorial - Preparing

Key Concepts about Defining Formats and Labeling Variables

Formats and labels are user-defined tools that provide a convenient way to describe variables and their numeric values in SAS or SUDAAN output. The use of formatting and labeling is optional, but investigators often rely on these tools because they help in keeping track of frequently used variables and they add clarity to programming output.

Formatting is used to assign descriptive text names to numeric and character values of a variable. For example, you can create a format that you name “YESNO.” In this case, “Yes” represents values of 1 and “No” represents values of 2 You can then apply this format to a variable in the dataset that has the same response categories (i.e., 1 and 2). As a result, in your output, all of the 1 values of that variable are represented by “Yes” and all of the 2 values of that variable are represented by “No.”

Labeling, on the other hand, allows you to assign descriptive titles to variable names. Variables have names that are an abbreviated series of letters or letters and numbers (e.g., CVDFITLV). Labeling is a way to flesh out this shorthand with some explanatory detail. For example, the variable "CVDFITLV" is given the label "Cardiovascular Fitness Level.”

The distinction between formatting and labeling variables is that formats are applied to the values of a variable, whereas labels are applied to a variable name. Formats must be explicitly defined in the code, and this step is usually done at the beginning of a program or the beginning of a new sequence within a program. Labels can be added at any point in the code.