Task 1c: How to Check Frequency Distribution and Normality in Stata

The frequency distribution can be presented in table or graphic format. In this task, you will learn how to use the standard Stata commands - summarize, histogram, graph box, and tabstat - to generate these representations of data distributions. These statistics can also be used to determine whether parametric (for a normal distribution) or non-parametric tests are appropriate to use in your analysis. As noted in the Clean & Recode Data module it is advisable to check for extreme weights and outliers before starting any analysis.

 

warning iconWARNING

There are several things you should be aware of while analyzing NHANES data with Stata. Please see the Stata Tips page to review them before continuing.


Step 1: Use the summarize command to generate weighted summary statistics for a population subset

 The Stata command, summarize, generates descriptive and summary statistics that are useful in describing the characteristics of a distribution.   Because the SVY series of commands do not include the summarize command, you will need to use the standard summarize command, but tell Stata to incorporate weights.  Below are instructions on how to write these commands and interpret the output. 

 

This command has the general structure: