Lesson 4: Displaying Public Health Data
Section 2: Tables
A table is a set of data arranged in rows and columns. Almost any quantitative information can be organized into a table. Tables are useful for demonstrating patterns, exceptions, differences, and other relationships. In addition, tables usually serve as the basis for preparing additional visual displays of data, such as graphs and charts, in which some of the details may be lost.
Tables designed to present data to others should be as simple as possible.(1) Two or three small tables, each focusing on a different aspect of the data, are easier to understand than a single large table that contains many details or variables.
A table in a printed publication should be self-explanatory. If a table is taken out of its original context, it should still convey all the information necessary for the reader to understand the data. To create a table that is self-explanatory, follow the guidelines below.
More About Constructing Tables
- Use a clear and concise title that describes person, place and time — what, where, and when — of the data in the table. Precede the title with a table number.
- Label each row and each column and include the units of measurement for the data (for example, years, mm Hg, mg/dl, rate per 100,000).
- Show totals for rows and columns, where appropriate. If you show percentages (%), also give their total (always 100).
- Identify missing or unknown data either within the table (for example, Table 4.11) or in a footnote below the table.
- Explain any codes, abbreviations, or symbols in a footnote (for example, Syphilis P&S = primary and secondary syphilis).
- Note exclusions in a footnote (e.g., 1 case and 2 controls with unknown family history were excluded from this analysis).
- Note the source of the data below the table or in a footnote if the data are not original.
One-variable tables
In descriptive epidemiology, the most basic table is a simple frequency distribution with only one variable, such as Table 4.1a, which displays number of reported syphilis cases in the United States in 2002 by age group.(2) (Frequency distributions are discussed in Lesson 2.) In this type of frequency distribution table, the first column shows the values or categories of the variable represented by the data, such as age or sex. The second column shows the number of persons or events that fall into each category. In constructing any table, the choice of columns results from the interpretation to be made. In Table 4.1a, the point the analyst wishes to make is the role of age as a risk factor of syphilis. Thus, age group is chosen as column 1 and case count as column 2.
To create a frequency distribution from a data set in Analysis Module:
Select frequencies, then choose variable under Frequencies of.
(Since Epi Info 3 is the recommended version, only commands for this version are provided in the text; corresponding commands for Epi Info 6 are offered at the end of the lesson.)Often, an additional column lists the percentage of persons or events in each category (see Table 4.1b). The percentages shown in Table 4.1b actually add up to 99.9% rather than 100.0% due to rounding to one decimal place. Rounding that results in totals of 99.9% or 100.1% is common in tables that show percentages. Nonetheless, the total percentage should be displayed as 100.0%, and a footnote explaining that the difference is due to rounding should be included.
The addition of percent to a table shows the relative burden of illness; for example, in Table 4.1b, we see that the largest contribution to illness for any single age category is from 35–39 year olds. The subsequent addition of cumulative percent (e.g., Table 4.1c) allows the public health analyst to illustrate the impact of a targeted intervention. Here, any intervention effective at preventing syphilis among young people and young adults (under age 35) would prevent almost half of the cases in this population.
The one-variable table can be further modified to show cumulative frequency and/or cumulative percentage, as in Table 4.1c. From this table, you can see at a glance that 46.7% of the primary and secondary syphilis cases occurred in persons younger than age 35 years, meaning that over half of the syphilis cases occurred in persons age 35 years or older. Note that the choice of age-groupings will affect the interpretation of your data.(3)
Table 4.1a Reported Cases of Primary and Secondary Syphilis by Age — United States, 2002
Age Group (years) | Number of Cases |
---|---|
≤14 | 21 |
15–19 | 351 |
20–24 | 842 |
25–29 | 895 |
30–34 | 1,097 |
35–39 | 1,367 |
40–44 | 1,023 |
45–54 | 982 |
≥55 | 284 |
Total | 6,862 |
Data Source: Centers for Disease Control and Prevention. Sexually Transmitted Disease Surveillance 2002. Atlanta: U.S. Department of Health and Human Services; 2003.
Table 4.1b Reported Cases of Primary and Secondary Syphilis by Age — United States, 2002
CASES | ||
---|---|---|
Age Group (years) | Number | Percent |
Total | 6,862 | 100.0* |
≤14 | 21 | 0.3 |
15–19 | 351 | 5.1 |
20–24 | 842 | 12.3 |
25–29 | 895 | 13.0 |
30–34 | 1,097 | 16.0 |
35–39 | 1,367 | 19.9 |
40–44 | 1,023 | 14.9 |
45–54 | 982 | 14.3 |
≥55 | 284 | 4.1 |
* Actual total of percentages for this table is 99.9% and does not add to 100.0% due to rounding error.
Data Source: Centers for Disease Control and Prevention. Sexually Transmitted Disease Surveillance 2002. Atlanta: U.S. Department of Health and Human Services; 2003.
Table 4.1c Reported Cases of Primary and Secondary Syphilis by Age — United States, 2002
CASES | |||
---|---|---|---|
Age Group (years) | Number | Percent | Cumulative Percent |
Total | 6,862 | 100.0* | 100.0* |
≤14 | 21 | 0.3 | 0.3 |
15–19 | 351 | 5.1 | 5.4 |
20–24 | 842 | 12.3 | 17.7 |
25–29 | 895 | 13 | 30.7 |
30–34 | 1,097 | 16 | 46.7 |
35–39 | 1,367 | 19.9 | 66.6 |
40–44 | 1,023 | 14.9 | 81.6 |
45–54 | 982 | 14.3 | 95.9 |
≥55 | 284 | 4.1 | 100 |
* Percentages do not add to 100.0% due to rounding error.
Data Source: Centers for Disease Control and Prevention. Sexually Transmitted Disease Surveillance 2002. Atlanta: U.S. Department of Health and Human Services; 2003.
Two- and three-variable tables
Tables 4.1a, 4.1b, and 4.1c show case counts (frequency) by a single variable, e.g., age. Data can also be cross-tabulated to show counts by an additional variable. Table 4.2 shows the number of syphilis cases cross-classified by both age group and sex of the patient.
Table 4.2 Reported Cases of Primary and Secondary Syphilis by Age and Sex — United States, 2002
NUMBER OF CASES | |||
---|---|---|---|
Age Group (years) | Male | Female | Total |
Total | 5,268 | 1,594 | 6,862 |
≤14 | 9 | 12 | 21 |
15–19 | 135 | 216 | 351 |
20–24 | 533 | 309 | 842 |
25–29 | 668 | 227 | 895 |
30–34 | 877 | 220 | 1,097 |
35–39 | 1,121 | 246 | 1,367 |
40–44 | 845 | 178 | 1,023 |
45–54 | 825 | 157 | 982 |
≥55 | 255 | 29 | 284 |
Data Source: Centers for Disease Control and Prevention. Sexually Transmitted Disease Surveillance 2002. Atlanta: U.S. Department of Health and Human Services; 2003.
To create a two-variable tablefrom a data set in Analysis Module:
Select frequencies, then choose variable under Frequencies of. Output shows table with row and column percentages, plus chi-square and p-value. For a two-by-two table, output also provides odds ratio, risk ratio, risk difference and confidence intervals. Note that for a cohort study, the row percentage in cells of ill patients is the attack proportion, sometimes called the attack rate.
A two-variable table with data categorized jointly by those two variables is known as a contingency table. Table 4.3 is an example of a special type of contingency table, in which each of the two variables has two categories. This type of table is called a two-by-two table and is a favorite among epidemiologists. Two-by-two tables are convenient for comparing persons with and without the exposure and those with and without the disease. From these data, epidemiologists can assess the relationship, if any, between the exposure and the disease. Table 4.3 is a two-by-two table that shows one of the key findings from an investigation of carbon monoxide poisoning following an ice storm and prolonged power failure in Maine.(4) In the table, the exposure variable, location of power generator, has two categories — inside or outside the home. Similarly the outcome variable, carbon monoxide poisoning, has two categories — cases (number of persons who became ill) and controls (number of persons who did not become ill).
Table 4.3 Generator Location and Risk of Carbon Monoxide Poisoning After an Ice Storm — Maine, 1998
NUMBER OF | ||||
---|---|---|---|---|
Cases | Controls | Total | ||
Total
|
27 | 162 | 189 | |
Generator location |
Inside home or
attached structure |
23 | 23 | 46 |
Outside home
|
4 | 139 | 143 |
Data Source: Daley RW, Smith A, Paz-Argandona E, Mallilay J, McGeehin M. An outbreak of carbon monoxide poisoning after a major ice storm in Maine. J Emerg Med 2000;18:87–93.
Table 4.4 illustrates a generic format and standard notation for a two-by-two table. Disease status (e.g., ill versus well, sometimes denoted cases vs. controls if a case-control study) is usually designated along the top of the table, and exposure status (e.g., exposed versus not exposed) is designated along the side. The letters a, b, c, and d within the 4 cells of the two-by-two table refer to the number of persons with the disease status indicated above and the exposure status indicated to its left. For example, in Table 4.4, “c” represents the number of persons in the study who are ill but who did not have the exposure being studied. Note that the “Hi” represents horizontal totals; H1 and H0 represent the total number of exposed and unexposed persons, respectively. The “Vi” represents vertical totals; V1 and V0 represent the total number of ill and well persons (or cases and controls), respectively. The total number of subjects included in the two-by-two table is represented by the letter T (or N).
Table 4.4 General Format and Notation for a Two-by-Two Table
Ill | Well | Total | Attack Rate (Risk) | |
---|---|---|---|---|
Total | a + c = V1 | b + d = V0 | T | V1 ⁄ T |
Exposed | a | b | a + b = H1 | a ⁄ a+b |
Unexposed | c | d | c + d = H0 | c ⁄ c+d |
When producing a table to display either in print or projection, it is best, generally, to limit the number of variables to one or two. One exception to this rule occurs when a third variable modifies the effect (technically, produces an interaction) of the first two. Table 4.5 is intended to convey the way in which race/ethnicity may modify the effect of age and sex on incidence of syphilis. Because three-way tables are often hard to understand, they should be used only when ample explanation and discussion is possible.
Table 4.5 Number of Reported Cases of Primary and Secondary Syphilis, by Race/Ethnicity, Age, and Sex — United States, 2002
Race/ethnicity | Age Group (years) | Male | Female | Total |
---|---|---|---|---|
American Indian/ Alaskan Native |
≤14 | 1 | 0 | 1 |
15–19 | 0 | 1 | 1 | |
20–24 | 5 | 3 | 8 | |
25–29 | 3 | 1 | 4 | |
30–34 | 1 | 2 | 3 | |
35–39 | 3 | 5 | 8 | |
40–44 | 4 | 3 | 7 | |
45–54 | 8 | 8 | 16 | |
≥55 | 2 | 1 | 3 | |
Total | 27 | 24 | 51 | |
Asian/Pacific Islander | ≤14 | 1 | 1 | 2 |
15–19 | 0 | 2 | 2 | |
20–24 | 9 | 4 | 13 | |
25–29 | 16 | 1 | 17 | |
30–34 | 21 | 1 | 22 | |
35–39 | 14 | 1 | 15 | |
40–44 | 14 | 1 | 15 | |
45–54 | 8 | 0 | 8 | |
≥55 | 0 | 0 | 0 | |
Total | 83 | 11 | 94 | |
Black, Non-Hispanic | ≤14 | 3 | 9 | 12 |
15–19 | 89 | 164 | 253 | |
20–24 | 313 | 233 | 546 | |
25–29 | 322 | 163 | 485 | |
30–34 | 310 | 166 | 476 | |
35–39 | 385 | 183 | 568 | |
40–44 | 305 | 142 | 447 | |
45–54 | 370 | 112 | 482 | |
≥55 | 129 | 23 | 152 | |
Total | 2,226 | 1,195 | 3,421 | |
Hispanic | ≤14 | 1 | 1 | 2 |
15–19 | 37 | 25 | 62 | |
20–24 | 117 | 29 | 146 | |
25–29 | 139 | 26 | 165 | |
30–34 | 172 | 20 | 192 | |
35–39 | 178 | 22 | 200 | |
40–44 | 93 | 9 | 102 | |
45–54 | 69 | 14 | 83 | |
≥55 | 18 | 1 | 19 | |
Total | 824 | 147 | 971 | |
White, Non-Hispanic | ≤14 | 3 | 1 | 4 |
15–19 | 9 | 24 | 33 | |
20–24 | 89 | 40 | 129 | |
25–29 | 188 | 36 | 224 | |
30–34 | 373 | 31 | 404 | |
35–39 | 541 | 35 | 576 | |
40–44 | 429 | 23 | 452 | |
45–54 | 370 | 23 | 393 | |
≥55 | 106 | 4 | 110 | |
Total | 2,108 | 217 | 2,325 |
Data Source: Centers for Disease Control and Prevention. Sexually Transmitted Disease Surveillance 2002. Atlanta: U.S. Department of Health and Human Services; 2003. p. 118.
Exercise 4.1
The data in Table 4.6 describe characteristics of the 38 persons who ate food at or from a church supper in Texas in August 2001. Fifteen of these persons later developed botulism.(5)
- Construct a table of the illness (botulism) by age group. Use botulism status (yes/no) as the column labels and age groups as the row labels.
- Construct a two-by-two table of the illness (botulism) by exposure to chicken.
- Construct a two-by-two table of the illness (botulism) by exposure to chili.
- Construct a three-way table of illness (botulism) by exposure to chili and chili leftovers.
Table 4.6 Line Listing for Exercise 4.1
ID | Age | Attended Supper | Case | Date of Onset | Case Status | Ate Any Food | Ate Chili | Ate Chicken | Ate Chili Leftovers |
---|---|---|---|---|---|---|---|---|---|
1 | 1 | Y | N | - | Y | Y | Y | N | |
2 | 3 | Y | Y | 8/27 | Lab-confirmed | Y | Y | N | N |
3 | 7 | Y | Y | 8/31 | Lab-confirmed | Y | Y | N | N |
4 | 7 | Y | N | - | Y | Y | Y | N | |
5 | 10 | Y | N | - | Y | Y | N | Y | |
6 | 17 | Y | Y | 8/28 | Lab-confirmed | Y | Y | Y | N |
7 | 21 | Y | N | - | N | N | N | N | |
8 | 23 | Y | N | - | Y | Y | N | N | |
9 | 25 | Y | Y | 8/26 | Epi-linked | Y | Y | N | N |
10 | 29 | N | Y | 8/28 | Lab-confirmed | Y | Unk | Unk | Y |
11 | 38 | Y | N | - | N | N | N | N | |
12 | 39 | Y | N | - | N | N | N | N | |
13 | 41 | Y | N | - | Y | Y | Y | N | |
14 | 41 | Y | N | - | N | N | N | N | |
15 | 42 | Y | Y | 8/26 | Lab-confirmed | Y | Y | Unk | N |
16 | 45 | Y | Y | 8/26 | Lab-confirmed | Y | Y | Y | Y |
17 | 45 | Y | Y | 8/27 | Epi-linked | Y | Y | Y | N |
18 | 46 | Y | N | - | Y | N | Y | N | |
19 | 47 | Y | N | - | Y | N | Y | N | |
20 | 48 | Y | Y | 9/1 | Lab-confirmed | Y | Y | Unk | N |
21 | 50 | Y | Y | 8/29 | Epi-linked | Y | Y | N | N |
22 | 50 | Y | N | - | Y | N | Y | N | |
23 | 50 | Y | N | - | Y | N | N | Y | |
24 | 52 | Y | Y | 8/28 | Lab-confirmed | Y | Y | Y | N |
25 | 52 | Y | N | - | N | N | N | N | |
26 | 53 | Y | Y | 8/27 | Epi-linked | Y | Y | Y | N |
27 | 53 | Y | N | - | Y | Y | Y | N | |
28 | 62 | Y | Y | 8/27 | Epi-linked | Y | Y | Y | N |
29 | 62 | Y | N | - | Y | N | Y | N | |
30 | 63 | Y | N | - | N | N | N | N | |
31 | 67 | Y | N | - | N | N | N | N | |
32 | 68 | Y | N | - | N | N | N | N | |
33 | 69 | Y | N | - | Y | Y | Y | N | |
34 | 71 | Y | N | - | Y | N | Y | N | |
35 | 72 | Y | Y | 8/27 | Lab-confirmed | Y | Y | Y | N |
36 | 74 | Y | N | - | Y | Y | N | N | |
37 | 74 | Y | N | - | Y | N | Y | N | |
38 | 78 | Y | Y | 8/25 | Epi-linked | Y | Y | Y | N |
Data Source: Kalluri P, Crowe C, Reller M, Gaul L, Hayslett J, Barth S, Eliasberg S, Ferreira J, Holt K, Bengston S, Hendricks K, Sobel J. An outbreak of foodborne botulism associated with food sold at a salvage store in Texas. Clin Infect Dis 2003;37:1490–5.
Tables of statistical measures other than frequency
Tables 4.1–4.5 show case counts (frequency). The cells of a table could also display averages, rates, relative risks, or other epidemiological measures. As with any table, the title and/or headings must clearly identify what data are presented. For example, the title of Table 4.7 indicates that the data for reported cases of primary and secondary syphilis are rates rather than numbers.
Table 4.7 Rate per 100,000 Population for Reported Cases of Primary and Secondary Syphilis, by Age and Race — United States, 2002
Age Group (years) | Am. Indian/ Alaska Native | Asian/ Pacific Is. |
Black, Non- Hispanic |
Hispanic | White, Non- Hispanic |
Total |
---|---|---|---|---|---|---|
10–14 | 0.0 | 0.1 | 0.3 | 0.1 | 0.0 | 0.1 |
15–19 | 0.5 | 0.2 | 8.6 | 1.9 | 0.3 | 1.7 |
20–24 | 5.0 | 1.5 | 20.7 | 4.3 | 1.1 | 4.4 |
25–29 | 2.7 | 1.6 | 19.1 | 4.9 | 1.8 | 4.6 |
30–34 | 2.0 | 2.2 | 18.2 | 6.1 | 3.0 | 5.4 |
35–39 | 4.8 | 1.6 | 20.1 | 7.1 | 3.6 | 6.0 |
40–44 | 4.5 | 1.6 | 16.6 | 4.4 | 2.8 | 4.6 |
45–54 | 6.1 | 0.6 | 11.8 | 2.7 | 1.4 | 2.6 |
55–64 | 1.4 | 0.0 | 4.6 | 0.6 | 0.5 | 0.9 |
65+ | 0.8 | 0.0 | 1.5 | 0.5 | 0.1 | 0.2 |
Totals | 2.4 | 0.9 | 9.8 | 2.7 | 1.2 | 2.4 |
Data Source: Centers for Disease Control and Prevention. Sexually Transmitted Disease Surveillance 2002. Atlanta: U.S. Department of Health and Human Services; 2003.
Composite tables
To conserve space in a report or manuscript, several tables are sometimes combined into one. For example, epidemiologists often create simple frequency distributions by age, sex, and other demographic variables as separate tables, but editors may combine them into one large composite table for publication. Table 4.8 is an example of a composite table from the investigation of carbon monoxide poisoning following the power failure in Maine.(4)
It is important to realize that this type of table should not be interpreted as for a three-way table. The data in Table 4.8 have not been arrayed to indicate the interrelationship of sex, age, smoking, and disposition from medical care. Merely, several one variable tables (independently assessing the number of cases by each of these variables) have been concatenated for space conservation. So this table would not help in assessing the modification that smoking has on the risk of illness by age, for example. This difference also explains why portraying total values would be inappropriate and meaningless for Table 4.8.
Table 4.8 Number and Percentage of Confirmed Cases of Carbon Monoxide Poisoning Identified from Four Hospitals, by Selected Characteristics — Maine, January 1998
CASES | ||
---|---|---|
Characteristic | Number | Percent |
Total cases | 100 | 100 |
Sex (female) | 59 | 59 |
Age (years) | ||
0–3
|
5 | 5 |
4–12
|
17 | 17 |
13–18
|
9 | 9 |
19–64
|
52 | 52 |
≥65
|
17 | 17 |
Smokers | 20 | 20 |
Disposition | ||
|
- Page last reviewed: May 18, 2012
- Page last updated: May 18, 2012
- Content source: