How to Analyze One Variable

Frequency Distribution

Grouped Data

Cumulative Distributions

Percentage Distributions

Why Graph

Bar Graphs

Histograms

Frequency Polygons

Pie Charts

Rates and Ratios

Descriptive statistics describe and summarize data.
Univariate descriptive statistics describe individual variables.

Obtain a printout of the raw data for all the variables. Raw data resembles a matrix, with the variable names heading the columns, and the information for each case or record displayed across the rows.

Example: Raw data for a study of injuries among county
workers (first 10 cases)

Injury Report No. | County Name | Cause of Injury | Severity of Injury |

1 | County A | Fall | 3 |

2 | County B | Auto | 4 |

3 | County C | Fall | 6 |

4 | County C | Fall | 4 |

5 | County B | Fall | 5 |

6 | County A | Violence | 9 |

7 | County A | Auto | 3 |

8 | County A | Violence | 2 |

9 | County A | Violence | 9 |

10 | County B | Auto | 3 |

For example, the variable Severity of Injury:

Severity of Injury |

3 |

4 |

6 |

4 |

5 |

9 |

3 |

2 |

9 |

3 |

Severity of Injury |
Number of Injuries with this Severity |

2 | 1 |

3 | 3 |

4 | 2 |

5 | 1 |

6 | 1 |

9 | 2 |

Total | 10 |

The severity of injury ratings can be collapsed into just a few categories or groups. Grouped data usually has from 3 to 7 groups. There should be no groups with a frequency of zero (for example, there are no injuries with a severity rating of 7 or 8).

One way to construct groups is to have equal class
intervals (e.g., 1-3, 4-6, 7-9). Another way to construct groups is to
have about equal numbers of observations in each group. Remember that class
intervals must be both mutually exclusive and exhaustive.

Severity of Injury |
Number of Injuries with this Severity |

Mild (1-3) | 4 |

Moderate (4-6) | 4 |

Severe (6-9) | 2 |

Total | 10 |

Severity of Injury |
Number of Injuries |
Cumulative frequency |

2 | 1 | 1 |

3 | 3 | 4 |

4 | 2 | 6 |

5 | 1 | 7 |

6 | 1 | 8 |

9 | 2 | 10 |

A cumulative frequency distribution can answer questions such as, how many of the injuries were at level 5 or lower? Answer=7

Severity of Injury |
Percent of Injuries |
Cumulative percentages |

2 | 10 | 10 |

3 | 30 | 40 |

4 | 20 | 50 |

5 | 10 | 70 |

6 | 10 | 80 |

9 | 20 | 100 |

Bar graphs can also be rotated so that the bars are parallel to the
horizontal orientation of the page. For example,

For example, in the case of the counties and employee injuries, we might have information on the rate of injury according to the number of workers in each county in State X.

County Name | Rate of Injury
per 1,000 workers |

County A | 5.5 |

County B | 4.2 |

County C | 3.8 |

County D | 3.6 |

County E | 3.4 |

County F | 3.1 |

County G | 1.8 |

County H | 1.7 |

County I | 1.6 |

County J | 1.0 |

County K | 0.9 |

County L | 0.4 |

If we group the injury rates into three groups, then
a low rate of injury would be 0.0-1.9 injuries per 1,000 workers; moderate
would be 2.0-3.9; and high would be 4.0 and above (in this case, up to
5.9). This could be graphed as follows:

For example, the following table shows the average injury rate per 1,000 employes for counties in State X for the years 1980 to 1990.

Year | 1980 | 1981 | 1982 | 1983 | 1984 | 1985 | 1986 | 1987 | 1988 | 1989 | 1990 |

Rate | 3.6 | 4.2 | 3.4 | 5.5 | 3.8 | 3.1 | 1.7 | 1.8 | 1.0 | 1.6 | 0.9 |

A cumulative frequency polygon is used to display the cumulative distribution of values for a variable.

This can be expressed as

A percentage is the same as a proportion, multiplied by 100.

This can be expressed as *f */ N x 100

A rate is the relationship between two different numbers, for example,
the number of injuries among county workers and the population of the county.
This can be calculated as the first number (N_{1}, or injuries)
divided by the second number (N_{2}, or population).

This can be expressed as N_{1 }/ N_{2}

Many health statistics are expressed as rates, for example, the birth rate is the number of births per some population, such as number of births per 1,000 women.