# Research Methods

By:   •  Research Paper  •  984 Words  •  May 3, 2011  •  775 Views

Page 1 of 4

Research Methods

Accountability Modules Data Analysis: Describing Data - Frequency Distribution

Texas State Auditor's Office, Methodology Manual, rev. 5/95 Data Analysis: Describing Data - Frequency Distribution - 1

WHAT IT IS Frequency distributions summarize and compress data by grouping it into

show how many observations on a given variable have a particular attribute. For

example, a survey is taken of 50 people's favorite color. The frequency

distribution might indicate 15 people selected green, 12 blue, 6 red, 7 yellow,

and 10 purple. Converting these raw numbers into percentages would then

provide an even more useful description of the data.

The frequency distribution is the foundation of descriptive statistics. It is a

prerequisite for both the various graphs used to display data and the basic

statistics used to describe a data set -- mean, median, mode, variance, standard

deviation, and so forth. Note that frequency distributions are generally used to

describe both nominal and interval data, though they can describe ordinal data.

WHEN TO USE IT A frequency distribution should be constructed for virtually all data sets. They

are especially useful whenever a broad, easily understood description of data

concentration and spread is needed. Most data provided by third parties are

grouped into a frequency distribution.

HOW TO PREPARE IT Regardless of whether manual or automated methods are used to prepare a

frequency distribution, it is usually necessary to code data numerically to

facilitate further data analysis. This makes creating a data dictionary which

defines the numeric codes used to identify data categories necessary. For

example, assume that an auditor/evaluator wants to classify both demographic

data and information on the opinion of entity staff on a particular policy. A data

dictionary for use with computer software might resemble the following:

Variable Name Code Field Width Field Type

Division Actual Division 20 Alphanumeric

Age Age in Years 3 Numeric

Gender 1 = Male 1 Numeric

2 = Female

Salary Range 1 = \$ 0 - 20,000 5 Numeric

2 = \$20 - 30,000

3 = \$30 - 50,000

4 = Over 50,000

Policy Opinion 1 = Excellent 1 Numeric

2 = Good

3 = Fair

4 = Poor

Data Analysis: Describing Data - Frequency Distribution Accountability Modules

Data Analysis: Describing Data - Frequency Distribution - 2 Texas State Auditor's Office, Methodology Manual, rev. 5/95

It is also necessary to determine how many classes one should use for the

frequency distribution. Selecting a number of classes is not as arbitrary as may

first appear. If data are nominal, simply list all possible classes (i.e. categories)

into which a data point might fall. If data are interval, the table below can

function as a rule of thumb:

Number of Observations Number of Classes

Under 50 5 - 7

50 - 200 7 - 9

200 - 500 9 - 10

500 - 1,000 10 - 11

1,000 - 5,000 11 - 13

5,000 - 50,000 13 - 17

Over 50,000 17 - 20

If