Jump to main content
US EPA
United States Environmental Protection Agency
Search
Search
Main menu
Environmental Topics
Laws & Regulations
About EPA
Health & Environmental Research Online (HERO)
Contact Us
Print
Feedback
Export to File
Search:
This record has one attached file:
Add More Files
Attach File(s):
Display Name for File*:
Save
Citation
Tags
HERO ID
2325101
Reference Type
Book/Book Chapter
Title
Comparing data distributions
Author(s)
Chambers, JM; Cleveland, WS; Kleiner, B; Tukey, JA
Year
1983
Publisher
Wadsworth international Group, Duxbury Press
Location
Belmont, California; Boston, Massachusetts
Book Title
Graphical methods for data analysis
Page Numbers
47-73
Language
English
Abstract
In many applications we have two or several groups of observations rather than a single set, and the goal of the analysis is to compare the distributions of the groups. For instance, we can again consider the gross national products of all countries in the United Nations in 1980, but separated into northern hemisphere and southern hemisphere countries. Probably the simplest comparison is to determine whether the "typical" value for one group is above or below the "typical" value for the other; however much more detailed comparisons are possible and often needed. Virtually any of the distributional questions posed for one group in Chapter 2 can be asked of two or more groups in comparison to each other.
Graphical methods can be used for making such distributional comparisons. We begin below by describing the empirical quantile-quantile plot. Then we discuss how the displays of Chapter 2 for each data set can be combined to allow effective visual comparisons. Finally, we show how certain kinds of derived plots based on differences and ratios can enhance our ability to perceive structure in the data.
One example that will be used to illustrate the methodology of this chapter is a set of data from a cloud-seeding experiment described by Simpson, Olsen, and Eden (1975). Rainfall was measured from 52 clouds, of which 26 were chosen randomly to be seeded with silver iodide. The data are the amounts of rainfall in acre-feet from the 52 clouds, and the objective is to describe the effect that seeding has on rainfall. The data for this and the two examples described below are given in the Appendix.
A second example is the average monthly temperatures in degrees Fahrenheit from January 1964 to December 1973 in Newark, New Jersey, and in Lincoln, Nebraska. Both the Newark and Lincoln data sets have 120 observations.
A third example is the maximum daily atmospheric ozone concentrations in Stamford, Connecticut, described in Chapter 2, together with a similar set of ozone measurements from Yonkers, New York, for the same time period. Although there are 136 Stamford values, the Yonkers data set has 148 observations, since Yonkers has fewer missing values.
Series
Wadsworth statistics/probability series
ISBN
9780534980528
Tags
•
ISA-SOx
Considered
2nd Draft
Chapter Review
Atmospheric Chemistry
Cited in First ERD Nov2015
Cited Second ERD Dec2016
Cited in Final ISA Dec2017
Home
Learn about HERO
Using HERO
Search HERO
Projects in HERO
Risk Assessment
Transparency & Integrity