![]() | ||
![]() | ||
| Histogram [1]
Example: Age of Lung Cancer Patients in British Columbia, Canada.
![]()
A histogram is a graphical display of data like the one shown above. It is composed of rectangles whose width indicates a range of values and whose height represents the number of data points contained within that range (or the percentage of data points contained within that range). For example, in the figure above, each rectangle's width represents a five-year age range (i.e., 0-5 years, 6-10 years, 11-15 years, and so on) and the height of a particular rectangle represents the number of cases of lung cancer diagnosed in people in British Columbia, Canada, whose ages fall into that age range. Each rectangle is called a "bin," because one can think of them as containers that accumulate data and "fill up" at a rate equal to the count of members of that bin. The histogram provides a graphical summary of the shape of the data's distribution, whether the data are skewed and whether they have one or more modes. It often is used in combination with other statistical summaries, such as the boxplot, which conveys the median, quartiles, and range of the data. The shape of the histogram is sometimes particularly sensitive to the number of bins. If the bins are too wide, important information might get omitted. For example, the data may be bimodal but this characteristic may not be evident if the bins are too wide. On the other hand, if the bins are too narrow, what may appear to be meaningful information really may be due to random variations that appear because of the small number of data points in a bin. To determine whether the bin width is set to an appropriate size, different bin widths should be used and the results compared to determine the sensitivity of the histogram shape with respect to bin size. Bin widths are usually selected so that there are between 5 and 20 groups of data, but the appropriate number depends on the situation. An age-sex pyramid is a special form of histogram.
1. This definition is based on The Histogram and Histograms: Construction, Analysis and Understanding.
| ||