![]() | ||
![]() | ||
| Cluster Sampling
Cluster sampling is a sampling technique in which the entire population of interest is divided into groups, or clusters, and a random sample of these clusters is selected. Each cluster must be mutually exclusive and together the clusters must include the entire population. After clusters are selected, then all units within the clusters are selected. No units from non-selected clusters are included in the sample. This differs from stratified sampling, in which some units are selected from each group. When all the units within a cluster are selected, the technique is referred to as one-stage cluster sampling. If a subset of units is selected randomly from each selected cluster, it is called two-stage cluster sampling. Cluster sampling can also be made in three or more stages: it is then referred to as multistage cluster sampling. In cluster sampling, the clusters are the primary sampling unit (PSU’s) and the units within the clusters are the secondary sampling units (SSU’s). It is important to keep these two levels in mind when calculating standard errors from cluster samples. If a cluster sample is analysed as if it were a simple random sample, the reported standard errors would probably be smaller then they should be. That would give the impression that the survey results are more precise than they really are. Whereas stratification often increases precision of the estimation compared with simple random sampling, cluster sampling often decreases it. That is because units in a cluster tend to be more similar than elements selected at random from the whole population. When using cluster sampling, it is usually necessary to increase the total sample size to achieve the same precision as in simple random sampling. Nevertheless, there are cases where cluster sampling is useful. The main reason for using cluster sampling is that it usually much cheaper and more convenient to sample the population in clusters rather than randomly. In some cases, constructing a sampling frame that identifies every population element is too expensive or impossible. Cluster sampling can also reduce cost when the population elements are scattered over a wide area. Suppose you want to survey school children of a certain age in a specific area. If you drew a simple random sampling of school children, you might have to visit all schools in the area to interview your sample. With cluster sampling you could first select the schools to be included in your sample, and then select school children within each of the selected schools. That would probably reduce the number of schools you have to visit and therefore reduce the cost of data collection. In this example, the schools are what are sometimes referred to as natural clusters. In other cases, the population may be widely distributed geographically, and then cluster sampling, where the clusters consists of geographical areas, could reduce the number of areas that need to be visited. A smaller number of areas that need to be visited could reduce travel expenses and also make possible more efficient supervision of the fieldwork. For more information about cluster sampling, see: Sarndal, C.E., Swenson, B., and Wreman, J.H., Model Assisted Survey Sampling, Springer-Verlag, New York, 1992.
| ||