Impact Evaluation

Impact evaluations aim to assess the impact of given policies in order to improve their effectiveness. A major objective of impact evaluations is to analyse which part of observed changes is attributable to a given policy and which part depends on other factors. Even when the goals and objectives of a policy have been reached, it does not necessarily mean that the policy itself was responsible for that achievement.

Impact evaluations are technical exercises that rely on econometric and statistical models. The three main kinds of impact evaluation designs can be identified as: experimental, quasi-experimental and non-experimental with which are respectively associated control groups, comparison groups, and non-participants. Though all three methods provide valid data about the relative effectiveness of a policy compared with other possible interventions, or with doing nothing at all, experimental designs are seen as the most valid and reliable, and are used most often, when feasible.

Experimental designs

As explained by Phil Davies [1], the “purest form of experimental method is the randomised controlled trial (RCT).” The aim of RCT is to separate possible factors influencing an outcome from the policy itself, by constructing two groups of people on the basis of a purely random selection, and exposing them to exactly the same factors, except from the policy under evaluation. “Randomisation does not mean that [both groups] will be identical, but it reduces the influence of extraneous factors by ensuring that the only difference between the two groups will be those that arise by chance" [2].

The main advantage of such a method is the simplicity in interpreting the results. However, this method is not exempt from a number of problems [3]. For instance, as noted by the World Bank, it may be unethical to carry out such an evaluation; indeed, randomisation may be unethical due to the fact that it may deny benefits or services to otherwise eligible members of the population for purposes of the study. Moreover, it may be difficult to ensure that the selection of both groups is totally random; and experimental designs tend to be expensive and time-consuming.

Quasi-experimental designs

Quasi-experimental designs are based on the same logic and objective as experimental designs, that is, exposing two different groups to exactly the same factors, except the policy under evaluation, in order to assess the true impact of the policy, but, in constructing these two groups, use methods other than randomisation.

These alternative methods include matching or reflexive comparisons [4]. Matching techniques involve building a counterfactual by identifying individuals who do not partake in the policy under study but whose essential characteristics are similar to that of policy participants. Though usually quicker and cheaper to implement than experimental designs, matching tends to reduce the reliability of results, because of selection bias, and increase the difficulty of analysing results.

The reflexive comparison involves constructing a counterfactual based on the characteristics of individuals prior to their involvement in the policy under study. Participants are thus compared to themselves before and after their involvement. The main advantage of reflexive methods is that they make possible the evaluation of policies that cover the entire population, not just subgroups. A major limit, however, is that the changes in the situation of a group before and after the implementation of a policy may be linked to a whole range of factors independent from the policy itself.

Non-experimental designs

This alternative method of impact evaluation should be used when a counterfactual group cannot be constructed based on a random selection of individuals, when it is not possible to identify a group of individuals who are not participants of a policy but share essential characteristics with participants, or when a group cannot be identified for "before and after" comparisons.

In non-experimental designs, statistical methods and econometric techniques are used to compare participants and non-participants to a given policy. Such methods take into account the differences between the two groups, and issues such as selection bias, thus allowing the true impact of policy to be measured. As is the case with quasi-experimental designs, this method tends to be cheaper and easier to implement than the experimental method, since it relies on existing data sources. However, the reliability of results is more fragile and, in statistical terms, such a method can involve a series of complex operations.


1. See online version of: Government Chief Social Researcher’s Office, Prime Minister’s Strategy Unit, Guidance Notes for Policy Evaluation and Analysis, Chapter 1: What is Policy Evaluation?, in The Magenta Book, Cabinet Office, London, 2004, p. 7.

2. Ibid.

3. For more information, see the website of the World Bank, section Evaluation Designs.

4. Ibid.