1.4. Vedung-2010
1.4. Vedung-2010
1.4. Vedung-2010
http://evi.sagepub.com/
Published by:
http://www.sagepublications.com
On behalf of:
Subscriptions: http://evi.sagepub.com/subscriptions
Reprints: http://www.sagepub.com/journalsReprints.nav
Permissions: http://www.sagepub.com/journalsPermissions.nav
Citations: http://evi.sagepub.com/content/16/3/263.refs.html
Evaluation
16(3) 263–277
Four Waves of Evaluation © The Author(s) 2010
Reprints and permission: sagepub.
Diffusion co.uk/journalsPermissions.nav
DOI: 10.1177/1356389010372452
http://evi.sagepub.com
Evert Vedung
Uppsala University, Sweden
Abstract
This article investigates the dissemination of evaluation as it appears from a Swedish and to a lesser
extent an Atlantic vantage point since 1960. Four waves have deposited sediments, which form
present-day evaluative activities. The scientific wave entailed that academics should test, through
two-group experimentation, appropriate means to reach externally set, admittedly subjective,
goals. Public decision-makers were then supposed to roll out the most effective means. Faith in
scientific evaluation eroded in the early 1970s. It has since been argued that evaluation should
be participatory and non-experimental, with information being elicited from users, operators,
managers and other stakeholders through discussions. In this way, the dialogue-oriented wave
entered the scene. Then the neo-liberal wave from around 1980 pushed for market orientation.
Deregulation, privatization, contracting-out, efficiency and customer influence became key phrases.
Evaluation as accountability, value for money and customer satisfaction was recommended. Under
the slogan ‘What matters is what works’ the evidence-based wave implies a renaissance for scientific
experimentation.
Keywords
dialogue-oriented evaluation, diffusion of evaluation, evidence-based evaluation, neo-liberal evaluation,
science-driven evaluation
Corresponding author:
Evert Vedung, Uppsala University, Sweden.
Email: evert.vedung@ibf.uu.se
considered interventions that are performed before interventions are adopted and put into practice. It
should be noted that assessments performed on interventions in empirical pilot trials are included in
the evaluation category. Also, groupings of several evaluations into meta-evaluations are considered
evaluations. In another sense, the definition is wide. It is not limited only to effects of interventions
and activities at the outcome level (i.e. in society or nature) but also includes outputs, implementation
processes, content and organization (Vedung, 1997: 2–13; 2006: 397).
Means knowledge
Outputs
Outcomes
Figure 1. The Engineering Model of the Initiation, Conduct and Use of Evaluation
Illustration: Tage Vedung after the model of EV
Source: The figure is my own and it has evolved over the years. Inspiration has been drawn from Albæk, 1988: xx,
Naustdalslid and Reitan, 1994: 50 ff, and Owen and Rogers, 1999.
goals set. The science-based findings from these trials are fed back to the decision-makers, who
make a proper binding decision to impose the most efficient of the examined interventions in full
scale (instrumental use). The intervention decision is then submitted to managers and operators,
who neutrally and faithfully implement it to produce the desired outcome.
According to the engineering model, intervention decisions should be taken in two quite clearly
discernible stages. The first, preliminary stage suggests that conceivable measures to reach given
ends should be rigorously tested in carefully designed, small-scale pilot trials. The findings of the
pilot trials should be fed back into the political system, which in a second stage, on the basis of the
findings, should arrive at a decision about the full-scale introduction of the most effective measure
to achieve the stated ends.
The engineering model posits that evaluative findings are used instrumentally. By instrumental
use is meant that evaluation discoveries about means are accepted as true and transformed into
binding decisions. Evaluation is neutral, objective research. It does not formulate problems. Nor
does it recommend ends (goals). The function of evaluation before an intervention is introduced on
a full scale is to help determine the most efficient means of achieving the previously stated ends,
that is, the means that will ensure goal achievement at the lowest cost. In the engineering model,
knowledge of means is produced in a value-neutral fashion, given that the evaluation meets the
highest scientific standards. For the ends have been set by the decision-makers, and finding the
most efficient means to reach these externally set ends (goals) is regarded as a proper task for
objective, empirical research (Simon, 1976: 37).
During the 1960s and even earlier, advanced evaluative thinking and practice was driven by this
notion of scientification of public policy and public administration. Evaluation would make gov-
ernment more rational, scientific and grounded in facts. Evaluation was to be performed by profes-
sional academic researchers.
The involvement of stakeholders … implies more than simply identifying them and finding out what their
claims, concerns and issues are. Each group is required to confront and take account of the inputs from other
groups. It is not mandated that they accept the opinions and judgments of others, of course, but it is required
that they deal with points of difference or conflict, either reconstructing their own constructions sufficiently
to accommodate the differences or devising meaningful arguments for why the others’ propositions should not
be entertained.
In this process a great deal of learning takes place. On the one hand, each stakeholder group comes to
understand its own construction better, and to revise it in ways that make it more informed and sophisti-
cated than it was prior to the evaluation experience … On the other hand, each stakeholder group comes
to understand the constructions of other groups better than before. Again we stress that that does not mean
coming to agreement, but it does mean gaining superior knowledge of the elements included in others’
constructions and superior understanding of the rationale for their inclusion.
… [a]ll parties can be mutually educated to more informed and sophisticated personal constructions as
well as an enhanced appreciation of the constructions of others.
Actually, for this wave of evaluation, Guba and Lincoln (1989: 43ff., 83ff.) proposed an alternative,
constructivist paradigm to the conventional, positivist, scientific paradigm in which the science-
driven wave was grounded (see also Dahler-Larsen, 2001). ‘It rests in a belief system that is virtually
opposite to that of science’, the authors argued.
Also labelled the naturalistic, hermeneutic or interpretive paradigm, the constructivist paradigm
was different from the positivist paradigm at three levels: ontology, epistemology and methodol-
ogy. As regards ontology (what is the nature of reality?) the constructivist paradigm denies the
existence of objective reality, asserting instead that realities are social constructions of the mind
and that there exist as many such constructions as there are individuals. There is no objective truth
on which inquiries can converge.
As regards epistemology (how can we be sure that we know what we know?), the constructivist
paradigm denies the possibility of subject–object dualism, suggesting instead that the findings of a
study exist because there is an interaction between observer and observed that literally creates what
emerges from that inquiry. One cannot find out the truth about how things really are or how they
really work. It is impossible to separate the inquirer from the one being inquired into. It is precisely
their interaction that creates the data that will emerge from the inquiry.
As regards methodology (what are the ways of finding out knowledge?), the constructivist para-
digm cannot use approximation to reality as the criterion to ascertain which construction is better
than others because the possibility of an objective reality is denied. Instead, it uses a hermeneutic-
dialectic process. As Guba and Lincoln (1989: 89–90) argue:
[A] process must be instituted that first iterates the variety of constructions (the sense-makings) that
already exist, then analyzes those constructions to make their elements plain and communicable to others,
solicits critiques for each construction from the holders of others, reiterates the constructions in light of
new information or new levels of sophistication that may have been introduced, reanalyzes, and so on to
consensus – or as close to consensus as one can manage. The process is hermeneutic in that it is aimed
toward developing improved (joint) constructions … It is dialectic in that it involves the juxtaposition of
conflicting ideas, forcing reconsideration of previous positions.
The science-driven wave rested upon means-ends rationality, emanating from the thinking of Max
Weber. Given that goals and objectives were set by bodies outside the scientific community and
expressly recognized as subjective, academic research could examine in experimental settings the
ability of various means to reach these externally set ends. Experiments would deliver objective
generalized truths about means. Other names for this train of thought in the social sciences are
behaviouralism and positivism. In contrast to the science-driven wave, the dialogue-oriented wave
rested upon communicative rationality. Instead of producing truths, dialogical evaluation would
generate broad agreements, consensus, political acceptability and democratic legitimacy.
Already during the final years of the 1960s, new social movements emerged that criticized elit-
ist central societal planning. Coming from the left of centre in the political spectrum, the criticism
also contained environmentalist tenets. Over time, the dialogue-oriented wave developed towards
a participatory criticism of extant representative democratic government.
In the early 1990s, proponents of the dialogical wave started to raise demands for more ‘delibera-
tive democracy’. Representative democracy, with its general elections and concomitant expert policy
analyses should be supplemented by venues for serious policy communication among ordinary
people. Election campaigns and so-called ‘debates’ in municipal councils and other parliamentary
arenas mostly declined into ‘pie-throwing’, it was argued. New forums were needed, deep down in
systems where clients and other stakeholders could meet for serious discussions of existing public
interventions and proposals for new action. Evaluations were presented as appropriate arenas for
deliberative democracy, which might deepen representative democracy. The point was that users and
other stakeholders, by participating in such evaluative dialogues, as a long-term side-effect, would
learn to become more engaged and better citizens, and thereby strengthen representative democracy
(Sjöblom, 2003). The trend towards ‘empowerment’ and ‘empowerment evaluation’ can be traced to
this time, too.
Rarely does a turn of events fulfil its concomitant expectations. The pioneers of evaluation in the
1960s claimed that the problem with the system of representative democracy and the public sector
was that it was based too much on biased ideological beliefs, political tactics, pointless bickering,
passing fancies and anecdotal knowledge. The cure was a strong dose of science focusing specifi-
cally on intervention effects. Unbiased evaluation research would eradicate the aberrations of the
representative system of democratic governance. Yet, as we have seen, this science-inspired move-
ment soon ran into demands for more stakeholder involvement and dialogue and communication
among concerned stakeholders, interest groups and citizens. Both of these tendencies drew their
strength from the left wing of the political spectrum.
Around 1978–9, a third swing started to sway the field of evaluation. Politically, the new
Zeitgeist implied a turn to the right. Its banner was neo-liberal, its contents were confidence in
customer orientation and markets. What was novel was not that goal achievement, effectiveness,
efficiency and productivity became catch phrases but that these objectives were to be achieved by
government marketization instead of stakeholder involvement or scientification from the top down.
Decentralization, deregulation, privatization, civil society and in particular, customer orientation
became new slogans. Previously regarded as the solution to problems, the public sector now
became the problem to be resolved.
The collective term for the neo-liberal public sector reform movement is New Public
Management. (Hood, 1991; Hood and Jackson, 1991; Klausen and Ståhlberg, 1998; Osborne and
Gaebler, 1992; Pollitt, 2003; Pollitt and Bouckaert, 2004; Pollitt et al., 1999). More focus on
results, less focus on processes, was the fundamental idea in New Public Management. Under this
Privatization, outsourcing
Individual performance-based wages and salaries
Focus on quality and quality assurance
More use of indirect Delegation of control and responsibility
New Public instead of direct Management by objectives
Management control
Contracting out
which means:
Purchaser–Provider models
Benchmarking
umbrella, New Public Management harboured a cluster of ideas drawn from administrative prac-
tices in the private sector. The main dogmas of New Public Management are shown in Figure 2.
New Public Management contains three major elements. The first element is belief in leader-
ship. ‘Let managers manage’ is the battle cry. It is with leadership at the centre that new and more
dynamic results-oriented organizations should be created. The margin for leadership should
increase. This applies to political as well as agency leadership; to leadership in educational as well
as in service institutions. Leadership should be exercised by management professionals. Good
leadership must be taught and learned. Being an expert on the actual substantive issues is not
enough. Leaders must meet demands for performance, efficiency and other management skills.
The second element involves increased use of indirect instead of direct control. Total privatiza-
tion is included, but only as one tenet (Osborne and Gaebler,1992: 45: ‘privatization is one arrow
in the government’s quiver’). A major point is that the government should act as the helmsman of
the ship of state, but not necessarily as an oarsman (‘steering not rowing’, Osborne and Gaebler,
1992: 25). In any case, the steering and rowing functions should be separated. Corporatization and
outsourcing of public services as well as increased competition are also important to boost flexibil-
ity, avoid wastage and counterbalance public employees’ self-interest.
The best-known feature of New Public Management is results-based management (ESV, 1999:
20; Kusek and Rist, 2004; Perrin, 1998; Pihlgren and Svensson, 1989; Sandahl, 1992). In its pure
ideal-type form, results-based management (performance-based management, management by
objectives) is a process of several steps (Table 3).
1 The principal, for example the superior national authority, creates an overall vision and sets some clear
outcome goals that indicate successive stages towards the overall vision
2 The principal and the implementing agent, e.g. the municipality, together develop indicators of the
successive outcome goals
3 The agent develops indicators of measures that the target audiences, such as households, shall take and
of its own final outputs
4 The agent is awarded financial resources as an unspecified lump sum
5 Within the budgetary frames set by the principal, the agent independently chooses the means (outputs)
to achieve the goals
6 The principal advertises that the indicators will be monitored and that evaluation in the form of effects
evaluation will be carried out later
7 To follow up, data on the indicators will be gathered, preferably by the agent who will also account for
them to the principal
8 The principal evaluates whether the goals and objectives have been attained and whether the means
chosen by the agents have contributed to goal achievement; this may be done on the basis of data
emerging from the follow-up
9 The principal and the agent use the follow-up and the outcome-effects analysis (evaluation) to correct
goals and means.
to signal demand for public services, other mechanisms are needed to improve the flow of informa-
tion from the users of the intervention into decision-making processes. This can be achieved if
users are able to choose between alternative service providers or participate on institution and
agency boards. It can also be achieved through hearings and questionnaires. New Public Management
stresses that the authorities have to be more responsive and adapt to customers. An evaluation
moment is introduced when customer satisfaction with the service is measured, for example, with
the help of customer satisfaction indices.
It should be stressed that NPM, in contrast to the dialogue-oriented wave, is customer oriented,
not stakeholder oriented.
All this has not led to the disappearance of evaluation. In the neo-liberal wave, it is regarded as
imperative that the fundamental principal in a representative democracy, the demos, has a right to
know how her agents spend her money. This results in an increased emphasis on the accountability
of agents in terms of resource use, by checking for economy, effectiveness and cost efficiency.
Evaluation has thus been strengthened and, above all, taken on new forms. Evaluation has become
a permanent feature of results-based management and of outsourcing. Evaluation has taken on new
expressions in the form of accountability assessments, performance measurement and consumer
satisfaction appraisal. Quality assurance and benchmarking are also recommended.
It is noteworthy that randomized experiments and quasi-experiments are ranked the highest,
while user opinion of effects is ranked the lowest. This is a far cry from the strong client-orientation
of New Public Management. From the English wordplay ‘evidence-based versus eminence-based
medicine’ (Times Literary Supplement, 8 Feb. 2008) it is obvious that the evidence movement
wants to play down professional judgements in favour of scientific experimentation (cf. Rieper and
Foss Hansen’s excellent report (2007)). The evidence wave tends to structure the field from a social
science methodology point of view, not a political, administrative or client-oriented one. Tacitly at
least, it is based on means–ends rationality, where the task of evaluation is to enhance and dis-
seminate knowledge of means. The evidence movement, some pundits argue, involves the return
of science-based evaluation, but in a new disguise.
Most famous among the international cooperation bodies is the Campbell Collaboration, named
after Donald T. Campbell, the celebrated advocate of a science-based, experimental public policy.
Another global institution that works in the same spirit is the Cochrane Collaboration.
Typical for the operation of these new international bodies is that they do not carry out evalua-
tions by themselves. Instead, they engage in meta-analysis, or synthesis analysis. The preferred
term, mentioned above, is ‘systematic reviews’. The Centre for Evidence-based Conservation in
Britain gives the following explication of the term ‘systematic review’:
Systematic review is a tool used to summarise, appraise and communicate the results and implications
of a large quantity of research and information. It is particularly valuable as it can be used to synthe-
sise results of many separate studies examining the same question, which may have conflicting find-
ings. Meta-analysis is a statistical technique that may be used to integrate and summarise the results
from individual studies within the systematic review, to generate a single summary estimate for the
effect of an intervention on a subject.
The purpose of a systematic review is to provide the best available evidence on the likely outcomes
of various actions and, if the evidence is unavailable, to highlight areas where further original research
is required. It is, therefore, a tool to support decision-making by providing independent, unbiased and
objective assessment of evidence.
The roles recommended for evaluators are also different from those of the science-driven wave in
the 1960s. In the 1960s, academic evaluators would carry out experimentally designed evaluations
commissioned by governmental bodies. In the evidence movement, two roles are open. One is that
evaluation is conducted at universities as uncommissioned basic research. The second is that public-
sector practitioners should conduct research on their clients as part of their clinical work. The ideal
is the doctor who acts as a clinical practitioner towards her patients while also doing scientific
research on them.
Can studies carried out in academia as basic research really be characterized as evaluations?
The answer is yes. They revolve around interventions, i.e. they are action-oriented. They are care-
fully carried out, concentrated on intervention effects and intended for potential adoption as poli-
cies or programmes in public governance. And syntheses of several such studies are included in my
minimal definition of evaluation at the outset of this article.
In the 1950s and 1960s, the idea that decisions on public policies and programmes should rest
on scientific information came from the defence sector. The RAND Corporation in the US and the
Defence Research Institute in Sweden were the initial proponents of this notion. This time, the
impetus was provided by the medical field. It started with demands for evidence-based social
medicine, which later led to cries for evidence-based social work, evidence-based public health
and evidence-based crime prevention (Sherman, 2002).
Conclusions
Evaluation is currently an exceptionally fashionable management recipe. Virtually every public
sector intervention is and should be evaluated.
remedy not in dialogue and participation but in more market orientation. Deregulation, privatiza-
tion, efficiency and customer orientation became new key words. Evaluation came to be included
in a neo-liberal, market-oriented train of thought called New Public Management.
New Public Management pushed strongly for evaluation as accountability and value for money.
Accountability evaluation became a permanent feature of performance management and outsourc-
ing. Evaluation took on new expressions in the form of customer-oriented evaluation. Value-for-
money evaluation in the form of cost-effectiveness and productivity studies was highly regarded.
While gaining in strength, the fourth evaluation wave is not yet as strong as the scientific wave of
the 1960s and the neo-liberal wave that grew in popularity from 1980 onwards. Characteristic of this
evidence wave is an effort to make government more scientific and based on real empirical evidence.
It is concerned with what works. This can be interpreted as a renaissance of science and randomized
experimentation. It is basically driven from the right-of-centre end of the political spectrum.
References
Alkin, M. C. (2004) Evaluation Roots: Tracing Theorists´ Views and Influences. Thousand Oaks, CA: SAGE.
Albæk, E. (1988) Fra sandhed til information: Evalueringsforskning i USA før og nu. Copenhagen: Aka-
demisk Forlag.
Dahler-Larsen, P. (2001) ‘From Programme Theory to Constructivism: On Tragic, Magic and Competing
Programmes’, Evaluation 7(3): 331–49.
ESV (1999) Ekonomistyrningsverket (National Financial Management Authority), Myndigheternas syn på
resultatstyrningen, by R. Sandahl. Stockholm: Ekonomistyrningsverket.
Furubo, J.-E. and R. Sandahl (2002) ‘A Diffusion Perspective on Global Developments in Evaluation’, in
J.-E. Furubo, R. C. Rist and Rolf Sandahl (eds) International Atlas of Evaluation, pp. 1–23. New Brunswick,
NJ, and London: Transaction Publishers.
Guba, E. G. and Y. S Lincoln (1989) Fourth Generation Evaluation. London: SAGE.
Hajer, M. and H. Wagenaar (2003) Deliberative Policy Analysis: Understanding Governance in the Network
Society. Cambridge: Cambridge University Press.
Hood, C. (1991) ‘A Public Management for All Seasons?’, Public Administration 69: 3–19.
Hood, C. and M. Jackson (1991) Administrative Argument. London: Gower.
Karlsson, O. (1995) Att utvärdera – mot vad? Om kriterieproblemet vid intressentutvärdering. Stockholm:
HLS Förlag.
Klausen, K. K. and K. Ståhlberg, eds (1998) New public management i Norden: nye organisations- og ledelse-
former i den decentrale velfærdsstat. Odense: Odense Universitetsforlag.
Kusek, J. Z. and R. C. Rist (2004) Ten Steps to a Results-Based Monitoring and Evaluation System: A Hand-
book for Development Practitioners. Herndon, VA.: World Bank Publications.
Osborne, D. and T. Gaebler (1992) Reinventing Government: How the Entrepreneurial Spirit is Transforming the
Public Sector From Schoolhouse to Statehouse, City Hall to the Pentagon. Reading, MA: Addison-Wesley.
Pawson, R. (2006) Evidence-Based Policy: A Realist Perspective. London: Sage.
Perrin, B. (1998) ‘Effective Use and Misuse of Performance Measurement’, American Journal of Evaluation
19(3): 367–79.
Pihlgren, G. and A. Svensson (1989) Målstyrning: 90-talets ledningsform för offentlig verksamhet. Malmö:
Liber/Hermods.
Pollitt, C. (2003) The Essential Public Manager. Maidenhead: Open University Press.
Pollitt, C. and G. Bouckaert (2004) Public Management Reform: A Comparative Analysis. Oxford: Oxford
University Press, 2a uppl.
Pollitt, C., X.Girre, J. Lonsdale, R. Mul, H. Summa and M. Wærness (1999) Performance or Compliance?
Performance Audit and Public Management in Five Countries. Oxford: Oxford University Press.
Evert Vedung is emeritus professor of political science, especially housing policy, at Uppsala University’s
Institute for Housing and Urban Research and Department of Government. His works on evaluation in English
include Public Policy and Program Evaluation (author, 1997 2000) and Carrots, Sticks and Sermons (1998,
2003, coeditor). Please address correspondence to: Uppsala University, IBF (Institute for Housing and Urban
Research), PO Box 785, SE-801 29 Gävle, Sweden. [email: evert.vedung@ibf.uu.se]