EVALUATING
THE COMPLEX : ATTRIBUTION, CONTRIBUTION AND BEYOND
Editors: Robert Schwartz,
Kim Forss, and Mita Marra
Overarching policy initiatives
are now standard modus operandi for governmental and non-governmental
organizations. Some of these initiatives aim to affect the big problems
of the early 21st century: poverty, hunger, infectious disease,
unhealthy behavior and income disparity - to name a few. Reminiscent
of the American Great Society programs of the 1960s, policymakers
in various jurisdictions are allocating big resources to the solution
of big problems. Unlike the Great Society initiatives, new overarching
policy initiatives often harness a variety of programs and projects
to addressing different aspects of big problems for a variety of
population groups. There is now an understanding that there is no
single right intervention for all. This makes for complex strategies.
Complex policy initiatives are now commonly used to affect more
routine matters, such as school achievement and the advancement
of handicapped people. Reflecting an understanding that no one program
intervention can address all needs, these policy initiatives provide
an umbrella of resources and implementation infrastructure to advance
various projects adapted to localized conditions.
Overarching policy initiatives can also be found at the supranational
and even global levels in a broad range of areas, including environment,
security, trade, immigration, economic and social development. Trans-jurisdictional
fiscal and monetary policies are now common features of overarching
policy initiatives. Highly articulated strategies are, therefore,
decided globally while they affect people's lives locally.
Of course overarching policy initiatives are not new. What appears
to be new is a growing demand for effectiveness evaluations of complex
policy interventions at the international, national and local levels.
There is now pressure on politicians, stakeholders and senior management
officials to demonstrate that resources invested in policy initiatives
have been well spent. They need to justify the overall cost of the
policy and the allocation of policy resources to international,
regional and local programs and projects. Increasingly they are
asked what value has been obtained for the money, relative to alternative
investment channels. Evaluators are now frequently asked to address
these questions. Book chapters will address the extent of demand
for evaluating the effectiveness and cost-effectiveness of policy
initiatives.
Why this demand? Three overlapping mantras of contemporary public
and non-profit management offer possible explanations: accountability,
results-based management, and evidence-based policy.
Elaborating on Michael Power's (1997) depiction of the audit explosion,
several observers describe accountability overload in a variety
of jurisdictions. Voters love to hear that government will be more
accountable. But who is the government? Voters want to hold elected
politicians accountable, but ministers often feel that they do not
have sufficient control and information. Hence, politicians rely
on evaluation as one of the managerial tools with which to steer
the administration. Not only voters, but also those voted into power,
need evaluation. It really is an explosion. The popularity of accountability
has not been lost on politicians who have pushed for countless accountability
improvement stipulations in super-national, national and regional
jurisdictions. Large complex policy initiatives are natural targets
for accountability seekers as they expend big chunks of resources.
The no longer new public management reforms have had a lasting effect
on getting governments and NGOs to focus on results. NGO's in particular,
but also many government services could in the past legitimate their
existence by pointing to the importance and relevance of the objectives
they were striving to achieve. They achieved symbolic legitimacy
by working, for example, for children in need, human rights, HIV/AIDS
victims, and so on. But as competition for funds, even in fields
such as these has increased, symbolic legitimacy is no longer enough.
Demonstrating that resources have been spent and activities conducted
is no longer sufficient. Stakeholders, politicians and senior managers
insist on knowing what has been achieved with the resources. Results
count, and those who can point to results stand a better chance
in the fund-raising game. There are often requirements to divulge
results in performance measurement systems, annual reports and periodic
assessments. Results seekers are not often concerned with the difficulties
in attributing results to overarching policy initiatives or to particular
programs and projects. What they want is data.
Evidence-based policy is a relative newcomer to public management.
This movement encourages policymakers and managers to examine empirical
evidence for the effectiveness of existing and proposed policies
and programs. In its extreme manifestation, evidence-based policy
requires the production of evidence on the effectiveness of individual
projects, program interventions and policy strategies. Evidence
seekers contribute significantly to the demand for evaluation of
complex policy strategies.
Accountability fever, results-based management and the evidence-based
policy movement contribute to a sense that everything can and should
be evaluated. Indeed in many jurisdictions evaluation is now a standard
operating procedure, automatically included in budget and work plans.
This ought to please evaluators. Indeed there is now an abundance
of evaluation work to be done. As long as the demand is contained
to the project and program levels, the evaluation took-kit is sufficiently
well-stocked to cope with a variety of evaluation needs.
Complex policy initiatives, however, challenge evaluators in new
and daunting ways, beyond the scope of what existing tools of the
trade can manage. Two distinguishing characteristics set evaluation
of complex policy initiatives apart: 1) attribution; 2) complexity/complicatedness.
Complex policy initiatives would almost always fail standard evaluability
assessment. Their objectives are seldom clear-cut and measurable.
And there is little evidence to support a causal linkage, (as causality
is defined in classical texts on scientific method, that is, that
cause must be proven necessary and sufficient for the effect to
be produced) between unique components (programs and projects) or
mixes of these and expected outcomes. Indeed in complex systems
the linkages amongst interventions, contextual variables and outcomes
may not be linear. In a nutshell, evaluations of complex policy
initiatives face a major challenge of attribution. Even when macro-level
outcomes are measurable, it is very difficult to attribute changes
in outcomes to the implementation of the initiative. The challenge
is even greater if there is no linear causality chain.
The editors have identified five approaches that might be helpful
in evaluating complex policy initiatives:
1. cluster evaluation
2. multi-level evaluation
3. quantified logic models
4. comparative community studies
5. thematic evaluations
These, and other approaches will be examined through a series of
case studies of complex policy initiatives. Evaluators may also
seek inspiration for how complex social processes are described
and analysed in other social science discipline, including perhaps
in historical research, which has for long sought to explain how
and why things happen in a complex environment such as the scene
of world history. Also economics, since its own inception, has explored
the outcomes of policy decisions both on individual and state welfare,
explicitly highlighting the daunting methodological questions of
macro vs. micro units of analysis and the problems of data aggregation
and disaggregation. Economic thinking has offered the "theory"
to inform complex strategies while statistical analyses have gauged
their effects on national GDP, growth prospects, and development
patterns. Currently, sophisticated econometric studies are employed
to estimate the impact of international development aid on economic
convergence between developed and developing countries, or on poverty
reduction and income distribution. The question is whether and in
what circumstances such "evaluative" economic approaches
are complementary with typical methodologies adopted by evaluators.
The book will begin with a couple of theoretical chapters addressing
attribution / contribution and complexity. These will be followed
by a series of case study chapters.
Each of these chapters
should include:
1. Description of the complex policy initiative
2. Exploration of why the initiative came to be evaluated
3. Surfacing of the particular attribution and complexity challenges
in evaluating the policy initiative
4. Description of the evaluation approach taken
5. Analysis of the success of this approach in meeting the challenges
(strengths & weaknesses)
6. Consideration of alternative evaluation approaches
7. Discussion of the pros and cons of evaluating the policy at all
The concluding chapter(s) will identify opportunities for cross-learning
and make preliminary remarks about what approaches might prove useful
in different contexts. It will discuss if, when and how complex
policy initiatives should be evaluated and identify when, under
what circumstances, it might be better not to evaluate
.
Prospective authors have tentatively proposed chapters on the following
case studies:
1. Swiss tobacco control strategy
2. Public sector reforms
3. EU development policies
4. EU multi-level policies
5. EC aid policies
6. Structural reform in Danish municipalities
7. Innovation policy
8. Global responses to the HIV/AIDS pandemic
9. Dangerous offenders' policies
10. Australian case studies
11. World Bank country strategies
12. Sustainable environmental development strategies
EVALUATION AND THE EVIDENCE MOVEMENT
Editors: Frans Leeuw,
Tom Ling and Olaf Rieper
1. Introduction (the
editors)
Part 1: Concepts and history of the evidence movement
2. Where are we? Taking stock of the evidence movement
3. Evaluation and evidence in retrospect
4. The appearance of second-order evidence producing organizations
Part 2: Forms of evidence
5. How different forms of evidence inform evaluative judgments
6. The end users' evidence
7. Building on the perception of patients - user evidence of treatments
8. Rolling up evidence in multiple level evaluations
9. Strategies for improvement of Evaluative evidence
Part 3: Applying evidence
10. Evidence/result based management in the public sector
11. Risk of evidence in result based budgeting
12. Counter-evidence: nabouring sources of evidence
13. The role of social science in litigation - is that evidence?
14. Solving social crises - does evidence help?
15. Evidence based investments
Part 5: Outlook and reflections
16. Megatrends in scientific paradigms of evaluation: anything new?
17. Conclusion: risks and promises of the new/old evidence movement
EVALUATION - THE POWER GAME?
Editors: Pearl Eliadis,
Jan-Eric Furbo, and Claus C. Rebien
Background:
Compared to the situation only ten to fifteen years ago, evaluation
activities have significantly increased. In northern European countries,
including the Netherlands, the UK, etc., thousands of evaluations
are produced annually. Estimates show that the cost of evaluation
today is billions of dollars in many of these countries. In Southern
Europe, much more attention is paid to evaluation than was previously
the case, basically due to the role that the EU has played as an
entrepreneur for evaluation. The World Bank and other institutions
have played similar roles in many other countries. In Canada, the
federal government adopted performance measurement and "results-based
management" as approaches to more accountable government that
have resulted in a higher demand for evaluation.
Many countries have seen institutional developments in terms of
the creation of special organs for evaluation, inspectorates, and
investigatory bodies and of course an expanded role for national
audit offices. These developments have often been accompanied by
evaluation systems that would ensure that decision makers have a
steady flow of information aiming to give them a basis for better
decisions.
Evaluation has therefore become an important part of public discourse
in many countries. Evaluation is intended to have an impact on decisions
and administrative behavior, and to influence the interpretation
of societal questions. It also a strategic tool of increasing power
in setting the discourse on a range of critical public policy issues.
Evaluation in itself has therefore become something which needs
to be discussed. This inquiry takes is to three dimensions: players,
power and rules.
Who are the players? Who determines that an evaluation is needed?
Who asks the questions? Second, whose values are at stake, and who
sets the context? Whose interests are served, and how is evaluation
used strategically? Third, what are the rules of the evaluation
game?
Players:
1. Who decides which policy arenas or sectors will be - or will
not be - most susceptible to analysis or assessment through evaluative
information? Who negotiates this? And on behalf of whom?
2. Can the various degrees of use and production of evaluation be
explained by the role of a "dominant" discipline within
a sector? Or is it explained by the different roles played by bureaucracies
and administrative systems in different sectors? What role do stakeholders
play? What role do public attitudes play?
3. Who decides or creates the institutional arrangements which carry
out or commission evaluations?
4. Who decides the use and non-use of evaluation results?
Power: Setting Context and Values
5. To what degree do the planning, structuring and resourcing of
evaluation activities reflect underlying, perhaps hidden, values?
If the values underlying a specific policy are taken for granted
by the evaluation (system and individual evaluations), then the
evaluation process may simply be an exercise in affirming those
values rather than a meaningful effort to independently test them.
What are these values? Are these the values of the political majority,
the administrative system, or other interest groups such as private
organizations, think-tanks or others
? Does evaluation have
a role in this inquiry?
6. Has New Public Management distracted us from fundamental questions?
The growth of "routinized" information may be used to
justify administrative processes or policy decisions but may be
of little relevance for fundamental re-examinations of policies.
Bureaucrats may commission evaluations, but then completely insulate
the evaluation and the evaluator from any meaningful influence over
either the policy or the result of its findings.
Rules
7. Do assertions of methodological superiority get used to validate
or devalue not only data but underlying policy objectives or ideologies?
8. Can the push to expand evaluation activities be explained by
the need for greater amounts of information as a form of quality
assurance? We are checking everything (quality of education, the
fire-brigades etc.) as a way to establish trust in decision-makers,
and therefore we do not question the value of the underlying policy
or the assumptions behind it.
Proposed Chapters:
* A good evaluator may not always be a good evaluatee - the case
of a government evaluation agency.
* Public policy, power plays and evaluation
* The marriage between administrative structures and evaluative
processes
* Negotiating Evaluation
* Who decides? Evaluation and audit institutions
* Evaluation: insulated in the "Rubber room".
* Principal-Agent Theory in Evaluation
*The straight jacket of randomized controlled trials
MIND THE GAP : PERSPECTIVES ON POLICY EVALUATION AND THE DISCIPLINES
Editors: Frans Leeuw
and Jos Vaessen
Historically, evaluations have been shaped by the academic disciplines
of the evaluators conducting the work. For example, earlier experimental
studies (done in the mid 40s of the previous century in the USA)
by Hovland and others, dealing with sorting out the impact of information
and communication strategies at the end of Word War II or the evaluation
of the Great Society programs under President Johnson in the USA
had a very strong disciplinary orientation (be it communication
sciences and psychology or sociology and economics).
By the 70s, particularly
in the US, evaluation had gained the status of a specific branch
of applied research within the social sciences, mainly characterized
by its specific methods (Rossi et al., 2004). From the 70s onwards
several developments took place which led to a trend of a further
emancipation of evaluation as a professional practice along with
a weakening of the direct link between the social and behavioral
sciences. One of the main reasons for the emerging gap between evaluative
practice and the social sciences was in fact the increasing importance
and institutionalization of evaluation in public administration
and policy. While early evaluation studies in the first half of
the 20th century were largely researcher-led studies shaped by the
interests of scientists, in the second half, and especially from
the 70s and 80s onwards, evaluation agendas were more and more determined
by policy makers and administrators of public policy interventions.
This evolution went hand in hand with a decreasing demand for scientifically
rigorous evidence and for more emphasis on alternative evaluation
approaches, as well as a growing interest in the utilization of
evaluation results, pioneered by the work of Weiss and others (see
Shadish et al., 2001; Rossi et al., 2004).
Over the last twenty
to thirty years, evaluation has grown in importance throughout the
world in different fields of public policy. As a result, the number
of people involved and specializing in evaluation has very probably
markedly increased. Evidence of this trend are for example publications
such as The International Atlas of Evaluation (Furubo et al., 2002),
new journals covering the field, a growing number of evaluation
societies and increasing numbers of 'systems of evaluation' (Leeuw
and Furubo, 2008). With a growing group of evaluators, the main
reference point has become the assessment of merit and worth of
interventions as such rather than the evaluator's disciplinary background
with its particular theories and methodologies for evaluation. The
growing importance of evaluation as an activity has also led to
an increasing demand for the development of certain competencies
evaluators have to have. In the evaluation community, interesting
debates on processes, methods, skills and standards have continued
to flourish. Gradually, evaluation has begun to take on more or
less its own 'disciplinary' characteristics, with these issues becoming
the trademarks, the flying banners of evaluation as a young profession.
An increasing number of professional associations and training programs
(academic and non-academic) in evaluation are further signs that
evaluation has become something of a 'discipline' of its own.
Further signs of a decline
in attention for substantive knowledge generated by the social and
behavioral sciences in evaluative practice are the following:
- a mush-rooming of evaluation approaches and the recent (as it
looks) increasing number of papers, studies, symposia dedicated
to topics as 'the role of evaluation in democracy', 'evaluation
and ethics', 'values in evaluation' (etc.), emphasizing 'process'
as opposed to 'content';
- professional evaluators, rather than content experts, becoming
(eclectic) do-it-all masters of tools, processes and standards of
evaluation in different fields of intervention (see Stern, 2006);
- a deluge of studies on the evaluation of input allocation, implementation
processes and output delivery; in contrast, in many fields there
continues to be a palpable lack of rigorous (social and behavioral
science) theory-supported impact evaluation studies.
In the light of the foregoing,
we believe it is time to take stock of where we stand and where
we are heading in terms of the relationship between evaluation and
the disciplines. The aim of this book is twofold. Our first goal
is to highlight and characterize the gap between evaluation practices
and debates on the one hand and the substantive knowledge debates
within the social and behavioral sciences on the other. Second,
we do not want to stop there but also show why this gap is problematic
for evaluation and at the same time illustrate possible ways for
building bridges. We consider our efforts in this book to be both
modest and bold. Modest in the sense that we make no attempt to
be comprehensive in terms of covering the complete array of evaluative
practices as well as fields of substantive disciplinary knowledge
relevant to policy interventions. In addition, we leave aside the
question of why the gap has grown, instead being satisfied with
the fact that to some extent the gap has been inevitable in the
sense that it is the result of evaluation becoming more firmly embedded
in policy practice. However, we will be bold in our efforts to point
out the dangers of going too far, the gap becoming too wide. In
this we do not linger too much in showing the shortcomings of 'theory-empty'
evaluations, but instead focus on the crucial added value of producing
evaluations grounded in social science research.
Outline
1. Introduction (Frans
Leeuw and Jos Vaessen)
PART I: The Evolving
Relationship between Evaluation and the Disciplines
2. US Sociology and Evaluation:
Issues in the Relationship between Theory and Methodology (Nicoletta
Stame)
3. Evaluation and the Disciplines in Development (Robert Picciotto)
4. Economics and Evaluation (Sandra Speer)
PART II: Evaluation and
its Disciplinary Basis
5. The Intellectual Underpinnings
of a Comprehensive Body of Evaluative Knowledge: The Case of INTEVAL
(Lisa Birch and Steve Jacob)
6. Public Management Theory, Evaluation and Evidence-Based Policy
(Robert Schwartz)
PART III: Bridging the
Gap between Evaluation and the Disciplines
7. Interventions as Theories:
Closing the Gap between Evaluation and the Disciplines? (Jos Vaessen
and Frans Leeuw)
8. Middle-Range Theory and Programme Theory Evaluation: From Provenance
to Practice (Ray Pawson)
9. Realistic Evaluation and Disciplinary Knowledge: Applications
from the Field of Criminology (Nick Tilley)