EVALUATING
THE COMPLEX : ATTRIBUTION, CONTRIBUTION AND BEYOND
Editors: Robert Schwartz,
Kim Forss, and Mita Marra
Overarching policy initiatives
are now standard modus operandi for governmental and non-governmental
organizations. Some of these initiatives aim to affect the big problems
of the early 21st century: poverty, hunger, infectious disease,
unhealthy behavior and income disparity - to name a few. Reminiscent
of the American Great Society programs of the 1960s, policymakers
in various jurisdictions are allocating big resources to the solution
of big problems. Unlike the Great Society initiatives, new overarching
policy initiatives often harness a variety of programs and projects
to addressing different aspects of big problems for a variety of
population groups. There is now an understanding that there is no
single right intervention for all. This makes for complex strategies.
Complex policy initiatives are now commonly used to affect more
routine matters, such as school achievement and the advancement
of handicapped people. Reflecting an understanding that no one program
intervention can address all needs, these policy initiatives provide
an umbrella of resources and implementation infrastructure to advance
various projects adapted to localized conditions.
Overarching policy initiatives can also be found at the supranational
and even global levels in a broad range of areas, including environment,
security, trade, immigration, economic and social development. Trans-jurisdictional
fiscal and monetary policies are now common features of overarching
policy initiatives. Highly articulated strategies are, therefore,
decided globally while they affect people's lives locally.
Of course overarching policy initiatives are not new. What appears
to be new is a growing demand for effectiveness evaluations of complex
policy interventions at the international, national and local levels.
There is now pressure on politicians, stakeholders and senior management
officials to demonstrate that resources invested in policy initiatives
have been well spent. They need to justify the overall cost of the
policy and the allocation of policy resources to international,
regional and local programs and projects. Increasingly they are
asked what value has been obtained for the money, relative to alternative
investment channels. Evaluators are now frequently asked to address
these questions. Book chapters will address the extent of demand
for evaluating the effectiveness and cost-effectiveness of policy
initiatives.
Why this demand? Three overlapping mantras of contemporary public
and non-profit management offer possible explanations: accountability,
results-based management, and evidence-based policy.
Elaborating on Michael Power's (1997) depiction of the audit explosion,
several observers describe accountability overload in a variety
of jurisdictions. Voters love to hear that government will be more
accountable. But who is the government? Voters want to hold elected
politicians accountable, but ministers often feel that they do not
have sufficient control and information. Hence, politicians rely
on evaluation as one of the managerial tools with which to steer
the administration. Not only voters, but also those voted into power,
need evaluation. It really is an explosion. The popularity of accountability
has not been lost on politicians who have pushed for countless accountability
improvement stipulations in super-national, national and regional
jurisdictions. Large complex policy initiatives are natural targets
for accountability seekers as they expend big chunks of resources.
The no longer new public management reforms have had a lasting effect
on getting governments and NGOs to focus on results. NGO's in particular,
but also many government services could in the past legitimate their
existence by pointing to the importance and relevance of the objectives
they were striving to achieve. They achieved symbolic legitimacy
by working, for example, for children in need, human rights, HIV/AIDS
victims, and so on. But as competition for funds, even in fields
such as these has increased, symbolic legitimacy is no longer enough.
Demonstrating that resources have been spent and activities conducted
is no longer sufficient. Stakeholders, politicians and senior managers
insist on knowing what has been achieved with the resources. Results
count, and those who can point to results stand a better chance
in the fund-raising game. There are often requirements to divulge
results in performance measurement systems, annual reports and periodic
assessments. Results seekers are not often concerned with the difficulties
in attributing results to overarching policy initiatives or to particular
programs and projects. What they want is data.
Evidence-based policy is a relative newcomer to public management.
This movement encourages policymakers and managers to examine empirical
evidence for the effectiveness of existing and proposed policies
and programs. In its extreme manifestation, evidence-based policy
requires the production of evidence on the effectiveness of individual
projects, program interventions and policy strategies. Evidence
seekers contribute significantly to the demand for evaluation of
complex policy strategies.
Accountability fever, results-based management and the evidence-based
policy movement contribute to a sense that everything can and should
be evaluated. Indeed in many jurisdictions evaluation is now a standard
operating procedure, automatically included in budget and work plans.
This ought to please evaluators. Indeed there is now an abundance
of evaluation work to be done. As long as the demand is contained
to the project and program levels, the evaluation took-kit is sufficiently
well-stocked to cope with a variety of evaluation needs.
Complex policy initiatives, however, challenge evaluators in new
and daunting ways, beyond the scope of what existing tools of the
trade can manage. Two distinguishing characteristics set evaluation
of complex policy initiatives apart: 1) attribution; 2) complexity/complicatedness.
Complex policy initiatives would almost always fail standard evaluability
assessment. Their objectives are seldom clear-cut and measurable.
And there is little evidence to support a causal linkage, (as causality
is defined in classical texts on scientific method, that is, that
cause must be proven necessary and sufficient for the effect to
be produced) between unique components (programs and projects) or
mixes of these and expected outcomes. Indeed in complex systems
the linkages amongst interventions, contextual variables and outcomes
may not be linear. In a nutshell, evaluations of complex policy
initiatives face a major challenge of attribution. Even when macro-level
outcomes are measurable, it is very difficult to attribute changes
in outcomes to the implementation of the initiative. The challenge
is even greater if there is no linear causality chain.
The editors have identified five approaches that might be helpful
in evaluating complex policy initiatives:
1. cluster evaluation
2. multi-level evaluation
3. quantified logic models
4. comparative community studies
5. thematic evaluations
These, and other approaches will be examined through a series of
case studies of complex policy initiatives. Evaluators may also
seek inspiration for how complex social processes are described
and analysed in other social science discipline, including perhaps
in historical research, which has for long sought to explain how
and why things happen in a complex environment such as the scene
of world history. Also economics, since its own inception, has explored
the outcomes of policy decisions both on individual and state welfare,
explicitly highlighting the daunting methodological questions of
macro vs. micro units of analysis and the problems of data aggregation
and disaggregation. Economic thinking has offered the "theory"
to inform complex strategies while statistical analyses have gauged
their effects on national GDP, growth prospects, and development
patterns. Currently, sophisticated econometric studies are employed
to estimate the impact of international development aid on economic
convergence between developed and developing countries, or on poverty
reduction and income distribution. The question is whether and in
what circumstances such "evaluative" economic approaches
are complementary with typical methodologies adopted by evaluators.
The book will begin with a couple of theoretical chapters addressing
attribution / contribution and complexity. These will be followed
by a series of case study chapters.
Each of these chapters
should include:
1. Description of the complex policy initiative
2. Exploration of why the initiative came to be evaluated
3. Surfacing of the particular attribution and complexity challenges
in evaluating the policy initiative
4. Description of the evaluation approach taken
5. Analysis of the success of this approach in meeting the challenges
(strengths & weaknesses)
6. Consideration of alternative evaluation approaches
7. Discussion of the pros and cons of evaluating the policy at all
The concluding chapter(s) will identify opportunities for cross-learning
and make preliminary remarks about what approaches might prove useful
in different contexts. It will discuss if, when and how complex
policy initiatives should be evaluated and identify when, under
what circumstances, it might be better not to evaluate
.
Prospective authors have tentatively proposed chapters on the following
case studies:
1. Swiss tobacco control strategy
2. Public sector reforms
3. EU development policies
4. EU multi-level policies
5. EC aid policies
6. Structural reform in Danish municipalities
7. Innovation policy
8. Global responses to the HIV/AIDS pandemic
9. Dangerous offenders' policies
10. Australian case studies
11. World Bank country strategies
12. Sustainable environmental development strategies
EVALUATION AND THE EVIDENCE MOVEMENT
Editors: Frans Leeuw,
Tom Ling and Olaf Rieper
1. Introduction (the
editors)
Part 1: Concepts and history of the evidence movement
2. Where are we? Taking stock of the evidence movement
3. Evaluation and evidence in retrospect
4. The appearance of second-order evidence producing organizations
Part 2: Forms of evidence
5. How different forms of evidence inform evaluative judgments
6. The end users' evidence
7. Building on the perception of patients - user evidence of treatments
8. Rolling up evidence in multiple level evaluations
9. Strategies for improvement of Evaluative evidence
Part 3: Applying evidence
10. Evidence/result based management in the public sector
11. Risk of evidence in result based budgeting
12. Counter-evidence: nabouring sources of evidence
13. The role of social science in litigation - is that evidence?
14. Solving social crises - does evidence help?
15. Evidence based investments
Part 5: Outlook and reflections
16. Megatrends in scientific paradigms of evaluation: anything new?
17. Conclusion: risks and promises of the new/old evidence movement
EVALUATION - THE POWER GAME?
Editors: Pearl Eliadis,
Jan-Eric Furbo, and Claus C. Rebien
Background:
Compared to the situation only ten to fifteen years ago, evaluation
activities have significantly increased. In northern European countries,
including the Netherlands, the UK, etc., thousands of evaluations
are produced annually. Estimates show that the cost of evaluation
today is billions of dollars in many of these countries. In Southern
Europe, much more attention is paid to evaluation than was previously
the case, basically due to the role that the EU has played as an
entrepreneur for evaluation. The World Bank and other institutions
have played similar roles in many other countries. In Canada, the
federal government adopted performance measurement and "results-based
management" as approaches to more accountable government that
have resulted in a higher demand for evaluation.
Many countries have seen institutional developments in terms of
the creation of special organs for evaluation, inspectorates, and
investigatory bodies and of course an expanded role for national
audit offices. These developments have often been accompanied by
evaluation systems that would ensure that decision makers have a
steady flow of information aiming to give them a basis for better
decisions.
Evaluation has therefore become an important part of public discourse
in many countries. Evaluation is intended to have an impact on decisions
and administrative behavior, and to influence the interpretation
of societal questions. It also a strategic tool of increasing power
in setting the discourse on a range of critical public policy issues.
Evaluation in itself has therefore become something which needs
to be discussed. This inquiry takes is to three dimensions: players,
power and rules.
Who are the players? Who determines that an evaluation is needed?
Who asks the questions? Second, whose values are at stake, and who
sets the context? Whose interests are served, and how is evaluation
used strategically? Third, what are the rules of the evaluation
game?
Players:
1. Who decides which policy arenas or sectors will be - or will
not be - most susceptible to analysis or assessment through evaluative
information? Who negotiates this? And on behalf of whom?
2. Can the various degrees of use and production of evaluation be
explained by the role of a "dominant" discipline within
a sector? Or is it explained by the different roles played by bureaucracies
and administrative systems in different sectors? What role do stakeholders
play? What role do public attitudes play?
3. Who decides or creates the institutional arrangements which carry
out or commission evaluations?
4. Who decides the use and non-use of evaluation results?
Power: Setting Context and Values
5. To what degree do the planning, structuring and resourcing of
evaluation activities reflect underlying, perhaps hidden, values?
If the values underlying a specific policy are taken for granted
by the evaluation (system and individual evaluations), then the
evaluation process may simply be an exercise in affirming those
values rather than a meaningful effort to independently test them.
What are these values? Are these the values of the political majority,
the administrative system, or other interest groups such as private
organizations, think-tanks or others
? Does evaluation have
a role in this inquiry?
6. Has New Public Management distracted us from fundamental questions?
The growth of "routinized" information may be used to
justify administrative processes or policy decisions but may be
of little relevance for fundamental re-examinations of policies.
Bureaucrats may commission evaluations, but then completely insulate
the evaluation and the evaluator from any meaningful influence over
either the policy or the result of its findings.
Rules
7. Do assertions of methodological superiority get used to validate
or devalue not only data but underlying policy objectives or ideologies?
8. Can the push to expand evaluation activities be explained by
the need for greater amounts of information as a form of quality
assurance? We are checking everything (quality of education, the
fire-brigades etc.) as a way to establish trust in decision-makers,
and therefore we do not question the value of the underlying policy
or the assumptions behind it.
Proposed Chapters:
* A good evaluator may not always be a good evaluatee - the case
of a government evaluation agency.
* Public policy, power plays and evaluation
* The marriage between administrative structures and evaluative
processes
* Negotiating Evaluation
* Who decides? Evaluation and audit institutions
* Evaluation: insulated in the "Rubber room".
* Principal-Agent Theory in Evaluation
*The straight jacket of randomized controlled trials
EVALUATION AND THE DISCIPLINES
Editors: Frans Leeuw
and Jos Vaessen
Introduction
The collection of papers will focus on the relationship between
the substantive knowledge funds available within the social science
disciplines and the theory and practice of evaluation.
Historically, evaluations have been shaped by the academic disciplines
of the evaluators conducting the work. Earlier experimental studies
(done in the mid 40's) by Hovland and others, dealing with sorting
out the impact of information and communication strategies at the
end of Word War II but also the evaluation of Great Society programs
had a very strong discipline-orientation (be it communication sciences
and psychology or sociology and economics). Even earlier, in the
UK and the US social scientists carried out experimental studies
in the 30s of the last century focused on reducing (juvenile) delinquency
(Oakley, 2000; Leeuw, 2005).
Nowadays, there is a large amount of policy-oriented research and
evaluations of interventions and programs carried out by evaluators
linked to specific academic disciplines, applying the concepts and
tools of these disciplines in the design and analysis of their evaluation
studies. Some of this work is carried out by academics active in
both social science research as well as evaluations of interventions
that are linked to their field.
While this link between research within the disciplines and specialized
evaluations is still manifest in a number of evaluations (e.g. economic
evaluations of price policies, medical treatment evaluations, crime
reduction studies) it is far from representative for the field of
evaluation as a whole. Over the last twenty to thirty years, evaluation
has grown in importance throughout the world in different fields
of public policy. As a result, the number of people full-time involved
and specializing in evaluation has markedly increased. With a growing
group of full-time evaluators, the main reference point for evaluators
became the assessment of merit and worth of interventions as such
rather than the evaluator's disciplinary background with its particular
theories and methodologies for evaluation.
The growing importance of evaluation as an activity also led to
an increasing demand for the development of certain competencies
evaluators have to have. In the evaluation community interesting
debates on processes, methods, skills and standards began to flourish.
As a result, gradually, evaluation began to take on more or less
its own disciplinary characteristics, with these issues becoming
the trademarks, the flying banners of evaluation as a young profession.
This evolution has had a number of implications for evaluation practice
and theory. Some of the following tendencies can be noted:
- a mush-rooming of a myriad in evaluation approaches and the recent
(as it looks) increasing number of papers, studies, symposia dedicated
to topics as 'the role of evaluation in democracy', 'democratic
evaluation approaches', 'evaluation and ethics' (etc.), emphasizing
'process' as opposed to 'content';
- professional evaluators, rather than content experts, becoming
(eclectic) do-it-all masters of tools, processes and standards of
evaluation in different fields of intervention;
- a deluge of studies on the evaluation of input allocation, implementation
processes and output delivery; in contrast, in many fields there
continues to be a palpable lack of rigorous (social science) theory-supported
impact evaluation studies.
All in all these developments have (probably unintendedly) contributed
to a reduction of interest in the substantive (i.e. theoretical
and methodological) relationships between sociological, economic
and psychological knowledge on the one hand and the knowledge produced
and diffused by the evaluation community on the other hand. Within
the light of the foregoing, it is time to take stock of where we
stand and where we are heading in terms of the relationship between
evaluation and the (academic) disciplines.
Basic concepts
The Oxford dictionary defines a discipline as: 'an area of knowledge;
a subject that people study or are taught'. Following this definition,
disciplines are first and foremost associated with the traditional
academic disciplines: physics, sociology, psychology, biology, etc.
Hence, in this collection of papers, when talking about 'the disciplines'
we always refer to the academic disciplines, more specifically (given
their importance in evaluation) the social sciences.
Notwithstanding this particular definition of the concept, it is
important to put into perspective 'the disciplines' vis-à-vis
other areas of knowledge, with disciplinary characteristics, that
shape the practice of evaluation. There is something which for the
sake of simplicity we denominate as professional knowledge. Two
main dimensions of professional knowledge stand out. The first one
refers to bodies of knowledge and experience related to specific
fields of intervention, e.g. education, health, housing. To different
degrees, these fields are associated with distinct professional
communities of people who relate to each other on the basis of shared
experiences, knowledge, methods and norms and standards of practice.
The second dimension refers to bodies of expertise and knowledge
transcending fields of intervention. Examples are financial planning
and management, organizational design, human resource management.
As in the case of professional fields, to some extent these areas
of expertise have their own professional communities of specialists.
Both dimensions, fields of intervention and bodies of expertise
across fields of intervention, are often perceived as disciplines.
Conceptually, evaluation could be perceived as a 'professional discipline',
next to for example financial planning and management. However,
apart from evaluation being the focus of this collection of papers
there are other reasons to view evaluation as a separate category,
a separate 'type' of discipline if you will. First and foremost,
evaluation is the practice of assessing different dimensions (like
organizational design, budgetary planning, i.e. transcending dimensions
of policy interventions) in different fields of policy intervention.
As such, it can be appropriately called a trans-discipline. Second,
the substantive knowledge and methodological basis of evaluation
transcends single academic disciplines. An evaluator (ideally) eclectically
applies different tools and insights from social science disciplines,
as warranted by the evaluative problem at hand. Consequently, this
provides another argument for viewing evaluation as a trans-discipline.
The foci of the collection
of papers (tentative)
Many of the papers in this collection look at the evaluation and
social science interface from the perspective of one (or a few)
particular dimension(s) or field(s) of policy intervention. Other
contributions adopt a more general perspective to the issue. The
first section addresses the linkages between evaluation and the
social sciences from a historical perspective. In the first paper,
Nicoletta Stame describes for three different fields of intervention
how evaluation as a professional practice has evolved over time,
paying special attention to the role of specific social science
knowledge therein. Bob Picciotto looks at five decades of development
intervention. He describes the evolution of development evaluation
under the influence of changing policy paradigms. This in turn has
implications for the way in which (mainly) economic theory is mobilized
in evaluation. Finally, he discusses how current global changes
and new challenges in development intervention require new evaluative
approaches securely backed up by relevant social science disciplinary
knowledge and methods.
Subsequent contributions are illustrations of the extent to which
evaluation studies are inspired by social science theories and methods.
The first contribution in this section by Steve Jacob develops an
interesting perspective on the relationship between evaluation and
the disciplines by looking at the intellectual underpinnings of
the 12 evaluation books produced by INTEVAL. Over the years INTEVAL
has covered a wide area of issues related to evaluation, including
contributions from a diverse group of evaluation practitioners and
academics. As a result, the analysis provides interesting insights
on particular social science knowledge funds as inputs to different
areas of evaluation practice and debate. Subsequent contributions
are in-depth illustrations, covering specific bodies of evaluation
studies in a particular field or on a particular dimension of intervention
in relation to relevant knowledge funds from social science disciplines.
Kim Forss explores the coordination dimension and how it is addressed
in evaluation. Starting out from theoretical insights from the social
sciences on coordination, he subsequently assesses to what extent
these insights are reflected in evaluation reports. Looking at a
small stratified sample of evaluation studies, he arrives at the
conclusion that reports are all in all 'theory-empty'. The question
of why this is so is raised at the end of the paper. Rob Schwartz
presents a desk study covering a large body of evaluation research
studies on performance assessment. He analyzes to what extent studies
are theory-oriented, i.e. theorize about the dimension of performance
assessment. More specifically, he looks at three social science
theoretical constructs relevant to performance assessment and the
extent to which they are included in studies about performance assessment.
Subsequently, for a small proportion of studies that in fact 'theorize'
about the issue it is analyzed how theoretical constructs help further
the understanding of performance assessment. In addition it is explored
what the studies say about the theories as such. Finally, the paper
looks at the issue of how studies that 'theorize' about performance
assessment deal with contextual dimensions, i.e. the circumstances
under which something works. Sandra Speer's contribution transcends
the boundaries of a particular dimension or field of intervention
and looks at the linkages between economics and evaluation in general.
In order to arrive at a feasible yet comprehensive treatment of
this issue, she focuses on the work of recent nobel laureates. Starting
out from an overview of the major intellectual achievements of nobel
laureate economists, she subsequently discusses how the work of
each of these laureates has influenced evaluative thought and practice.
A final section of the
collection of papers addresses the question of how the relationship
between evaluation and the social science disciplines can be restored.
Steve Jacob does not directly address this question but instead
zooms in on one of the important aspects of the relationship, i.e.
the mono-disciplinary versus multi-disciplinary basis of evaluation
studies. In many cases, evaluators are challenged to deliver a broad
evaluative analysis on different dimensions of a policy intervention.
While in principle such a situation would call for a multidisciplinary
approach to evaluation, some critical reflection appears to be warranted.
Starting out from the analytical (epistemological and methodological)
foundations of multi- versus mono-disciplinarity in evaluation,
the author subsequently critically illustrates on the basis of examples
potential advantages (e.g. knowledge pooling, out-of-the-box-thinking)
and pitfalls (e.g. dilution of specialist insights, operational
constraints) of multi-disciplinary evaluation. Jos Vaessen and Frans
Leeuw look at theory-oriented evaluation as a basis for reconciling
disciplinary research with evaluative practice. They start out by
describing the main defining characteristics of the theory-oriented
tradition in evaluation, paying particular attention to the role
of social science theory. Subsequently, the authors present a concrete
framework of how social science substantive knowledge can be usefully
incorporated in an evaluation exercise. Both the design of an evaluation
as well as the corresponding analysis of the policy problem benefit
from building on existing substantive knowledge generated in the
social sciences. In turn, the evaluation study can effectively test
certain theoretical strands as well as contribute to the overall
knowledge fund of what works in particular conditions, thus providing
a positive feedback loop back to the social sciences. The framework
is illustrated with an example.
Tentative structure
based on current received draft papers
(0. Introduction)
Historical perspectives
1. The origins of evaluation and the disciplines
2. Evaluation and the disciplines in development - tensions, dilemmas
and synergies
Illustrations
3. The intellectual underpinnings of a comprehensive body of evaluative
knowledge: the case of INTEVAL (working title)
4. Public management theory, evaluation and evidence-based policy
5. From banal feelings to exalted visions - the case for competence
in evaluation
6. Economics in evaluation (working title)
Reconciling evaluation and the disciplines
7. Cross-disciplinarization: a new talisman for evaluation?
8. Interventions as theories: closing the gap between evaluation
and the disciplines?
9. Conclusions