You are on page 1of 8

Critical Review of Goal Free Evaluation By Dexter N.

Pante

Introduction The question of about the purpose of evaluation is a crucial issue that needs to be addressed whenever evaluation is undertaken. It is important because the answer would clarify the rationale of evaluation, the scope and even determine the applicable approach and method to use (Davidson, 2005; Walsh, 1980). Traditionally, there are two ways of responding to this question. The first alternative is to determine whether the objectives of the evaluand or the object of the evaluation, have been achieved. Conceived by Ralph Tyler during the early 1930s, this approach, popularly known as goal-based evaluation (GBE) or goal attainment evaluation (GAE), remains up to this day, the predominant and common denotation of evaluation (Stufflebeam & Shinkfield, 2007). This approach encompasses third and fourth evaluation forms of Owen (2006) which are monitoring and impact evaluations, respectively. The former basically aims to determine the current progress of attainment with regards to the goals while the latter measures the evaluand against its goal at the end point of implementation. The second option is to measure the evaluand to the extent that is addressing the needs of the beneficiaries and of the society, in general. This paper uses the term goal-free evaluation (GFE) in consistency with the ideas of Michael Scriven (1991) who originally coined the term instead of the alternatives such as consumer-oriented or needs-based evaluation (Coryn & Hattie, 2006; Owen, 2006; Scriven, 2004; Stufflebeam & Shinkfield, 2007). According to Scriven (1991, p. 56), goal-free evaluation is the evaluation of actual effects against (typically) a profile of demonstrated needs. In the next sections, this paper situates GFE within the other ideas of Scriven to demonstrate soundness and consistency of his reasoning and to show the strengths of GFE over GBE. It also assesses the worth of the approach in evaluation practice and its significance to the advancement of the evaluation profession. Finally, the paper discusses the limitation of this approach as well as the future application of GFE in combination with other approaches and in complex environments. Goal-free evaluation: a misunderstood evaluation approach? As mentioned earlier, GFE proposes the abandonment of program goals as standards on which to measure performance in lieu the needs of the participants and society. At initial glance, this approach seems to be odd or nonsense, a fact of which even Scriven (2004) admitted hearing from critics. In practice, evaluation practitioners always refer to the goals as the scope of program and staff work and of evaluation. Hence, the notion that ones performance would be evaluated not on the basis of ones work is unthinkable and would be seen as unfair. The next paragraphs contextualise GFE within the ideas of Scriven on evaluation to show the internal coherence and consistency of his arguments. Goal-free evaluation is a part of the consumerist orientation of Scrivens concept of evaluation. This orientation is borne out of three important factors: his understanding of the history of philosophy, his present concern on the focus of evaluation, and finally his vision about the future of evaluation discipline. In a sense, understanding GFE requires situating Scriven in a historical milieu. Firstly, Scrivens idea of evaluation is characterized by his critique of the philosophical movement called logical positivism. This movement was started by a group of philosophers,

scientists and mathematicians called the Vienna Circle in the 1920s. These positivists claimed humans can have knowledge of nature through the use of senses (Bell, 1997; Dorst, 2007). One of their doctrines severely criticized by Scriven was the notion of value-free science. Value-free doctrine is the belief that facts and values are distinct and that scientific methods humans used to discover the objective world is independent of value judgments (Beauchamp, 1998). The Vienna Circle believed that the role of science is just to describe the world and not to prescribe what it ought to be. Scriven pointed that many disciplines in the sciences like, medicine, engineering, agriculture and others pronounce judgment all the time; hence, a misguided belief. For Scriven, the traditional notion that evaluation is just to measure the attainment of goals is a perpetuation of the value-free doctrine. In traditional evaluation, the evaluator just passively looks at the whether or not goals were achieved without pronouncing a judgement about the merit or worth of those goals. At best, GBE is just a partial approach because evaluation is about judging the merit, worth and significance of an evaluand (Stufflebeam & Shinkfield, 2007). Scriven argued that this is a serious flaw of GBE. Secondly, when Scriven spoke about evaluation, it is to be understood as referring to the entire gamut of evaluands not only to the subset of program evaluation (Scriven, 2004). He always cautioned the readers that there are many types of evaluation not covered by program evaluation, such as product evaluation, personnel and student evaluation, meta-evaluation, among others. So when he talked about applicability of GFE, he was referring to the entire evaluation enterprise. He acknowledged that although GFE is rarely undertaken in program and policy evaluation, this is not the case in product evaluation where a variant of GFE called blinding or masking is commonly used (Scriven, 1993). This is the reason he advocated GFE because he was not only speaking about evaluation in specific but in a general sense. Finally, it is important to note that Scriven advocated the professionalization of evaluation as a transdiscipline with its own subject matter, methods, field of application and societal contribution (Scriven, 1993, 2004). He argued that it is the hallmark of professionalism when evaluators ponder on the contribution of their discipline to the welfare of humankind and of the environment not merely to the extent they supported the decisions of management. The elevation of human needs in evaluation is consistent with GFE with its emphasis on the needs of the participants and of society as criteria for measuring program success. This is another justification for the importance of GFE in his entire concept of evaluation. Now that a brief context has been provided about GFE, a presentation of its beneficial features illustrates the strengths of this approach. Goal-free evaluation differs with GBE on the following areas: 1) the importance of goals, 2) impartiality and 3) value of unintended effects. The first defining feature of GFE is the thesis about the superfluity of goals in assessing the merit or worth of the evaluand. Scriven stated that goals might be very ambitious, unrealistic, and even unethical, thus difficult to achieve. There can also instances where the evaluand might be constantly evolving, therefore no goal is identifiable or that existing goals might be ambiguous. Even when clarificative evaluation might be an option, often, it is not advisable because of the complexity of the evaluand and the exorbitant cost of the clarification process. There might also be cases where rhetorical goals, what a program seeks to attain, differed with real goals, what people actually work for (Patton, 2005). In these situations, the goal-based evaluator could throw his hat in frustration or do iteration of the development process just to specify, clarify and come up with measurable goals. Indeed, if the problem is the clarity and grandiosity of the goals, why not just improve how they are stated? This is more sensible rather than throwing goals away. Scriven countered this argument in two ways. First, improving the goal statement does not still answer the question about the merit of

the goal. The goal could still be unethical, arbitrary and capricious. This is particularly true in societies where program designers, policy makers, and politicians have no legitimacy; hence, their goals are questionable. The second way is about goal-bias and this would be elaborated as the next feature of GFE. In the absence of goals, what criteria would then be used in evaluation? While Scriven espoused that evaluation can be done without goals, it does not necessarily follow that evaluation could be implemented without criteria. He criticized those who misinterpreted him that he is merely substituting the evaluators goal to the program goal as standards (Scriven, 1991). For him, this would invalidate the evaluation. He claimed that the criteria that should be used are the needs of the consumers and society. Unlike GBE, the purpose of GFE is to assess the evaluand not against its goal, but rather on the basis consumer and societal needs. An analytical tool that implements GBE is the logical framework which describes the causal connection of resources, activities, outputs and outcomes. In the logical framework, needs and goals are two sides of the same coin; goals are needs stated positively. Stufflebeam and Shinkfield (2007) in fact, raised this issue. Alkin (1972) even commented that goal-free is a misnomer because GFE still has goals; goals are just more distant and have different audiences. Still, Scriven and Roth (1990) responded that a need is more than the discrepancy between something real and something ideal. A need is something required for a satisfactory mode of existence. For example, Scriven cited that Vitamin C is a need because the absence of which will make one sick. Here the need is real (Vitamin C) rather than ideal (to be healthy). Scriven and Roth (1990) pointed that oftentimes consumers act the way they do because of their actual needs not because of the goals that they have set in mind. The second feature is the control of goal-bias. Goal-free evaluation approach tries to control a type of bias which is overlooked in GBE. According to Scriven (2011, p. 87), bias is, a statistically likely to tendency to systematic error and an actual systematic increase in the frequency of errors . . . Bias, in the first sense, is statistical tendency of error in a group of which you are a member; in the second sense, it is a tendency which has in fact infected you. (emphasis supplied) To illustrate, imagine the situation when professors are grading students assessment papers. The professors are biased already, in the first connotation, because they know the students. The professors may have a tendency towards bias, on the second connotation, but they may overcome it through the application of proper strategies.1 It is with the latter sense that Scriven concerned himself because it undermines impartiality. Scriven (2011) explained that goal-bias occurs when the goals limit the vision of the evaluator from detecting other outcomes which are equally important as the intended outcomes. This kind of bias affects the independent judgment of program staff and management and even evaluators. In GBE, staff, management and evaluators are already biased because of their knowledge of the goal. Indeed, there is a conflict of interest and risk of the loss of objectivity when program staff and managers evaluate their own program. Medical studies corroborate this argument of Scriven. Misra (2012) claimed there are clinical evidences that show outcomes are affected by the expectations of patient and investigators.
1

Another example can be found in Scriven (2011).

To prevent this from happening, researchers have devised the blinding or masking method to address this bias. A highly popular variant of blinding is the double blind experiment where neither the investigator nor the subject of treatment knows the efficacy of the treatment (Hoffer, 1967). The idea of blinding is to isolate the influence bias of the evaluator and faith in the patient to independently measure the effect of the treatment. Scriven alluded that GFE is akin double-blind experiment which can be used as evaluation approach to reduce goal-bias (Scriven, 1991, 2011). One criticism on this argument is that the goal-bias could be easily remedied through training and through an external evaluator. Scriven (2011) agrees to the importance of professional training in evaluation. In fact, it could be argued that part of the professional training should be about GFE. The third feature of GFE is its de-emphasis on the dichotomy between the intended and unintended or side effects. Scriven saw the prioritization of intended effects as a narrow view of program effects. In product evaluation, especially in medicine, the side effects are often more important because the treatment is futile if the side effects outweigh the intended benefits. In rejecting this dichotomy, Scriven (1991) elevated the search for unintended effects at an equal footing with that of the intended effects. Taking the entire effects into consideration allows GFE to provide a better view about the evaluand. Some authors viewed that findings of no program effect are indicative of the inadequacy of GBE (Chen & Rossi, 1980). Their claims supported the call of Scriven for the need to look beyond goal attainment as criteria of merit. In GFE, knowledge of unintended effects may point to the complexities of the evaluand and thus generate a more compassionate report. It can occur when findings of positive unintended effects may save an otherwise unsuccessful program; hence, the importance of keeping it as part of the inquiry. The criticism here is the impracticality of testing for all possible effects. Scriven (1991) suggested that one can prioritize effects by referring to previous experience and knowledge from research literature. The other option is to proceed with GFE then use GBE later as a means to rank which effects to probe later. Scriven (2011) concluded the reversibility and flexibility of GFE make it superior than GBE. To summarize, we have cited the three features of GFE and have provided sound reasons of these features. We also described how Scriven parried the criticisms that GFE uses the evaluators goal instead of the programs goal, impracticality of testing all effects and offers practical suggestion on improving objectivity of evaluators. Worth and Significance This section assesses the GFE in terms of worth and significance. This paper puts into operation Scrivens definition of worth: the value of GFE to the evaluation practice. Meanwhile, significance means the contribution of GFE to the evaluation profession. Firstly, GFE is valuable to the evaluation practice because it brought the attention of evaluators to the importance of looking at the unintended effects and actual needs of consumers. Scrivens arguments for unintended effects enlightened evaluators about the limiting effects of just focusing on goals. Harpham, Burton, and Blue (2001) argued similarly that GFE is noteworthy because it woke evaluators from intellectual slumber induced by GBE. Also, Owen (2006) observed that, currently, inquiry of unintended effects, analysis cost-benefit and comparison of alternative products have become important key questions in any impact evaluation. Further, GFE is also a good addition to the available approaches on evaluation and provides the groundwork for the development of other alternative approaches and methods such as multi-goal evaluation, criteriabased evaluation and the most significant change method. Discussion about these approaches in relation with GFE can be found in the last section of this paper.

Secondly, GFE is significant because the emphasis on consumer and societal needs as evaluation criteria of merit and worth, redefined the role of evaluator as mere management support to an enlightened surrogate consumer (Scriven, 1993; Stufflebeam & Shinkfield, 2007). This means evaluators should serve as social conscience to ensure that evaluands are of quality and serves the welfare of mankind (Stufflebeam & Shinkfield, 2007). Also, the thrust of Scriven on objectivity supports his stand on the importance of establishing ethical standards among evaluators as professionals. Limitations and Constraints of GFE It is important to note that using GFE poses several limitations. Firstly, it is highly dependent on the skill of the evaluator. It requires that the evaluator must have a keen perception of the changes occurring within the context of the evaluand. And this perception is honed through extensive experience and knowledge. The evaluator should also have a wide array of interpersonal skills like negotiation, mediation, networking, and among others. These skills are necessary to address the next limitation of GFE. Secondly, the risk of conflict and misunderstanding between the evaluators and management is high when using GFE. Conflict could arise between the following groups: a. Evaluators and management Conflict could happen between these two when management feels that their performance and the performance of the evaluand are being improperly evaluated through the use of criteria outside of their scope of work. Patton (2005) also argued that GFE might veer away from the concerns of management who has high interest of using the results of the evaluation in improving the program. b. Evaluators and beneficiaries Beneficiaries who do not understand GFE might view with suspicion the motives of the evaluator especially when program side effect, which has a negative meaning, is being asked. c. Management and staff Intra-office misunderstanding could occur when as a consequence, the evaluators penchant in questioning the goals, they may influence program staff to be sceptical and critical about the feasibility and appropriateness of the goals of the program designers. d. Management and beneficiaries In case of findings about adverse side effects, the management might accuse the beneficiaries of misusing the evaluand. e. Management and those who commissioned the evaluation - Since evaluation is a highly political activity, GFE might be perceived as a strategy of one group to discredit the work of rival group by looking at the side effects. Thirdly, another limitation of this approach is that there is no yet well-developed and comprehensive method to implement GFE (Youker, 2011). And (Evers, 1980); Walsh (1980) claimed that examples of GFE are very rare. Even the double blind methodology which works very well in product evaluation is unclear on its application on the area of program evaluation. Hence, while the arguments in support of GFE are sound and convincing, the absence of a clear procedure discourages evaluators from implementing it. This is in stark contrast with GBE which has the logic model or logical framework which provides GBE detailed step by step guide in doing monitoring and evaluation. Finally, it can be argued that there is no single evaluation approach that can answer every question. Evaluation approaches are useful only to the extent that they answer evaluative questions. Patton (2010) referred to this as methodological appropriateness. Scriven (1991)

admitted that the emphasis on unintended effects is one of the trade-offs of GFE. He argued, meanwhile, that his intention with GFE was to make people critical with what is not asked rather what is asked.

Future of Goal-free evaluation It seems that the future of GFE lies on the next generation of evaluation theorists who will build on Scrivens idea to other approaches and methods. It is necessary because Scriven did not provide clear process on how to conduct GFE. Scriven (2011) suggested the use GFE as complement of GBE. This requires making a member of the evaluation team insulated from anything that talks about the program goal. The evaluator could start with GFE and after collecting sufficient data to enable him infer about the effects, other team members could critique him using their knowledge of the program goals to further refine his hypotheses. Another idea pointed by Alkin (1972) is to look at the needs in GFE as distant goals. Seen this way, GFE would be similar to the proposal of Chen and Rossi (1980) which they called Multi-Goal Theory-Driven Approach (MGA). Like GFE, MGA recognizes that there are intended and unintended program effects. However, MGA proposes the use of multiple goals that are based on social theory, research and knowledge as standards to discover the other effects. As shown by the examples of Chen and Rossi (1980), it is important to orient the management about this approach as this would improve the feasibility of implementing MGA. Hence, a little bit tinkering of the name, but not necessarily on the process would perhaps increase the chance of seeing GFE implemented. The other option is to consider the attainment of goal and needs as criteria in the evaluation. This approach called criteria-based evaluation looks at the qualities of the evaluand that are important to evaluate (Cronholm & Goldkuhl, 2003). In criteria-based evaluation, the evaluator starts by looking at the alternatives against different criteria that measure the attainment of certain outcomes (Andalecio, 2004). Eilat, Golany, and Shtub (2008) cited that an example of criteria-based method is the balanced score card (BSC). The BSC is a management tool which uses the concept of cards to represent different managerial perspectives. Our organization, the Department of Education, also uses BSC which identified the criteria of the attainment of organizational goals and the needs of its constituents as measures of organizational performance (Valisno, 2010). Goal-free evaluation could also be linked with the Most Significant Change (MSC) approach. Davies and Dart (2005, p. 8) and Sigsgaard (2002) described MSC as a form of participatory monitoring and evaluation that does not use indicators. Simply stated, MSC starts with the collection of significant stories from program participants and then proceeds on determining which among these stories is the most significant. Since the emphasis is on stories of program effects rather than indicators based on goals, MSC is very suitable to use in GFE. MSC could even be set up so that the group members who collect and decide on the most significant story are unaware of the program goal. Another feature of MSC which is similar to GFE is its client-centred orientation. In MSC, the clients needs and how the program addressed those needs are the main focus of inquiry. The prospect of GFE is also bright in complex environments, like in international humanitarian emergencies, politics, and creative organizations where there are a lot of unexpected outcomes. In complex environments, there might be no goals or they might be constantly evolving, or it is just that the context has many unaccounted factors affecting the evaluand which makes it impossible to predict the effects. In this environment, GFE would be useful as it has goalindependent criteria on which to assess the value of an evaluand.

Reflecting on my own professional practice, I see the reasons of Scriven for GFE as still very important to bear in mind. From personal experience of designing impact evaluation using program logic, I could say that GBE has changed significantly as compared to the original definition of Ralph Tyler. Evaluation, particularly the impact evaluation, also now includes analysis of unintended effects, cost-efficiency and effectiveness, and comparison of alternative services. I see these as many aspects of GFE coming into fruition. References Alkin, M. C. (1972). Wider context goals and goals-based evaluators. Eval. Comment J. Educ. Eval., 3, 10-11. Andalecio, M. N. (2004). Development of a multi-criteria evaluation technique to assess the impacts of fisheries management in a Philippine bay. (Doctoral Dissertation). Retrieved from http://search.proquest.com.ezp.lib.unimelb.edu.au/docview/305094755?accountid=12372 Beauchamp, T. L. (1998). Value Judgments in social science. In E. Craig (Ed.), Routledge Encyclopedia of Philosophy. London: Routledge. Retrieved from http://www.rep.routledge.com/article/W011SECT4. Bell, D. (1997). Logical Positivism. In P. V. Lamarque & R. E. Asher (Eds.), Concise Encyclopedia of Philosophy of Language. Kidlington, Oxford: Pergamon. Chen, H.-T., & Rossi, P. H. (1980). The Multi-Goal, Theory-Driven Approach to Evaluation: A Model Linking Basic and Applied Social Science. Social Forces, 59(1), 106-122. Coryn, C. L., & Hattie, J. A. (2006). The transdisciplinary model of evaluation. Journal of MultiDisciplinary Evaluation, 4, 107-114. Cronholm, S., & Goldkuhl, G. (2003). Strategies for information systems evaluation-six generic types. Electronic Journal of Information Systems Evaluation, 6(2), 65-74. Davidson, E. J. (2005). Evaluation Methodology: Basics. Thousand Oaks, London: Sage Publications. Davies, R., & Dart, J. (2005). The Most Significant Change(MSC) Technique. A Guide to its Use. Dorst, C. H. (2007). The problem of design problems. In N. Cross & E. Edmonds (Eds.), Expertise in Design - Design Thinking Research Symposium 6. Sydney: Creativity and Cognition Studios Press. Eilat, H., Golany, B., & Shtub, A. (2008). R&D project evaluation: An integrated DEA and balanced scorecard approach. Omega, 36(5), 895-912. Evers, J. W. (1980). A field of study of goal-based and goal-free evaluation techniques. (Doctoral Dissertation). Retrieved from http://search.proquest.com.ezp.lib.unimelb.edu.au/docview/303099421?accountid=12372 Harpham, T., Burton, S., & Blue, I. (2001). Healthy city projects in developing countries: the first evaluation. Health Promotion International, 16(2), 111-125. Hoffer, A. (1967). A theoretical examination of double-blind design. Canadian Medical Association Journal, 97(3), 123. Misra, S. (2012). Randomized double blind placebo control studies, the "Gold Standard" in intervention based studies. [Article]. Indian Journal of Sexually Transmitted Diseases, 33(2), 131-134. Owen, J. M. (2006). Program evaluation: forms and approaches: St Leonards, N.S.W. : Allen & Unwin, 2006. Patton, M. Q. (2005). Goal-based vs. Goal-free Evaluation. In K. Kempf-Leonard (Ed.), Encyclopedia of Social Measurements (Vol. II, pp. 141-144). London: Elsevier Academic Press. Patton, M. Q. (2010). Trends in Evaluation: Unicef Webinar Retrieved April 4, 2013, from http://www.youtube.com/watch?v=wcnkqL6Kdug Scriven, M. (1991). Prose and Cons about Goal-Free Evaluation. American Journal of Evaluation, 12(1), 55.

Scriven, M. (1993). Hard-Won Lessons in Program Evaluation. New directions for program evaluation, 58, 1-107. Scriven, M. (2004). Reflections. In M. C. Alkin (Ed.), Evaluation Roots: Tracing Theorists' Views and Influeneces (pp. 183-195). Thousand Oaks: Sage Publications, Inc. Scriven, M. (2011). Evaluation Bias and Its Control. Journal of MultiDisciplinary Evaluation, 7(15), 7998. Scriven, M., & Roth, J. (1990). Special Feature: Needs Assessment. American Journal of Evaluation, 11(2), 135. Sigsgaard, P. (2002). Monitoring without Indicators: An ongoing testing of MSC Approach. Evaluation Journal of Australasia, 2(1), 8-15. Stufflebeam, D. L., & Shinkfield, A. J. (2007). Evaluation theory, models, and applications. San Francisco: Jossey-Bass. Valisno, M. (2010). Adoption of DepED Balanced Score Card under the Performance Governance System. Pasig, Philippines: Retrieved from http://former.deped.gov.ph/cpanel/uploads/issuanceImg/DO%20No.%2039,%20s.%202010. pdf. Walsh, P. L. (1980). An empirical evaluative comparison of goal-based and goal-free approaches to educational program evaluation (Doctoral Dissertation). Retrieved from http://search.proquest.com.ezp.lib.unimelb.edu.au/pqdtft/docview/303018888/13D48930F BB3A7C6586/2?accountid=12372# Youker, B. W. (2011). An analog experiment comparing goal-free evaluation and goal achievement evaluation utility. (Doctoral Dissertation). Retrieved from http://search.proquest.com.ezp.lib.unimelb.edu.au/pqdtft/docview/920881199/13D48DE42 3E26867413/1?accountid=12372#

You might also like