How to Conduct Evaluation of
Extension Programs
Murari Suvedi
Kirk Heinze
Diane Ruonavaara
ANRECS Center for Evaluative Studies
Dept of ANR Education and Communication Systems
409 Agriculture Hall
Michigan State University Extension
East Lansing, MI 48824
December 1999
Introduction
Evaluation in extension used to focus primarily on judging a program’s merit or worth. Additionally, the methodology associated with earlier forms of evaluation was portrayed as basically a quantitative activity. In today’s increasingly complex and demanding world, evaluation must deal with issues of accountability, good management, knowledge building and sharing, organizational learning and development, problem identification and policy formation. As the scope of evaluation expands, qualitative approaches and multiple methods are becoming increasingly necessary. Concurrently, today’s evaluator in extension finds that he or she needs to fulfill multiple roles and be familiar with numerous methods. This manual is designed to cover the expanding field of evaluation as it applies to extension and to provide you, the evaluator, with a methodological toolbox containing a broad array of methods and suggestions as to their appropriate use.
What is Evaluation?
Program evaluation is a continual and systematic process of assessing the value or potential value of Extension programs to guide decision-making for the program’s future.
When we evaluate...
Why Evaluate?
Demands on Extension for program efficiency, program effectiveness and for public accountability are increasing. Evaluation can help meet these demands in various ways.
To assess needs.
To set priorities.
To direct allocation of resources.
To guide policy.
To determine achievement of project objectives.
To identify strengths and weaknesses of a program.
To determine if the needs of beneficiaries are being met.
To determine the cost-effectiveness of a program.
To assess causes of success or failure.
To improve program management and effectiveness.
To identify and facilitate needed change.
To continue expand or terminate a program.
To stakeholders.
To funding sources.
To the general public.
To discover a program’s impact on individuals and/or communities.
To gain support from policy makers and advisory councils.
To direct attention to needs of particular stakeholder groups.
When to Evaluate
There are several basic questions to ask when deciding whether to carry out an evaluation. If the answers to these questions are "No", this may not be the time for an evaluation.
Role of the Evaluator
The role of an evaluator is continually expanding. The traditional role of an evaluator was a combination of expert, scientist and researcher who uncovered clear-cut cause-and-effect relationships. Today evaluators are often educators, facilitators, consultants, interpreters, mediators and/or change agents.
An Evaluator’s Credibility
An evaluator is judged by his or her competence and personal style. Competence is developed through training and experience. Personal style develops over time through a combination of training, experience and personal characteristics.
Competence
Personal Style
Steps to Evaluation
Program evaluation can be an overwhelming process. To make program evaluation less intimidating and more manageable it can be broken down into several manageable steps. The specifics of each step may vary, depending on the nature, scope and complexity of the programs and the resources available for conducting the evaluations. These steps will be expanded upon in later sessions.
10 Steps to Evaluation – A Flow Chart
|
Step 1 Identify and describe the proposed or existing program |
||||
|
â |
||||
|
à à à á |
à |
Step 2 Identify the phase the program is in & the type of evaluation study needed |
à |
à à à â |
|
á |
â |
|||
|
Step 10 Apply and Use Findings |
Step 3 Assess the Feasibility of Implementing an Evaluation |
|||
|
á |
â |
|||
|
Step 9 Communicate Findings |
Step 4 Identify & Consult Key Stakeholders |
|||
|
á |
â |
|||
|
Step 8 Collect, Analyze and Interpret Data |
Step 5 Identify Approaches to Data Collection |
|||
|
á |
â |
|||
|
Step 7 Identify Population and Select Sample |
ß |
ß ß ß ß ß ß ß |
ß |
Step 6 Select Data Collection Techniques |
Step 1. Identify and describe the program to be evaluated
A description should include:
Step 2: Identify the program phase & the appropriate type of evaluation study
There are a number of types of evaluation studies: needs assessments, baseline studies, formative evaluations, summative evaluations and follow-up studies. The type of evaluation study utilized is selected on the basis of stage of program, program requirements and stakeholders’ interests.
Identifying the program phase and type of evaluation study needed
|
Ask: |
Identify program phase: |
Select type of evaluation study: |
||
|
Is the program at a design stage? |
Õ |
Program design |
Õ |
Needs assessment |
|
Is the program just beginning? |
Õ |
Program start-up |
Õ |
Baseline study |
|
Is the program active? |
Õ |
On-going program |
Õ |
Formative evaluation |
|
Is the program ending? |
Õ |
Program wrap-up |
Õ |
Summative evaluation |
|
Is the program over? |
Õ |
Program follow-up |
Õ |
Follow-up study |
Types of Evaluation Studies
A needs assessment focuses on identifying needs of the target audience, developing a rationale for a program, identifying needed inputs, determining program content, and setting program goals. A needs assessment asks questions about what exists and what is needed:
What do we need and why?
What does our audience expect from us?
What resources do we need for program implementation?
A baseline study establishes a benchmark from which to judge future program or project impact. A baseline study asks questions about what exists:
What is the current status of the program?
What is the current level of knowledge, skills, attitudes and beliefs of our audience?
What are our priority areas of intervention?
What are our existing resources?
A formative, process, or developmental evaluation provides information for program improvement, modification, and management. A formative evaluation asks descriptive questions:
What are we supposed to be doing?
What are we doing?
How can we improve?
A summative, impact, or judgmental evaluation focuses on determining overall success, effectiveness, and accountability of the program. It helps make major decisions about a program’s continuation, expansion, reduction, and/or termination. A summative evaluation asks questions about what happened:
What were the outcomes?
Who participated and how?
What were the costs?
A follow-up study examines long-term effects of a program. A follow-up study asks questions about long-term impacts:
What were the impacts of our program?
What was most useful to participants?
What are the long-term effects?
Step 3. Assess the feasibility of implementing an evaluation study
Assessing the feasibility of a program evaluation helps ensure that the program can be meaningfully evaluated and that the evaluation will contribute to improving program design and/or performance. Consider the following questions carefully and then decide whether this is an appropriate time to begin a program evaluation. If the answers to many of these questions are "No", this may not be an appropriate time to implement an evaluation study.
Step 4: Identify and consult key stakeholders
Stakeholders are people who have a stake or vested interest in the evaluation findings. They can be program funders, staff, administration, clients or program participants. It is important to clarify the purpose and procedures of an evaluation with key stakeholders before beginning. This process can help determine the type of evaluation needed and point to additional reasons for evaluation that may prove even more productive than those originally suggested.
Come to agreement with stakeholders on:
Each objective should:
Example: Members of every household in Ingham county will increase their awareness about water quality by participating in a survey conducted by Michigan State University.
Clarify evaluation questions, issues, indicators and criteria
Evaluations are conducted to answer specific questions, to address programmatic issues, to plan for future programs and/or to apply criteria to judge value or worth of an existing program. If the questions and issues that are being used are not clearly defined and the indicators and criteria that will be used to judge merit or worth are not well thought out, the evaluation may lack focus, be irrelevant, omit important areas of interest or come to unsupported conclusions.
Basic steps in selecting questions, issues, indicators and criteria
List questions, issues and criteria from all sources consulted.
Organize material into a manageable number of categories. Match level of program with indicators appropriate for that level -- remember that it is not possible for an evaluation to address all areas of interest.
Come to agreement with stakeholders on the degree of incompleteness that is acceptable, given monetary and time constraints.
Focus the scope of the evaluation to the crucial and practical.
In addition to talking with stakeholders, consider the following sources when you are clarifying the purpose of the evaluation and developing the questions, issues, indicators and criteria:
Coming to agreement on indicators
Indicators are variables. A variable is an operational representation of an attribute (quality, characteristic, property) of a system. Indicators are observable phenomena that point toward the intended and/or actual condition of situations, programs, outcomes and help gauge the performance of natural systems as well as human endeavors.
An indicator is a marker that can be observed to show that something has changed. Indicators can help people notice changes at an early stage of program’s impact.
Characteristics of indicators:
Criteria for choosing indicators
Bennett’s Hierarchy of Evidence
Bennett’s Hierarchy of Evidence provides a way of conceptualizing the relationships between program objectives and outcomes at different program levels. The hierarchy suggests the kind of information appropriate to measure to determine if an objective has been met. This will help ensure that the information you gather is appropriate for the level of the program you are evaluating.
|
Program Levels |
Indicators |
|||||||
|
End results |
Changes in participants’ personal and working lives as a result of program participation. |
|||||||
|
Practice and behavior changes |
Changes in participants’ practices as a result of program participation. |
|||||||
|
Knowledge, attitude, skill and aspirational Changes (KASA) |
Changes in participants’ knowledge, attitudes, skills and aspirations as a result of program participation. |
|||||||
|
Reactions |
How participants and clients reacted to the program. |
|||||||
|
Participation |
Who participated and how many. |
|||||||
|
Activities |
Activities that participants were engaged in through the program. The kinds of information and methods used to interact with program participants. |
|||||||
|
Inputs |
The personnel and other resources used during the program. |
|||||||
Step 5. Approaches to Data Collection
There are two basic types of data collection: quantitative and qualitative. Quantitative data tend to focus on numerical data, while qualitative data are expressed in words.
Quantitative Methods measure a finite number of pre-specific outcomes and are appropriate for judging effects, attributing cause, comparing or ranking, classifying and generalizing results. Quantitative Methods are:
|
Quantitative methods commonly used in evaluation of extension programs include, but are not limited to:Existing information |
Testing information & knowledge |
|
Surveys |
Benefit/cost analysis |
|
Group-administered questionnaire |
Personal interviews |
Qualitative Methods take many forms including rich descriptions of people, places, and conversations and behavior. The open-ended nature of qualitative methods allows the person being interviewed to answer questions from his or her own perspective. Qualitative Methods are appropriate for:
|
Qualitative methods commonly used in evaluation of extension programs include, but are not limited to: |
|
|
Existing Information |
Personal Interview |
|
Focus Group |
Rapid Rural Appraisal |
|
Participant Observation |
Case Study |
|
Group Interview |
|
Multiple Methods combine qualitative and quantitative methods within one evaluation study. This combination can be used to offset biases and complement strengths of different methods. When using multiple methods, care should be taken to ensure that the selected methods are appropriate to the evaluation questions and that resources are not stretched too thinly. Multiple Methods are appropriate for:
|
An Example of Multiple Methods: Garden Project Evaluation In culturally and politically complex situations multiple methods are particularly appropriate. The following methods were combined in an evaluation of garden projects with indigenous and immigrant groups in the Petén of Guatemala. |
|
|
Introduction to communities |
Garden visits and biotic survey |
|
|
|
|
|
|
|
Sampling strategy developed |
|
|
|
|
|
|
Focus group interviews |
|
Focused unstructured interviews |
Data analysis |
|
|
|
Visits to garden projects |
|
|
|
|
|
|
Quality of Evidence
The validity and reliability of the data collection instrument determine the quality of evidence for quantitative methods.
Validity - The data collection instrument measures what it is supposed to measure and data collected are relevant to the specific situation or audience.
Reliability - The data collection instrument measures consistently, yielding the same results with the same groups of people under the same conditions.
Step 6: Selecting Data Collection Techniques
There is no one best method to use when collecting data for project evaluation. Selection of a method or methods should be influenced by the type of information needed, the time available, and cost. Last, but not least you should consider whether the information collected will be viewed as credible, accurate and useful by your organization.
A large array of methods exist which can be used in evaluation. We will cover the following:
|
Quantitative Methods |
Qualitative Methods |
|
Existing Information |
Focus Group |
|
Testing Information and Knowledge |
Rapid Rural Appraisal |
|
Telephone Surveys |
Case Study |
|
Mail Surveys |
Semi-structured Interviews |
|
Group-administered Questionnaire |
Participant Observation |
Existing Information
Before you start to collect data, check to see what information already exists. Pre-existing information can be found in documents, reports, program records, historical accounts, minutes of meetings, letters, photographs, census data and surveys.
Existing information is useful for:
Advantages of using existing information:
Disadvantages of existing information as a data source:
Tests can be used as a tool to measure the level of knowledge, understanding and ability that an individual possesses related to a particular program.
Advantages of using testing information and knowledge:
Disadvantages of using testing information and knowledge:
Basic Steps in Testing Changes in Knowledge and Information
Surveys
Surveys are a very popular method of collecting evaluation data and require a carefully designed questionnaire administered by mail, telephone or personal interviews. Surveys can be used to collect data on a participant’s knowledge, attitudes, skills and aspirations, adoption of practices, and program benefits and impacts. It is the responsibility of the evaluator to ensure that ethical standards are maintained. This means that participation is voluntary and survey results are made public in a way that maintains confidentiality.
Advantages of using surveys:
Disadvantages of using surveys:
Key questions that need to be answered before carrying out a survey:
When choosing a survey method, consider the resources you have available:
Telephone Survey
A telephone survey consists of a written questionnaire that is read to a selected group of people over the telephone. The survey sample is often selected from a telephone directory or other lists. People on the list are interviewed one at a time over the phone.
Advantages of telephone surveys:
Disadvantages of telephone surveys:
Implementing a Telephone Survey
Sample call sheet for telephone interviews
A call-sheet is used for each number chosen from the sampling frame. The interviewer records information that allows the supervisor to decide what to do with each number that has been processed. Call sheets are attached to questionnaires after an interview is completed.
|
Telephone interview call sheet |
|||||
|
Survey title Questionnaire identification number ____________ Area code & number ( )______ - _________ |
|||||
|
Contact attempts |
Date |
Time |
Result code & comments |
Interviewer I.D. |
|
|
1 |
|||||
|
2 |
|||||
|
3 |
|||||
|
4 |
|||||
|
5 |
|||||
|
6 |
|||||
|
Additional comments:
|
|||||
|
Result Codes |
|||||
|
Code |
Explanation |
||||
|
01 |
No answer after seven rings |
||||
|
02 |
Busy, after one immediate redial |
||||
|
03 |
Answering machine (residence) |
||||
|
04 |
Household language barrier |
||||
|
05 |
Answered by nonresident |
||||
|
06 |
Household refusal |
||||
|
07 |
Disconnected or other non-working number |
||||
|
08 |
Temporarily disconnected |
||||
|
09 |
Business or other non-residence |
||||
|
10 |
No one meeting eligibility requirement |
||||
|
11 |
Contact only |
||||
|
12 |
Selected respondent temporarily unavailable |
||||
|
13 |
Selected respondent unavailable during field period |
||||
|
14 |
Selected respondent unavailable because of physical/mental handicap |
||||
|
15 |
Language barrier with selected respondent |
||||
|
16 |
Refusal by selected respondent |
||||
|
17 |
Partial interview |
||||
|
18 |
Respondent contacted - completed interview |
||||
|
19 |
Other |
||||
Sample Help Sheet for Interviewers
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Mail Survey
A mail survey is the most frequently used type of survey in evaluation of Extension programs and requires the least resources.
Advantages of using a mail survey:
Disadvantages of using a mail survey:
Basic Steps in Implementing a Mail Survey
Personal Surveys
Personal or face-to-face surveys are conducted by talking individually to respondents and systematically recording their answers to each question.
Advantages of a personal survey:
Disadvantages of a personal survey:
Basic Steps in Implementing a Personal Survey
General Procedures for a Survey Interview
Minimizing interviewer bias:
Initiating contact:
Guidelines for interviewing:
Some surveys are more accurate than others. Accuracy means that survey results closely represent the population from which the sample has been drawn. Inaccuracy can be caused by several types of errors including coverage error, sampling error, selection error, frame error, non-response error and measurement error.
|
Type of Error |
Cause of Error |
Control of Error |
|
Coverage error |
The sampling frame does not include all elements of the population. |
Redraw list from which the sample is drawn to include all elements of the population. |
|
Sampling error |
A subset or sample of all people in the population is studied instead of conducting a census. |
Increase the size of the sample; Use random sampling; Purge list of duplication. |
|
Selection error
|
Some sampling units have a greater chance of being chosen than others are. |
Use random sampling |
|
Frame error |
List is inaccurate or some sampling units are omitted. |
Use up-to-date, accurate list. |
|
Non-response error |
Subjects can’t be located or fail to respond. |
Compare early to late respondents. If no difference is apparent, results can be generalized to the larger population. Contact about 10% of non-respondents and gather data from them. Compare these data with the respondents. If no difference is apparent, results can be generalized to the larger population. Compare respondents to non-respondents on known characteristics. If no difference is apparent, the results can be generalized to the larger population. |
|
Measurement error |
A respondent’s answer is inaccurate or imprecise or cannot be compared to any useful way to other respondent’s answer. This may caused by: unclearly stated questions; unclear instructions; or respondents giving socially correct responses, not knowing the correct information or deliberately lying. |
Choose appropriate method of data collection for your evaluation. Write clear, unambiguous questions that people can and want to answer. Train your interviewers carefully. Use valid and reliable instruments. |
Group-administered Questionnaire
A group-administered questionnaire is handed directly to each participant in a group at the end of a workshop, seminar or program. Respondents answer the questions individually and hand them back to the person conducting the evaluation.
Advantages of a group-administered questionnaire:
Disadvantages of a group-administered questionnaire:
Basic steps in doing a group-administered questionnaire
Questionnaire Design
The overall aim questionnaire design is to solicit quality participation. Response quality depends on the trust the respondent feels for the survey, the topic, the interviewer and the manner in which the questions are worded and arranged. Consider whether the questionnaire is going to be mailed, given directly to respondents, used in a telephone survey or used in personal interviews. Before you begin it is essential to know what kind of evidence you need for the evaluation and how the information will be used.
Before you begin…
Writing the questionnaire
Special Questionnaire Design Considerations
Telephone questionnaires: Telephone questionnaires dependent on oral communication, so special attention must be paid to designing a questionnaire that will assist the interviewer as much a possible in holding the respondent’s attention. Design and construction of the questionnaire are based on utility rather than aesthetics.
Introduction: Special attention is paid to the introduction because it is at this point that most refusals occur. The introduction should include:
Mailed questionnaires: The appearance of a mailed questionnaire is of utmost importance. A mailed questionnaire must "sell" itself to the respondent to be returned. Therefore, considerable care should be taken in designing the format of the questionnaire.
Designing a Questionnaire Cover Letter
1st paragraph:
Explains the purpose of the study.
Describes who will be answering the questionnaire.
Assures confidentiality of responses.
2nd paragraph:
Assures the respondent the study is useful.
Lets the respondent know he or she is important to the success of the study.
3rd paragraph:
Provides directions on how and when to return the questionnaire
Explains the questionnaire identification number for facilitating follow-up.
4th paragraph:
Reemphasizes the study’s social usefulness.
Promises a copy of survey results if desired.
Indicates a willingness to answer any questions.
Includes a statement of thanks, a closing and the sender’s name and title.
Writing Questions
The questions used in a questionnaire are the basic components that determine the effectiveness of your survey. Writing good questions is not easy and usually takes more than one try. Consider what information to include, how to structure the questions and whether people can answer the questions accurately. Good survey questions are focused, clear, and to the point.
Every question should focus on a single, specific issue or topic.
Poor: Which brand do you like best?
Better: Which of these brands are you most likely to buy?
The objective of these questions is to measure consumer preference. The first question lacks focus, consumers may like a particular brand, but may not buy it because of its high price.
The meaning of the question must be completely clear to all respondents. Clarity ensures that everyone interprets the question the same.
Poor: When was the last time you went to the doctor for a physical examination on your own or because you had to?
Better: How many months ago was your last physical examination?
The first question could be interpreted in weeks, months, years, or by date.
Keep questions as short as possible. Short questions are easier to answer and less subject to error by interviewers and respondents. Long questions are more likely to lack focus and clarity.
Poor: Can you tell me how many children you have, whether they’re boys or girls, and how old they are?
Better: What is the age and sex of each of your children?
A respondent may answer the first question ambiguously. For example, "I have two boys and a girl. They are 5, 7, and 10 years old." It is not possible to determine the ages of each child from this response.
Questions should be written to avoid bias.
Poor: Is it true that our agents always work long hours?
Better: On average, how many hours do extension agents work in their job?
Types of Information
Questions can be formulated to elicit four types of information: 1) knowledge, 2) beliefs, attitudes and opinions, 3) behavior and 4) attributes. Any one or a combination of these types can be included in a questionnaire.
Knowledge questions include what people know and how well they understand something
What is the major cause of accidental deaths among children inside the home?
Beliefs, attitudes and opinions include people’s perceptions, their thoughts, their feelings, their judgments or their ways of thinking.
Should the Clearwater Regional Education Center in Minor County continue to offer college-level and/or continuing education courses and programs?
Behavioral questions ask people about what they have done in the past, what they do now or what they plan to do in the future.
Have you or your family ever taken classes at the Clearwater Regional Education Center in Minor County?
Attributes are a person’s personal characteristics, such as age, education, occupation or income. Attribute questions ask respondents who they are, not what they do.
Where do you currently live?
How many children do you have?
What percentage of your household income comes from off-farm employment?
Types of Questions
There are basically two distinct type of questions asked in a survey – closed-ended questions and open-ended questions.
Closed-Ended questions have pre-determined categories of responses from which the respondent can choose. When asking closed-ended questions make sure that all alternative response categories have been included.
Advantages of closed-ended questions:
Disadvantages of closed-ended questions:
Examples of Closed-ended Questions
1. Have you or members of your family ever taken classes at the Regional Education Center in this county? _____Yes _____No
2. To what extent do you agree or disagree with the new zoning code?
Open-ended Questions
Open-ended questions allow respondents to answer in their own words rather than select from predetermined answers.
Advantages of open-ended questions:
Disadvantages of open-ended questions:
Examples of Open-ended Questions
Pre-testing Evaluation Instruments
Pre-testing is usually associated with quantitative methods though qualitative and participatory methods can be pre-tested as well albeit using a slightly different format. Pre-testing entails trying out evaluation techniques and instruments before beginning the evaluation process and to avoid costly errors and wasted effort. When possible, pre-testing should be done in circumstances similar to those anticipated during the evaluation itself. If feasible, use the same sampling plan you will use during the evaluation to select a mini-sample.
In pre-testing, we ask:
3 Are the issues to be discussed, the questions to be asked and/or the words to be used clear and unambiguous?
3 Is the technique or instrument appropriate for the people being interviewed or observed?
3 Are instructions for the interviewer or observer easy to follow?
3 Are the techniques and/or forms for recording information clear and easy to use?
3 Are procedures standardized?
3 Will the technique or instrument provide the necessary information?
3 Does the technique or instrument provide reliable and valid information using the criteria of the chosen data collection approach?
You may find that you have to modify the technique or instrument after field testing it. If extensive revisions are made, a second field test may be necessary.
Focus Group
A focus group is a small group, typically 8 to12 people who are relatively homogeneous, which is selected to discuss a specific topic in a non-threatening atmosphere. The focus group is moderated and recorded by a skilled interviewer. A focus group measures community needs and issues; citizens’ attitudes, perceptions and opinions on specific topics; and impacts of a particular program on individuals and communities.
Advantages of a focus group:
Disadvantages of a focus group:
Steps to a Focus Group Interview
|
æ 10 |
Prepare a short report and share findings with stakeholders.
|
|||||||||
|
æ 9 |
Analyze results of taped discussion and summarize what is said. Interpret meaning, make recommendation, summarize interview. |
|||||||||
|
æ 8 |
Immediately following the interview the moderator and assistant discuss the experience and their perceptions. They review the tape together before the next focus group is conducted. |
|||||||||
|
æ 7 |
Conduct focus group interview. The moderator explains purpose, ensures anonymity of respondents, tape- records meeting. |
|||||||||
|
æ 6 |
Arrange for a meeting room. Check the seating and table arrangements.
|
|||||||||
|
æ 5 |
Identify a trained moderator and an assistant to conduct the focus group interview. The moderator creates a warm and friendly atmosphere, directs and keeps the flow of the conversation flowing and takes notes. |
|||||||||
|
æ 4 |
Identify and contact potential participants by sending a personalized invitation. Explain the purpose of the meeting to them and how their participation will contribute. Reconfirm their availability to participate. |
|||||||||
|
æ 3 |
Arrange a meeting place that is neutral and non-threatening, convenient and easy to find. Select means to record discussion. (tape recorder, note taker etc.). |
|||||||||
|
æ 2 |
Identify the questions to be asked in the interview. Establish the context for each question. Arrange the questions in a logical sequence. |
|||||||||
|
1 |
Consider your purpose for conducting a focus group interviews. Identify the users of the information generated by the focus group. Develop a tentative plan including time required and resources needed. |
|||||||||
How to Begin a Focus Group Discussion
The first few moments in a focus group discussion are critical. In a brief time, the moderator must create a thoughtful, permissive atmosphere, provide the ground rule, and set the tone of the discussion. Much of the success of group interviewing can be attributed to the development of this open environment. The recommended pattern for introducing the group discussion include: the welcome, the overview and topic, the ground rules and the first question.
An Example of a Typical Introduction
Good evening and welcome to our session tonight. Thank you for taking the time to join our discussion of county educational services. My name is _______ and I represent ____________. Assisting me is _________ from _________. We are attempting to gain information about educational opportunities in the community. We have invited people who live in several parts of the county to share their ideas.
You were selected because you have certain things in common that are of particular interest to us. You are all employed outside the home and you live in the suburban areas of the county. We are particularly interested in your views because you are representative of others in the county.
Tonight we will be discussing non-formal educational issues in the community. These include all the ways you gain new information about areas of interest to you. There are no right or wrong answers but rather differing points of view. Please feel free to share your point of view even if it differs from what others have said.
Before we begin, let me remind you of some ground rules. Please speak up, but only one person should talk at a time. We’re tape-recording the session because we don’t want to miss any of your comments. If several are talking at the same time, the tape will get garbled and we’ll miss your comments. We will be on a first- name basis tonight, and in our later reports, there will not be any names attached to comments. You may be assured of complete anonymity of responses. Keep in mind that we’re just as interested in negative comments as positive comments, and at times the negative comments are the most helpful.
Our session will last about an hour and we will not be taking a formal break. Well, let’s begin.
Let’s find out some more about one another by going around the room one at a time. Tell us your name and where you live.
How to ask Questions in a Focus Group
What did your think of the program?
Where do you get new information?
What do you like best about the proposed program?
2. Avoid dichotomous questions – those that can be answered with a yes or a no.
3. "Why" questions are rarely asked
"Why" questions can make people defensive and feel the need to provide an answer.
When you ask "why," people usually respond with attributes or influences.
It’s better to ask, "What prompted you?" or "What features did you like?"
5. Carefully prepare focus questions
Identify potential questions.
Five types of questions are:
a. Opening questions (round-robin).
b. Introductory questions.
c. Transition questions.
d. Key questions.
e. Ending questions.
6. Ask uncued questions first, cued questions second
(Cues are the hints or prompts that help participants recall specific features or details.)
7. Consider using standardized questions (explain this - what does this mean?)
8. Focus the questions by using a sequence that proceeds from general questions to those focusing on specific
Sample Focus Group Questions
Field Crops Industry Advisory Committee
Note: These questions will be distributed to all advisory committee members during the focus group session.
(Facilitator’s notes: Record key words and return to some of these later)
(Facilitator’s notes: Probe positive and negative comments.)
(Facilitator’s notes: Encourage each participant to respond. Refrain from probes until each participant has a chance to react.)
Rapid Rural Appraisal
Rapid rural appraisal (RRA) is a research approach that involves multiple data collection techniques that are quick, flexible and adaptive, yet relevant. RRA helps us learn about local people’s situations, experiences and problems from a local perspective.
Advantages of rapid rural appraisal:
Disadvantages of rapid rural appraisal:
RRA Methods Tool Box
|
Existing information |
Visualization techniques |
|
Individual Interviews |
|
|
|
|
|
|
|
|
Group interviews |
|
|
|
Ranking games |
|
|
|
|
|
Matrices |
|
Basic Steps to Rapid Rural Appraisal
Case Study
A case study is an in-depth analysis of a particular case – a program, a group of participants, a single individual, or a specific site or location. Case studies can be explanatory, descriptive or exploratory. An explanatory case study can measure causal relationships; a descriptive case study can be used to describe the context in which a program takes place and the program itself, and an exploratory case study can help identify performance measures or pose hypotheses for further evaluation. Case studies rely on multiple sources of information and methods to provide as complete a picture as possible of the particular case.
Advantages of a case study:
Disadvantages of a case study:
An Example of a Case Study
A detailed and systematic recording of evidence before and after a producer participates in a comprehensive financial farm management program could provide valuable insights into program impact that might be useful in expanding the program to a larger group of producers.
Steps to planning and conducting a Case Study
Semi-structured Interviews
Semi-structured interviews with project participants and other key informants begin with an interview guide that lists topics to cover and open-ended questions to ask. Probing techniques are used to solicit answers and raise new topics that reflect the people’s perspectives, beliefs, attitudes and concerns.
Advantages of semi-structured interviews:
Disadvantages of semi-structured interviews:
Guidelines for Semi-structured Interviewing
Participant Observation
Participant observation entails gathering information about behavioral actions and reactions through direct observation, interviews with key informants, and participation in the activities being evaluated. As used in evaluation, the PO evaluator immerses him- or herself in the setting being studied with the intent of understanding the world through the eyes of stakeholders. Participant observation is useful in determining community conflicts or misunderstandings, assessing community needs and problems, and/or identifying means to involve local people in problem solving.
Advantages of participant observation:
Disadvantages of participant observation:
General Instructions for Engaging in Participant Observation
Benefit/Cost Analysis
Benefit/cost analysis is typically viewed as an alternative to program evaluation. However, it can also be seen as an extension of the evaluation process. As such, benefit/cost analysis provides a means to systematically quantify and compare program inputs to program outcomes in monetary terms. Valuing both benefits and costs in monetary terms allows them to be directly compared to determine the net impact of the program, make comparisons between alternative programs or projects, assist in program planning, advance organizational accountability and /or expedite program support.
Advantages of a benefit/cost analysis:
Disadvantages of a benefit/cost analysis:
Steps to Benefit/Cost Analysis
A. The cost equation: Cost = L + K + I – i
L = labor: The cost per hour for labor including salary and fringe benefits. Fringe benefits vary but normally fall within 22 to 35 percent of full salary. The full labor hourly formula (L) is: (S+S.35)/260/8 where S = salary and S.35 = fringes, 260 = workdays per year and 8 = hours per workday.
K = direct costs: Direct program costs budgeted for or assigned to the program, e.g., supplies, correspondence, communications, travel and per diem expenses, equipment and audiovisuals. If costs are shared between projects, the total is calculated from a cost/share equation. Opportunity costs are defined as opportunities that participants have lost in order to participate in the program. Opportunity costs are included in direct costs to the participants, the presenters or the stakeholders, depending on the level of analysis.
I = indirect costs: Costs indirectly associated with the participants but directly associated with the program e.g., administrative costs such as facility rental, photocopying, report costs, telephone and prorated equipment and supplies costs.
i = discount amortization: Measurable returns over time (both positive and negative). Discount amortization is not included if returns can not be traced over time.
B. The benefit equation: B = Cr + DB + IB)
Cr = cost reductions attributable to program activities
DB = direct benefits - the primary outcomes experienced by participants and others directly involved in the program. They are typically derived from program objectives.
IB = indirect benefits - secondary or intangible outcomes of the program or project experienced by participants, non-participants or society in general. These outcomes or consequences can be positive or negative.
Sample Sheet for Benefit/Cost Analysis
Benefits Estimate Worksheet Cost Estimate Worksheet
|
Number of beneficiaries |
No. of units |
Unit value |
Total cost |
|||||||||||||
|
Direct benefits |
Direct costs |
|||||||||||||||
|
1. |
Labor |
Hours: |
||||||||||||||
|
2. |
1. |
|||||||||||||||
|
3. |
2. |
|||||||||||||||
|
4. |
3. |
|||||||||||||||
|
5. |
Direct costs |
|||||||||||||||
|
6. |
1. Rent |
|||||||||||||||
|
2. Utilities |
||||||||||||||||
|
Indirect benefits |
Equipment & materials |
|||||||||||||||
|
1. |
1. Printed materials |
Pieces: |
||||||||||||||
|
2. |
2. Furnishings |
|||||||||||||||
|
3. |
3. Instructional Materials |
|||||||||||||||
|
4. |
4. Travel |
Miles: |
||||||||||||||
|
5. |
Opportunity costs |
|||||||||||||||
|
6. |
1. Child care |
|||||||||||||||
|
2. Food |
||||||||||||||||
|
3. Travel |
||||||||||||||||
|
Indirect costs |
||||||||||||||||
|
Total program benefits |
Total program costs |
|||||||||||||||
|
Benefit/cost ratio |
||||||||||||||||
Step 7. Sampling for Evaluation
A sample is a set of respondents selected from a larger population for the purpose of a survey. When done properly, the sample represents the characteristics of the population as a whole. Sampling saves time, money, materials and efforts without sacrificing accuracy and precision.
Five Steps in Sampling
|
5 æ |
Infer conclusions back to the total population. |
||||
|
4 æ |
Draw conclusions based on sample information. |
||||
|
3 æ |
Choose and execute the sampling plan: decide on sample size. |
||||
|
2 æ |
Define how much sampling error can be tolerated. |
||||
|
1æ |
Define the population: what is the population size; how varied is the population. |
||||
Sample Size
How large should a sample be? A sample size of 100 respondents is often cited as a minimal number for a large population. The practical maximum size is about 1000 respondents. Generally, a sample of fewer than 30 respondents will not provide enough certainty to prove useful. However, several factors need to be considered when determining actual sample size.
Characteristics of population – addresses the amount of variability in the population to be sampled. A relatively homogeneous population may permit a smaller sample size. Conversely, a more heterogeneous one may require a larger population size.
Sampling error - the difference between an estimate taken from the population and that taken from the sample when the same method is used to gather the data. Sampling error is larger when the sample size is small. It is therefore advisable to use the largest sample size possible given the constraints on time, money and materials.
Degree of precision - measures the degree to which an estimate approximates the estimate obtained from the total population, assuming the same method of data collection was used. In designing a sample, the evaluator may begin by defining the degree of precision desired.
Margin of error - It is a matter of choice depending on the objectives of the inquiry. If we want to be relatively safe about our conclusions, then a 5 percent margin of error is acceptable (see appendix). In general, more subjects are needed for a .01 alpha test than a .05 alpha test, and a two-tailed test requires a larger sample size that a one-tailed.
Confidence level - the probability that a value in the population is within a specific, numeric range when compared with the corresponding value computed for the sample. Generally, a 95 percent confidence level will give the security needed to draw conclusions for the larger population based on the sample.
Cost - A small sample size reduces cost.
Table For Determining Sample Size from a Given Population
|
n* |
s* |
n |
s |
n |
s |
||
|
10 |
10 |
220 |
139 |
1200 |
291 |
||
|
15 |
14 |
230 |
143 |
1300 |
296 |
||
|
20 |
19 |
240 |
147 |
1400 |
301 |
||
|
25 |
24 |
250 |
151 |
1500 |
305 |
||
|
30 |
28 |
260 |
155 |
1600 |
309 |
||
|
35 |
32 |
270 |
158 |
1700 |
313 |
||
|
40 |
36 |
280 |
161 |
1800 |
316 |
||
|
45 |
49 |
290 |
165 |
1900 |
319 |
||
|
50 |
44 |
300 |
168 |
2000 |
322 |
||
|
55 |
48 |
320 |
174 |
2200 |
327 |
||
|
60 |
51 |
340 |
180 |
2400 |
331 |
||
|
65 |
55 |
360 |
185 |
2600 |
334 |
||
|
70 |
59 |
380 |
191 |
2800 |
337 |
||
|
75 |
62 |
400 |
195 |
3000 |
340 |
||
|
80 |
66 |
420 |
200 |
3500 |
346 |
||
|
85 |
69 |
440 |
205 |
4000 |
350 |
||
|
90 |
72 |
460 |
209 |
4500 |
353 |
||
|
95 |
76 |
480 |
213 |
5000 |
356 |
||
|
100 |
79 |
500 |
217 |
6000 |
361 |
||
|
110 |
79 |
550 |
226 |
7000 |
364 |
||
|
120 |
91 |
600 |
234 |
8000 |
366 |
||
|
130 |
97 |
650 |
241 |
9000 |
368 |
||
|
140 |
102 |
700 |
248 |
10000 |
369 |
||
|
150 |
107 |
750 |
254 |
15000 |
375 |
||
|
160 |
112 |
800 |
259 |
20000 |
377 |
||
|
170 |
117 |
850 |
264 |
30000 |
379 |
||
|
180 |
123 |
900 |
273 |
40000 |
380 |
||
|
190 |
127 |
950 |
277 |
50000 |
381 |
||
|
200 |
131 |
1000 |
284 |
75000 |
382 |
||
|
210 |
135 |
1100 |
1000000 |
384 |
|||
* n = population size; s = sample size
Sampling Techniques
Random or probability sampling is based on random selection of units from the identified population. Random sampling techniques include:
Simple random sample - all the individuals in the population have an equal and independent chance of being selected as a member of the sample. A random numbers table is sometimes used with a randomly selected starting point to identify numbered subjects (see appendix).
Systematic sampling - all members in the population are placed on a list for random selection and every nth person is chosen after a random starting place is selected.
Stratified sampling – is used to ensure that certain subgroups in the population will be represented in the sample in proportion to their numbers in the population. Each subgroup is separately numbered and random selection is used for each subgroup. A definite rationale should exist for selecting any such subgroup.
Cluster sample - the unit of sampling is not the individual but rather a naturally occurring group of individuals such as a classroom, organization or community.
Matrix sample - one sample of people receives a given sampling of questions and another sample of people receives another sampling of questions.
Purposive sample is chosen to include a wide variety of people on the basis of a number of specifically chosen and critical characteristics. Purposive sampling does not rely on random selection of units.
Accidental sample - sample consists of individuals who are available at the time. This is the weakest type of sample. Generalizations to the larger population can not be made.
Reputational sample people are selected to respond to a survey or an interview based on a judgment of who is and who is not a "typical" representative of the population.
Step 8. Collecting, Analyzing & Interpreting Data
Various kinds of data analysis exist for both quantitative and qualitative data. You should consider whether the analyses would provide the information needed to answer the questions posed by the evaluation and the analytical skills the evaluator possesses.
Qualitative Data Analysis
Analysis and interpretation of qualitative data are not simple technical processes like the analysis of quantitative data. Analysis of qualitative data is the process of bringing order to the data and organizing what there is into patterns, categories and basic descriptive units. Interpreting qualitative data is the process of bringing meaning to the analysis, explaining patterns, and looking for relationships and linkages among descriptive dimensions. The evaluator and/or stakeholders then make judgments about assigning value or worth to what has been analyzed and interpreted.
Characteristics of qualitative data analysis:
When doing qualitative analysis consider:
Content Analysis: A coding or classifying technique that investigates pattern of information and the meaning of data within a specific conceptual framework.
Content analysis
Content analysis is a research method that uses a set of procedures to make valid inferences from text such as newsletters, meeting minutes, correspondence, interview transcripts, etc. The inferences are about the sender(s) of the message, the message itself, or the audience of the message. Content analysis can be used for many purposes, such as auditing communication content against objectives, coding open-ended questions in surveys, describing attitudinal and behavioral responses to communications, revealing the focus of individual, group, institutional or societal attention toward something. A central idea in content analysis is that many words of text are classified into much fewer content categories. Each category may consist of one, several or many words. Words, phrases or other units of text classified in the same category are presumed to have similar meanings. Content analysis procedures create quantitative indicators that assess the degree of attention or concern devoted to cultural units such as themes, categories or issues. The investigator then interprets and explains the results using relevant theories. It involves three steps:
Reliability
To make valid inferences from the text, it is important that the classification procedure be reliable in the sense of being consistent: Different people should code the same text in the same way. Reliability problems in content analysis usually grow out of the ambiguity of word meanings, category definitions, or coding rules. Three types of reliability are pertinent to content analytic analysis: stability, reproducibility, and accuracy.
Stability refers to the extent to which the results of content classification are invariant over time, i.e., whether content will be coded in the same way if it is coded more than once by the same coder.
Reproducibility refers to the extent to which content classification produces the same results when the same text is coded by more than one coder.
Accuracy refers to the extent to which the classification of text corresponds to a standard or norm. It is the strongest form of reliability, but usually not available and done. Sometimes, it is used to train coders, though.
Validity
The classification procedure must also generate valid variables, that is, it must measure or represents what the investigator intends it to measure. As happens with reliability, validity problems also grow out of the ambiguity of word meaning and category or variable definitions.
Face validity (weakest) consists of the correspondence between the investigators’ definitions of concepts and their definitions of the categories that measured them. A category has face validity to the extent that it appears to measure the construct it is intended to measure.
A measure has construct validity to the extent that it is correlated with another measure of the same construct. Thus, construct validity entails the generalizability of the construct across measures or methods. There is no simple right way to do content analysis, investigators must judge what methods are most appropriate for their purpose. Large portions of text, such as paragraphs and complete texts, usually are more difficult to code as a unit than smaller portions, such as words or phrases, because large units typically contain more information and a greater diversity of topics. Hence, they are more likely to present coders with conflicting cues.
Creating and testing a coding scheme
INTERPRETING DATA ANALYSIS
Data analysis focuses on organizing and reducing information and making logical or statistical inferences; interpretation, on the other hand, attaches meaning to organized information and draws conclusions. All interpretations, to some extent, are personal and idiosyncratic. Therefore, not only interpretations but also the reasons behind should be made explicit. Useful interpretation methods include the following:
One method of bringing multiple perspectives to the interpretation task is to use stakeholder meetings. Stakeholders can be supplied in advance with the results, along with other pertinent information such as the evaluation plan and list of questions, criteria, and standards that guided the evaluation; that way, the meeting can be devoted to discussion rather than presentation. At the meeting, findings are systematically reviewed in their entirety, with each participant interpreting each finding, using questions such as: What does this mean? Is it good, bad or neutral? What are the implications? What, if anything, should be done?
Quantitative Data Analysis
Simple statistical analysis
Scales of measurement:
Scales of measurement refers to the type of variable being measured and the way it is measured. Different statistics are appropriate for different scales of measurement. Scales of measurement include:
Nominal: mutually exclusive and logically exhaustive categories.
Examples: marital status; gender; group membership; religious affiliation.
Ordinal: ranked or ordered.
Examples: letter grades; social class; attitudinal variables.
Interval: ranked and ordered in standard units of measurement.
Examples: years of age; degree; calendar year; scores on a test; IQ.
Ratio: an interval scale with an absolute zero starting point.
Examples: years of age; years of education; time; length; weight.
Analyzing descriptive data:
Measure of central tendency:
The purpose of central tendency is to report a single summary score or category that best describes a set of observations. Mean, median and mode are the most common measurements of central tendency and are used to compare one group with another, identify some behavior that is unknown, or compare a group to a standard.
The mean is used for interval variables. It is the arithmetic average of all observations. You calculate mean by totaling all observations (scores or responses) and dividing by the number of observations. The mean is sensitive to "outliers" or extreme values in the observations. When your data has a few extremely small or large observations, the data are "skewed."
Example: 15 participants received the following scores: -2, -1, 1, 4, 4, 4, 7, 7, 7, 7, 7, 8, 8, 8, 9.
The mean of the scores (3 X/n) is:
(-2)+(-1)+(1)+(4)+ (4)+(4)+(7)+(7)+(7)+(7)+(7)+(8)+(8)+(8)+(9)/15 = 5.1
The median is most appropriate for ordinal variables. The median is the middle observation. Half of the observations are larger and half are smaller. The median is not as sensitive to the outliers as the mean.
Examples: Observation 1 = 6,8,13,18,25. The median is 13, because half the scores fall above this number and half fall below; Observation 2 = 1, 4, 7, 8, 10, 11, 21, 22. The median is determined by summing the middle two numbers, 8 and 10, and dividing by 2. The median is 9.
The mode is used for nominal variables. It is the observation or category that occurs most frequently. The mode can be used to show the most "popular" observation or value. A distribution can be either unimodal or bimodal.
|
Distribution A |
Distribution B |
||
|
Score |
Frequency |
Score |
Frequency |
|
23 |
2 |
33 |
1 |
|
45 |
6 |
21 |
7 |
|
34 |
8 |
61 |
21 |
|
25 |
11 |
75 |
4 |
|
73 |
15 |
66 |
3 |
|
83 |
18 |
24 |
7 |
|
54 |
10 |
74 |
10 |
|
66 |
12 |
88 |
21 |
Distribution A is unimodal or has a single mode of 83, with 18 responses.
Distribution B is bimodal or has two modes, 61 and 88, with 21 responses each.
When to Use Mean, Median or Mode
Use the mean when:
Use the median when:
Use the mode when:
Test for Differences
Chi-Square (x2) is the most popular of all non-parametric inferential statistical methods. Chi-square tests for differences between categorical variables (e.g., nominal or ordinal data). There are both "one-way" and "two-way" chi-square procedures.
Example of one way chi-square: A sample group is asked a question about political party preference, assuming the question on the instrument form requires a categorical response (Democratic, Republican, Independent, etc.). The one-way chi-square would test for differences in popularity between the political party categories relative to the sample’s response to the question.
Example of two-way chi-square: Used if two categorical variables are to be compared. If the same group above were split into male and female, thus creating a new variable, "sex of the respondent," then this categorical variable could be compared (or "cross-tabulated") to political party choice. In this way, comparisons between the sexes on political preference may be evaluated (e.g., significantly more males are Republican and more females are Democratic).
Both the one-way and two-way chi-square procedures result in a chi-square value and associated significance (probability) level. Chi-square is a non-parametric statistic and as such requires no parametric data assumptions. The data must be categorical in nature.
t-Test is used to test the difference between two means even when the sample sizes are small. The significance of the t statistics depends upon the hypothesis the researcher plans to test. If you are interested in determining whether there is a significant difference between two means, but you do not know which of the means is greater, use the two-tailed test. If you are interested in testing the specific hypothesis that one mean is greater than the other, use the one-tailed test. Data should satisfy parametric assumptions: 1) the sample is selected from populations that are nominally distributed; 2) there is homogeneity of variance -- i.e., the spread of the dependent variable within the group tested must be statistically equal; and 3) data are of continuous form with equal intervals of quantity measurement. Dependent variables must be interval or ratio-type data.
T-test for matched pairs: if both groups of data are contained in each data record, the appropriate t-test is for matched pairs. An example of an appropriate use of the t-test for matched pairs might be to compare pretest and posttest scores where each person took a pretest (variable 1) and a posttest (variable 2). Both values are contained in each data record.
T-test for independent groups: If each case in the data file is to be assigned to one group or the other based on another variable, use the t-test for independent groups. For example, to compare reading scores between males and females, split the reading scores into two groups, depending on whether the person is male or female (each record in the date file is assigned to one group or the other).
Degrees of freedom: (this is not a complete description) The degrees of freedom (d.f.) reflect sample size. When two independent samples are being considered, d.f. are equal to the sum of two sample sizes minus 2; i.e., d.f. =(n1 + n2 –2).
Measures of variance indicate the spread or dispersion of the group and include range, variance and standard deviation.
Range is the difference between the largest and the smallest scores in a distribution.
Example: Scores of 3, 6, 8, 10, 14, 17. The range is 14 points. The scores range from 3 to 17.
Variance is the mean of the squares of the deviation scores. Calculate the difference (deviation) between each score and the mean of the scores, square the deviations, sum the squares and divide the sum by the number of scores minus 1.
Standard deviation measures the spread of data about their mean and is an essential part of any statistical test. It is calculated by taking the square root of the variance. This transforms variance into the same unit of measurement as the raw scores. Standard deviation is expressed in terms of "one standard deviation above the mean" or the like. If the standard deviation is 11 and the score is 63, then one standard deviation is above the mean is 74, two standard deviations is 85 and so forth. The value of this figure becomes apparent when we understand the relationship between standard deviations and percentiles in a normal curve. The area contained within +1 and - 1 standard deviations of the mean includes approximately 69 percent of all scores on the distribution. Therefore, in our earlier example 68 percent of all scores were between 52 and 74.
Another way of assessing the meaning of the standard deviation is to compare scores with percentiles. It is known that, in a normal distribution, 97.7 percent of the cases are below two standard deviations above the mean. So when a raw score for one case is found to be two standard deviations above the mean, we know that the case scored higher than 97.7 percent of all other cases.
Selection Guide for Common Statistical Methods
|
Testing for: |
|||
|
Data Type |
Statistical Method |
Differences (between groups) |
Relationships (within one group) |
|
Nominal |
|||
|
CATEGORICAL Û |
Û Non-parametric Û |
Û Chi-square |
Contingency Coefficient |
|
Ordinal |
|||
|
Interval |
|||
|
Û ANOVA (3groups) |
|||
|
CONTINUOS Û |
Û Parametric Û |
Pearson Correlation Multiple Regression Discriminate Analysis* |
|
|
Ratio |
Û T-test (2 groups) |
||
*If variable to be predicted is categorical
Step 9. Communicate Findings
Evaluators have a responsibility to report their findings to stakeholders and other audiences who may have an interest in the results. Communication with stakeholders should occur throughout the evaluation process to help ensure meaningful, acceptable and useful results. (Continue this)
Reporting plan: Developing a reporting plan with stakeholders can help clarify how, when and to whom findings should be disseminated.
|
|
|
|
Reporting results: A variety of reporting procedures may be used.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
An Evaluation report usually contains:
|
|
|
|
|
|
|
|
|
Reporting Tips
Reporting Negative Findings
At times you may be called on to report negative findings - the program may not have met its objectives, the program is being mismanaged or changes are needed. Evaluation can both identify and point to the causes of negative results. Reporting these difficulties can help avoid future mistakes and suggest ways to improve. However, negative findings must be reported in a manner that helps promote learning and improvement, rather than feelings of failure.
Negative findings should be reported in a manner that:
Step 10. Applying and Using Findings
An evaluation should not be considered complete until the findings of the evaluation are applied:
Ô To make decisions about program continuation.
Ô To improve on-going programs,
Ô To plan future programs.
Ô To inform program stakeholders.
When evaluators are evaluating their own programs there are fewer problems involved in implementing findings, however, where evaluators are not the persons conducting the program, the likelihood of evaluation findings being ignored is greater. When the concerns of stakeholders have been incorporated into the evaluation process, evaluation findings are more likely to be used.
A Final Step: Evaluating the Evaluation
Evaluating evaluation differs little from the actual process of the evaluation itself. It must meet the same standards and follow similar steps as the original evaluation. Evaluating evaluation considers:
Evaluating Evaluation: Hierarchy of Evaluation Accountability
|
Program Chain of Events |
Matching Level of Evidence |
||||
|
Program & decision impacts |
To what extent and in what ways could the program improved? To what extent were informed, high-quality decisions made? |
||||
|
Practice and program changes
|
To what extent did intended use occur? Were recommendations implemented? |
||||
|
Stakeholder’s knowledge and attitude changes |
What did intended users learn? How were users’ attitudes and ideas affected? |
||||
|
Reactions of primary users |
What do intended users think about the evaluation? What’s the evaluation’s credibility? believability? relevance? accuracy? potential utility? |
||||
|
Stakeholder participation |
Who was involved? To what extent were key stakeholders and primary decision makers involved throughout? |
||||
|
Evaluation activities |
What data were gathered? What were the focus, the design and the analysis? What happened in the evaluation? |
||||
|
Inputs |
To what extent were resources for the evaluation sufficient and well managed? Was there sufficient time to carry out evaluation? |
||||
Evaluation Planning Worksheet
|
Identify the program to be evaluated, its objectives and stakeholders |
Assess the feasibility of implementing an evaluation |
Consult Stakeholders to clarify indicators of program merit |
Identify approaches to data collection |
Select data collection techniques |
|
|
|
|
|
Identify target population and select sample |
Who will collect data? |
How will data be analyzed and interpreted? |
How will evaluation findings be shared with stakeholders? |
|
|
|
References
Archer, T. and Layman, J. (1991). "Focus Group Interview" Evaluation Guide Sheet, Ohio Cooperative Extension Service.
Bennett, Claude F. and S. Kay Rockwell (1995). Targeting Outcomes of Programs (TOP): An Integrated Approach to Planning and Evaluation. Lincoln, NE: Cooperative Extension, University of Nebraska.
Bennett, Claude. 1979. Analyzing Impacts of Extension Programs. Washington, DC: US Department of Agriculture, Science and Education Administration (ES C-575).
Case R.; Andrews, M. and Werner, W. (1988). How can we do it? an evaluation training package for development educators. British Columbia, Canada: Research and Development in Global Studies.
Contant, C. K. (1993). "Assessing What and Why: Designing and Using Evaluations Effectively for Local Level Programs." Paper presented at the Rural Nonpoint Source Pollution in the Upper Midwest Conference, March.
Dillman, D. A. (1995). "Survey Methods." Class notes in AG*SAT Graduate Course in Program Evaluation in Adult Education and Training, University of Nebraska-Lincoln.
Fink, A. (1995). How to Sample in Surveys. Thousand Oaks, California: Sage.
Fraenkel J. R and Wallen, N. E. (1996). How to Design and Evaluate Research in Education. New York: McGraw-Hill Inc.
Krueger, R.A. (1994). Focus Groups: a Practical Guide for Applied Research. 2nd edition, Thousand Oaks, California: Sage.
Mueller, D.J.(1986). Measuring social attitudes. New York: Teachers College Press.
Neito, R. and Henderson, J.L. (1995). Establishing Validity and Reliability (draft). Ohio State Cooperative Extension.
Patton, M. Q. (1997). Utilization-focussed evaluation: The New Century Text. Newbury Park, California: Sage.
Salant, P. and Dillman, D.A. (1994). How to conduct your own survey. New York, NY: John Willey & Sons, Inc.
Wholey, J. S.; Harty, H. P. and Newcomer, K. E. (eds.). (1994). Handbook of practical program evaluation. San Francisco: Jossey-Bass Publishers.
Worthen, B. R. and Sanders, J. R. (1987). Educational evaluation: alternative approaches and practical guidelines. New York: Longman.
Yin, R. K. (1984). Case study research: design and methods. Applied Social Research Methods Series. Vol. 5. Newbury Park, California: Sage.