You are on page 1of 14

104

The International Journal of Educational and Psychological Assessment May 2012, Vol. 10(2)

Assessing the Item Difficulty of a Tool for Screening Teaching Demonstrations Carlo Magno

De La Salle University, Manila


Abstract The present study developed a scale used to rate higher education teacher applicants on their teaching demonstration performance. The scale guides a screening committee to decide on the selection of a faculty applicant. The items in the scale were anchored on Danielsons components of professional practice. The domains include (1) planning and preparation, (2) classroom environment, (3) instruction, and (4) professional responsibility. The items were first reviewed by the stakeholders such as those who had experienced in screening teachers and administrators. The items were then pilot tested for teacher applicants during the hiring period for three consecutive trimesters. A total of 161 faculty applicants were used to pilot test the instrument. The results obtained showed that high internal consistencies were obtained for the three domains but low for professional responsibility. The four-factors when used to screen faculty applicants converged with significant path coefficients. An adequate fit was obtained for the four-factor structure (RMSEA=.08, PGI=.89, GFI=.96, and NFI=.93). When the polychotomous Rasch model was used, only one item did not fit, proper items resulted to be easy and difficult, appropriate step functions were obtained for the four point scale, and precise measurement for teacher performance was obtained based on the TIF. Keywords: Teacher performance, teaching demonstration, components of professional practice, polychomous Rasch model Introduction One of the crucial assessments conducted in an educational institution is the selection of teachers that will make up the teaching force. The quality and standards of a school is reflected in the kind of teachers selected by administrators. The teachers implement the curriculum and their ability to teach will depend on how effective the curriculum will be carried out. According to Kersten (2008), people usually identify the excellent teachers as the most important consideration in achieving school improvement and student success. The perception that making faculty selection as the top priority of the administration would cause an increase school success (Tucker & Strange, 2005; Marzano, Pickering, & Pollock, 2001; Danielson & McGreal, 2000; Kersten, 2008). Miller-Smith (2001) even added that the key to provide a good quality education for students is the ability to recruit quality teachers. Schools consider several criteria when hiring a teacher such as its mission and vision, school culture, and the ability of the teacher to teach. Proper assessment is needed to ensure that the teachers selected in a school can deliver the curriculum
2012 Time Taylor Academic Journals ISSN 2094-0734

105
The International Journal of Educational and Psychological Assessment May 2012, Vol. 10(2)

well, and can carry out the school mission and vision appropriately. Due to the demand for increasing student achievement, administrators of schools should have the ability to recruit and select quality teachers who have excellent teaching skills. According to Barber (1998, p.44), recruitment is the process which includes practices, activities carried on by the organization with the primary purpose of identifying and attracting potentials employees while selection is the process of reducing the pool of applicants and choosing from among the final job candidates (Young & Castesetter, 2004, p. 90). This goes to show that recruitment is the bigger term and selection is under it. The inability or failure to properly recruit and select faculty applicant would result to having teachers who are not capable to respond to pressing issues in educational setting (Kersten, 2008). Qualities of a Successful Candidate Studies have noted that in screening teacher applicants in higher education, it is notable that the characteristics that can be easily assessed are the teaching ability, content knowledge, and research capability. One of the required teacher qualities is teaching ability (Twombly, 2005). In the study of Twombly (2005), the teaching ability of the applicant was measured through a demo-teaching. Demo-teaching is usually accomplished around 15-30 minutes. Regarding the given time, some are being skeptic if this would be enough to measure the excellence of a teacher. However, in her study, search committee members believed that they could tell very quickly whether a candidate had the skills to be a good teacher or not. In the study, the audience was the screening committee and the applicants were given same concept to teach. Furthermore, the academic director at SCCC also looked at different aspects of demo-teaching as he said this: We judge also, on their demonstration, how they handle themselves in the demonstration. Do they stand up there and just give a straight lecture? Do they just talk to us? Their eye contact is absent. Do they just kind of meander up there?...Do they engage the audience? Do t hey use interactive things?...What types of learning activities do they include? While demo-teaching is considered by the deans in the study of Twombly (2005) as the most important part in teacher selection, the study of Kersten (2008) reported that teaching demonstration only gathered 7.4 % in the selection process components. Multiple interviews obtained the top rate since it is used by the school districts at a large extent. On one hand, in the study of Benson and Buskist (2005), the quality of teaching demonstration obtained 34% when the committee chairs in psychology were asked to determine how teaching excellence was assessed. Another teaching quality aside from teaching ability is being knowledgeable and up-to-date in the field. In some reviews, this is also considered mastery of the subject matter which is considered to be the most frequently identified quality of a successful teacher candidate (Kersten, 2008). Others include ability to link knowledge to personal experience, knows the teaching best practices, and ability to explain how they incorporated strategies into teaching experience.

2012 Time Taylor Academic Journals ISSN 2094-0734

106
The International Journal of Educational and Psychological Assessment May 2012, Vol. 10(2)

The study of Burke (1998) which involved interviews with department chairs who just hired new teachers, it was found that research quality and potential were the two factors that were considered although teaching quality was still a criterion. This suggests that teaching and research have received the same weight in terms of its significance to teacher qualities with more emphasis on the potential of the candidate to do good research (Youn & Gamsons 1994; Burke, 1998). Similarly, a survey conducted by the Department of Education Chairs in the United States supported the finding that research ability is an important criterion in selecting faculty in some institutions especially in research universities. In the said survey, 73% respondents from research institutions regarded research as an important quality of a teacher while only 43% of them considered teaching as the most important (Russell, Fairweather, & Zimbler, 1991 as cited in Meizlish & Kaplan, 2008). In another study conducted by Sheehan, McDevitt, and Ross (1988) which investigated the top ten criteria for selecting a teacher, the recommendation letters, and the fit of line of research and the need of the department were the two most significant factors identified. In the open-ended responses of the applicant, the top three factors considered were teaching ability and experience, fit between applicant and the department, and research skill. Still, some other qualities that administrators look for in hiring teachers include being enthusiastic, willing to invest time and energy for students and school, doing extra mile, having strong interpersonal skills, reflective and respect children and enjoy being with them (Kersten, 2008), previous teaching experience, teaching philosophy (Benson & Buskist, 2005). Hindrances to Success Another section of Kerstens study (2008) includes the factors that hinder the teacher of being selected. Personal appearance was viewed to be the most common reason why teacher candidates are unsuccessful. This happens when a candidate wears casual or distractive dress. Another finding about the qualities that hinder the success of a teacher applicant is unprepared-ness (e. g., had not researched about the school and the job position they are applying for, not prepared to answer the questions that will be asked) and lack of knowledge (e. g., not being up-to-date to the latest best practices and emerging trends in education, providing shallow responses to instructional methods, communication skill problems, etc.). Given that teaching components require tasks that are allows an applicant to become successful or not in the screening process, it is necessary to identify these tasks through specific items in a measure. The Item Response Theory (IRT) allows to determine item difficulty. In an IRT approach, more specifically the Rasch model, difficult items are the ones where a proportion of participants have a low chance of obtaining a correct answer. In Andrichs polytomous Rasch model (1988), the concept of item difficulty is applied to scales with increasing interval responses. Difficult items in a polytomous Rasch model are those items that are endorsed by few individuals. For example, an item of checking class attendance
2012 Time Taylor Academic Journals ISSN 2094-0734

107
The International Journal of Educational and Psychological Assessment May 2012, Vol. 10(2)

(more times responded highly) is easily endorsed as compared to keeping the class quiet for one hour (few high responses). Teaching Framework The items in screening tool in the present study are anchored on Danielsons Components of Professional Practice. It is a teaching framework reflective of learner-centeredness and constructivist approach in teaching (Magno & Sembrano, 2009). The framework by Danielson has been used by several school districts, state certification departments, and universities worldwide which include the Philippines. Magno and Sembrano (2009) used the framework to construct three versions of the Student Teachers Assessment Report (STAR) in a higher education institution in the Philippines. The STAR is an instrument used by students to assess their teachers performance. The STAR correlated significantly in measurement models with factors of learner-centered practices. The framework was also used in the study of Magno (2012) that constructed a tool for assessing teacher performance used by peers in one higher education institution in the Philippines. A rubric was constructed where the items were based on teachers best practices in a higher education institution. The four factor structure was supported by the data by having adequate goodness of fit. The components of Danielsons framework include the following: Domain 1: Planning and Preparation - demonstrating knowledge of content and pedagogy, demonstrating knowledge of students, selecting instructional goals, demonstrating knowledge of resources, designing coherent instruction, and assessing student learning. Domain 2: The Classroom Environment creating an environment of respect and rapport, establishing a culture for learning, managing classroom procedures, managing student behavior, and organizing physical space; Domain 3: Instruction communicating clearly and accurately, using questioning and discussion techniques, engaging students in learning, providing feedback to students, and demonstrating flexibility and responsiveness; Domain 4: Professional Responsibilities reflecting on teaching, maintaining accurate records, communicating with families, contributing to the school and district, growing and developing professionally, and showing professionalism. The present study used Danielsons components of professional practice to construct a screening tool that that is used to assess performance of teacher applicants in higher education on their teaching demonstration. The framework is appropriate for screening higher education faculty because of the strong need for each of the domains. It is important for teachers to have adequate planning and preparation because of the complex learning materials needed for students to learn in their coursework. At the same time, teachers also need to have the ability to carry out effective instruction and classroom environment in order to build on students skills. More so, higher education faculty needs to build their professional responsibility by producing research, publications active participation in professional organizations, and running seminars and workshops.
2012 Time Taylor Academic Journals ISSN 2094-0734

108
The International Journal of Educational and Psychological Assessment May 2012, Vol. 10(2)

In the present study, item analysis was conducted using a Polychotomous Rasch Model to determine item difficulty, item fit, test information function, and test characteristic curves. Method Participants A sample of 161 teacher applicants was used to pretest the screening tool. The raters (screening committee) were composed of the school deans, chairs, heads, and senior faculties (depending on the program). The teacher applicant was asked to perform a demo-teaching and the time requirement was provided by the program chair. Instrument The screening tool were based on the four domains and the underlying components of Danielsons Components of Professional Practice. The items were generated based on the responses on a survey that was administered to school deans, chairs, heads, and senior faculties who are usually involved in the teaching demonstration to screen teacher applicants. The responses from the survey were clustered according to the four components of Danielsons Components of Professional Practice. The responses were rephrased and made it into appropriate items. There were 10 items created for each domain comprising a total of 40 items. For every component, open-ended items for other comments are provided if there are some observations not captured in the instrument. The response format used for each items is a four-point numeric scale (4-Very Good, 3-Good, 2-Poor, and 1Needs Improvement). The items were reviewed by the faculty who are involved in the screening of faculty applicants and others who identified the characteristics to be included in the screening tool. The items were revised according to their comments. External experts in the field of teachers performance assessment were requested to review the items. A Confirmatory Factor Analysis (CFA) was conducted to prove the fourfactor structure of the scale. The four factors involved planning and preparation, classroom environment, instruction, and professional responsibility. The items for each domain were used as manifest variables. The CFA showed that the items had significant path to their respective latent factors. When the latent factors were intercorrelated, significant estimates were obtained which supports the results of the convergent validity established in the bivariate correlations. The solution also showed to have adequate fit with the data, 2=1641.81, df=773, RMSEA=.08, PGI=.89, GFI=.96, and NFI=.93.

2012 Time Taylor Academic Journals ISSN 2094-0734

109
The International Journal of Educational and Psychological Assessment May 2012, Vol. 10(2)

Procedure The items were used to screen the teachers and it was accomplished within the given time for each applicant. The guidelines for accomplishing the instrument include: 1. The purpose of this inventory is to determine the performance of the teaching faculty applicant based on the teaching demonstration. 2. You are highly encouraged to write your comments in the boxes provided after each domain in support of your rating given. The comments may include relevant critical incidents from the teaching demonstration. 3. The accomplished form can be verified through other assessment of the teaching faculty applicant such as the interview and pre employment tests. 4. Accomplish the form within the duration of screening the faculty in order to provide a timely feedback and decision for the applicant. The data from the pretest was encoded and analyzed for reliability and validity. The Polychotomous Rasch Model was used to determine the acceptable items by assessing item fit. The items with inadequate fit were revised. Data Analysis For the close-ended items, the internal consistency of the items were determined using the Cronbachs Alpha and interitem correlations. Person and item reliability were also obtained using the one-parameter Rasch Model. The items were assessed based on the item fit measure using the oneparameter Rasch Model. The Rasch Model is an item response theory approach to analyzing test items as opposed to the classical test theory. Results The screening tool was analyzed first by determining the descriptive statistics such as means, standard deviations, skeweness, and kurtosis. The internal consistency using Cronbachs alpha was also determined. The four-factor structure was tested using Confirmatory Factor Analysis (CFA). The items were further analyzed on its scale functions, fit of the items, and its accuracy using the Polychotomous Rasch Model. Table 1

Means and Standard Deviations N


Planning and Preparation Classroom Environment Instruction Professional Responsibility 161 150 161 109

M
3.39 3.36 3.29 1.89

SD
0.45 0.50 0.47 0.19

Skewness Kurtosis Cronbachs alpha .87 -0.44 -0.78 .91 -0.77 0.80 .90 -0.60 0.50 .55 -2.30 5.58

Note. Unequal Ns are due to the removal of participants with incomplete answers in the scale.
2012 Time Taylor Academic Journals ISSN 2094-0734

110
The International Journal of Educational and Psychological Assessment May 2012, Vol. 10(2)

The means for the screening tool are quite high considering that the instrument uses a four-point scale except for professional responsibility which had the lowest mean score (M=1.89). The scores tend to be negatively skewed where ratings given to the domains tend to be high. Table 2

Intercorrelations among the Factors of the Screening Tool


(1) 1 Planning and Preparation 2 Classroom Environment 3 Instruction 4 Professional Responsibility **p<.05 --.72** .79** .48** --.80** .42** --.47** (2) (3) (4)

---

The four domains of the screening tool were correlated in order to establish the convergent validity of the scale. The convergence was proven by obtaining significant relationship among the four domains with a positive magnitude (r>.42). The domains planning and preparation, classroom environment, and instruction are highly intercorrelated (r>.72). However, professional responsibility seems to be distant to the first three domains because of the weak correlations (.42 to .48). Professional responsibility may not be perceived to be integrated with teaching in higher education in the setting where the study was conducted. The Polychotomous Rasch Model was used to determine the fit of the model based on the Rasch model, the difficulty level of items, the scale category analysis, the test characteristic curve (TCC), and the test information function (TIF). A separate person and item reliability was also generated using the Rasch model. The person reliability obtained is .90 and the item reliability is .96. The results of the item analysis showed that only one item did not fit the one-parameter Rasch model. Each item has a mean square value (MNSQ) that is within the bounds of 0.8 to 1.2 except for the item on preparing for an alternative teaching activity. The item on preparing for an alternative teaching activity has a MNSQ value of 1.48. This item is said to be discrepant among the rest of the items in the set. There is also an item that is close to discrepancy like the item on plans activities effectively which have a MNSQ value of 1.15 but this is still considered good fit. On the other hand, the items close to 0.80 are items that are homogenous or redundant with the rest of the items. Examples of items close to a MNSQ of .80 are considers students reaction to instruction (MNSQ=.83) and involves students in class (MNSQ=.88). However, these items are still considered good fit because they are above .80. The difficulty of the items is determined using the logit measures. Difficult items have positive logit values while easy items have negative logit values. Table 3 shows the easy items. The concept of item difficulty in a polychotomous scale refers to behaviors that are difficult to accomplish and easy items refers to those behaviors that are less difficult. In the case of a scale to rate performance in teaching demonstration, difficult items are those aspects of teaching where teachers
2012 Time Taylor Academic Journals ISSN 2094-0734

111
The International Journal of Educational and Psychological Assessment May 2012, Vol. 10(2)

are mostly scored low and difficult to accomplish. On the other hand, the easy items are those aspects of teaching where teachers are mostly scored in a higher scale because the task was easily observed to be evident. For example, communicating with students is perceived to be easier that setting rules and procedures inside the classroom. Table 3

Easy Items of the Screening Tool


Item A1. Demonstrates knowledge of the subject matter. C21. Communicates clearly. C22. Delivers relevant content. A4. Uses visuals and materials to enhance learning. A2. Organizes the presentation for the type of learners. A5. Prepares lesson or course outline for teaching demonstration. A8. Manages the time allotted for the teaching demonstration C31. Encourages students to participate. A10. Sets clear goals C23. Makes instruction appropriate for the types of learners. B15. Established rapport. C30. Asks questions that tap critical thinking. A6. Plans activities effectively. B20. Maximizes equipment properly. B16. Makes instruction interactive. Logit Measure -1.07 -1.06 -0.91 -0.85 -0.58 -0.45 -0.32 -0.29 -0.25 -0.21 -0.17 -0.15 -0.11 -0.04 -0.01

The easy items in the set are demonstrates knowledge of the subject matter and communicates clearly. These items are the easiest among teacher applicants and can be considered as expected among the basic competencies of teaching. Table 4 shows the difficult items. The difficult items in the set are having rules and procedures in class and conducts some form of assessing student learning. These items are the most difficult in the set among teacher applicants. The results for these items are difficult for the teacher applicant to show during the teaching demonstration. The scale categories were analyzed for each response. Higher scale categories must reflect higher measures and must show low values for lower scales, thereby producing a monotonic increase in threshold values. The average step calibrations for the items are, -0.10, 0.30, 0.26, and 1.05, respectively for strongly disagree, disagree, agree, and strongly agree. All average step functions are increasing monotonically, indicating that the 4-point scale used for each items has attained scale ordering and that there is a high probability of observance of certain scale categories.

2012 Time Taylor Academic Journals ISSN 2094-0734

112
The International Journal of Educational and Psychological Assessment May 2012, Vol. 10(2)

Table 4

Difficult Items in the Screening Tool


Items A3. Answers questions convincingly. B18. Responds positively to students. B19. Involves students in class. B17. Motivates students to participate in class. C24. Varies the teaching approach according to students needs. B13. Considers students reaction to instruction. C25. Clarifies directions or instruction given. D34. Attained a relevant graduate degree or experience. D36.1 Has a positive interpersonal characteristics. C29. Uses activities that tap higher order thinking skills. D35. Personal values are in line with the thrust of DLS-CSB D32. Attended or facilitated seminars or trainings in the past related to teaching or their profession A7. Indicates the references and resources used in teaching demonstration. D36.3 Adapts to learning needs. D36.4 Facilitates the learning process. D37. Makes himself or herself available for college-wide activities. B11. Maintains orderliness in class. D33. Affiliated with at least one professional organization. D36.2 Encourages personal change. B12. Handles disruptions appropriately. D38. Maintains professional behavior with students/co-faculty members/academic Community. C28. Provides students the opportunities to learn skills. C26. Provides feedback. A9. Prepares alternative teaching activities when the initial plan does not seem to work. B14. Has rules and procedures. C27. Conducts some form of assessing student learning. Logit Measure 0.04 0.04 0.04 0.06 0.06 0.08 0.09 0.15 0.17 0.2 0.22 0.24 0.26 0.26 0.27 0.27 0.28 0.28 0.3 0.31 0.33 0.42 0.44 0.49 0.56 0.59

The item characteristic curve was generated and it shows that with average ability (=0), there is 44.44% chance of getting an answer correct. The lower probability of getting an answer correct indicates that the items can discriminate the abilities of teacher applicants in the teaching demonstration. The test information function was also generated to determine the spread of the scale scores relative to the construct being assessed (teacher demonstration performance). Curves that are highly constrained (low SD on each side of the 0 are constrained) are imprecise measures for much of the continuum in the domain, while curves that encompass a large area (2 SD below and above 0) provides

2012 Time Taylor Academic Journals ISSN 2094-0734

113
The International Journal of Educational and Psychological Assessment May 2012, Vol. 10(2)

precise scores for the continuum. The area of the TIF for the screening tool encompass one SD below 0 and one SD above showing highly constrained area. Discussion The screening tool for teacher applicants showed to perform accurately as a screening tool in both the Classical Test Theory (CTT) and Item Response Theory Approach. In the CTT approach, the items for each domain showed to have very high internal consistency only for planning and preparation, classroom environment, and instruction. However, low internal consistency was obtained for professional responsibility. The factorial validity of the domains was further supported through the convergence of the four factors. The convergence was also supported in the Confirmatory Factor Analysis (CFA) where the four domains had significant covariances. The CFA also showed that the items significantly loaded under their respective domain. Goodness of fit of the model was also attained supporting the four factor structure. The results of the Item Response Theory (IRT) also showed the accuracy of the instrument as a screening tool. Using the Polychotomous Rasch model, the items of the screening tool all showed to have good fit based on the MNSQ. The Rasch model showed that there are notable items that are difficult and easy. Appropriate scales ordering were also attained. The instrument tends to be stringent because an average ability has a below 50% chance of getting a correct response. Actual measurment precision of teaching performance was shown by the TIF. The descriptive statistics shows high mean scores given for the teaching demonstrations except for the domain of professional responsibility. This is supported by the negatively skewed distribution for each domain. The domains on planning and preparation, classroom environment, and instruction are expected teaching skills that are assessed during teaching demonstration as supported by previous studies (Ketler, 2008; Twombly, 2005). These are specific areas that teacher applicants come prepared during the teaching demonstration so a high score is expected. However, the low mean scores for professional responsibility is attributed to the expectation of Filipino higher education faculty about research and professional development in the Philippines (Magno, 2012). According to Magno (2012) that faculty in higher Education Institutions in the Philippines are not that much involved in research and publication where they see their role as exclusively teaching. Much of the hours are also devoted in teaching rather than doing research. This expectation is reflected in the low mean rating for professional responsibility because teacher applicants for HEIs do not meet this requirement sufficiently. The items on planning and preparation, classroom environment, and instruction showed to have very high internal consistencies. The domain on professional responsibility had low internal consistency. The scores on the items for the first three domains showed to have consistently high ratings because of the same impression given by teachers for their teaching ability. The low internal consistency for the professional responsibility reflects the disparity among applicants among those few who are able to extremely meet the requirements and those with none.
2012 Time Taylor Academic Journals ISSN 2094-0734

114
The International Journal of Educational and Psychological Assessment May 2012, Vol. 10(2)

Discrepancy of performance is evident among teacher applicants in their professional responsibility. Factorial validity was established in two ways. First, the domains when intercorrelated showed to have significant relationships. Second, the four latent factors were also intercorrelated in a measurement model that attained an adequate fit. In both the bivariate correlations and the CFA, stronger relationships were found among planning and preparation, classroom environment, and instruction. However, these three domains are not as strongly correlated with professional responsibility. This result is consistent with the findings of Magno (2012) and Magno and Sembranos (2009) study where professional responsibility is loosely integrated with the other domains. This perspective indicates that professional responsibility is seen to be exclusive with the other teaching domains. It also shows that professional responsibility is seen to be distant with the role of HEI faculty in their teaching. In the present study, the context is a teaching demonstration where the applicant shows his/her best in teaching. Professional responsibility is assessed more with the documents provided by the applicant such as publications, certificates, and other evidence which the applicant may or may not be ready with. Factorial validity was also evidence in the measurement model that was established. The four factor solution that was tested showed to have adequate fit and the items under each factor all showed to be significant. This shows that the items are accurately classified according to the domain they are measuring. The four factor structure was also supported by the sample observed. This provides evidence that the four factors are valid measures for a sample of teaching demonstration. The 40 items in the instrument showed to have good fit according to the Polychotomous Rasch model except for one which is preparing for an alternative teaching activity. This item may not be very evident in a teaching demonstration. The applicant may not have enough time to actually deliver an alternative teaching activity because of the short time given for the demonstration. The perspective of having alternative activities is difficult to comply with especially when the teacher has made careful plans for instruction. Since applicants come prepared for teaching demonstrations, it is not necessary for them to prepare alternative activities. Considering that there is only one item that did not fit the model, the overall scale tends to function well. General conception of item difficulty is applied for tests with right and wrong answer. However, the Polychotomous Rasch model allows one to interpret how difficult or easy each item is carried out even for scaled measures with no right and wrong answers. In the results of the analysis, the items on knowledge of the subject matter and communicating clearly are easier to accomplish but items on establishing rules and procedures and conducting assessment of student learning are difficult to establish. The communication skills and knowledge of the subject matter is a common understanding when applying for a teaching position (Ketler, 2008). A faculty applying for a higher education teaching post is expected to exhibit expertise in the field they are teaching and in order to attain this they must effectively communicate their ideas. However, establishing rules and regulations are not as easy as delivering the contents of a lesson. Rules and regulations are part of
2012 Time Taylor Academic Journals ISSN 2094-0734

115
The International Journal of Educational and Psychological Assessment May 2012, Vol. 10(2)

classroom management that needs time to be established among students. Aside from informing students about rules and regulations, the teacher needs to be consistent for the whole semester/term to enforce the rules. This item needs to be practiced consistently which makes it difficult. On the other hand, assessing student learning is also considered difficult to implement. Assessment needs some training to master proper procedure. The appropriate ways of conducting assessment needs careful study and guidance which makes it difficult to implement. An increasing scale category was obtained. This indicates that appropriateness of the four point scale in measuring their degree of intensity for the construct measured. The proper observance of the scale supports the accuracy and distinctiveness of the four levels of the scale. The raters can clearly distinguish the four levels in the scale. The ICC indicates that the scale can somehow discriminate abilities. An average ability of an applicant intersects with 44.44% chance of getting an appropriate teaching sample. The discrimination is marked at 50%. In the case of the screening tool developed, the chance is below 50% given an average ability of teachers. The TIF indicates the scale to be a precise measure of a teaching sample. The wide range of area in the TIF indicates that the instrument covered most of the abilities of the participants who were rated using the scale. The present study contributes to existing literature on teacher assessment more specifically on screening teacher applicants. Previous studies are limited to determine important characteristics that need to be assessed in the selection of teachers. The present study showed that scale is a useful measure for selecting teachers as evidenced by the reliability, validity and functioning of the items. The instrument is recommended to be consistently used for screening teacher applicants conducting a demonstration teaching.

References Andrich, D. (1988). Rasch models for measurement. Sage Publications. Barber, A. E. (1998). Recruiting employees: Individual and organizational perspectives. Thousand Oaks, CA: Sage Benson, T. A., & Buskist, W. (2005). Understanding "excellence in teaching" as assessed by psychology search committees. Teaching of Psychology, 32(1), 47-52. Burke, D. L. (1988). A new academic marketplace. New York: Greenwood Press. Danielson, C. (1996). Enhancing professional practice: A framework fro teaching. Alexandria, VA: Association for Supervision and Curriculum Development. Danielson, C., & T. McGreal. 2000. Teacher evaluation to enhance professional practice. Alexandria, VA: Association for Supervision and Curriculum Development.

2012 Time Taylor Academic Journals ISSN 2094-0734

116
The International Journal of Educational and Psychological Assessment May 2012, Vol. 10(2)

Kersten, T. (2008). Teacher hiring practices: Illinois principals perspectives. The Educational Forum, 72(4), 355-369. Marzano, R., D., Pickering, & Pollock, J. 2001. Classroom instruction that works. Alexandria, VA: Association for Supervision and Curriculum Development. Magno, C. (2012). Assessing Higher Education Teachers through Peer Assistance and Review. The International Journal of Educational and Psychological Assessment, 9(2), 104-120. Magno, C., & Sembrano, J. (2009). Integrating learner-centeredness and teaching performance in a theoretical model. International Journal of Teaching and Learning in Higher Education, 21(2), 158-170. Miller-Smith, K. (2001). An investigation of factors in resumes that influence the selection of teachers. Unpublished doctoral dissertation, The Ohio State University, Columbus. Meizlish, D., & Kaplan, M. (2008). Valuing and evaluating teaching in academic hiring: A multidisciplinary, cross-institutional study. The Journal of Higher Education, 79(5), Russell, S., Fairweather, J. S., & Zimbler, L. J. (1991). Profiles of faculty in higher education institutions, 1988. Washington, DC: National Center for Education Statistics. (ERIC Document Reproduction Service No. ED336058) Sheehan, E. P., McDevitt, T. M., & Ross, H. C. (1998). Looking for a job as a psychology professor? Factors affecting applicant success. Teaching of Psychology, 25(1), 8-11. Tucker, P., & J. Stronage. 2005. Linking teacher evaluation and student learning. Alexandria. VA: Association for Supervision and Curriculum Development. Twombly, S. B. (2005). Valaues, policies, and practices affecting the hiring process for full-time arts and sciences faculty in community colleges. The Journal of Higher Education, 76.4, 423-447 Youn, T. I. K., & Gamson, Z. F. (1994). Organizational responses to the labor market: A study of faculty searches in comprehensive colleges and universities. Higher Education, 28(2), 189-205. Young, P. (2003). The effects of chronological age and information media on teacher screening decisions for elementary school principals. Journal of Personnel Evaluation in Education, 17(2), 157 Young, I. P., & Castetter, W. B. (2004). The human resource function in educational administration. (8th edn). Columbus: Merrill Prentice Hall. Winter, P. A. (1997). Educational recruitment and selection: A review of recent studies and recommendations for best practice. In L. Wildman (Ed), Fifth NCPEA Yearbook. Lancaster: Technomic.

2012 Time Taylor Academic Journals ISSN 2094-0734

117
The International Journal of Educational and Psychological Assessment May 2012, Vol. 10(2)

About the Author Dr. Carlo Magno is presently a faculty of the Counseling and Educational Psychology Department of De La Salle University, Manila. He has an active research program on assessment of teacher performance focused on teaching and learning. This paper was supported and funded by the Center for Learning and Performance Assessment of De La Salle-College of Saint Benilde.

2012 Time Taylor Academic Journals ISSN 2094-0734

You might also like