Educational Measurement and Evaluation Studies

Educational Measurement and Evaluation Studies

Multi-Faceted Rasch Model in Practical Tests: Study Subject: Sorayesh Test

Document Type : Original Article

Authors
1 Ph.D student. Department of Curriculum Planning and Educational Methods, Faculty of Psychology and Education, University of Tehran, Tehran, Iran
2 Associate Professor Department of Curriculum Planning and Educational Methods, Faculty of Psychology and Education, University of Tehran, Tehran, Iran
3 University of Kharazmi, Faculty of Psychology and Education , Karaj, Iran
4 professor Department of Curriculum Planning and Educational Methods, Faculty of Psychology and Education, University of Tehran, Tehran, Iran
10.22034/emes.2024.2003752.2479
Abstract
Objective: The purpose of this research is to analyze the data obtained from the practical tests conducted by National Organization of Educational Testing, using the Multi-Faceted Rasch model and comparing it with the results of classical analysis methods.
Methods: The research method is quantitative a type of descriptive-analysis method. The participants included all the songwriting candidates taking the Sorayesh practical test. The analyzed data has been obtained from the evaluation forms filled out by four evaluators for each candidate.
Results: The findings show that although the correlation coefficient among raters was high (more than 0/90), the agreement of the raters in terms of the Kappa coefficient was average. Furthermore, based on the Multi-Faceted Rasch model, the raters strictness parameter was moderate.
Conclusion: The difference between the correlation and the Kappa coefficient shows that these two indicators cannot be used alone to analyze the rater. Also, these indicators present the group status of the raters while the Multi-Faceted Rasch model shows the individual status of each rater. The results of the Multi-Faceted Rasch model indicated that the raters didn't exhibit strictness or leniency errors.
Keywords

Subjects


References

Allen, M. J., & Yen, W. M. (2002). Introduction to Measurement Theory. Prospect Heights, IL: Waveland Press. Translated in persian, 2013, SAMT
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education (Eds.). (2014). Standards for educational and psychological testing. American Educational Research Association. Tranlated in persian, 2019, Tehran university.
Andrich D. & Marais I. (2019). A course in rasch measurement theory: measuring in the educational social and health sciences. Springer. https://doi.org/1007/10/978-981-13-7496-8
de Ayala, R. J. (2022). The theory and practice of item response theory (2st ed.). Guilford Press.
Eckes, T. (2015). Introduction to Many-Facet Rasch Measurement: Analyzing and Evaluating Rater-Mediated Assessments (2nd ed.). New York: Peter Lang.
Embretson, Susan & Reise, S.. (2000). Item Response Theory For Psychologists. Tranlated in persian, Roshd, 2009, Tehran.
Ezanloo, B., & Hajatpour, S. (2023). An Investigation of the Evaluators' Ratings of the Performance Exams in the Field of Arts Using Multi-Faceted Rasch Model. Educational Measurement and Evaluation Studies13(42), 100-123. doi: 22034/10/emes.528161/2023.2244
Hambleton, Ronald K. & Swaminathan, Hariharan. & Rogers, H. Jane.  (1991).  Fundamentals of item response theory.  Newbury Park, Calif:  Sage Publications, Translated in persian, 2010, Alameh tabai university.
Hombo, C. M., Donoghue, J. R., & Thayer, D. T. (2001). A simulation study of the effect ofrater designs on ability estimation (ETS Research Report No. RR-01-05). Princeton, NJ:Educational Testing Service.
Keeves J. P. (1997). Introduction: Advances in Measurement in Education. In Keeves J. P. (Ed.), Educational research methodology and measurement: an international handbook (2nd ed.) (pp. 705-712). Pergamon.
Lee, M., & Cha, D. (2016). A comparison of generalizability theory and many facet Rasch measurement in an analysis of mathemetics creative problem solving test. Journal of Curriculum Evaluation, 19(2), 251–279
Li, G., Pan, Y., & Wang, W. (2021). Using Generalizability Theory and Many-Facet Rasch Model to Evaluate In-Basket Tests for Managerial Positions. Frontiers in psychology, 12, 660553. https://doi.org/3389/10/fpsyg.660553/2021
Polat, M., Sölpük Turhan, N., & Toraman, Çetin . (2022). Comparison of Classical Test Theory vs. F Theory in writing assessment. Pegem Journal of Education and Instruction, 12(2), 213–225. https://doi.org/47750/10/pegegog.02/12.21
Robitzsch, A., & Steinfeld, J. (2018). Item response models for human ratings: Overview, estimation methods, and implementation in R. Psychological Test and Assessment Modeling, 60(1), 101–139.
Taylor, C. S. (2013). Validity and validation. Oxford University Press, Translated in persian, 2010, Alameh tabai university.
Wind, S., & Hua, C. (2022). Rasch Measurement Theory Analysis in R (1st ed.). Chapman and Hall/CRC. https://doi.org/1201/10/9781003174660
Wolf. R.M. (1997). Rating Scales. In Keeves J. P. (Ed.), Educational research methodology and measurement: an international handbook (2nd ed.) (pp. 958-965). Pergamon.