The application of IRT Polytomous models in scoring high-stakes tests (Case of study: lawyer's license test)

Document Type : Original Article

Authors

1 Ph.D student, Faculty of Psychology and Education, Allameh Tabataba'i University, Tehran, Iran

2 Associate Professor, Department of Educational Measurement, Allameh Tabataba'i University, Tehran, Iran

3 Professor, Department of Educational Measurement, Allameh Tabataba'i University, Tehran, Iran

10.22034/emes.2023.563268.2426

Abstract

Objective: The aim of this study was to compare the accuracy and measurement error of dichotomous and Polytomous IRT models in scoring high-stakes, large-scale ability tests.
Methods: The statistical population of this study was included all the participants of the lawyer's license external tests in 2016 and 2018, from which 5000 persons and 5000 persons respectively were selected by random sampling. In addition, data collection was done using the responses of the participants of the above exam. Accordingly, the research method is experimental.
Results: The analysis of the findings showed that among the dichotomous IRT logistic models, the 3-parameter model, and among the nominal Polytomous models studied, the 3-parameter model are a better fits and information compared with other models on the data under study.
Conclusion: Considering the more favorable fit and the level of information of the 3-parameter dichotomous model and the 3-parameter Polytomous model compared with other models, the use of these models in scoring can increase the accuracy of measurement and reduce the error. In addition, the use of these models also helps the fairness of the selection process of the applicants for the lawyer's license exam.

Keywords

Main Subjects


References

Abad, F.; Olea, J. & Ponsoda, V. (2009). The Multiple-Choice Model Some Solutions for Estimation of Parameters in the Presence of Omitted Responses. Applied Psychological Measurement, Vol. 33, No. 3, pp. 200-221.
Baker, F. B. & Ho Kim, S. (2017). The Basics of Item Response Theory Using R. Springer International Publishing.
Brown, A. & Croudace, T. (2015). Scoring and estimating score precision using multidimensional IRT. In Reise, S. P. & Revicki, D. A. (Eds.). Handbook of Item Response Theory Modeling: Applications to Typical Performance Assessment. New York: Routledge/Taylor & Francis Group.
Bock, R. D. (1997). The nominal categories model. In Handbook of modern item response theory. New York: Springer.
Bock, R. D. & Gibbons, R D. (2021). Item response theory. John Wiley & Sons Ltd.
Bolt, D.; Wollack, J. & Suh, Y. (2012). Application of a multidimensional nested logit model to multiple-choice test items. Psychometrika, 77(2), 339–357.
Burnham, K. P., & Anderson, D. R. (2002(. Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. Springer-Verlag.
Burnham, K. P., & Anderson, D. R. (2004(. Multimodel Inference: understanding AIC and BIC in Model Selection, Amsterdam Workshop on Model Selection.
Carlson, J. E. & Von Davier, M. (2018). Item Response Theory. Available from https://ets.org
Chalmers, R. P. (2012). Mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 1-29.
De Ayala, R. J. (1989). A comparison of the nominal response model and the three parameter logistic model in computerized adaptive testing. Applied measurement in education, 5, 17-34.
De Ayala, R. J. (2009). The Theory and Practice of Item Response Theory. Guilford Publications, Inc.
Drasgow, F. & Levine, M.V. & Tsien, S. (1995). Fitting Polytomous Item Response Theory Models to Multiple-Choice Tests. APPLIED PSYCHOLOGICAL MEASUREMENT. Vol. 19, No. 2.
DeMars, C. (2010). Item response theory. Published by Oxford University Press, Inc.
Kim, Jee-Seon. (2006). Using the Distractor Categories of Multiple-Choice Items to Improve IRT Linking. Journal of Educational Measurement, Vol. 43, No. 3, pp. 193–213.
Lacourly, N; Sanmartin, J; Silva, M; & Uribe, P. (2018). IRT Scoring and the principle of consistent order. Available from https://arXiv.org
Lahner, F; Schauber, S; Lorwald, A; Kropf, R; Guttormsen, S; Fischer, M; & Huwendiek, S. (2020). Measurement precision at the cut score in medical multiple-choice exams: Theory matters. Perspectives on Medical Education, 9, 220-228.
Myszkowski, N., & Storme, M. (2018).  A snapshot of g? Binary and Polytomous item-response theory investigations of the last series of the Standard Progressive Matrices (SPM-LS). Intelligence, 68, 109–116.
Paek, I. & Cole, K. (2020). USING R FOR ITEM RESPONSE THEORY MODEL APPLICATIONS. Routledge
Penfield, R, & La Torre, J. (2008). A new response model for multiple-choice items. Paper presented at the 2008 annual meeting of the National Council on Measurement in Education, New York.
Penfield, R. (2014). An NCME Instructional Module on Polytomous Item Response Theory Models. Educational Measurement: Issues and Practice, Vol. 33, No. 1, pp. 36–48.
Preston, K; Reise, S; Cai, L; & Hays, R. (2011). Using the Nominal Response Model to Evaluate Response Category Discrimination in the PROMIS Emotional Distress Item Pools. Educational and Psychological Measurement, 7(3), 523-550.
Price, L. R. (2017). Psychometric Methods. Guilford Publications, Inc.
Reif, M. (2014). IRT models for multiple-choice items (mcIRT).
Ritt, M. (2016). The impact of high-stakes testing on the learning environment. Master of social work clinical research papers. Paper 658.
Revelle, W. (2017). Psych: Procedures for Personality and Psychological Research. Available from https://personality-project.org/r/psych.
Samejima, F. (1996). Polychotomous responses and the test score. Available from https://eric.ed.gov
Simon, M; Ercikan, K; & Rousseau, M. (2013). Improving Large-Scale Assessment in education. New York: Routledge.
Storme, M; Myszkowski, N; Baron, S; & Bernard, D. (2019). Same Test, Better Scores: Boosting the Reliability of Short Online Intelligence Recruitment Tests with Nested Logit Item Response Theory Models. Intelligence, 7(3), 1-17.
Suh, Y., & Bolt, D. (2010). Nested logit models for multiple-choice item response data. Psychometrika, 75(3), 454–473.
Suh, Y. & Bolt, D. (2011). A nested logit approach for investigating distractors as causes of differential item functioning. Journal of Educational Measurement, 48, 188-205.
Thompson, N. (2021). Classical Test Theory vs. Item Response Theory: What are some key differences, and how to choose? Available from https://assess.com
Tour, L; Mengcheng, W; & Tao, X. (2017). An investigation of enhancement of ability evaluation by using a nested logit model for multiple-choice items. Annals of psychology, 33(3), 530-537.
Van der Linden, W. J. (2016). Handbook of Item Response Theory. Taylor & Francis Group, LLC.