کاربرد مدل های چند ارزشیIRT در نمره گذاری آزمون های سرنوشت ساز (مورد مطالعه: آزمون پروانه وکالت)

پیروی, رضا; فلسفی نژاد, محمدرضا; مینائی, اصغر; دلاور, علی; فرخی, نورعلی

doi:10.22034/emes.2023.563268.2426

کاربرد مدل های چند ارزشیIRT در نمره گذاری آزمون های سرنوشت ساز (مورد مطالعه: آزمون پروانه وکالت)

نوع مقاله : مقاله پژوهشی

نویسندگان

¹ دکتری سنجش و اندازه‌گیری ، دانشکده روان شناسی و علوم تربیتی، دانشگاه علامه طباطبائی، تهران، ایران

² دانشیار، گروه سنجش و اندازه‌گیری، دانشکده روان شناسی و علوم تربیتی، دانشگاه علامه طباطبائی، تهران، ایران

³ استاد، گروه سنجش و اندازه‌گیری، دانشکده روان شناسی و علوم تربیتی، دانشگاه علامه طباطبائی، تهران، ایران

10.22034/emes.2023.563268.2426

چکیده

هدف: هدف مطالعه حاضر، مقایسه میزان دقت و خطای اندازهگیری مدلهای دوارزشی و چند ارزشیIRT در نمرهگذاری آزمونهای توانایی سرنوشت‌ساز بود.
روش پژوهش: جامعه پژوهش شامل تمامی شرکتکنندگان آزمون سراسری پروانه وکالت سالهای 1396 و 1398 بوده که از میان آنها تعداد 5000 نفر از سال 1396 و تعداد 5000 نفر از سال 1398 با روش نمونهگیری تصادفی ساده انتخاب شدند. همچنین، گردآوری دادهها با استفاده از پاسخهای شرکتکنندگان آزمون انجام یافت. متغیر مستقل این پژوهش، شیوه و مدل نمرهگذاری و متغیر وابسته، میزان برازش و آگاهی (دقت) مدل محسوب میشود. بر این اساس، روش پژوهش آزمایشی است.
یافته‌ها: تجزیه و تحلیل یافتهها نشان داد که از میان مدلهای لجستیک دوارزشی IRT، مدل 3 پارامتری، و از میان مدلهای چندارزشی اسمی مورد مطالعه نیز، مدل 3 پارامتری در مقایسه با سایر مدلها، برازش و نیز آگاهیدهندگی بیشتر و مطلوبتری بر روی دادههای مورد مطالعه داشتند.
نتیجه‌گیری: با توجه به برازش و میزان آگاهی مطلوبتر مدل 3 پارامتری دو ارزشی و مدل 3 پارامتری چندارزشی در مقایسه با سایر مدلها، استفاده از این مدلها در نمرهگذاری میتواند به افزایش دقت اندازهگیری و کاهش خطا، و نیز به منصفانه بودن فرآیند گزینش متقاضیان آزمون پروانه وکالت کمک نماید.

کلیدواژه‌ها

20.1001.1.24762865.1402.13.42.4.3

موضوعات

سنجش و اندازه‌گیری آموزش عالی

عنوان مقاله [English]

The application of IRT Polytomous models in scoring high-stakes tests (Case of study: lawyer's license test)

نویسندگان [English]

Reza Payravi ¹
Mohammadreza Falsafinejad ²
Asghar Minaei ²
Ali Delavar ³
Ali Farrokhi ³

¹ Ph.D student, Faculty of Psychology and Education, Allameh Tabataba'i University, Tehran, Iran

² Associate Professor, Department of Educational Measurement, Allameh Tabataba'i University, Tehran, Iran

³ Professor, Department of Educational Measurement, Allameh Tabataba'i University, Tehran, Iran

چکیده [English]

Objective: The aim of this study was to compare the accuracy and measurement error of dichotomous and Polytomous IRT models in scoring high-stakes, large-scale ability tests.
Methods: The statistical population of this study was included all the participants of the lawyer's license external tests in 2016 and 2018, from which 5000 persons and 5000 persons respectively were selected by random sampling. In addition, data collection was done using the responses of the participants of the above exam. Accordingly, the research method is experimental.
Results: The analysis of the findings showed that among the dichotomous IRT logistic models, the 3-parameter model, and among the nominal Polytomous models studied, the 3-parameter model are a better fits and information compared with other models on the data under study.
Conclusion: Considering the more favorable fit and the level of information of the 3-parameter dichotomous model and the 3-parameter Polytomous model compared with other models, the use of these models in scoring can increase the accuracy of measurement and reduce the error. In addition, the use of these models also helps the fairness of the selection process of the applicants for the lawyer's license exam.

کلیدواژه‌ها [English]

Keywords: IRT scoring
Dichotomous models
IRT nominal Polytomous models
Fairness of assessment

مراجع

References

Abad, F.; Olea, J. & Ponsoda, V. (2009). The Multiple-Choice Model Some Solutions for Estimation of Parameters in the Presence of Omitted Responses. Applied Psychological Measurement, Vol. 33, No. 3, pp. 200-221.

Baker, F. B. & Ho Kim, S. (2017). The Basics of Item Response Theory Using R. Springer International Publishing.

Brown, A. & Croudace, T. (2015). Scoring and estimating score precision using multidimensional IRT. In Reise, S. P. & Revicki, D. A. (Eds.). Handbook of Item Response Theory Modeling: Applications to Typical Performance Assessment. New York: Routledge/Taylor & Francis Group.

Bock, R. D. (1997). The nominal categories model. In Handbook of modern item response theory. New York: Springer.

Bock, R. D. & Gibbons, R D. (2021). Item response theory. John Wiley & Sons Ltd.

Bolt, D.; Wollack, J. & Suh, Y. (2012). Application of a multidimensional nested logit model to multiple-choice test items. Psychometrika, 77(2), 339–357.

Burnham, K. P., & Anderson, D. R. (2002(. Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach. Springer-Verlag.

Burnham, K. P., & Anderson, D. R. (2004(. Multimodel Inference: understanding AIC and BIC in Model Selection, Amsterdam Workshop on Model Selection.

Carlson, J. E. & Von Davier, M. (2018). Item Response Theory. Available from https://ets.org

Chalmers, R. P. (2012). Mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 1-29.

De Ayala, R. J. (1989). A comparison of the nominal response model and the three parameter logistic model in computerized adaptive testing. Applied measurement in education, 5, 17-34.

De Ayala, R. J. (2009). The Theory and Practice of Item Response Theory. Guilford Publications, Inc.

Drasgow, F. & Levine, M.V. & Tsien, S. (1995). Fitting Polytomous Item Response Theory Models to Multiple-Choice Tests. APPLIED PSYCHOLOGICAL MEASUREMENT. Vol. 19, No. 2.

DeMars, C. (2010). Item response theory. Published by Oxford University Press, Inc.

Kim, Jee-Seon. (2006). Using the Distractor Categories of Multiple-Choice Items to Improve IRT Linking. Journal of Educational Measurement, Vol. 43, No. 3, pp. 193–213.

Lacourly, N; Sanmartin, J; Silva, M; & Uribe, P. (2018). IRT Scoring and the principle of consistent order. Available from https://arXiv.org

Lahner, F; Schauber, S; Lorwald, A; Kropf, R; Guttormsen, S; Fischer, M; & Huwendiek, S. (2020). Measurement precision at the cut score in medical multiple-choice exams: Theory matters. Perspectives on Medical Education, 9, 220-228.

Myszkowski, N., & Storme, M. (2018). A snapshot of g? Binary and Polytomous item-response theory investigations of the last series of the Standard Progressive Matrices (SPM-LS). Intelligence, 68, 109–116.

Paek, I. & Cole, K. (2020). USING R FOR ITEM RESPONSE THEORY MODEL APPLICATIONS. Routledge

Penfield, R, & La Torre, J. (2008). A new response model for multiple-choice items. Paper presented at the 2008 annual meeting of the National Council on Measurement in Education, New York.

Penfield, R. (2014). An NCME Instructional Module on Polytomous Item Response Theory Models. Educational Measurement: Issues and Practice, Vol. 33, No. 1, pp. 36–48.

Preston, K; Reise, S; Cai, L; & Hays, R. (2011). Using the Nominal Response Model to Evaluate Response Category Discrimination in the PROMIS Emotional Distress Item Pools. Educational and Psychological Measurement, 7(3), 523-550.

Price, L. R. (2017). Psychometric Methods. Guilford Publications, Inc.

Reif, M. (2014). IRT models for multiple-choice items (mcIRT).

Available from https://github.com/manuelreif/mcIRT

Ritt, M. (2016). The impact of high-stakes testing on the learning environment. Master of social work clinical research papers. Paper 658.

Revelle, W. (2017). Psych: Procedures for Personality and Psychological Research. Available from https://personality-project.org/r/psych.

Samejima, F. (1996). Polychotomous responses and the test score. Available from https://eric.ed.gov

Simon, M; Ercikan, K; & Rousseau, M. (2013). Improving Large-Scale Assessment in education. New York: Routledge.

Storme, M; Myszkowski, N; Baron, S; & Bernard, D. (2019). Same Test, Better Scores: Boosting the Reliability of Short Online Intelligence Recruitment Tests with Nested Logit Item Response Theory Models. Intelligence, 7(3), 1-17.

Suh, Y., & Bolt, D. (2010). Nested logit models for multiple-choice item response data. Psychometrika, 75(3), 454–473.

Suh, Y. & Bolt, D. (2011). A nested logit approach for investigating distractors as causes of differential item functioning. Journal of Educational Measurement, 48, 188-205.

Thompson, N. (2021). Classical Test Theory vs. Item Response Theory: What are some key differences, and how to choose? Available from https://assess.com

Tour, L; Mengcheng, W; & Tao, X. (2017). An investigation of enhancement of ability evaluation by using a nested logit model for multiple-choice items. Annals of psychology, 33(3), 530-537.

Van der Linden, W. J. (2016). Handbook of Item Response Theory. Taylor & Francis Group, LLC.

کاربرد مدل های چند ارزشیIRT در نمره گذاری آزمون های سرنوشت ساز (مورد مطالعه: آزمون پروانه وکالت)

The application of IRT Polytomous models in scoring high-stakes tests (Case of study: lawyer's license test)

مراجع

References

دوره 13، شماره 42
تیر 1402

فایل ها

هم رسانی

ارجاع به این مقاله

آمار

کاربرد مدل های چند ارزشیIRT در نمره گذاری آزمون های سرنوشت ساز (مورد مطالعه: آزمون پروانه وکالت)

The application of IRT Polytomous models in scoring high-stakes tests (Case of study: lawyer's license test)

مراجع

References

دوره 13، شماره 42تیر 1402

فایل ها

هم رسانی

ارجاع به این مقاله

آمار

دوره 13، شماره 42
تیر 1402