طراحی خزانه‌های سؤال بهینه برای سنجش انطباقی کامپیوتری با در نظر گرفتن امنیت آزمون

نوع مقاله : مقاله پژوهشی

نویسندگان

1 دانشجوی دوره‌ دکتری سنجش و اندازه‌گیری دانشگاه علامه‌ طباطبایی

2 دانشیار دانشکده‌ روانشناسی و علوم تربیتی دانشگاه علامه ‌طباطبایی

3 استاد دانشکده‌ روانشناسی و علوم تربیتی دانشگاه علامه‌طباطبایی

4 استادیار سازمان سنجش آموزش کشور

چکیده

سنجش انطباقی کامپیوتری به خزانه‌ سؤالی نیاز دارد که به ‌خوبی طراحی شده و برای ساخت آزمون‌های مجزا، تعداد مناسبی  سؤال‌ داشته باشد. همچنین شامل سؤال‌هایی باشد که از لحاظ محتوایی متعادل باشد و هزینه‌ ساخت آزمون را کاهش دهد. یکی از روش‌های طراحی خزانه‌ سؤال، روش رِکیس است، که در آن از روش مونت‌کارلو برای تعیین ویژگی‌های یک خزانه‌ سؤال بهینه استفاده می‌شود. در این پژوهش، از این روش برای طراحی خزانه‌هایی که با مدل سه پارامتری لُجستیک مدرج شده‌اند، استفاده شده است. برای کنترل نرخ مواجهه‌ سؤال از روش‌ سیمپسون-هتر و برای شبیه‌سازی سؤال‌های آزمون از سه روش تصادفی (R)، تصادفی آمیخته‌ و پیش‌بینی (MRP) و حداقل آگاهی آزمون (MTI) استفاده شده است. عملکرد خزانه‌های‌ سؤال شبیه‌سازی شده و عملیاتی با در نظر گرفتن مجموعه‌ای از ملاک‌های ارزیابی، با یکدیگر مقایسه شده‌اند. نتایج نشان می‌‌دهد که خزانه‌ MRP نسبت به سایر خزانه‌های بهینه از امنیت و دقت اندازه‌گیری بالاتری برخوردار است و شامل سؤال‌های زیادی با ضریب تشخیص بالا است. نرخ همپوشی آزمون در آن‌ کمتر و درصد کوچکی از سؤال‌های بیش مواجهه و کم مواجهه دارد. در مقابل اندازه خزانه‌ MTI نسبت به سایر خزانه‌های بهینه کوچک‌تر و سؤال‌هایی با ضریب تشخیص پایین‌تر دارد. به‌طور کلی با در نظر گرفتن عامل کنترل مواجهه، خزانه‌های بهینه، بهتر از خزانة عملیاتی عمل می‌کند.

کلیدواژه‌ها


عنوان مقاله [English]

Designing Optimal Item Pools for Computerized Adaptive Testing by Considering the Test Security

نویسندگان [English]

  • Maryam Moghadasin 1
  • Mohammad Reza Falsafinejad 2
  • Ali Delavar 3
  • Ehsan Jamali 4
  • Noor Ali Farokhi 2
چکیده [English]

Computerized adaptive testing requires a well-designed item pool containing an appropriate number of items to build individualized tests. An optimal item pool should also contain well-balanced items that will achieve optimal item usage and low cost of item creation. One of the methods for designing the blueprint for an item pool is Reckase’s method, which is a Monte Carlo method to determine the properties of an optimal item pool. This study has been presented by the Reckase’s method for designing item pools calibrated with the three-parameter logistic model. Also, in the present simulation, the Sympson-Hetter procedure has been used to control the item exposure rate. To design simulation test items, three approaches including R, MRT, and MTI have been applied. The performance of the simulated item pools has been compared with the operational item pool by considering some evaluation criteria. The results suggest that the optimal pool designed with MRP performs the best, based on test security and measurement accuracy. It does require more items and some highly discriminating items. Tests assembled from MRP pools have smaller test-retest overlap rates and significantly lower percentages of over- and under-exposed items. The results show that the MTI design generally leads to smaller pools and contain items with lower a-parameters. Overall, Considering the Sympson-Hetter procedure, the optimally designed item pools perform better than the operational pool obtained from operational CAT

کلیدواژه‌ها [English]

  • Computerized adaptive testing
  • optimal item pool
  • Monte Carlo method of Reckase
  • control of item exposure rate and Sympson-Hetter procedure
  1. Ariel, A.; Veldkamp, B. P. & van der Linden, W. J. (2004). Constructing rotating item pools for constrained adaptive testing. Journal of Educational Measurement, 41, 345-360.
  2. Bergstrom, B. A. & Lunz, M. E. (1999). CAT for certification and licensure. In F. Drasgow & J.Olson-Buchanan (Eds.), Innovations in Computerized Assessment (pp. 67-91). Mahwah, NJ: Lawrence Erlbaum Associates, Inc.
  3. Belov, D. I. & Armstrong, R. D. (2009). Direct and inverse problems of item pool design forcomputerized adaptive testing. Educational and Psychological Measurement, 69 (4), 53-547.
  4. Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee‘s ability. In F. M. Lord & M. R. Novick, Statistical theories of mental test scores (pp. 397-479). Reading, MA: Addison-Wesley.
  5. Chang, H. (2007). Book review: Linear models for optimal test design. Psychometrika, 72, 279-281.
  6. Chang, H. H., & Ying, Z. (1999). Alpha-stratified multistage computerized adaptive testing. Applied Psychological Measurement, 23, 211-222.
  7. Chang, H. H., & van der Linden, W. J. (2003). Optimal stratification of item pools in a-stratified computerized adaptive testing. Applied Psychological Measurement, 27, 262-274.
  8. Cheng, Y., & Chang, H. (2009). The maximum priority index method for severely con- strained item selection in computerized adaptive testing. British Journal of Mathematical and Statistical Psychology, 62, 369-383.
  9. Chen, S. Y.; Ankenmann, R. D. & Spray, J. A. (1999). Exploring the relationship between item exposure rate and test overlap rate in computerized adaptive testing (No. ACT-RR-99-5): American College Testing Program, Iowa City, IA.
  10. De Ayala, R.J. (2009). The theory and practice of item response theory. New York: Guilford Press.
  11. Gu, L. (2007). Designing optimal item pools for computerized adaptive tests with exposure controls. Unpublished doctoral dissertaion. Michigan State University.
  12. Gu, L. & Reckase, M. D. (2007). Designing optimal item pools for computerized adaptive tests with Sympson-Hetter exposure control. Paper Presented at the 2007 GMAC Conference on Computerized Adaptive Testing, Minneapolis, MN.
  13. Guide. M. U. S. (2014). The mathwork. Lnc. Natich, MA, 5,333.
  14. Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park CA: Sage.
  15. Hau, K. T., & Chang, H. H. (2001). Item selection in computerized adaptive testing: Should more discriminating items be used first. Journal of Educational Measurement, 38 (3), 249-266.
  16. He, W., & Reckase, M.  (2010). Optimal item pool design for a highly constrained computerized adaptive test. Unpublished doctoral dissertaion. Michigan State University.
  17. He. W., & Reckase, M. (2011). Optimal item pool design for a highly constrained computerized adaptive test. Paper presented at the National Council on Measurement in Education, Denver, CO.
  18. Jensema, C. J. (1977). Bayesian tailored testing and the influence of item bank characteristics. Applied Psychological Measurement, 1, 111-120.
  19. Lord, F. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum.
  20. McBride, J. R. & Weiss, D. J. (1976). Some properties of a Bayesian adaptive ability testing strategy (Research Rep No. 76-1). Minneapolis, MN: Psychometric Methods Program, Department of Psychology.
  21. Millman, J. & Arter, J. A. (1984). Issues in item banking. Journal of Educational Measurement, 21, 315-330.
  22. Owen, R. J. (1975). A Bayesian sequential procedure for quantal response in the context of adaptive mental testing. Journal of the American Statistical Association, 70, 351-356.
  23. Parshall, C., Davey, T., & Nering, M. (1998). Test development exposure control for adaptive tests. Paper presented at the Annual Meeting of the National Council on Measurement in Education, San Diego CA.
  24. Reckase, M.D. (1989). Adaptive testing: The evolution of a good idea. Educational Measurement: Issues and Practice, 8 (3), 11-15.
  25. Reckase, M. D.  (2001). Item pool design for computerized adaptive tests.  Invited small group session at the 6th Conference of the European Association of Psychological Assessment, Aachen, Germany.
  26. Reckase, M. D. (2003). Item pool design for computerized adaptive tests. Paper presented at the National Council on Measurement in Education, Chicago, IL.
  27. Reckase, M. D. (2007). The design of p-optimal item bank for computerized adaptive tests. In D. J. Weiss (Ed.). Proceedings of the 2007 GMAC Conference on Computerized Adaptive Testing.
  28. Reckase, M. D. (2009). Optimal Item Pool Design for the 2009 NCLEX Exam. A Report SubMTIted to National Council of State Boards of Nursing March 2009.
  29. Reckase, M. D. (2010). Designing Item Pools to Optimize the Functioning of Computerized Adaptive Test. Psychological Test and Assessment Modeling, 52 (2), 127-141.
  30. Reckase, M. D. & He, W. (2004). The ideal item pool for the NCLEX-RN examination--Report to NCSBN: Michigan State University, East Lansing, MI.
  31. Reckase, M. D. & He, W. (2005). Ideal item pool design for the NCLEX-RN exam. Michigan State University, East Lansing, MI.
  32. Reckase, M. D. & He, W. (2008). The impact of item disclosure (compromise) on the probability of passing of the NCLEX-RN exam--report to the National Council of State Boards of Nursing (NCSBN): Michigan State University.
  33. Reckase, M. D., & He, W. (2009a). ). Optimal item pool design for the 2009 NCLEX Exam--report to the National Council of State Boards of Nursing (NCSBN): Michigan State University.
  34. Reckase, M. D., & He, W. (2009b). The influence of item pool quality on the functioning of computerized adaptive tests. Paper presented at the annual meeting of Psychometric Society, Cambridge, U.K.
  35. Robin, F.; van der Linden, W. J.; Eignor, D. R.; Steffen, M. & Stocking, M. L. (2005). A comparison of two procedures for constrained adaptive test construction (ETS Research Rep No. RR-04-39). Princeton, NJ: Educational Testing Service.
  36. Stocking, M. L. & Swanson, L. (1998). Optimal design of item pools for computerized adaptive tests. Applied Psychological Measurement, 22, 271-279.
  37. Stocking, M. L. & Lewis, C. (2000). Methods of controlling the exposure of items in CAT. In W. J. Van der Linden & C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp.163-182). Netherlands: Kluwer Academic Publishers.
  38. Sympson, J. B. & Hetter, R. D. (1985). Controlling item-exposure rates in computerized adaptive testing. Proceedings of the 27th annual meeting of the Military Testing Association (pp. 973-977). San Diego, CA: Navy Personnel Research and Development Center.
  39. Urry, V. W. (1977). Tailored testing: A successful application of latent trait theory. Journal of Educational Measurement, 14, 181-196.
  40. Van der Linden, W. J. (2000a). Constrained adaptive testing with shadow tests. In W. J. van der Linden, & C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 27–52). Boston: Kluwer Academic Publishers.
  41. Van der Linden, W. J. (2000 b). Optimal assembly of tests with item sets. Applied Psychological Measurement, 24, 225–240.0.
  42. Van der Linden, W. J. (2005a). A comparison of item-selection methods for adaptive tests with content constraints. Journal of Educational Measurement, 42, 283-302.