Equating and Linking Scores in National Exams

Document Type : Original Article

Authors

1 Research Expert of National Organization of Educational Testing (NOET), Iran

2 Associate Professor, Faculty of Psychology and Education, University of Tehran, Tehran, Iran

10.22034/emes.2024.561619.2412

Abstract

Objective: According to the approval of the Supreme Council of Cultural Revolution the National Organization of Educational Testing is supposed to hold a national exam twice a year and the score of each test would have credit for two years. The purpose of this research is to introduce and implement appropriate methods of equating in the framework of the classical test theory in order to compare the test scores of different courses to choose applicants in a fair way.
Methods: Based on common-item nonequivalent groups design three test forms X, Y and Z of differential calculus were experimentally implemented on the group of 600, 1111 and 2200 examinees. The number of test items were 21, 21 and 20 with six common item in total. Data were analyzed in R software with the package equate (Albano & Albano, 2018). The equipercentile function was used to equate the test forms; The pre-smoothing process with linear logarithmic transformations were done so that the parameters were estimated with less standard error and scores with more accuracy.
Results: The average difficulty of these test forms were 9.03, 7.61 and 7.79 respectively. Also, all three forms have different skewness 0.203, 0.518 and 0.392, respectively.
Conclusion: Due to the different difficulty and right skewness of the distribution of the tests scores, it is suggested to equate them with the equipercentile function in a nonequivalent anchor test design. The tests for which common item cannot be developed, it is recommended to link their scores by the same equipercentile function.

Keywords

Main Subjects


References

Albano, A., & Albano, M. A. (2018). Package ‘equate’. Available at: https://cran.r-project.org/web/packages/equate/index.html
Albano، A. D. (2016). Equate: An R package for observed-score linking and equating. Journal of Statistical Software, 74, 1-36.
Angoff, W. H. (1984). Scales, norms, and equivalent scores. Educational Testing Service.
Angoff, W. H. (1971). Scales, norms, and equivalent scores. In RL Thorndike (Ed.), Educational measurement.
Bahmanabadi, S., Falsafinejad, M., Delavar, A., Farrokhi, N., & Minaei, A. (2020). Identification of Optimal Equating Method in Multidimensional Tests. Educational Measurement and Evaluation Studies, 10(30), 217-264. doi: 10.22034/emes.2020.44489
Brennan, R.L. (2001). Generalizability Theory, Iowa Testing Programs. University of Iowa. Springerverlag, New York.
Chen,F., Huang, H. & MacGregor, D.,(2009). EQUATING OR LINKING: BASIC CONCEPTS AND A CASE STUDY. Originally presented at CAL, Washington. Available at: https://faculty.ecnu.edu.cn/picture/article/220/0c/13/03357e474db0b2d5de11abaef0fb/793ecb9d-fe0b-4ff5-bd56-78148d7d4210.pdf.x
Dorans, N. J., Moses, T. P., & Eignor, D. R. (2010). Principles and practices of test score equating. ETS Research Report Series2010(2), i-41.
Heh, V. K. (2007). Equating accuracy using small samples in the random groups design (Doctoral dissertation, Ohio University). Available at: https://etd.ohiolink.edu/apexprod/rws_etd/send_file/send?accession=ohiou1178299995&disposition=inline
Hendrickson, A. B., & Kolen, M. J. (2001). IRT Equating of the MCAT. MCAT Monograph.
Kim, S. H., & Cohen, A. S. (1998). A comparison of linking and concurrent calibration under item response theory. Applied psychological measurement, 22(2), 131-143.
Levine, R. Equating the score scales of alternative forms administered to samples of different ability 1955 Princeton. NJ Educational Testing Service (ETS Research Bulletin No. 55-23). Available at: https://onlinelibrary.wiley.com/doi/epdf/10.1002/j.2333-8504.1955.tb00266.x
Liang, Z., Zhang, M., Huang, F., Kang, D., & Xu, L. (2021). Application Innovation of Educational Measurement Theory, Method, and Technology in China’s New College Entrance Examination Reform. Chinese/English Journal of Educational Measurement and Evaluation, 2(1), 3.
Livingston, S. A. (2014). Equating test scores (without IRT). Educational testing service.
Liu, J., & Low, A. C. (2007). An exploration of kernel equating using SAT® data: Equating to a similar population and to a distant population. ETS Research Report Series2007(1), i-22.
MoghadamZade, A.(2015). Optimal Smoothing Method of Data in Test Equating: The Case of TOLIMO and Comprehensive Trial Tests of Iran Educational Testing Organization. Quarterly of Educational Measurement, 6(21), 261-287. doi: 10.22054/jem.2015.5736
Muraki, E., Hombo, C. M., & Lee, Y. W. (2000). Equating and linking of performance assessments. Applied Psychological Measurement, 24(4), 325-337. Available at: https://www.researchgate.net/publication/247742704_Equating_and_Linking_of_Performance_Assessments
Parsaeian, M., NaghiZadeh, S., Naderi, H.  (2018). Selection the best Method of Equating Using Anchor-Test Design‎ in Item Response Theory . Andishe-_ye-Amari. Avalable at: http://andisheyeamari.irstat.ir/article-۱-۵۰۴-fa.html‎‎
Rezvanifar, S., Falsafinejad, M., Delavar, A., (2016). eguating methods. Quarterly of Educational Measurement, 7(26), 1-33. doi: 10.22054/jem.2017.2737.1085
Ryan, J., & Brockmann, F. (2009). A Practitioner's Introduction to Equating with Primers on Classical Test Theory and Item Response Theory. Council of Chief State School Officers.
Schultz, D. P., & Schultz, S. E. (1996). A History of Modern Psychology [1969]. Translated by: Saif, A., Sharifi, H. P., Ali Abadi, K. & Najafi Zand, J. (2005). Dowran publication
Seufert, B. (2012).When, why, and how the business analyst should use linear regression. Available at: https://mobiledevmemo.com/when-why-and-how-you-should-use-linear-regression/
Shea, J. A., & Norcini, J. J. (1995). Equating. Licensure Testing: Purposes, Procedures, and Practices. Edited by Impara JC. Lincoln, NE, Buros Center for Testing, 253-287.
‪Supreme Council of Cultural Revolution (2021). Policies and criteria for organizing assessment and acceptance Applicants for admission to higher education. Resolution no. 3217: avalable at: https://sccr.ir/pro/3217/
Swaminathan, H. (---) .linking and equating of test scores. University of Connecticut. Available at: https://slidetodoc.com/linking-and-equating-of-test-scores-hariharan-swaminathan/
Tian, F. (2011). A comparison of equating/linking using the Stocking-Lord method and concurrent calibration with mixed-format tests in the non-equivalent groups common-item design under IRT. Unpublished doctoral dissertation, Boston College.
von Davier, M., & von Davier, A. A. (2004). A unified approach to IRT scale linking and scale transformations. ETS Research Report Series, 2004(1), i-21.
Zolfagharnasab, S., Khodaei, E., & Yadegarzadeh, G. (2013). Optimum Weighting to Entrance Subtests and Their Items to Make Composite Score. Educational Measurement and Evaluation Studies, 3(4), 79-104.