استفاده از روش تصمیم‌گیری چند‌شاخصه در رتبه‌بندی روش‌های ساخت نمره‌کل

جهانی فر, مجتبی

doi:10.22034/emes.2022.550667.2366

استفاده از روش تصمیم‌گیری چند‌شاخصه در رتبه‌بندی روش‌های ساخت نمره‌کل

نوع مقاله : مقاله پژوهشی

نویسنده

مجتبی جهانی فر

استادیار گروه علوم تربیتی دانشگاه شهید چمران اهواز، اهواز، ایران

10.22034/emes.2022.550667.2366

چکیده

هدف: تصمیم پذیرش در آزمونها بیشتر براساس نمره‌ای است که در آن آزمون کسب می‌شود. آزمون می‌تواند از چند خرده آزمون با محتوای متفاوت تشکیل شده باشد که به آن آزمون مرکب و نمره حاصل، نمره کل نامیده می‌شود. روش‌های متفاوت نمره کل سازی موجب تغییر در تصمیم پذیرش افراد می‌شود. این پژوهش با هدف رتبه‌بندی روش‌هایی که برای ساختن نمره کل استفاده می‌شود، انجام شده است.
روش پژوهش: از 10000 نمونه تصادفی آزمون سراسری در هفت خرده آزمون برای رتبه‌بندی شش روش نمره کل سازی بهره گرفته شده است. نمره خام از مجموع پاسخ‌های صحیح به دست آمده و از روش‌های نرمال‌سازی و آرک سینوس برای تبدیل نمره‌ها به نمره‌های مقیاس بهره برده شده است. از طرح‌های وزن دهی اسمی، موثر و شانون برای ساخت نمره کل استفاده گردید. به منظور رتبه‌بندی روش‌های نمره کل سازی بر اساس خطای استاندارد اندازه‌گیری شرطی آنها از رویکردی مبتنی بر تصمیم‌گیری چند شاخصه استفاده شد.
یافته‌ها: نتایج نشان داد که آن دسته از روش‌های نمره کل سازی که از مقیاس آرک‌سینوس و از طرح‌های وزن دهی اسمی و یا شانون بهره می‌برند، حائز رتبه‌های بالاتری شدند و در صورت استفاده از آنها در نمره کل سازی ، خطای کمتری مرتکب خواهیم شد.
نتیجه‌گیری: استفاده از نمره مقیاس آرک سینوس، به دلیل خطای کمتر، تبدیل و راحت تر می تواند به تفسیرپذیری و دقت بیشتر نمره های آزمون های مرکب کمک کند، ضمن اینکه روش های متفاوت وزن دهی تاثیر چندانی بر دقت نمره ها نداشته و مطابق با شرایط آزمون و تصمیم آزمون ساز می توانند مورد استفاده قرار بگیرند.

کلیدواژه‌ها

عنوان مقاله [English]

Utilizing the Decision-Making Approach to Rank Composite Score Construction Methods

نویسنده [English]

Mojtaba Jahanifar

Assistant professor, Education departement, Shahid Chamran University of Ahvaz, Ahvaz, Iran

چکیده [English]

Objective: Battery Test is usually used for decision-making in education and Admission decisions. There are several methods to construct composite scores so each method makes a different effect on the admission decision. However, which decision makes fewer errors?
Methods: present research has been conducted to rank different methods of composite score construction based on their CSEM. 10,000 random sample Data from participants of the Iran university entrance exam were used to rank six composite score construction methods. The participants' raw score arises from summing up correct responses. Normalizing and Arcsine transformation methods were used to Construct scale scores, also we used nominal, effective and Shannon weighting schemes to combine subtest scale scores. In order to rank composite score construction methods, a new approach was employed based on the MADM decision-making approach
Results: The results revealed that the methods that use Arcsine to construct scale scores and nominal or Shannon weighting schemes to combine subtest scale scores have taken the higher ranks, and less error will occur at admission decision.
Conclusion: Using the Arc Sine scale score, due to less error and easier conversion, can help the interpretation and accuracy of composite test scores, while different weighting methods do not affect the accuracy of scores and in accordance with the test conditions or test builders' decision can be used.

کلیدواژه‌ها [English]

Keywords: scale score
composite score
weighting scheme
CSEM
decision making

مراجع

Allen, M. J., & Wendy, Y. M. (1979). Introduction to Measurement Theory. California: Cole publishing company.

Angoff, W.H. (1971). Scales, norms, and equivalent scores. In R. L. Thorndike (Ed.), Educational measurement (2nd ed., pp. 508-600). Washington, DC: American Council on Education. (Reprinted as 'W. A. Angoff, Scales, norms, and equivalent scores'. Princeton, NJ: Educational Testing Service, 1984.)

American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, (2014). Standards for educational and psychological testing. Washington, DC: American Psychological Association.

Azar, A., Rajabzade A. (2015). Applied Decision Making MADM Approach. Tehran: Negah Danesh.

Brennan, Robert L., Lee, Won-Chan. (1999) Conditional Scale-Score Standard Errors of Measurement under Binomial and Compound Binomial Assumptions, Educational and Psychological Measurement, Vol 59, Issue 1, pp. 5 – 24.

Brooks, G. P., Johnson, G. A.(2014). TAP: Test Analysis Program [computer software]. Chicago.

Chang, S. W. (2009). Choice of weighting schemes in forming the composites, bulletin of educational psychology,40(3), 489-510, national Taiwan normal university, Taipei, Taiwan, R.O.C.

Chang, S. W. (2006), Methods in Scaling the Basic Competence Test, Educational and Psychological Measurement, 66(6), 907-929.

Dorans N. J., Pommerich, M. & Holland P. W. (2007). A Framework and History for Score Linking. In Holland P. W. (Eds.), Linking and Aligning Scores and Scales (pp 5-30). New York: Springer.

De Boor, C. (2001). A Practical Guide to Splines (Revised Edition). pp. 207–214, New York: Springer.

Feldt, L. S. (2004). Estimating the reliability of a test battery composite or a test score based on weighted item scoring. Measurement and Evaluation in Counseling and Development, 37(3), 184-190.

Gulliksen, H. (1950). Theory of mental test. New York: John Wiley & sons.

Gronlund. N. E. & Linn R. T. (1990), measurement and evaluation in teaching. New York: Macmillan.

Haertel, H. E. (2006). Reliability. In R. L. Brennan (Ed.), Educational measurement (4rd. ed., pp. 65-86). CT: American Council on Education and Praeger.

Iowa Assessment (2016). Iowa Tests of Basic Skills, Retrieved itp.education.uiowa.edu

Ishizaka, A., Nemery, P. (2013). Multi-criteria Decision Analysis: Methods and Software, New York: John Wiley & sons.

Kane, M., & Case, S. M. (2004). The reliability and validity of weighted composite scores. Applied Measurement in Education, 17, 221-240.

Kolen, M. J., Hanson, B. A., & Brennan, R. L. (1992). Conditional standard errors of measurement of scale scores. Journal of Educational Measurement, 29, 285-307.

Kolen, M. J., & Hanson, B. A. (1989). Scaling the ACT Assessment. In R. L. Brennan (Ed.), Methodology used in scaling the ACT Assessment and P-ACT+ (pp. 35-55). Iowa City, IA: American College Testing Program.

Kolen, M. J., Zeng, L., & Hanson, B. A. (1996). Conditional standard errors of measurement for scale scores using IRT. Journal of Educational Measurement, 33, 129-140.

Kolen, M.J. (1991). Smoothing methods for estimating test score distributions. Journal of Educational Measurement, 28, 257-282.

Kolen, M. J., & Brennan, R. L. (2014). Test Equating, Scaling and Linking (3rd Ed.). New York: Springer.

Kolen, M.J. (2006), Scaling and norming. In R. L. Brennan (Ed.), Educational measurement (4rd ed., pp. 236-241). CT: American Council on Education, and Praeger.

Kolen, M. J, Wang, T., Lee, W. Chon. (2012), Conditional Standard Errors of Measurement for Composite Scores Using IRT, International Journal of Testing, 12, 1-20.

Lord, F. M., & Novick, M. R. (1967). Statistical theory of mental test scores. MA: Adisson-wesley.

Nunnally, J. c., & Bernstein, I. H. (1994). Psychometric theory. New York: McGraw-Hill.

Magnusson, D. (1967). Test theory. MA: Addison-Wesley.

Nitko, A. J. (2001), Educational assessment and evaluation (3rd Ed.). New Jersey: Merrill prentice-hall.

Pei, L. K., & Maller, S. J. (2006). Monte Carlo simulation study of differential weights on composite reliability and validity. Paper presented at the annual meeting of the National Council on Measurement in Education, San Francisco.

Petersen, N. S., Kolen, M. J., & Hoover, H. D. (1989). Scaling, norming, and equating. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 221-262). New York: American Council on Education, and Macmillan.

Price, R. L., Raju, N., Lurrie, A. Wilkins, C. & Zhu, J. (2006). Conditional standard errors of measurement for composite scores on the Wechsler Preschool and Primary Scale of Intelligence-Third Edition, Psychological Reports,98,237-252

Rudner, L. M. (2001). Informed test component weighting. Educational Measurement: Issues and Practice, 20(1), 16-19.

Sutton, R. (2004). Teaching under high-stakes testing: Dilemmas and decisions of a teacher educator. Journal of Teacher Education, 55(5), 463-475.

Testing, National Organization. (2015, Sep 01). NOET web page. Retrieved from www.sanjesh.org

The ACT, The ACT technical manual (2014), Retrieved www.act.org

The SAT, SAT technical manual (2015), Retrieved collegereadiness.collegeboard.org.

Wang, T. (1998). Weights that maximize reliability under a congeneric model. Applied psychological measurement, 22(2), 179-187.

Wang, M. W., & Stanley, J. C. (1970). Differential weighting: A review of methods and empirical studies. Review of Educational Research, 4, 663- 705.

Woodruff, D., Traynor, A., Cui, Z., Fang, Y., (2013). A Comparison of Three Methods for Computing Scale Score Conditional Standard Errors of Measurement, ACT Research report series, no.7. Retrieved from www.act.org.

Zolfagharnasab, S., Khodaei, E., Yadegarzadeh, G. (2013). Optimum Weighting to Entrance Subtests and Their Items to Make Composite Score. Educational Measurement and Evaluation Studies, 3(4), 79-104.

تعداد مشاهده مقاله: 218
تعداد دریافت فایل اصل مقاله: 171

استفاده از روش تصمیم‌گیری چند‌شاخصه در رتبه‌بندی روش‌های ساخت نمره‌کل

Utilizing the Decision-Making Approach to Rank Composite Score Construction Methods

مراجع

دوره 12، شماره 37
فروردین 1401
صفحه 81-98

فایل ها

هم رسانی

ارجاع به این مقاله

آمار

استفاده از روش تصمیم‌گیری چند‌شاخصه در رتبه‌بندی روش‌های ساخت نمره‌کل

Utilizing the Decision-Making Approach to Rank Composite Score Construction Methods

مراجع

دوره 12، شماره 37فروردین 1401صفحه 81-98

فایل ها

هم رسانی

ارجاع به این مقاله

آمار

دوره 12، شماره 37
فروردین 1401
صفحه 81-98