تحلیل بعدیت، اثر متن، تأثیر و کارکرد افتراقی سؤال در آزمون‌های مبتنی بر متن

احمدی ده قطب الدینی, محمد; خدایی, ابراهیم; فرزاد, ولی الله; مقدم‌زاده, علی; کبیری, مسعود

تحلیل بعدیت، اثر متن، تأثیر و کارکرد افتراقی سؤال در آزمون‌های مبتنی بر متن

نوع مقاله : مقاله پژوهشی

نویسندگان

محمد احمدی ده قطب الدینی ¹

ابراهیم خدایی ²

ولی الله فرزاد ³

علی مقدم‌زاده ⁴

مسعود کبیری ⁵

¹ دانشجوی دکتری سنجش و اندازه‌گیری، گرایش سنجش آموزش، دانشکده روان‌شناسی و علوم تربیتی، دانشگاه تهران، ایران

² دانشیار گروه روش‌ها و برنامه‌های آموزشی و درسی، دانشکده روان‌شناسی و علوم تربیتی، دانشگاه تهران، ایران

³ دانشیار گروه روان‌شناسی، دانشکده روان‌شناسی و علوم تربیتی، دانشگاه خوارزمی، ایران

⁴ استادیار گروه روش‌ها و برنامه‌های آموزشی و درسی، دانشکده روان‌شناسی و علوم تربیتی، دانشگاه تهران، ایران

⁵ عضو هیئت علمی پژوهشگاه مطالعات آموزش‌وپرورش وابسته به سازمان پژوهش و برنامه‌ریزی آموزشی، ایران

چکیده

این مطالعه با هدف بررسی اثر تک‌زبانه، دوزبانه بودن دانش‌آموزان ایرانی بر بعدیت، وابستگی مکانی سؤال، تأثیر و کارکرد افتراقی سؤال متن‌های مطالعه پرلز 2011 انجام گرفت. بعدیت با مقایسه مدل تک‌بعدی پاسخ مدرج شده با مدل دوعاملی چندبعدی نظریه سؤال-پاسخ، تأثیر و وابستگی مکانی سؤال با استفاده از مدل دوسطحی دوعاملی و کارکرد افتراقی سؤال با مدل دوعاملی چندگروهی کای و همکاران (2011) بررسی شد. نتایج بعدیت نشان داد مدل دوعاملی نسبت به مدل پاسخ مدرج شده برازندگی بهتری با داده‌ها دارد و وابستگی مکانی سؤال بین سؤال‌های دو صفت ادبی موجب انحراف از تک‌بعدی بودن شده است، تفاوت زبانی نیز مقدار زیادی از واریانس آنها را تبیین می‌کند. نتایج تأثیر نشان داد متوسط برآورد توانایی‌های تک‌زبانه‌ها بالاتر از دوزبانه‌ها است. همچنین نتایج نشان داد سؤال‌ها با کارکرد افتراقی یکنواخت برای تک‌زبانه‌ها دشوارتر از دوزبانه‌ها هستند و در سؤال‌های چندگزینه‌ای با کارکرد افتراقی غیر یکنواخت تک‌زبانه‌ها عملکرد بهتری دارند. به‌طور کلی، نتایج نشان داد صفات مرتبط با دو متن در بین دانش‌آموزان تک‌زبانه و دوزبانه به گونه متفاوتی ادراک می‌شوند و در بین دوزبانه‌ها وابستگی مکانی سؤال بیشتر از تک‌زبانه‌ها است. همچنین نتایج حاکی از تفاوت عملکرد دانش‌آموزان تک‌زبانه –دوزبانه در سؤال‌ها با شکل آمیخته است.

کلیدواژه‌ها

مدل پاسخ مدرج شده

مدل دوعاملی

بعدیت

کارکرد افتراقی سؤال

آزمون‌های مبتنی بر متن

عنوان مقاله English

The Analysis of Dimensionality, Testlet Effect and Differential Item Functioning and Impact in Testlet-Based Tests

نویسندگان English

Mohammad Ahmadi Deh Qutbuddini ¹

Ebrahim Khodai ²

Valiollah Farzad ³

Ali Moghadam Zadeh ⁴

Masoud Kabiri ⁵

چکیده English

The present study was conducted to investigate the impact of monolinguality-bilinguality of Iranian students on dimensionality, local item dependence and differential item functioning and impact of questions included in the passages of PIRLS (2011). The dimensionality was analyzed through comparing the one-dimensional graded response model and the multidimensional bi-factor item-response theory model. Next, the local item dependence and impact was analyzed using the two-level bi-factor model and, finally, the differential item functioning was examined using a multiple-group bi-factor model used by Cai et al. (2011).
The results of the dimensionality showed that the bi-factor model better fitted to the data than the graded response model. Furthermore, it was found that the local item dependence between two literal trait questions caused deviation from the one-dimensionality and that the linguistic difference could explain a majority of its variance. The Testlet results also showed that the average estimation of monolinguals’ abilities was higher than bilinguals’ abilities. Besides, it was indicated that items embedded with uniform differential item functioning were more difficult for monolinguals than bilinguals but monolinguals outperformed bilinguals in multiple choice items embedded with non-uniform differential item functioning.
Overall, the results showed that the traits related to the two literal Testlet were differently perceived among monolingual and bilingual students and local item dependence was more evident among bilinguals than monolinguals. Also, the results indicated a difference between the performance of monolingual and bilingual students in the mixed items format.

کلیدواژه‌ها English

Graded response model

bi-factor model

dimensionality

differential item functioning

Testlet-based tests.

Alper Kose, I., & Demirtasli, N. C. (2012). Comparison of unidimensional and multidimensional models based on item response theory in terms of both variables of test length and sample size. Procedia - Social & Behavioral Sciences, 46, 135 – 140. https://doi:10.1016/j.sbspro.2012.05.082

Baker, C. (2011). Foundations of bilingual education and bilingualism (5th ed.). Bristol: Multilingual Matters.

Beretvas, S. N. & Walker, C. M. (2012). Distinguishing differential testlet functioning from differential bundle functioning using the multilevel measurement model. Educational & Psychological Measurement, 2 (2), 200–223. https://doi:10.1177/0013164411412768.

Beretvas, S. N.; Cawthon, S. W.; Lockhart, L. L. & Kaye, A. D. (2012). Assessing Impact, DIF, and DFF in Accommodated Item Scores: A Comparison of Multilevel Measurement Model Parameterizations. Educational & Psychological Measurement, 72 (5).

Bryk, A. S., & Raudenbush, S. W. (1992). Hierarchical linear models: Applications and data analysis methods—Advanced quantitative techniques in the social sciences. Newbury Park, CA: SAGE.

Cai, L.; Yang, J. S. & Hansen, M. (2011). Generalized full-information item bifactor analysis. Psychological Methods, 16, 221–248. https://doi:10.1037/a0023350.

Chen, T. T. & Fienberg, S. E. (1974). Two-Dimensional Contingency Tables with Both Completely and Partially Cross-Classified Data. Biometrics, 30 (4), 629 – 642.

DeMars, C. E. (2006). Application of the bi-Factor multidimensional item response theory model to testlet-based tests. Journal of Educational Measurement, 43 (2), 145–168. https://doi:10.1111/j.1745-3984.2006.00010.x

DeMars, C. E. (2013). A Tutorial on interpreting bifactor model scores. International Journal of Testing, 13 (4), 354-378. https://doi:10.1080/15305058.2013.799067

Duan, J. C.; Hardle, W. K. & Gentle, J. E. (2012). Handbook of computational finance. Springer Heidelberg dordrecht London new york

Elosua Oliden, P. & Mujika Lizaso, J. (2014). Impact of family language and testing language on reading performance in a bilingual educational context. Psicothema, 26 (3),328-335. https://doi: 10.7334/psicothema2013.344

Fukuhara, H. (2009). A differential item functioning model for testlet-based items using bi-factor multidimensional item response theory model: a baysian approach, electronic thesis. treatises and dissertations, florida state university libraries.

Gibbons, R. D. & Hedeker, D. R. (1992). Full-information item bi-factor analysis. Psychometrika, 57 (3), 423-436.

Hambleton, R. K., Swaminathan, H., & Rogers, J. H. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage Publications, Inc.

Holzmüller , H. H.; Singh, J. & Nijssen, E. J. (2002). Multicetric cross-national research: A typology and illustration. Retrieved Februray 19, 2015, from http://www.wiso.tudortmund.de/wiso/m/Medienpool/Arbeitspapiere/Arbeitsbericht06.pdf

Kim, S. & Kolen, M. J. (2006). Robustness of Format Effects of IRT Linking Methods for Mixed Format Tests. Applied Measurement in Education, 19 (4), 357-381.

Lee, Y.-W. (2004). Examining passage-related local item dependence (LID) and measurement construct using Q3 statistics in an EFL reading comprehension test, Language Testing (Vol. 21, pp. 74-100). Princeton, NJ: Educational Testing Service.

Ling Ping, H. & Islam, M. A. (2008). Analyzing Incomplete Categorical Data: Revisiting Maximum Likelihood Estimation (Mle) Procedure. Journal of Modern Applied Statistical Methods, 7 (2), 488-500. https://doi:10.22237/jmasm/1225512780.

Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsboro, NJ: Erlbaum.

MD Desa, Z. N. D. (2012). Bi-factor Multidimensional Item Response Theory Modeling for Subscores Estimation, Reliability, and Classification. Retrieved out 8, 2016, from https://kuscholarworks.ku.edu/bitstream/handle/1808 /10126/MdDesa_ku_0099D_12360_DATA_1.pdf;sequence=1

Moore, D. White (2015). Unidimensioal vertical scaling of mixed format tests in the presence of item format effect, thesis and dissertation, University of Pittsburgh.

Morton, J. B. & Harper, S. N. (2007). What did Simon say? Revisiting the bilingual advantage. Developmental Science, 10, 719–726.

Mullis, I. V. S.; Martin, M. O.; Gonzalez, E. J., & Kennedy, A. M. (2003). PIRLS 2001 international report: IEA’s study of reading literacy achievement in primary school in 35 countries. Retrieved on oct 23, 2017.

From https://timss.bc.edu/pirls2001i/pdf/p1_IR_book.pdf

Mullis, I. V. S.; Martin, M. O.; Kennedy, A. M. & Foy, P. (2007). PIRLS 2001 international report: IEA’s progress in international reading literacy study in primary school in 40 countries. Retrieved on oct 23, 2017.

From https://timss.bc.edu/PDF/PIRLS2006_international_report.pdf

Mullis, I. V. S.; Martin, M. O.; Foy, P. & Drucker, K. T. (2012). PIRLS 2011 international result in reading, Retrieved on oct 23, 2017. from https://timssandpirls.bc.edu/pirls2011/downloads/ P11_IR_FullBook.pdf

Rauch, D. P. & Hartig, J. (2010). Multiple-choice versus open-ended response formats of reading test items: A two-dimensional IRT analysis. Psychological Test and Assessment Modeling, 52 (4), 354-379.

Rijmen, F. (2011). Hierarchical factor item response theory models for PIRLS: capturing clustering effects at multiple levels, IERI monograph series: issues and methodologies in large-scale assessments, 4, 59-74.

Ravand, H. (2015). Assessing Testlet Effect, Impact Differential Testlet, and Item Functioning Using Cross-Classified Multilevel Measurement Modeling. SAGE Open 5, 1-9. https://doi: 10.1177/2158244015585607.

Sireci, S. G., Thissen, D., & Wainer, H. (1991). On the reliability of testlet-based tests. Journal of Educational Measurement, 28 (3), 237-247.

Smarter balanced assessment Consortium: 2013-2014 technical report (2015) Validity, item and test development,pilot test and field test, achievement level setting. Retrieved Februray 18,2015, fromhttp://www.smarterbalanced.org /wpcontent/uploads/2015/08/201314_Technical_Report.pdf

Syahabuddin, K. (2013). Student English Achievement, Attitude and Behaviour in Bilingual and Monolingual Schools in Aceh, Indonesia. School of Education, Faculty of Education and Arts Edith Cowan University, Perth, Western Australia

Tao, W. (2008). Using the score-based testlet method to handle local item dependence. electronic thesis and dissertation, Boston college.

Thissen, D.; Steinberg, L., & Mooney, J. A. (1989). Trace lines for testlets: A use of multiple-categorical-response models. Journal of Educational Measurement, 26 (3), 247-260.

Thurstone, L. L. (1925). A method of scaling psychological and educational tests. The Journal of Educational Psychology, 16 (7), 433-451.

van de Vijver, F. J. R., & Leung, K. (1997). Cross-cultural psychology series, Vol. 1. Methods and data analysis for cross-cultural research. Thousand Oaks, CA, US: Sage Publications, Inc.

Wainer, H. & Thissen, D. (1993). Combining multiple-choice and constructed-response test scores: Toward a Marxist theory of test construction. Applied Measurement in Education, 6 (2), 103-118.

Wang, X.; Bradlow, E. T., & Wainer, H. (2002). A general Bayesian model for testlet: Theory and applications. Applied Psychological Measurement, 26 (1), 109-128.

Wei, L. (2010). BAMFLA: Issues, methods and directions. International Journal of Bilingualism, 14 (1), 3-9

Yao, C. (2008). Mixed-format test equating: effects of test dimension and common-item sets. Unpublished Doctoral Dissertation, University of Maryland, College Park.

Yen, W. M. (1993). Scaling performance assessments: Strategies for managing local item dependence. Journal of Educational Measurement, 30 (3), 187-213.

Zhang, B. (2010). Assessing the accuracy and consistency of language proficiency classification under compet-ing measurement models. Language Testing, 27, 119-140. doi:10.1177/0265532209347363.

Zhang, O.; Shen, L., & Cannady, M.(2010, April). Polytomous IRT or testlet model: An evaluation of scoring models in small testlet size situations. Paper presented at the Annual Meeting of the 15th International Objective Measurement Workshop, Boulder, CO, USA.

Zenisky, A. L., Hambleton, Z. R. K. and Sireci, S.G. (2003). Effects of local item dependence on the validity of IRT item, Test, and Ability statistics. Retrieved out 8, 2016, from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1. 134.5458&rep=rep1&type=pdf