Allen, M. J., & Wendy, Y. M. (1979). Introduction to Measurement Theory. California: Cole publishing company.
Angoff, W.H. (1971). Scales, norms, and equivalent scores. In R. L. Thorndike (Ed.), Educational measurement (2nd ed., pp. 508-600). Washington, DC: American Council on Education. (Reprinted as 'W. A. Angoff, Scales, norms, and equivalent scores'. Princeton, NJ: Educational Testing Service, 1984.)
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, (2014). Standards for educational and psychological testing. Washington, DC: American Psychological Association.
Azar, A., Rajabzade A. (2015). Applied Decision Making MADM Approach. Tehran: Negah Danesh.
Brennan, Robert L., Lee, Won-Chan. (1999) Conditional Scale-Score Standard Errors of Measurement under Binomial and Compound Binomial Assumptions, Educational and Psychological Measurement, Vol 59, Issue 1, pp. 5 – 24.
Brooks, G. P., Johnson, G. A.(2014). TAP: Test Analysis Program [computer software]. Chicago.
Chang, S. W. (2009). Choice of weighting schemes in forming the composites, bulletin of educational psychology,40(3), 489-510, national Taiwan normal university, Taipei, Taiwan, R.O.C.
Chang, S. W. (2006), Methods in Scaling the Basic Competence Test, Educational and Psychological Measurement, 66(6), 907-929.
Dorans N. J., Pommerich, M. & Holland P. W. (2007). A Framework and History for Score Linking. In Holland P. W. (Eds.), Linking and Aligning Scores and Scales (pp 5-30). New York: Springer.
De Boor, C. (2001). A Practical Guide to Splines (Revised Edition). pp. 207–214, New York: Springer.
Feldt, L. S. (2004). Estimating the reliability of a test battery composite or a test score based on weighted item scoring. Measurement and Evaluation in Counseling and Development, 37(3), 184-190.
Gulliksen, H. (1950). Theory of mental test. New York: John Wiley & sons.
Gronlund. N. E. & Linn R. T. (1990), measurement and evaluation in teaching. New York: Macmillan.
Haertel, H. E. (2006). Reliability. In R. L. Brennan (Ed.), Educational measurement (4rd. ed., pp. 65-86). CT: American Council on Education and Praeger.
Iowa Assessment (2016). Iowa Tests of Basic Skills, Retrieved itp.education.uiowa.edu
Ishizaka, A., Nemery, P. (2013). Multi-criteria Decision Analysis: Methods and Software, New York: John Wiley & sons.
Kane, M., & Case, S. M. (2004). The reliability and validity of weighted composite scores. Applied Measurement in Education, 17, 221-240.
Kolen, M. J., Hanson, B. A., & Brennan, R. L. (1992). Conditional standard errors of measurement of scale scores. Journal of Educational Measurement, 29, 285-307.
Kolen, M. J., & Hanson, B. A. (1989). Scaling the ACT Assessment. In R. L. Brennan (Ed.), Methodology used in scaling the ACT Assessment and P-ACT+ (pp. 35-55). Iowa City, IA: American College Testing Program.
Kolen, M. J., Zeng, L., & Hanson, B. A. (1996). Conditional standard errors of measurement for scale scores using IRT. Journal of Educational Measurement, 33, 129-140.
Kolen, M.J. (1991). Smoothing methods for estimating test score distributions. Journal of Educational Measurement, 28, 257-282.
Kolen, M. J., & Brennan, R. L. (2014). Test Equating, Scaling and Linking (3rd Ed.). New York: Springer.
Kolen, M.J. (2006), Scaling and norming. In R. L. Brennan (Ed.), Educational measurement (4rd ed., pp. 236-241). CT: American Council on Education, and Praeger.
Kolen, M. J, Wang, T., Lee, W. Chon. (2012), Conditional Standard Errors of Measurement for Composite Scores Using IRT, International Journal of Testing, 12, 1-20.
Lord, F. M., & Novick, M. R. (1967). Statistical theory of mental test scores. MA: Adisson-wesley.
Nunnally, J. c., & Bernstein, I. H. (1994). Psychometric theory. New York: McGraw-Hill.
Magnusson, D. (1967). Test theory. MA: Addison-Wesley.
Nitko, A. J. (2001), Educational assessment and evaluation (3rd Ed.). New Jersey: Merrill prentice-hall.
Pei, L. K., & Maller, S. J. (2006). Monte Carlo simulation study of differential weights on composite reliability and validity. Paper presented at the annual meeting of the National Council on Measurement in Education, San Francisco.
Petersen, N. S., Kolen, M. J., & Hoover, H. D. (1989). Scaling, norming, and equating. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 221-262). New York: American Council on Education, and Macmillan.
Price, R. L., Raju, N., Lurrie, A. Wilkins, C. & Zhu, J. (2006). Conditional standard errors of measurement for composite scores on the Wechsler Preschool and Primary Scale of Intelligence-Third Edition, Psychological Reports,98,237-252
Rudner, L. M. (2001). Informed test component weighting. Educational Measurement: Issues and Practice, 20(1), 16-19.
Sutton, R. (2004). Teaching under high-stakes testing: Dilemmas and decisions of a teacher educator. Journal of Teacher Education, 55(5), 463-475.
Testing, National Organization. (2015, Sep 01). NOET web page. Retrieved from
www.sanjesh.org
The ACT, The ACT technical manual (2014), Retrieved
www.act.org
The SAT, SAT technical manual (2015), Retrieved collegereadiness.collegeboard.org.
Wang, T. (1998). Weights that maximize reliability under a congeneric model. Applied psychological measurement, 22(2), 179-187.
Wang, M. W., & Stanley, J. C. (1970). Differential weighting: A review of methods and empirical studies. Review of Educational Research, 4, 663- 705.
Woodruff, D., Traynor, A., Cui, Z., Fang, Y., (2013). A Comparison of Three Methods for Computing Scale Score Conditional Standard Errors of Measurement,
ACT Research report series, no.7. Retrieved from
www.act.org.
Zolfagharnasab, S., Khodaei, E., Yadegarzadeh, G. (2013). Optimum Weighting to Entrance Subtests and Their Items to Make Composite Score. Educational Measurement and Evaluation Studies, 3(4), 79-104.