- Bond, TG and Fox, CM (2007). Applying the Rasch model: Fundamental measurement in the human sciences (2nd Ed), Mahwah, NJ, USA: Lawrence Erlbaum.
- Childs, R, Eligie, S, Gadalla, T, Traub, R and Jaciw, A. (2004). IRT-linked standard errors of weighted composites, Practical Assessment. Research and Evaluation, 9, (13).
- Cronbach, L, Glesser, G, Nanda, H and Rajaratnam, N (1972). The dependability of behavioral measurements: Theory of generalizability for scores and profiles, Chichester: Wiley.
- Feldt, L and Brennan, R (1989). Reliability, Educational Measurement(3rd Edition, R Linn Ed), the American Council on Education, MacMillan, pp. 105-146.
- Gill, T and Bramley, T (2008). Using simulated data to model the effect of inter-marker correlation on classification, Research Matters: A Cambridge Assessment Publication, 5, pp. 29-36.
- Hambleton R and Swaminathan, H. (1983). Item response theory: Principles and applications, the Netherlands: Kluwer-Nijoff.
- Hambleton, R, Swaminathan, H, and Rogers, J (1991). Fundamentals of Item Response Theory, Newbury Park, Ca, 12, pp. 177- 184.
- Runder, L (2001). Informed test component weighting, Educational Measurement: Issues and Practice, 20, pp. 16-19.
- Wang, M and Stanley, J (1970). Differential weighting: A review of methods and empirical studies, Review of Educational Research, 40, pp. 663-705.
- Webb, N; Shavelson, R and Haertel, E. (2007). Reliability coefficient and generalizability theory, Handbooks of Statistics 26: Psychometrics (C Rao and S Sinharay Eds), pp. 81-120.
- Wright, B and Masters G (1982). Rating scale analysis, Rasch Measurement, Chicago, IL, USA: MESA Press.
- Wu, M and Adams, R (2006). Modelling mathematics problem solving item responses using a multidimensional IRT model, Mathematics Education Research Journal, 18, pp. 93-113.