Implicational Scaling of Reading Comprehension Construct: Is it Deterministic or Probabilistic?

Document Type: Research Paper

Authors

1 Faculty of Persian Literature and Foreign Languages, South Tehran Branch, Islamic Azad University, Tehran, Iran

2 Instituto de Evaluacion e Ingenieria Avanzada, S.C. San Luis Potosi, Mexico

Abstract

In English as a Second Language Teaching and Testing situations, it is common to infer about learners’ reading ability based on his or her total score on a reading test. This assumes the unidimensional and reproducible nature of reading items. However, few researches have been conducted to probe the issue through psychometric analyses. In the present study, the IELTS exemplar module C (1994) was administered to 503 Iranian students of various reading comprehension ability levels. Both the deterministic and probabilistic psychometric models of unidimensionality were employed to examine the plausible existence of implicational scaling among reading items in the mentioned reading test. Based on the results, it was concluded that the reading data in this study did not show a deterministic unidimensional scale (Guttman scaling); rather, it revealed a probabilistic one (Rasch model). As the person map of the measures failed to show a meaningful hierarchical order for the items, these results call into question the assumption of implicational scaling that is normally practiced in scoring reading items.
In English as a Second Language Teaching and Testing situations, it is common to infer about learners’ reading ability based on his or her total score on a reading test. This assumes the unidimensional and reproducible nature of reading items. However, few researches have been conducted to probe the issue through psychometric analyses. In the present study, the IELTS exemplar module C (1994) was administered to 503 Iranian students of various reading comprehension ability levels. Both the deterministic and probabilistic psychometric models of unidimensionality were employed to examine the plausible existence of implicational scaling among reading items in the mentioned reading test. Based on the results, it was concluded that the reading data in this study did not show a deterministic unidimensional scale (Guttman scaling); rather, it revealed a probabilistic one (Rasch model). As the person map of the measures failed to show a meaningful hierarchical order for the items, these results call into question the assumption of implicational scaling that is normally practiced in scoring reading items.

Keywords


Alderson, J. C. (1990). Testing reading comprehension skills (part one). Reading in a Foreign Language, 6, 425-438.                                                                                                                                       

Alderson, J. C. (1991). Language testing in the 1990's: How far have we come? How much further have we to go? In S. Anivan (Ed.), Current developments in language testing: Anthology series (pp.1-26). Singapore: SEAMEO Regional Language Center.

Alderson, J. C. (2000). Assessing Reading. Cambridge: Cambridge University Press.

Allerson, S., & Grabe, W. (1986). Reading Assessment. In F. Dubin & D. E. Eskey & W. Grabe (Eds.), Teaching second language reading for academic purposes (pp.161-181). NY: Addison-Wesley Publishing Company, Inc.

Anderson, R.W. (1978). An implicational model for second language research. Language Learning, 28,  221-278. 

Bachman, L. F., (1995).  Fundamental considerations in language testing.  London: Oxford University Press.

Baker, D. (1989). Language Testing. London: Edward Arnold.                                          

Bart, W. M., & Krus, D. J. (1973). An ordering-theoretic method to determine hierarchies among items. Educational and Psychological Measurement, 33, 291-300.

Barrett, T.C. (1968). The Barrett Taxonomy of the cognitive and affective dimensions of reading comprehension.In H. M. Robinson, Innovation and change in reading instruction (pp.1-30). NY: the National Society for the Study of Education

Baudoin, E. M, Bober, E. S., Clarke, M. A., Dobson, B. K., & Silberstein, S. (1977). Reader’s Choice: A reading skills textbook for students of English as a second language. Michigan: The University of Michigan Press.

Beaton, A.E., & Allen, N.L. (1992). Interpreting scales through scale anchoring. Journal of Educational Statistics, (17), 191-204.

Beaton, A.R., & Jonson, E.G. (1992). Overview of the scaling methodology used in the national assessment. Journal of Educational Measurement, (29)2, 163-175

Biggs, J.B., & Collis, R.E. (1982). Evaluating the quality of learning: The SOLO taxonomy. New York: Academic Press.

Bloom, B.S. (1957). Taxonomy of Educational Objectives: The Classification of Educational Goals, Handbook I, cognitive domain. New York-Toronto: Longmans-Green.

Bloom, B. S. (1994). Reflections on the development and use of the taxonomy. In L.W. Anderson & L.A. Sosniak  (Eds.), Bloom’s taxonomy a forty-year retrospective (pp.1-8). NY: the National Society for the Study of Education.

Carrell, P. L. (1987). Readability in ESL. Reading in a Foreign Language, 4, 21-40.

Champeau De Lopez, C.L., Marchi, G. B., & Coyle, M. E. A. (1997, April-June). A taxonomy evaluating reading comprehension in EFL. Forum, 35(2), 30, from http://www.exchanges.state.gov/forum/vols/vol35/no2/p30.htm

Cazden, C.B. (1971). Evaluation of learning in preeschool education : Early language development. In Bloom B., Hasting J. & Madaus G. (Eds.), Handbook of formative and summative evaluation of student learning (pp. 345-398). NY: McGraw Hill.

Daftarifard, P. (2002). Scalability and divisibility of the reading comprehension ability. Unpublished master’s thesis, Iran University of Science and Technology, Tehran.

Foley, J.J. (1971) Evaluation of learning in writing. In Bloom B., Hasting J. & Madaus G. (Eds.) Handbook of formative and summative evaluation of student learning ( pp. 767-814 ). NY: McGraw Hill.

Glass, G.V. (N/D)  Building tests that make students think. In Test and grades. Chap.1. Available in internet: http://glass.ed.asu.edu/TG/chp1.htm

Grabe, W. (1986). The Transition from Theory to Practice in Teaching Reading. In F. Dubin, D. E. Eskey, & W. Grabe (Eds.), Teaching second language reading for academic purposes  (pp. 25-48). NY: Addison-Wesley Publishing Company.

Grabe, W. (1991).  Current developments in second language reading research. TESOL Quarterly, 25, 375-406.

Grabe, W. (1997). Reading research and its implications for reading assessment. LTRC paper.

Grabe, W. (2009). Reading in a second language: Moving from theory to practice.
Cambridge: Cambridge University Press.

Gray, W. S. (1960). The major aspects of reading. In H. Robinson (Ed.), Sequential development of reading abilities (pp. 8-24). Chicago: Chicago University Press.

Guttman, L. L. (1974). The basis for scalogram analysis. In G. M. Maranell (Ed.), Scaling: A sourcebook for Behavioral Scientists, (pp. 142-171). NY: Aldine Publishing Company.

Hajipournezhad, G. R. (2001, Oct.). Reading complexity judgments: Episode 1. Shiken: JALT Testing & Evaluation SIG Newsletter 5 (pp. 2 – 5)  from http://www.jalt.org/test/haj_1.htm

Hillocks, G. JR., & Ludlow, L. H. (1984). A taxonomy of skills in reading and interpreting fiction. American Educational Research Journal, 21, 7-24.

Hatch E., & Farhady, H. (1981).  Research design and statistics for applied linguistics. LA: University of California.                                                                                                                                       

Henning, G. H. (1977). A developmental analysis of errors of adult Iranian students of English as a foreign language. Language learning, 28, 387-397.                                                                                   

Hulstijn, J., (1997). Mnemonic methods in foreign language vocabulary learning: Theoretical considerations and pedagogical implications. In J. Coady & T. Huckin (Eds.), Second language vocabulary acquisition (pp.203-224). Cambridge:  Cambridge University Press.

Hyltenstam, K. (1977). Implicational patterns in inter-language syntax variation. Language Learning, 27, 383-411.

Jensen, L. (1986). Advanced reading skills in a comprehensive course. In F. Dubin, D. E. Eskey, & W. Grabe (Eds.), Teaching second language reading for academic purposes (pp. 103-124). CA: Addison-Wesley Publishing Company, Inc.

Karami, F. (2000). The effect of task variation on the reading comprehension ability of the learners. Unpublished master’s thesis, Iran University of Science and Technology, Tehran, Iran.

Kral, T. (1995). Selected approaches from the creative English teaching forum 1989-93. United States Department of State, EUA

Ludlow, L. H., & Hillocks, G. Jr. (1985). Psychometric considerations in the analysis of reading skill hierarchies. Journal of Experimental Education, 54, 15-21                                                               

Maranell, M. G. (1974). Introduction. In G. M. Maranell (Ed.), Scaling: A sourcebook for Behavioral Scientists (pp. xi-xix). Chicago: Aldine Publishing Company.                                                                   

Matthews, M. (1990). Skill taxonomies and problems for the testing of reading. Reading in a Foreign Language, 7, 511-517.                                                                                                                         

McNamara, T. (1996). Measuring Second Language Performance. NY: Addison Wesley Longman.

Moore W.J., & Kennedy L.D. (1971). Evaluation of learning in the language arts. In B. Bloom, J. Hasting, & G. Madaus(Eds.), Handbook of formative and summative evaluation of student learning (pp. 399-446). NY: McGraw Hill.

Nuttall, C. (1996). Teaching reading skills in a foreign language. Hong Kong: Macmillan Publishers Limited.

Pretorius, E. J. (2000). Reading and the Unisa student: Is academic performance related to reading ability? From http://www.unisa.ac.za/dept/bmi/resrep00/arts/linguist/publicat.html

Roberts, N. (1974). Further verification of Bloom’s taxonomy. Journal of Experimental Education, 45(1), 16-19.

Rost, D. H. (1993).  Assessing different components of reading comprehension: Fact or fiction. Language and Education, 6, 79-91.

Stauffer, S. A. (1974). An overview of the contributions to scaling and scale theory. In G. M. Maranell (Ed.), Scaling: A sourcebook for Behavioral Scientists, (pp. 131-141). Chicago: Aldine Publishing Company.

Trimble, L. (1985) English for science and technology: A discourse approach. London: Cambridge University Press.

Tristan, L.A. (1998) Test blueprint techniques (in Spanish: Tablas de validez de contenido) Instituto de Evaluación e Ingeniería Avanzada. San Luis Potosí, Mexico.

Tristan, L.A. & Molgado, R. D. (2006) Handbook of taxonomies (in Spanish: Compendio de taxonomías. Clasificaciones para los aprendizajes de los dominios educativos). Instituto de Evaluación e Ingeniería Avanzada. San Luis Potosí, Mexico.

Tristan, L.A. & Vidal, U.R. (2007) Linear model to assess the scale´s validity of a test. AERA Meeting, session: “New Developments in Measurement Thinking”, SIG-Rasch Measurement. Available through ERIC: ED501232.

Urquhart , A.H., & Weir, C. J. (1998). Reading in a second language: Process, product and practice. NY: Longman

Valette, R. M. (1971) Evaluation of learning in a second language. In B. Bloom, J.  Hasting & G. Madaus (Eds.), Handbook of formative and summative evaluation of student learning (pp. 815-854). NY: McGraw Hill.

Weir, C.J., Hughes, A., & Porter, D. (1990). Reading skills: Hierarchies, implicational relationships and identifiability. Reading in a Foreign Language, 7, 505-510.

Weir C. J., & Porter D. (1996). The multi-divisible or unitary nature of reading: The language tester between Scylla and Charybdis. Reading in a Foreign Language, 10, 1-19.