Semiotics+of+Standardized+Testing

In the process of criticizing the practice of standardized testing, many authors have engaged with the meaning-making processes of these tests. The first semiotic issue has to do with the words we use to refer to the whole process of assessment—“standardized testing.” The word “standardized” suggests that the assessment has authority because every child experiences it in the same way. In fact, the whole idea of being able to compare a classroom’s scores with the norm sample is based on the idea that the conditions under which the classroom experienced the test are the same as those of the norm sample. Wodtke (1989) examines testing procedures across ten kindergartens and finds egregious departures from testing manual procedures. Ryan states that motivation may play a part in how people do on a test, so that means that the test is measuring their motivation instead of their math skills. Rodekohr discusses how language tests are problematic for dialect speakers--they represent "normal" Enlish rather than any particular issue with language. Slomp has a venn diagram of the differences between real writing and test writing, indicating that test writing is not particularly well-related to real writing.

A test is a semiotic system in itself, which purports to represent qualities of a person’s interaction with written language, another semiotic system, and to report these qualities in the form of numbers, a third semiotic system. We therefore have issues of semiosis itself, in terms of how semiotic systems “carve” up whatever world they are representing and we also have issues of translating one semiotic system (the subject of the test) into another (the test) and interpreted by a third (statistics). While these semiotic systems overlap considerably, they do so problematically. From a semiotic perspective, a standardized test is a poor form of representing a person’s understanding of a topic or ability to use that information. Semiotic Theory

Tests as a Semiotic System --Signs in Tests--Relationship of Signifier and Signified--Presence and Absence

Signifier/Signified Within a test question lies an opposition between the right choice and the wrong ones. The one that is identified by the test makers as the right choice, when "right" choice may be a subtle notion. The distractors may have elements that are correct, but the test taker is told to choose the "best" answer.

Presence and Absence Jakobson described the two poles of language. We can think of syntax as the horizontal pole--it is the string of words put together, including the ways in which the order of words creates meaning. "The cat played with the ball" and "The ball played with the cat" have exactly the same words but very different meanings. The other pole is vertical. It is the word that was chosen as opposed to all the other possible words. As anyone who has been in a conflict with someone knows, what is not said can be just as meaningful as what is said.

Questions of inference are playing with the vertical pole. When information is to be inferred, it is not present directly in the text. Instead, it is to be extracted from the text by association. The problem is, whose association? One of the challenges of meaning-making is that we all have a different chain of associations with each word. There are enough overlaps that meaning can be made--but when there is a cultural difference, someone can be a successful reader without being able to infer the same thing as the test makers.

Desire and Tests Stenner and Burdick (1997) present an explanation of the derivation and use of Lexiles in reading comprehension and analyzing texts. At first, the simplicity of the idea behind measuring a reader and measuring a text and then putting the two together seems so desirable and so do-able. With the system they describe, a teacher could be sure that students were reading texts at the 75% comprehension level, so they would be learning but not be frustrated. It seems so neat and clean. The math cleans up the ambiguity of words in a way that makes one want to sigh in relief--at last, we have a way to help kids learn to read.

This is a case for Jacques Derrida, in two ways. One is that in teaching a Reading for Learning class (Reading Across the Curriculum) just for music education majors, I present students with information on how to assess readability, for which they are grateful. But then I problematize the idea of readability with two texts. One is from Kofi Agawu's book on the semiotics of Classical (as in the era of Mozart and early Beethoven) music, which on the Flesch scale is 25--a very difficult text. The other is an easier text according to Flesch--42. It is a passage from one of Jacques Derrida's essays in French philosophy. My music education majors can make sense of the Agawu text but throw up their hands when faced with Derrida. Derrida uses simpler words than Agawu, but the concept he is explaining is extremely complex. A mathematical analysis of a text, as is the Flesch scale and as Stenner and Burdick describe, fails to capture the relative difficulty of two texts because it cannot account for what makes these texts different in terms of difficulty. Agawu (see Appendix A) is introducing the idea of linguistic-like characteristics of music during the late 18th century, and while he does so in a complex way that truly represents the complexity of the relations between music and language, nevertheless, Agawu is using language in the way that academics normally use language--to describe a specific idea.

Derrida, in contrast, (see Appendix A) discusses the problematic idea of structure--that while we think in terms of structures with centers and organization, in fact with any structure we create, there is play. This includes language as a structure, so while Derrida is discussing the fact that structures always contain a little unavoidable chaos in them, he is using language not just to get this idea across but to represent the idea through the way in which he uses language. Language is not a transparent medium--it morphs meaning because of its structure and because of the inevitable incoherencies in the structure. So Derrida plays with language as he expresses the idea that structure as structure does not really exist.

The world of Stenner and Burdick is a carefully-built structure, bolstered by a lot of sophisticated statistics and careful quantitative research methodologies. In that sense it is really beautiful and therefore desirable--but it can be undone by one French philosopher. In fact, the analogy that Stenner and Burdick use throughout their essay is that of a temperature scale on a thermometer and the process of measuring temperature. Temperature is a physical phenomenon, pretty well understood by people and easily measurable in a reliable and valid way. There are centuries of data to verify the reliability and the validity of the idea of temperature. In the education world there is a desire to make the various aspects of education, including literacy, objective and scientific, so that tests are considered to be like thermometers. Many people in education would like to have the status of people in the "hard" sciences (a Derridean exploration of "hard" and "soft" or "easy" science would be very funny at this point but too much of a sidetrack) and in order to gain this, they have adopted quantitative research methodologies.

Yet ultimately you cannot take a text's temperature and you cannot take a reader's temperature, either. Or you can, but at some point the structure falls apart. I have illustrated one instance of play in a mathematically-determined reading analysis--where Derrida is really more difficult than Agawu, but there are many more. There are instances of play within the texts we read, within the relationship between the texts and the questions around them,

InferenceEnfield describes how culture and language are necessarily intertwined and that while human languages have certain logical forms in common, there is something called cultural logic: The phenomenon of cultural logic assumes an interpersonalecology, emphasizing the interdependence of semiotic and conceptualsystems, allowing for tightly organized subsystems of conceptual relations(as found in linguistic structure), as well as more loosely organized systems(as in less systematized cultural concepts and practices), embedded withinthe overall mass of cognitive/experiential representations. This ecologicalinterdependence is operative across conceptual/experiential systems withinthe realm of the individual, as well as between individuals, providing a flowof significance across the community of egos—as if individuals were rockpools on a coastal reef, apparently separate, yet whose internal ecologiesare constantly linked by ebbs and flows of sea water and the myriad formsof life they transport. (Enfield, p. 36) We have mental representations of the things and people in our world; we use various semiotic systems to align our mental representations with those of other people, but the extent to which this endeavor is successful is based on the degree to which any two people share cultural logic. He describes the cultural logic behind two different approaches to motorcycle headlights--Australia where headlights are supposed to be used during the day, and Laos where police can fine people for using their headlights during the day.

Later he states: Culture emerges from the irresistible tendency for individuals to buildconvention and to establish stereotypes and other kinds of precedents, soas to form personal libraries of models and scenarios which may serve asreference material in inferring and attributing motivations behind people’sactions, and behind other mysterious phenomena. This process of establishingconceptual convention depends directly on semiotics, since groupsof individuals rely on external signs as material for common focus, and,thereby, agreement. And language is the part of culture that has thegreatest semiotic presence. (Enfield, p. 37) Inference means being able to delineate something which is absent, and it is part of the GRE. Yet because language and culture are so inseparable, inference can only be done "correctly" if one share the same cultural logic of those who created the test. Inference in the GRE is not a test of linguistic comprehension or the ability of a person to handle logic per se. Because it is language-based, it is necessarily an artifact of its culture. There is no way to ascertain a person's ability to handle language that is separate from culture, that of the test maker or the test taker.

Semiotic Issues of Translation Lost in translation. Why do we have so many semiotic systems--language, maps, visual arts, movement, ASL, music, etc.? That is because there are some things that can be expressed in one system that cannot be expressed in other systems. A linguistic example I am fond of using is the idiom in Cajun French--"Oh ye yaille" which is used in a lot of Cajun songs. Cajun music sounds happy but the words are often about sad events such as death, the breaking up of a relationship, or getting drunk. "Oh ye yaille" is an expression of an almost inexplicable color of sadness and people who don't speak Cajun French do not have access to this marvelous phrase to explain their feelings.

When we translate from one language to another there are often difficulties in expressing the subtleties and connotations of the original meaning. For example, the creators of the King James Version of the Bible struggled with the Greek word, agape. They ended up with "charity," which does not express the type of love that agape expresses. Yet because there is only one word in English for a concept that the Greeks divided into at least three words (eros, agape, philios), the translators had to pick a necessarily meager word. The richness of the Greek understanding of love and the terms of the types of love were lost in the translation to English. Later translations of the Bible use the word "love," but even so English speakers do not know how to separate the idea of eros from the idea of agape unless they are aware of the original Greek.

The task of translation is much more fraught with pitfalls when the translation moves between semiotic systems. This kind of translation has been used in art, such as Handel's musical setting of words from the Bible in The Messiah. Handel uses word painting to represent the meaning of the words in his music; for example, in the chorus "For Unto Us a Child Is Born," the music changes drastically when the chorus begins singing "And His name shall be called, Wonderful, Counselor, etc." The music reflects the excitement and beauty of the names of the Messiah. In this case, the translation of an idea from one semiotic system to another enhances the idea, in part because the musical translation is accompanied by the words. It would be impossible to derive the Bible verses just from listening to the music, but when the music is sung, the melodies and harmonies amplify the words.

Also, when an artist translates from one semiotic system to another, there are typically no high stakes. If the translation does not work, then no one's life is changed, doors to opportunities do not shut. Unfortunately with standardized testing, we have the problems of translating from one semiotic system to a drastically different semiotic system along with some high stakes for the individuals who take the tests. Test scores can determine where one can go to college and even if one can graduate from high school.

Tests translate something called a person's reading or math or whatever ability into a series of questions. Those questions, in turn, are translated into a raw score, which is then translated into statistical concepts which compare the person to all other people.

In their semiotic analysis of mathematics knowledge, Elia et al. point out that:

The understanding of functions does not appear to be easy, given the diversity ofrepresentations related to this concept (Hitt, 1998). Sierpinska (1992) indicated that studentshave difficulties in making the connections between different representations of the notion(formulas, graphs, diagrams, and word descriptions), in interpreting graphs and manipulatingsymbols related to functions. Some students’ difficulties in the construction of concepts arelinked to the restriction of representations when teaching. Mathematics instructors, at thesecondary level, traditionally have focused their instruction on the use of algebraicrepresentations of functions rather than the approach of them from the graphical point of view(Eisenberg and Dreyfus, 1991; Kaldrimidou and Iconomou, 1998). Markovits, Eylon andBruckheimer (1986) observed that translation from graphical to algebraic form was moredifficult than the reverse conversion and that the examples given by the students were limitedin the graphical and algebraic form.The findings of the above studies are related to the phenomenon of compartmentalization.The existence of compartmentalization reveals a cognitive difficulty that arises from the needto accomplish flexible and competent conversion back and forth between different kinds ofmathematical representations of the same situation (Duval, 2002), which according to Arcavi(2003) is at the core of mathematical understanding. (p. 257)

Mathematical understanding is not just linked to the ability to do math problems; it involves being able to understand the same idea in more than one way and to be able to recognize an idea across a variety of representations such as formulae, graphs, and so forth. If this is truly what we want students to be able to do, then many standardized math assessments are not even approaching the depth and richness of this understanding. When teachers then teach to the test, students receive a shallow, solve-the-problem-oriented approach to mathematics which keeps them from really understanding math in a fundamental way.

People as Numbers In an article I wrote about the DSMIII, I pointed out how the translation of a patient's complaints into numbers gave the diagnostic process an unwarrented amount of authority due to the objectivity we associate with numbers. Tests do the same thing--they take something complex like a person's ability to read or a person's understanding of teaching and translate that into a score which then gets statistically manipulated so it can have some kind of meaning. This process has three unfortunate consequences. One is that something about a person becomes vastly oversimplified; a test is a single snapshot and the aggregate of the test is a number that cannot adequately represent knowledge or learning, as we have seen.

The second consequence is insidious. We associate numbers with authority, so we fool ourselves into thinking that the test has authority because of the numbers. Psychology and related fields have had science envy for a long time and the creation of tests that yield numerical data that is fairly consistent have allowed them entree into the hallowed halls of the scientific method. When you take a second look at how tests really work or fail to work, what they mean and fail to mean, it is amazing how much of a foothold testing has in our educational endeavors. The third consequence is that the raw score of the test undergoes another translation, this time measuring the person against the group of people who normed the test. The tyranny of the bell curve is that it is a social construct that privileges the mean.

Science and Meaning

Brier (2001) states:

Realizing that we make science with our mind an eliminative materialistic view ofmind is, seen from a philosophy of science-viewpoint, self-contradictory. Scientificknowledge is constructed socially by subjective minds interacting with nature. It,therefore, seems obvious that we have to admit that our inner “subjective” world is asfoundational a part of reality as “objective” external nature and “intersubjective” social worlds. (p. 405)

The desire of science to remove human arbitrariness is a pipe dream.



This is a great way to articulate what is at stake--that the aspects of the meaning-making process that can be translated into a mathematical algorithm are so small and non-representative of the entire process of semiosis that it is meaningless and pretty much worthless.

The point of Brier's work is to create a framework for working across disciplines; he points out the limitations of scientific knowing, especially in relation to the work of the humanities. This is applicable because literacy is the epitome of the humanities and using a scientific construct (standardized testing) to assess the complex meaning-making process that reading is has the same problems as using the scientific method to understand a piece of literature.

Ward's research demonstrates how standardized testing actually disrupts the process of mainstreaming in that the focus on the test has set up an educational situation where students with disabilities actually spend significantly less time in the regular classroom than similar students in other states. To an extent, then standardized tests represent authority to teachers and teachers do what they can to maximize their chances of students getting good scores by eliminating students who are likely to make poor scores.

References: Elia et al. http://www.aber.ac.uk/media/Documents/S4B/semiotic.html)

Appendix A Agawu text and Derrida text

Agawu

If we can assume that the studies by Rosen and Ratner are representative of the range of methodologies followed by students of Classic music, we can go on to observe that the specific concern with normative procedures—whether these are treated axiomatically as with Rosen, or spelled out in the form of formulaic recipes as with Ratner—grows out of the feeling that the classical style approximates a //language// “spoken” by Haydn, Mozart, and Beethoven, and their contemporaries. Most scholars acknowledge the exemplary and polished nature of this music, hence the terms “Classic,” “classical,” and “classic,” even where attempts are made to dispense with the label altogether. The uniformity of intent necessary for this style to attain the status of a language can therefore be inferred from this characterization. But inference is weaker than explicit demonstration—hence my reference to a “feeling,” by which I mean a persistent current that informs these writings in the form of a subtext; it guides the formulation of the authors’ concepts but it is never made explicit. What is the precise nature and the extent of the linguistic analogy in writings about Classic music? To answer this question, we need to examine a few characteristic descriptions of the music. Descriptions of music in terms of language-based disciplines are commonplace in the musicological literature. In the seventeenth and eighteenth centuries, rhetoric provided a useful model for such discourse, and theorists freely borrowed the language and terminology of rhetoricians. Thus Joachim Burmeister, in his //Musica Poetica// of 1601, drew on literary concepts to characterize compositional strategy as a threefold process—//exordium//, //confirmatio//, and //conclusio//. Johann Mattheson also relied a great deal on rhetorical terms in characterizing the process of a piece of music. In his //Vollkommene Capellmeister// of 1739, Mattheson extended Burmeister’s three-stage model to a six-stage one as follows: //exordium// (introduction), //narratio// (report), //propositio// (proposal), //confirmatio// (corroboration), //confutatio// (refutation), and //peroratio// (conclusion). Later in the century, Heinrich Koch continued, on the one hand, to borrow from rhetoric while, on the other hand, showing a decisive shift from rhetoric to (or, more accurately, //back// to) linguistics, from rhetorical terms to grammatical ones. These trends have continued to the present day, both informally in music criticism, and more formally in the recent theories of Allan Keiler, Mario Baroni, David Lidov, and Lerdahl and Jackendoff, among others. What distinguishes writing about Classic music from that about other music is not merely a general awareness of the affinities between music and language, but a persistent concern with a shadowy linguistic analogy at all levels. Is it perhaps the case that Mozart and Haydn “spoke one language” whereas Brahms and Wagner, Schumann and Chopin, or Bach and Rameau spoke different languages? Certainly a hasty response to this question might cite the fact that it is, at least superficially, easier to mistake, for example, Haydn for Mozart (and vice versa) than it is to mistake Brahms for Wagner, or Rameau for Bach. One might then go on to cite sociological factors—such as the presence of certain societal uniformity in the late eighteenth century, which was then overthrown in the nineteenth, leading to a profound individualization in artistic expression—to support such a viewpoint? Yet our hasty response will still have left many questions unanswered. Derrida: “ We need to interpret interpretations more than to interpret things. ” Montaigne Perhaps something has occurred in the history of the concept of structure that could be called an “ event, ” if this loaded word did not entail a meaning which it is precisely the function of structural — or structuralist — thought to reduce or to suspect. Let us speak of an “ event, ” nevertheless, and let us use quotation marks to serve as a precaution. What would this event be then? Its exterior form would be that of a //rupture// and a redoubling. It would be easy enough to show that the concept of structure and even the word “structure” itself are as old as the //episteme//—that is to say, as old as Western science and Western philosophy—and that their roots thrust deep into the soil of ordinary language, into whose deepest recesses the episteme plunges in order to gather them up and to make them part of itself in a metaphorical displacement. Nevertheless, up to the event which I wish to mark out and define, structure—or rather the structurality of structure—although it has always been at work, has always been neutralized or reduced, and this by a process of giving it a center or of referring it to a point of presence, a fixed origin. The function of this center was not only to orient, balance, and organize the structure—one cannot in fact conceive of an unorganized structure—but above all to make sure that the organizing principle of the structure would limit what we might call the //play// of the structure. By orienting and organizing the coherence of the system, the center of a structure permits the play of its elements inside the total form. And even today the notion of a structure lacking any center represents the unthinkable itself. Nevertheless, the center also closes off the play which it opens up and makes possible. As center, it is the point at which the substitution of contents, elements, or terms is no longer possible. At the center, the permutation or the transformation of elements (which may of course be structures enclosed within a structure) is forbidden. At least this permutation has always remained //interdicted// (and I am using this word deliberately). Thus it has always been thought that the center, which is by definition unique, constituted that very thing within a structure which while governing the structure, escapes structurality. This is why classical thought concerning structure could say that the center is, paradoxically, //within// the structure and //outside it//. The center is at the center of the totality, and yet, since the center does not belong to the totality (is not part of the totality), the totality //has its center elsewhere//. The center is not the center. The concept of centered structure—although it represents coherence itself, the condition of the //episteme// as philosophy or science—is contradictorily coherent. And as always coherence in contradiction expresses the force of a desire. The concept of centered structure is in fact the concept of a play based on a fundamental ground, a play constituted on the basis of a fundamental immobility and a reassuring certitude, which itself is beyond the reach of play.

Before we decide how something is being represented, we need to consider what is being represented. Reading has been defined many ways.

Reading as a functional act Different theorists would answer this question in different ways. In the Adult Literacy in America report (2002), literacy is defined as: “Using printed and written information to function in society, to achieve one’s goals, and to develop one’s knowledge and potential” p. 2. This is a functional definition which is adequate in terms of defining the results of literacy in a person’s life, but it does not get at the “technology” of reading or how reading works as a meaning-making process. Reading as a technical act The excellent website, Children of the Code (childrenofthecode.org)

test resource:http://fairtest.org/

Scherbaum discusses the fact that minorities do worse on easier SAT items and better than whites on harder SAT items. Discusses Item Response Theory--the intersection between the individual exam taker and the particular questions on the exam.

Syverson--issue of how colleges use test scores to represent themselves. Questionable predictive value for sat and act RepresentationA standardized test represents a student's abilities (to do what may be up for grabs) at a given time--on a given day. There are so many factors that can influence that representation--whether the student had breakfast or not, whether or not something upsetting has happened to the student on that day, whether or not

Tal--looks at how different types of readers perform on different types of test questions, which indicates that the tests are maybe not representing the test takers' literacy skills very well. Tal states that the score does not let you know what the test asked takers to actually do in relation to literacy. Has a good analysis of the types of questions and their relationship to text.

ValidityAlthough standardized tests purport to possess authority, one of the most fundamental constructs on which standardized tests are based cannot be determined by any objective means and that is validity. Webster discusses bias in tests. Also discusses the differences in results across several tests.

Teachers teaching to the test affect the tests' validity. Urdan

Urdan p. 138 discusses the uses of standardized tests for many purposes, which is problematic. Urdan p. 139 test "pollution" changing what the test means--related to the issue of standardization and as a result of how people prepare students for the test in the interest of them getting higher scores because of the stakes at issue.

Dealing with the issue of ethical and unethical preparation practices.Teachers get angry with tests because they feel they don't reflect what is going on in their classrooms--they don't represent classrooms well. May not feel compelled to support the integrity of the test because of anger and disagreements with it. That is a source of "pollution."



Quote: definition of testing as they use it in this article, which is a reasonable definition. This means that such assessment (a) is externally imposed by the state government; (b) assesses state-prescribed content standards; (c)follows a uniform procedure in administering, scoring, and interpreting the test; and (d) the results are often used to determine rewards and sanctions for students, teachers, schools, or districts. Wang, p. 307

Wang is arguing that this is more than just multiple choice--it accounts for all standards-based assessment. In general, the idea of standards potentially means that businesses can count on people having a certain body of knowledge that is common to everyone who went to an American school. Also, the idea of standards outside of the classroom means that undereducating students is less of an option for teachers. The No Child Left Behind Act suggests that all children can be educated, which is a radical change from just a generation ago when students of particular cultural backgrounds were considered stupid and routinely written off educationally. The question becomes, how do we get the benefits of outside standards without the negative "cookie cutter" effect and without turning some kids off from education? First of all, since information is easily available on the internet, today's students need less of a particular body of knowledge and more a set of metacognitive strategies for managing information and learning new things. Jobs are rapidly changing because technology is rapidly changing. Ten years ago, the web was a much simpler place and a lot less flexible than it is currently.

Templin discusses Arendt's notion of self-disclosure in relation to politics and states that tests actually prevent self-disclosure--they are a poor representation of a person. Includes the idea of how thinking is related to the possibility of self-expression and that this can be limited which in turn limits thinking. Puts me in mind of Orwell's 1984 and its limits on language, also White's ants.

Steele discusses the issue of identification in relation to school performance in terms of stereotype threat; test results may actually represent not ability but the extent to which someone feels threatened.

Semiotic theory explicates the meaning-making processes in communication between two people whether synchronous (e.g., person to person) or asynchronous (reading the text that was written in the past). There are several facets of semiotic theory that are pertinent to understanding reading. In describing the plaque NASA sent on a space probe, art historian Ernst Gombrich (quoted in Chandler) argues that the plaque would be incomprehensible to alien intelligence, even if it could sense the image: “Reading an image, like the reception of any other message, is dependent on prior knowledge of possibilities; we can only recognize what we know.” Reading is a meaning-making activity in the context of a community that has agreed on the symbols that are being read. As Chandler states: “Without realizing it, in understanding even the simplest texts we draw on a repertoire of textual and social codes” (Chandler, Code). Reading is not something that can possibly take place in a vacuum; it is fundamentally connected to culture. Roman Jakobson describes the way in which culture surrounds the meaning-making process with the following graphic (from Chandler, Coding/Decoding):

About this, Jakobson states: The addresser sends a message to the addressee. To be operative the message requires a context referred to ('referent' in another, somewhat ambivalent, nomenclature), seizable by the addressee, and either verbal or capable of being verbalized, a code fully, or at least partially, common to the addresser and addressee (or in other words, to the encoder and decoder of the message); and finally, a contact, a physical channel and psychological connection between the addresser and the addressee, enabling both of them to stay in communication. (quoted in Chandler Coding/Decoding) All of these factors influence if and how a message is understood and they are therefore important to the process of reading. Jakobson also delineates six possible functions of language (referential, expressive, conative, phatic, metalingual, poetic) that can also influence how we read and understand text. What this means in relation to testing is that a person’s expertise with the code of reading, which is what is allegedly being tested, can be obscured if the code is presented in an unfamiliar context, particularly when answering the question depends on identifying implications of the text rather than just the text itself.

Signs The chief concept in semiotics is the sign—an arbitrary sound, image, or motion that is attached to a concept, as in the word tree in English or arbre in French being attached to the idea of a tree. While this sounds like a simple idea, it is actually very complex. For example, across semiotic systems, concepts don’t get carved up in the same way. The classic probably apocryphal example is that Inuits have many more words for types of snow than people who live in warmer climates. Since sign concepts are central to the human thinking process, how a landscape is carved up into concepts actually affects what a person perceives. Presumably a person from Northern Alaska or Canada would see a snowy landscape in a much more complex way than a person from the southern U.S.

Because different semiotic systems carve up the world in different ways, there is a problem with translation between systems. One example might be the idea of love, which in English is expressed by a single word but in Biblical Greek is expressed by three words: philos, eros, and agape, each of which is a different type of love. Another is the expression of emotions: there are some that cannot be expressed in language, but can find expression in music. Finally, a map is a representation of an area, but it is a simplification that focuses only on certain features. Road and topographical maps do not show where buildings or farms are. A city street map may not show natural features such as creeks and streams.

Signs and Tests It is not always easy to determine what is a sign within a semiotic system. For example, different semiologists have divided up music in different ways, borrowing from language-related semiotic theory but having to account for the ways in which music is different from language. Some go note for note (Nattiez) and others choose larger units of semiosis that overlap (Agawu).

The same is true for tests. What are the signs of tests?

Let’s take a sample question from the study guide for the Praxis Principles of Teaching and Learning (PLT) Early Childhood exam:

Classroom management research findings suggest that one of the most effective ways to maximize the amount of time elementary school children spend on academic activities is for the teacher to do which of the following?
 * 1) Plan for, teach, and enforce routines for transition times and classroom housekeeping tasks.
 * 2) Assign homework three times a week in the major subjects.
 * 3) Assign individual reading on new topics before discussing the topic in class.
 * 4) Introduce new material in a lecture followed immediately by a questioning session on the material. (ETS 2008, p. 9) http://www.ets.org/Media/Tests/PRAXIS/pdf/0521.pdf

The Praxis PLT is designed to assess a pre-service teacher for knowledge in four areas that they identify: Students as Learners, Instruction and Assessment, Communication Techniques, and Profession and Community. Presumably this sample question tests a student in instruction, particularly in the techniques of maximizing the amount of time in the classroom that is devoted to learning. But it also tests knowledge of teaching strategies since the distracters are all related to traditional forms of teaching. And anyone who is not a strong reader will not be able to answer the question, so this question tests certain kinds of reading skills. In relation to the definition of the sign, this question and its answers are a signifier, but the signified is not really clear.

The system of signifiers is actually not representational but rather relational. In the above example, traditional teaching forms (homework, individual reading, lecture) are placed in opposition to a newer focus for teachers, the use of classroom time for learning.

Distracters are opposed to the right answer. The test is parallel to the thing it is testing but the correlations are not high—the connections are not all that good because the system of representation is weak. The answer is nothing without the question and without the distracters—it basically has no meaning in relation to the process of assessing a potential teacher. The question and the wrong answers give the right answer meaning because of their opposition.

The test as a whole is made up of a series of signs which are to “cover” a field, to represent various aspects of a field.

Saussure’s diagram was this:

Introduction to Semiotics states about this picture: The arbitrary division of the two continua into signs is suggested by the dotted lines whilst the wavy (rather than parallel) edges of the two 'amorphous' masses suggest the lack of any 'natural' fit between them. The gulf and lack of fit between the two planes highlights their relative autonomy. (http://www.aber.ac.uk/media/Documents/S4B/sem02.html)

This means that there is a plane A that has a group of signifiers and a plane B that has a group of signifieds. There is nothing that causes these two planes to fit together smoothly. There are arbitrary connections made between signifier and signified, although in this case, it looks like the plane of the signified is fairly completely represented by the signifiers.

In the case of testing, the planes may look more like this:

In other words, each question carves out a portion of the idea of reading, however, there are large portions of reading which are not represented on the test.

Saussure points out that signs are conventional—and in relation to this, tests rely on conventionality. If you have access to the conventions of the test—the kind of thinking that gives rise to the test questions—then you will do well on the test. But if you don’t have access to that sign system, then you won’t do well on the test. The test is a test of knowledge of the conventions of testing and the conventions of tested reading, which is not the same thing as real reading.

Saussure identified several key aspects of language that are important to understanding how tests make meaning

First of all, language is a system—no individual word in a language can make meaning without its relationships with other words. The primary type of relationship that words have is opposition: in other words a word means in relation to its opposite. Hot means hot in relation to the concept of cold. In fact there is no temperature that we would say is concretely “hot”—hot water is 212 degrees Fahrenheit, but iron only melts at 2750 degrees Fahrenheit, so “hot” in relation to iron is much hotter than “hot” in relation to water. Likewise, water is cold at 32 degrees Fahrenheit but that would not be cold in relation to hydrogen which only becomes liquid at -423 degrees. These oppositions can actually increase the distance between two concepts. For example, when we use the concept of “black” and “white” to label people, none of whom have truly black or truly white skin, creates a large gap not just in our language but also in our concepts about human relationships. If I define myself on one end of that spectrum, then I see a person “opposite” from me as at the other end, if I’m not careful, that linguistic gap will become a relational gap. As a meaning-making system, standardized tests are also based on oppositions. The most obvious one is right vs. wrong. On a multiple choice test with four options for each question, a person can be right 25% of the time, on average, just by taking a random guess. We have a scoring system which adds up the right answers and then a statistical meaning-making system which interprets the resulting number in relation to a number of linguistic concepts such as, “average,” or “above average,” “percentiles,” “grade-equivalents,” etc. Like the opposition between black and white in reference to people, the opposition between right and wrong in the world of standardized testing artificially increases the difference between the two concepts. In fact, since the answers are right there on the page and since there is a 25% chance of a person getting a right answer just based on guessing, the opposition of right vs. wrong for any given question or for the test as a whole does not tell us a whole lot about the test taker, the way a person can be culturally “black” but have “white” skin. Another central opposition within standardized testing has to do with the topics. For example, often tests such as the SAT include a literacy portion and a mathematics portion. Theoretically, given these names, the reading test tests reading and the mathematics portion tests mathematical reasoning. But when the math questions are “story problems,” then the opposition is not so dichotomous—the reading test tests some aspects of reading and the math test tests some aspects of reading. Still another opposition has to do with objective vs. subjective. Theoretically, standardized tests are preferred over “authentic” forms of assessment because the test procedures are supposed to be the same at any given test site, multiple choice items are machine-graded, and scores are given in numbers, which provide an aura of precision. Yet there are many subjective aspects of a test that are occluded by the moniker, “objective,” including the questions asked and the construction of the possible answers. For example, how does a reading comprehension test relate to the real ability to read? The preparation pdf file for the GRE gives us an ironic example: The common belief of some linguists that each language is a perfect vehicle for the thoughts of the nation speaking it is in some ways the exact counterpart of the conviction of the Manchester school of economics that supply and demand will regulate everything for the best. Just as economists were blind to the numerous cases in which the law of supply and demand left actual wants unsatisfied, so also many linguists are deaf to those instances in which the very nature of language calls forth misunderstandings in every day conversation, and in which, consequently, a word has to be modified or defined in order to present the idea intended by the speaker: “He took his stick—no not John’s, but __his own__.” No language is perfect, and if we admit this truth, we must also admit that it is not unreasonable to investigate the relative merits of different languages or of different details in languages. (GRE pdf, p. 37).

The oppositions fool us into thinking the tests are better at measuring characteristics of people than they truly are because they artificially inflate the differences in our minds. This is not just an artifact of language; rather, it is a linguistic fact that allows proponents of standardized testing to continue to suggest that the tests have some type of utility.

Another concept of Saussure’s about language is the idea of presence and absence—the two poles of language: syntagm and paradigm. Syntagm is the “horizontal” pole—the stringing of words together in syntax. Paradigm is the vertical pole—the choice of one word over all the other possibilities, including opposition. For example, if I say “the cat plays with the ball,” paradigmatically speaking, the cat could be a dog, a child, or many other possibilities. The poles of language account for the experience we often have of deriving more meaning from what is not said than from what is said, a handy strategy for sorting out the double-speak of national leaders and would-be national leaders. It is common for students to miss questions because they “overthink,” as if thinking about something deeply is problematic. This overthinking comes about because of absence. Tests dependent on reading passages and then answering questions about them, whether these are tests designed to measure reading comprehension or tests where students are to analyze and respond to scenarios, claim that all information necessary to answer the question is present. When people “overthink,” they are often looking at all the possible responses and thinking of possible conditions where those could be true—they have a hard time distinguishing between the absence of the correct answer from the scenario and the absence that would have allowed the other answers to be correct.

Take, for example, the following question which comes from ets.org’s test preparation for the Principles of Teaching and Learning, Early Childhood: Which of the following would be the best indication to a teacher that students are beginning to think critically? A. They talk about earthquakes, space probes, and science-related information in the news. B. They begin to read more books and articles about science on their own. C. They successfully plan and carry out simple experiments to test questions raised in classroom discussions. D. They ask the teacher to read stories to them about scientific topics. http://www.ets.org/Media/Tests/PRAXIS/pdf/0521.pdf, p. 9 (ETS, 2008)

It would be easy to imagine scenarios (by adding context that is absent but reasonable) in which any of the four choices could be a right answer. The main clue in the stem are “best” and “think critically,” which more or less lead to answer C, based on absent information that doing scientific method is a form of critical thinking. The other piece of absent information, however, that might lead a person away from answer C is the fact that this is a test regarding the teaching of younger children, who may not have the cognitive maturity to handle scientific method, so a person may avoid this answer because of knowledge about cognitive development. Tests that depend on students selecting an answer essentially based on absent information become not tests that discern students who have knowledge versus students who lack it, but students who can guess what the test-makers are thinking versus students who cannot.

Something Lost in Translation The final issue with semiotics and standardized testing is this: a score is at least two levels removed from what the test actually purports to represent. If the test is a reading test, supposedly measuring the ability to read, then test makers select passages for people to read and answer questions about. Right here is where artificiality begins, as I found out when I tried to construct a reading test. Along with two colleagues, I was hired to create a reading test for the Civil Service examination in my city. The idea was to create a test that did not have bias against ethnic minorities. The test that had been given had people read text from police procedure manuals, which is an authentic enough task, but it favored small town (mostly white) men with police experience in those municipalities, because it measured experience with police-related information, not reading. Having taken a number of reading tests myself, I decided to find texts that were interesting and well-written, but which were outside of the experiences of most people taking the test. I chose a passage about Vietnam from a very well-written short story, a passage about the construction of the Quickie ultralight wheelchair, and a passage from writer Bailey White about the time she brought a snake into her first grade classroom. My colleagues and I interviewed test takers as we piloted our exam and found that the snake passage was very problematic for one person. He got every comprehension question wrong on that passage because he was afraid of snakes and couldn’t make himself read the text. We managed to create a test for debilitating snake phobias! The substance of reading tests is supposedly a representation of the actual process of reading, only the purpose for reading is entirely different. No one reads a reading test passage for pleasure (although I was trying to get there in the test I was part of creating) in part because of the context of the test—high stakes assessment—and in part because test makers cannot anticipate the desires of each reader. One immediate characteristic that gets lost in the translation between real reading and test reading are the main components of meaning-making—the desire to read and the pleasure of reading. This is not a trivial characteristic, either—it is central to the reading process. Reading is not merely technique and technique is subservient to meaning making, except for when one is reading boring test passages. Secondarily, the reading test gets converted into a number—the score. Not just the raw score, either. The raw score gets transmogrified into a statistical system that places one person in opposition to everyone else who has taken the test. The bell curve, the “normal” distribution, is based on an opposition between “normal” (the center of the curve) and “not-normal” (the tails). Standard deviations measure the amount of not-normalness for a given test taker. If we know that it is possible that random guessing can add significantly to a person’s score, then what do standard deviations really mean? But they are numbers and we think of numbers as honest and objective. Exactly who said there were liars, damned liars, and statisticians is up for grabs, but it is something to keep in mind.

Standardized tests are a counterintuitive way to assess people. They are based on the misconception that there is something standard about the ways in which people, in particular, use and respond written and spoken language. They are based on the assumption that technique in reading and writing is more important than meaning-making and that technique can be measured apart from meaning-making.

Reading Comprehension The tests purport to measure reading comprehension, so it is worth looking at that construct. Ryan (1984) states: One's text comprehension standards willreflect his or her conception of the desiredoutcome of the reading process. This con-ception, in turn, will reflect an individual'simplicit epistemological beliefs—his or herunderstanding of the nature of knowledge and the learning process. (p. 248)

In other words, if it is not already obvious, reading comprehension is not a simple idea, the terms of which are consistently held across even fairly well-educated people.

Ryan

Scruggs and Lifson cite studies in which the reading passages of reading comprehension tests have been left out adn students were still able to answer questions correctly at a higher rate than chance would suppose. They report that test takers use their own prior knowledge rather than the reading passage in order to answer questions. This further compromises what reading tests are supposed to measure. Reading tests not only measure just a subset of what reading comprehension really is, but that measure is further muddied by the construction of the test itself, where all the answers are present and wise test takers can figure out correct answers without doing the actual reading task.

Scruggs and Lifson

Duffy Roelher and Pearson (check spelling) trace the history of the understanding of reading comprehension, revealing the behavioristic roots of this construct and also how today's cognitive understanding creates a different reader from the behavioristic concept of the reader. Duffy Roelher and Pearson

Inference: MacCartney and Manning

Those who create search engines and other computer applications in which human beings use natural language as a basis for finding information have to deal with the concepts of inference and have done a lot towards parsing out just what inference is in the process of trying to get computers to infer. Computer culture leads to a very different inference process than human inference. For example, it is very difficult to get a computer to infer a paraphrase of a given statement. The struggle to get computers to infer demonstrates that this process is very complex and that one's schemas strongly influence the inferences a person can make. A person can be good at inferring but if that person's schemas do not match the schemas of the test maker, then the test will fail to represent that person's ability. One might argue that academic success is based on being able to infer in the way that ETS would have one infer. And yet, to say this means that academic thinking remains monolithic because the only people who are identified as talented enough for an academic education are people who have an inference pattern of a particular type.

Looking at the GRE questions. These reading tests privilege a certain kind of reading--efferent, in particular, with a certain type of ability to discern academic significance--the type of points that academics would get out of an article. they privilege not just reading for information but also particular types of reading and a hierarchy of information, being able to infer a hierarchy of ideas: main ideas, ideas that support the main idea, and so forth. People who pass these tests are demonstrating not so much the ability to read per se but the ability to read academically. Since these tests get people into school it might be thought this is not such a terrible thing. But when we eliminate people on the basis of how they read and understand what they read, then we are reducing the type of thought that comes to the foreground in academia. This means that regardless of how people look in a university, their brains function in a similar way.

Juanita is an example of a person who was an excellent reader but who would not get the same ideas out of a text as other people, so she would not have done well on the tests.

Reading tests measure a small piece of what reading is. When we use reading tests as gate keepers for teaching, learning, and education, then we are limiting those things to people who think in a particular way. We are therefore limiting the types of ideas that are available, which, considering the problems we face, is a dumb thing to do.

Reading is a complex meaning-making process that involves an interaction between two people that is typically asynchronous in time (although not by much in the case of texting). Reading tests attempt to translate this into an assessment, but too much information is lost in the translation process.