Archive: January 19th, 2009

Geoffrey Sampson: Writing Systems

[Readings] (01.19.09, 2:52 am)

This book is about the linguistics of writing. Generally, linguistics is centered around the language of speech, and neglects the characteristics of writing. I was interested in this book from two main perspectives. The first is the consideration of written text for analysis. The second is the purpose of developing some sort of writing system for character communication within a simulated world. An in-world language is arguably necessary, (as discussed in Crawford), but is an enormously risky venture, fraught with problems and difficulties. Sampson does not provide clear answers to these questions, but does provide a vocabulary and method for thinking about them cohesively.

Linguistics has classically ignored written language in favor of speech. This division comes from several philosophical perspectives. speech and langauge play an important part in the development of parts of the brain in evolution, and this evolutionary root underscores the importance of speech to language. Writing is a cultural development, and its influence becomes the strongest after the invention of print. Writing is still very important culturally. Sampson’s thesis question is to develop a linguistics of writing.

Three categories of study must be distinguished around writing: typology, history, and psychology. Typology deals with form: what types of written languages are there. The types of written languages, as played out in alphabets and such, are often determined by cultural differences. An example is the adoption of Roman versus Cyrillic alphabets in Eastern Europe. The spoken languages are similar, but the alphabets are divided along the line of the Roman Catholic and Eastern Orthodox churches. The history of writing examines how writing changes. The historical development of writing is different from the historical development of speech, as can be seen to play out in spelling, conventions, and the like. Sampson makes an interesting argument about the value of writing. The linguistics of speech avoids declarations of value to ways of speaking. It is inappropriate to argue that one spoken language is better than another, because value is determined within a culture, and cannot be asserted externally. However, writing does not face this problem. Writing is a tool like any other, and writing systems may have more or less value depending on the circumstances of their use.

Sampson introduces a vocabulary for discussing writing. “I shall use the terms script, writing-system, or orthography, to refer to a given set of written marks together with a a particular set of conventions for their use.” (p. 19) Orthography has much to do with conventions beyond the actual symbols themselves. A language and a script are often conflated, but they are different. Writing is not the same as the transcription of speech, and this is due to the conventions of use. Writing operates according to different grammatical rules and conventions. Multiple scripts may be used to write for one language, and one script may be used for multiple languages as well. The units of writing are graphs. Sampson argues against the use of the terms symbols, characters, letters, and the like, due to their inspecificity. Sampson defines writing itself is a system for communicating using “permanant, visible marks”.

There are to major kinds of writing systems: semasiographic and glottographic. The former uses images with conventions of reading and interpretation. This can be translated into a spoken language, but not read directly. Semasiographic scripts are not normally understood as writing, but are pervasive in communication. Visual illustrations to convey instructions are semasiographic. More poignantly, mathematics is a semasiographic system. These are generally passed over in favor of glottographic systems. There is a lot to be said for semasiographic systems in digital media, and Crawford’s early work using sentence construction belongs in this category. It is interesting to note that semasiographic symbols may have “names”, or translations, which convey how to read the individual icon, but the entire system is still semasiographic because even witht the names, the text cannot be simply read.

Sampson divides glottographic systems into logographic and phonographic subcategories. He notably eschews the term “ideographic” because it is unclear. Logographic systems are similar to semasiographic systems in that they are pictoral, but they are not meant to be interpreted or translated explicitly. Spoken language is “double articulated”, according to Andre Martinet: It articulates thoughts into units, and then provides vocal codes for these units. Thus, a written language that can be read may articulate either the vocal codes, or the units of thought themselves. A pictographic language uses images to designate words is logographic. Phonographic scripts represent the actual phonetic symbols in the words, and generally letters are used to denote vocal sounds.

Systems may be classified according to a couple more principles. Systems may be motivated (iconic) or arbitrary. This difference applies to both phonetic and logographic scripts. A motivated phonetic alphabet will have like-sounding characters resemble each other, while a motivated ideographic script might have graphs which resemble the things they are supposed to represent. Systems may also be complete or incomplete (defective). Completeness relies on the capacity of the written language to carry across the range of expression in the actual language. It is relatively straightforward to see how ideographic scripts may be incomplete, but phonographic languages may be incomplete in other respects as well. English writing is unable to carry through in script the various vocal intonations that might be associated with a sentence. In human speech, intonation can carry across much important data.

Having discussed these fundamental points, Sampson reviews many different written languages. Only a few of these were really noteworthy, so I will examine those here:

The first case study is of Sumerian writing. This was developed for the highly specialized purpose of recording transactions. It is composed of both motivated and arbitrary graphs: Many transactions were written with an image denoting the object being bought or sold, and a number, the components of which are arbitrary in comparison. Because writing was specialized and intended for this very specific purpose, it is difficult to consider it incomplete. Sampson makes a brilliant analogy to computer programming. One usually does not say that a programming language is incomplete because it cannot express Tennyson. The both programming languages and Sumerian cuneiform emerged to fill particular needs. Sampson also compares the transaction writing to a kind of mnemonic, like a note that one might jot down in a calendar, which is adistillation of a sentence into its salient elements.

Consonantal writing is phonographic orthography without vowels, as is the case in Hebrew script. Generally, context is sufficient for determining the meaning of ambiguous terms. However, the language has low redundancy. The term of redundancy is borrowed from Shannon and Weaver, and is a property of information theory. “A system possessing relatively high redundancy is one where, in an average signal, the identity fo any given part of the signal is relatively easy to predict given the rest of the signal. Suppose that a policeman telephones to give you details of a suspect who needs to be looked out for, but because the line is bad you hear only some of the letters and numbers as they are spelled out: you hear the suspect’s name is F*ANK DAW*ON and his car registration is OWY 9*8P.” The suspect’s name in this example is easy to determine because English names have high redundancy. Car registrations have low redundancy, so the missing digit is impossible to reconstruct. Redundancy is an important consideration in written text, as well as in the laanguage used to communicate itself.

In terms of alphabet and construction, Han’gul composes graphs according to phonetic differences and is clearly differentiated. Graphemes map to phonemes, and similar phonemes have similar graphemes and vice versa. Syllables are organized into larger structures through construction. The tying of these graphs together is powerful for phonetics, but for language construction, I need a semantic system for developing a composed language.

Reading Info:
Author/EditorSampson, Geoffrey
TitleWriting Systems: A Linguistic Introduction
Tagsspecials, media traditions, narrative, linguistics
LookupGoogle Scholar, Google Books, Amazon

N. Katherine Hayles visits LCC

[General,Talks] (01.19.09, 12:02 am)

Notable scholar of literature and new media, Katherine Hayles visited us in LCC last Thursday. Her presentation was about electronic literature, and about the practice of academic study of the humanities. The presentation was posed as a conflict between traditional and digital humanities. The traditional humanities are slow to understand the digital, but the digital must be able to build from the foundation of traditional. There are tacit and implicit differences between the two disciplines, indicating shifts and differences in modes of thinking. The primary differences occur along the lines of scale, visualization, collaboration, database structures, language and codes, as well as a few others. Hayles’ research was conducted by interviewing several new digital humanities scholars.

The most notable difference is the idea of scale. This relates to the sheer physical limitations in the capacity of the researcher to read the domain of study. Digital technology enables a broad, but shallow, analysis of a broad corpus of text. The example is of 19th century fiction. A scholar will have read around 300 to 500 texts, but these texts are atypical, notable works, which are read because they are outstanding, the ones that stand out. The nature of research, the questions, and conclusions change when a quantative analysis is possible. When it is possible to look at thousands of texts at a distance.

Franco Moretti poses reading texts at the greatest distance possible. Hayles described this as “throwing down the gauntlet to traditional humanities,” whose approach has been to do deep reading, looking within texts to understand psychology, allusions, and connections. Moretti attempts to read texts as assemblies, breaking them into pieces, without ever reading a whole text. This is a dramatic change in method, and comes across as wildly controversial. It is notable that Moretti does have experience of practice, and is well read and familiar with the corpus. He is able to employ this approach precisely because of this familiarity. Moretti focuses on analyzing texts in terms of devices, themes, tropes, genres, or systems. The practice of analysis amounts to a kind of distant statistical profiling. Moretti analyzes how genres are born and die, tracing genres which have passed, such as epistolary and gothic novels. Moretti’s conclusion is that genres die because their readers die (not necessarily literally, but in the sense that they move on to other material).

Another question is how do you tell when technology platforms emerge. Hayles’ example is Tim Lenoir. He makes the claim that algorithmic processing of text counts as a form of reading. Lenoir’s project traces citations among a set of scientific papers. This network develops and defines a relationship of connections. This is interesting because the analysis is of material entirely contained within the texts themselves, and does not actually analyze works in terms of some external system of values. The claim that this analysis is reading is inflammatory in the traditional humanities, where reading is a hermeneutic activitiy focused on interpretation. The problem is that the traditional understanding of reading is wedded to comprehension. Lenoir argues that, at a wide scale, textual meaning is less important, but what is really interesting are the data streams.

In common with Moretti, Lenoir is interested in finding patterns. Patterns do not require primary investment in meaning. The traditional humanities is instead intereested in hermeneutic interpreatation, which is bound tightly to meaning. These two perspectives are mutually opposed, but Hayles is interested in linking patterns with hermeneutic reading, finding some form of common ground from which these may build from each other.

One such example of a work which uses both strategies is Tanya Clement‘s analysis of Gertrude Stein’s “The Making of Americans.” This text is a traditional narrative through half of the text, but at some point in the middle, the narrative breaks down and becomes virtually unreadable. The text at that point is composed of frequently repeated phrases, content which is essentially an anti-narrative. A deep reading of such a text is difficult or impossible because of the very structure of the text itself. An analysis of pattern is necessary to deduce meaningful conclusions. Clement’s analysis finds that texts contains repeated 490 word sequences, where only a few words within these sequences vary. The analogy is made to the notion of character, as character is repitition with only slight variation. This is a way  of understanding the text which is arguably very valuable, but would be impossible without pattern analysis.

The traditional humanities is usually solitary, involving a deep communion between the reader and the text. Networked culture is interested in collaborative approaches to study, and when applied to study of texts and narrative, comes with a shift of assumptions in how to approach a text. One way of looking at this is in scale of participation, but another approach is to break up a text and treat it as a database. David Lloyd’s project “Irish Mobility” which chops up prose to remove references of subordination and cooperation. Then the resulting material is embedded into a database. This allows the user to “refactor” the content. The resulting piece becomes harder to read, but arguably the content is more meaningful. The resulting form is fragmentary hypertext, and enables the user control over the narrative.

Hayles gives a few examples of database projects used in education, wehre students build from each others’ work, and is published. Thus, their work continues to live beyond the class, and is valuable for sharing and feedback. These projects are less interested in representation, and more interested in communication and distribution.

Regarding language and code, Hayles gives a few examples. A succinct quote comes from Tanya Clement: “Software is an exterioralization of desire.” The writer of software must have an exact  articulation of what the computer must do, without tacit knowledge. Modifying code is generally easier than modifying tacit knowledge, and once created, it is also easier to observe because it is actually written and visible. Tacit assumptions are by their very nature concealed. This is not to say that digital systems are always explicit about their values, but they more clearly formulate their models, and thus the values are more concretely established within the system.

Disciplines are formed by the violence of exclusion, according to Weber. Disciplines achieve legitimacy by constructing boundaries. On one side of this boundary is placed the material which “belongs” in the discipline, and the other side is that which is excluded. This process occurs with astronomy and astrology: One side is given legitimacy while the other is denied it. The legitimacy of traditional humanities is threatened by digital humanities which is outside of the boundaries of the traditional in many senses.

We were not able to extensively discuss the relationship between language and code because the presentation was beginning to run out of time. The relationship between digital and traditional humanities is construed as a conflict. Hayles’ goal is to find a reconciliation between these two. However, the examples described are primarily data oriented approaches to texts and literature. The approaches of pattern analysis and interpretive hermeneutics presuppose a inherent content related difference in the reading of texts. I think that it would be useful to have a more process oriented approach, that focuses on the system rather than the structure of narrative. A common ground might be found in considering that both hermeneutics and the digital are dependent on process.