Erving Goffman: Forms of Talk

The outset frames the book as a largely experimental work. The subject is talk in a general sense, and the content poses models of a few particular types of talk. His goal is not for these to be seen as definitive and final, but rather as possibilities. I would venture to call the models that Goffman puts forth prototypes. The general aim is to think about how to understand the dense layers of meaning in talk. Talk in person is laden with many forms of communication beyond language, including glances, posture, intonation, and so on. The idea is to understand what these are, but also to understand how people can understand and decipher all these signals in the first place. There are three matters that are important: ritualization, a participation framework, and the fact that words are frequently not our own. These dimensions are characteristics of dramaturgy. The essence of this is that talk contains the requirements of theatricality.

Replies and Responses

The subject of this chapter is the schema of reply and response. This is initially very simple, but begins to complex on realizing the layers of context and embeddings that take place within replies. Responses are dependent on the frame of the question, and are significantly dependent on them for the purposes of understanding their meaning. Goffman is at first using examples of questions and answers, of particular interest are layered responses. Where the person asked the question must pose another question in order to give the response. Alternately, some implicit contexts might be assumed and simply done away with. An example of this is a simple diner script: (p. 8, but coems from Marilyn Merritt)

A: “Have you got coffee to go?”
B: “Milk and sugar?”
A: “Just milk.”

In this brief example, B’s response not only implicitly answers A in the affirmative, but also suggests a state change. A is not asking a question for information, but rather is asking for service. On the request, B moves to fulfill it.

Goffman explains that there are three types of listeners in conversation: those that overhear; those that are ratified participants, but who are not specifically addressed; and those ratified participants who are specifically addressed. The system of particpants is again suggestive of theatrical models. The assumption with talk, though, is that the central goal of it is to communicate, and for the listeners to correctly understand what the speaker means, whether or not they agree with what was said. Speech uses many cues to provide feedback to confirm understanding. This is interesting in relation to games, where, when the player communicates with non-player characters, comprehension is treated as a given (not necessarily rightly so).

Talk is presented, initially, as a communication system, in terms of transmitting and receiving messages. Feedback occurs on a “back channel.” In this system, the two-part exchange of question and response is a natural form. The example of the communication based model is a case in which the theory shapes the interpretation of communications. Later on, Goffman explores examples which are very challenging to the communication model. Goffman suggests an approach to this model that formats exchanges as statements and replies, rather than questions and answers.

For the speaker, the communication model suggests a protocol of gestures and pauses. The general effect of these is a way of bracketing the talk, so that it is clear what each statement means, and what its frame and context are. The full channels model  is beyond my needs, but is remarkably thorough. Goffman suggests several requirements for talk in this model: (p. 14-15)

  1. A two-way capability for transceiving acoustically adequate and readily interpretable messages.
  2. Back-channel feedback capabilities for informing on reception while it is occurring.
  3. Contact signals: means of announcing the seeking of a channeled connection, means of ratifying that the sought-for channel is now open, means of closing off a theretofore open channel. Included here, identification-authentication signs.
  4. Turnover signals: means to indicate ending of a message and the taking over of the sending role by the next speaker. (In the case of talk with more than two persons, next-speaker selection signals, whether “speaker selects” or “self-select” types.)
  5. Preemption signals: means of including a rerun, holding off channel requests, interrupting a talker in progress.
  6. Framing capabilities: cues distinguishing special readings to apply across strips of bracketed communication, recasting otherwise conventional sense, as in making ironic asides, quoting another, joking, and so forth; and hearer signals that the resulting transformation has been followed.
  7. Norms obliging respondents to reply honestly with whatever they know that is relevant and no more.
  8. Nonparticipant constraints regarding eavesdropping, competing noise, and the blocking of pathways for eye-to-eye signals.

Talk is not all about pure and raw communication. Talk is heavily dependent on social codes and other conventions around politeness, etiquette, privacy, and so on. Communication may contain layers of subtext, for instance, a greeting that is meant to be inviting to further conversation or closed to it. Necessary for the inclusion of these subtexts is to see talk as having a ritual form, or be composed of ritualized interchanges. The system of ritual concerns works by imposing a set of constraints on allowable actions and behaviors. Goffman gives three primary points for ritual communication (p. 21), summarized here.

  1. The speech act makes implications about the character of the speaker and his relationship to the listeners.
  2. Offensive or potentially offensive actions may be ameliorated by apologies, but these must be acknowledged as acceptable to the listener.
  3. Offended parties must give a sign that offense has been made, otherwise they are enabling a lapse of the ritual code.

Addressing the problem of dialogic analysis, Goffman turns to the question of units. What are the units of conversation? Classical linguistics looks at sentences, but a more general term is needed (not least because many utterances are not sentences). Goffman suggests the idea of a “turn,” which means an entire period of speech. Instead, he settles on the idea of a “move.” Both of these terms are associated with games, and lend a certain game-like quality to the model of talk, something which is encouraged in the text.

Goffman is critical of the noncontextual approach to conversation, which is normally introduced in looking at replies and responses. The noncontextual approach is reminiscent of the classical models of linguistics and cognition, where the person is a frontend for a database of known facts. Goffman emphasizes the primacy of context in the comprehension of interactions. The model of statement and reply does not adequately account for the process of communication, only its content. So, Goffman suggests a system where conversation is rather a system of responses, where each statement is a response in reaction to the context which has been induced by the last move. Goffman gives four bullets describing the properties of responses: (p. 35) Note that these are extremely worth considering in the conversation simulation projects.

  1. They are seen as originating from an individual and as inspired by a prior speaker.
  2. They tell us something about the individual’s position or alignment in what is occurring.
  3. They delimit and articulate just what the “is occurring” is, establishing what it is the response refers to.
  4. They are meant to be given attention by others now, that is, to be assessed, appreciated, understood at the current moment.

Goffman begins to challenge the primacy of the statement, and then the entire communication-based model. Switches to the idea that talk is simply a sequence of response moves in reference to each other. He emphasizes the idea of context and social setting as fundamental: “So, too, we would be prepared to appreciate that the social setting of talk not only can provide something we call “context” but also can penetrate into and determine the very structure of the interaction.” (p. 53)

Finally, Goffman explores an interactional view of talk: “What, then, is talk viewed interactionally? It is an example of the arrangement by which individuals come together and sustain matters having a ratified, joint, current, and running claim upon attention, a claim which lodges them together in some sort of intersubjective mental world.” (p. 71) Tellingly, he uses the analogy of games. The difference is that the moves of conversation are not composed of tokens and positions, but utterances and other nonverbal cues. Statements and responses may be seen as deriving from moves, not the other way around.

Response Cries

The subject of this chapter is “response cries,” which are exclamations that one might give in response to oneself. Examples are things such as “hmm,” “ow!,” “ooh,” and so on. These are analyzed in detail. These kinds of expressions are not merely situated, they are situational. They are indicators of one’s own mental and physical state, partly aimed at others, to serve as indicators regarding potential interactions. These are thus theatric in nature.


The focus of this paper is the concept of “footing,” how the mode and frame of conversation is determined and how that is controlled (or not) by participants. Goffman’s first example is a transcript of president Nixon teasing and embarrassing a female news reporter, shifting the ground from a serious and official mode to a sexual one where the reporter is disempowered. Footing is important for the general understanding of reference (bracketing), and also for the role of power, which strongly relates to Johnstone’s status.

Instances of footing changes are conversational shifts. Examples of shifts are given here: (p. 127)

  1. direct or reported speech
  2. selection of the recipient
  3. interjections
  4. repetitions
  5. personal directness or involvement
  6. emphasis
  7. separation of topic and subject
  8. discourse type, e.g., lecture and discussion

Footing is important in fiction, as with social interaction, it is represented by various cues. In fiction and text it cannot be communicated subtextually through signs of gaze and posture. These signals exist, but they must be explained and raised to the level of the surface text. However, fiction has authorial shifts in terms of voice and focus, especially in the practices of different forms of speech (free indirect speech, for example), are closer indicators of footing changes. Footing changes are emphasized in film, and often are accompanied with special cuts to draw attention to the shift. The medium that probably would be most adept at communicating footing to games (in an interactional sense) would be comics, which are able to represent content with both text and images, with a great deal of simplification.

Goffman lists several qualities of footing in attempt to give a definition: (p. 128)

  1. Participant’s alignment, or set, or stance, or posture, or projected self is somehow at issue.
  2. The projection can be held across a strip of behavior that is less long than a grammatical sentence, or longer, so sentence grammar won’t help us all that much, although it seems clear that a cognitive unit of some kind is involved, minimally, perhaps, a “phonemic clause.” Prosodic, not syntactic, segments are implied.
  3. A continuum must be considered, from gross changes in stance to the most subtle shifts in tone that can be perceived.
  4. For speakers, code switching is usually involved and if not this then at least the sound markers that linguists study: pitch, volume, rhythm, stress, tonal quality.
  5. The bracketing of a “higher level” phase or episode of interaction is commonly involved, the new footing having a liminal role, serving as a buffer between two more substantially sustained episodes.

Normal language (channels) to denote speaker and hearer misses the other cues, and the other types of relationships: proximity, touch, gaze, and so on, that occur between particpants. These sorts of elements are crucial in footing, and are incidentally a substantial part of filmic language.

The first dimension is elaborating the relationship between the speaker, the addressed recipient, and the bystanders. This framing reveals the complexities of these interactions, especially as bystanders may talk and communicate amongst themselves, or there may be various levels of interaction between each level. Goffman explains these types of itneractions as byplay, crossplay, sideplay, and collusion. The act of speaking invovles more than just the speaker and receiver: “The point of all this, of course, is that an utterance does not carve up the world beyond the speaker into precisely two parts, recipients and non-recipients, but rather opens up an array of structurally differentiated possibilities, establishing a participation framework in which the speaker will be guiding his delivery.” (p. 137)

Goffman looks at the different modes of the speaker as: the animator, the author, and the principal. These roles each entail a different relationship between the speaker and the actual activity and content of speech. The animator is the dynamic dimension of the speaker in action, with the emphasis on the delivery and performance of speaking. The author links the speaker as the originator of the words that are encoded. The point of the principal is to stand, not as merely the speaker of words, but rather as the authority or the one whose position is established and identified by the words. The principal is not necessarily an authority figure, but rather someone who is committed to the words and who makes a connection between himself or herself and the words spoken. Together, these form the “production format” of the utterance.

Changing footing is not so easy as simply and mechanically dropping one context and assuming another, but rather, holding the old context in abeyance with the potential to be reengaged.

