CAM, The Cultural Anthropology Methods Journal, Vol. 8 no. 1, 1996 |
H. Russell
Bernard
University of Florida(1)
Every cultural anthropologist ought to be interested in finding,
or creating, and analyzing texts. By "finding" texts I mean things like diaries,
property transactions, food recipes, personal correspondence, and so on. By
"creating" texts I mean recording what people say during interviews.
But by creating text, I also mean doing what Franz Boas did with
George Hunt, and what Paul Radin did with Sam Blowsnake. In 1893, Boas taught
Hunt to write Kwakiutl, Hunt's native language. By the time Hunt died in 1933,
he had produced 5,650 pages of text -- a corpus from which Boas produced most of
his reports about Kwakiutl life (Rohner 1966).
Sam Blowsnake was a Winnebago who wrote the original manuscript
(in Winnebago) that became, in translation, Crashing Thunder: An
Autobiography of a Winnebago Indian (Radin 1926). More recently, Fadwa El
Guindi (1986), James Sexton (1981), and I (Bernard and Salinas 1989), among
others, have helped indigenous people create narratives in their first
languages.
Original texts provide us with rich data -- data that can be
turned to again and again through the years as new insights and new methods of
analysis become available. Robert Lowie's Crow texts and Margaret Mead's hours
and hours of cinema verité about Bali dance are clear examples of the value of
original text. Theories come and go but, like the Pentateuch, the Christian
Gospels, the Q'uran and other holy writ, original texts remain for continued
analysis and exegesis.
If we include all the still and moving images created in the
natural course of events (all the television sitcoms, for example), and all the
sound recordings (all the jazz and rock and country songs, for example), as well
as all the books and magazines and newspapers, then most of the recoverable
information about human thought and human behavior is naturally-occurring text.
In fact, only the tiniest fraction of the data on human thought and behavior was
ever collected for the purpose of studying those phenomena. I suppose that if we
piled up all the ethnographies and questionnaires in the world we'd have a
pretty big hill of data. But it would be dwarfed by the mountain of
naturally-occurring texts that are available right now, many of them in
machine-readable form.(2)
One of the things I like best about texts is that they are as
valuable to positivists as they are to interpretivists. Positivists can tag text
and can study regularities across the tags. This is pretty much what content
analysis (including cross-cultural hypothesis testing) is about. Interpretivists
can study meaning and (among other things) look for the narrative flourishes
that authors use in the (sometimes successful, sometimes unsuccessful) attempt
to make texts convincing.
Scholars of social change have lots of longitudinal quantitative
data available (the Gallup poll for the last 50 years, the Bureau of Labor
Statistics surveys for the last couple of decades, baseball statistics for over
a hundred years, to name a few well-studied data sets), but longitudinal text
data are produced naturally all the time. For a window on American popular
culture, take a look at the themes dealt with in country music and in
Superman comics over the years.
Or look at sitcoms and product ads from the 1950s and from the
1990s. Notice the differences in, say, the way women are portrayed or in the
things people think are funny in different eras. In the 1950s, Lucille Ball
created a furor when she became pregnant and dared to continue making episodes
of the I Love Lucy show. Now think about almost any episode of
Seinfeld. Or scan some of the recent episodes of popular soap operas
and compare them to episodes from 30 years ago. Today's sitcoms and soaps
contain much more sexual innuendo.
How much more? If you were interested in measuring that, you
could code a representative sample of exemplars (sitcoms, soaps) from the 1950s
and another representative sample from the 1990s, and compare the codes (content
analysis again). Interpretivists, on the other hand, might be more interested in
understanding the meaning across time of concepts like "flirtation," "deceit,"
"betrayal," "sensuality," and "love," or the narrative mechanisms by which any
of these concepts is displayed or responded to by various characters.
Suppose you ask a hundred women to describe their last pregnancy
and birth, or a hundred labor migrants to describe their last (or most
dangerous, or most memorable) illegal crossing of the border, or a hundred
hunters (in New Jersey or in the Brazilian Amazon region) to describe their last
(or greatest, or most difficult, or most thrilling) kill. In the same way that a
hundred episodes of soap operas will contain patterns about culture that are of
interest, so will a hundred texts about pregnancies and hunts and border
crossings.
The difficulty, of course, is in the coding of texts and in
finding the patterns. Coding turns qualitative data (texts) into quantitative
data (codes), and those codes can be just as arbitrary as the codes we make up
in the construction of questionnaires.
When I was in high school, a physics teacher put a bottle of
Coca-Cola on his desk and challenged our class to come up with interesting was
to describe that bottle. Each day for weeks that bottle sat on his desk as new
physics lessons were reeled off, and each day new suggestions for describing
that bottle were dropped on the desk on the way out of class.
I don't remember how many descriptors we came up with, but there
were dozens. Some were pretty lame (pour the contents into a beaker and see if
the boiling point was higher or lower than that of sea water) and some were
pretty imaginative (let's just say that they involved anatomically painful
maneuvers), but the point was to show us that there was no end to the number of
things we could measure (describe) about that Coke bottle, and the point sunk
in. I remember it every time I try to code a text.
Coding is one of the steps in what is often called "qualitative
data analysis," or QDA. Deciding on themes or codes is an unmitigated,
qualitative act of analysis in the conduct of a particular study, guided by
intuition and experience about what is important and what is unimportant. Once
data are coded, statistical treatment is a matter of data processing, followed
by further acts of data analysis.
When it comes right down to it, qualitative data (text) and
quantitative data (numbers) can be analyzed by quantitative and qualitative
methods. In fact, in the phrases "qualitative data analysis" and "quantitative
data analysis," it is impossible to tell if the adjectives "qualitative" and
"quantitative" modify the simple noun "data" or the compound noun "data
analysis." It turns out, of course, that both QDA phrases get used in both ways.
Consider the following table:
Analysis | Data | ||
Qualitative | Quantitative | ||
Qualitative | a | b | |
Quantitative | c | d |
Cell a is the qualitative analysis of qualitative data.
Interpretive studies of texts are of this kind. At the other extreme, studies of
the cell d variety involve, for example, the statistical analysis of
questionnaire data, as well as more mathematical kinds of analysis.
Cell b is the qualitative analysis of quantitative
data. It's the search for, and the presentation of, meaning in the results of
quantitative data processing. It's what quantitative analysts do after they get
through doing the work in cell d. Without the work in cell b,
cell d studies are puerile.
Which leaves cell c, the quantitative analysis of
qualitative data. This involves turning the data from words or images into
numbers. Scholars in communications, for example, might tag a set of television
ads from Mexico and the U.S. in order test whether consumers are portrayed as
older in one country than in the other. Political scientists might code the
rhetoric of a presidential debate to look for patterns and predictors.
Archeologists might code a set of artifacts to produce emergent categories or
styles, or to test whether some intrusive artifacts can be traced to a source.
Cultural anthropologists might test hypotheses across cultures by coding data
from the million-pages of ethnography in the Human Relations Area Files and then
doing a statistical analysis on the set of codes.
Strictly speaking, then, there is no such thing as a
quantitative analysis of qualitative data. The qualitative data (artifacts,
speeches, ethnographies, TV ads) have to be turned first into a matrix, where
the rows are units of analysis (artifacts, speeches, cultures, TV ads), the
columns are variables, and the cells are values for each unit of analysis on
each variable.
On the other hand, the idea of a qualitative analysis of
qualitative data is not so clear-cut, either. It's tempting to think that
qualitative analysis of text (analysis of text without any recourse to coding
and counting) keeps you somehow "close to the data." I've heard a lot of this
kind of talk, especially on e-mail lists about working with qualitative data.
Now, when you do a qualitative analysis of a text, you interpret
it. You focus on and name themes and tell the story, as you see it, of how the
themes got into the text in the first place (perhaps by telling your audience
something about the speaker whose text you're analyzing). You talk about how the
themes are related to one another. You may deconstruct the text, look for hidden
subtexts, and in general try to let your audience know the deeper meaning or the
multiple meanings of the text.
In any event, you have to talk about the text and this
means you have to produce labels for themes and labels for articulations between
themes. All this gets you away from the text, just as surely as numerical coding
does. Quantitative analysis involves reducing people (as observed directly or
through their texts) to numbers, while qualitative analysis involves reducing
people to words -- and your words, at that.
I don't want to belabor this, and I certainly don't want to
judge whether one reduction is better or worse than the other. It seems to me
that scholars today have at their disposal a tremendous set of tools for
collecting, parsing, deconstructing, analyzing, and understanding the meaning of
data about human thought and human behavior. Different methods for doing these
things leads us to different answers, insights, conclusions and, in the case of
policy issues, actions. Those actions have consequences, irrespective of whether
our input comes from the analysis of numbers or of words.
1. 1. This was written while I was at the University of Cologne (July 1994-July 1995). I thank the Alexander von Humboldt Foundation, the Institut für Völkerkunde at the University of Cologne, and the College of Arts and Sciences, University of Florida for support during this time.
2. 2. The Human Relations Area Files (HRAF) consists of about one million pages of text on about 550 societies around the world. All the data on a 60-culture sample from that database are now available on CD-ROM. HRAF plans to convert the entire million-page corpus of text to machine-readable form over the next few years. The Center for Electronic Texts in the Humanities at Rutgers University is bringing together hundreds of machine-readable corpera (the Bible, all of Shakespeare's work, all the ancient Greek and Latin plays and epics). Lexis has placed the entire corpus of Supreme Court opinions on line. The list goes on and on. Conversions of text corpera to on-line databases proceeds at a breathtaking pace.
[geneva97/eop.htm]