Darwin and Digital Code
From the Origin of Species to the Origin of Speeches
During World War II, Bell Telephone Laboratories worked on the problem of secure voice communications for the United States Army. In 1943, they launched the SIGSALY system, also known as “Project X.” SIGSALY (a fake acronym) enabled thousands of secret telephone calls between Winston Churchill, Franklin Roosevelt, General Douglas MacArthur and other military leaders. Today, the encryption system is hailed as a starting point for digital communication, the first successful demonstration of pulse code modulated speech. With PCM, patterns of electrical pulses or numbers “describe” the amplitudes of samples from a speech wave. Instead of carrying speech itself (or an analogous electrical wave), telephone wires and radio wireless transmit only the parameters of the speech signal.
A central component of the SIGSALY system was the vocoder, or “VOice-CODER,” a device best known for generating “robotic” voices in electronic music. Bell engineer Homer Dudley originally designed the vocoder in the 1920s as a tool for speech compression—to reduce the bandwidth, and hence the expense, of telephone transmission. The vocoder “analyzed” or separated telephone speech into frequency bands, sampled and quantized each band, and then remade the speech at the receiver. “In ordinary telephony,” Dudley explained to the readers of Bell Laboratories Record in 1936, “we move a sound wave electrically from one point to another by direct transmission but in the synthesizing process, only the specifications for reconstructing the sound wave are directly transmitted.”
Dudley based the vocoder’s sampling mechanism on the theory that concrete and quantifiable “gestures” lay beneath speech. His publications often referred to lip-reading, artificial larynges, and speaking automata as evidence for the viability of speech compression. Each of these fields had already demonstrated that speech could be divided into a sound stream and a “message,” imprinted onto the breath by the movements of the lips, tongue, teeth and other vocal organs. These speech “codes,” or their numerical descriptions, might be transmitted with great efficiency over a telephone line.
Dudley acknowledged a debt to one of his contemporaries from England, Sir Richard Arthur Surtees Paget. Baronet and barrister, born in Somersetshire in 1869, Paget was one of the last “gentlemen scientists” to have substantial influence. In childhood and adolescence he met all the unofficial benchmarks for a promising speech researcher—possessing “perfect pitch”; performing “one man duets” (by whistling and singing, or whistling and humming, at the same time); and teaching his black poodle, Pompey, to utter a few words.
His life’s work was given over to debunking venerable notions of orality. In a lecture to the Oxford University Anthropological Society in 1934, Paget explained, “Speech is usually described as a system of significant sounds by which we communicate ideas—but a very little reflection will show that this is a mistake…It is the gestures that we make with our tongue, lips, etc., which carry the meaning of speech. The sounds are only consequences by which we (subconsciously) recognize the gestures.” The role of sound, Paget believed, was to convey emotion, as in the mating calls of birds, the songs of crickets, and the tones of human voices. Speech “gestures,” on the other hand, conveyed information.
Paget marshaled a wide range of evidence to support this gesture theory. For one thing, he knew that many deaf people were able to communicate through lip-reading. For another, he observed firsthand the actions of the vocal organs. At the 1932 International Phonetic Conference, he peered down phonetician George Oscar Russell’s throat with a “pharyngeal periscope.” In a subsequent lecture to the Physiological Society at University College London, he recalled “the extreme activity of his larynx and surrounding parts…during normal speech. There was a different ’attitude’ at each change of vowel – there was a different attitude in forming the so-called voiced and unvoiced consonants…even when all these consonants were being whispered.”
Paget also argued from evolution: articulation must have emerged from a “primitive” sign language. In The Descent of Man, Charles Darwin had insisted that complex language distinguished the human species—and was subject to the laws of evolutionary theory. Other animals, of course, communicated through sound, but humans had the “mental faculties” for “connecting definite sounds with definite ideas.” Different human languages, like so many behaviors, had a common basis in biology. “I was incessantly struck whilst living with the Fuegians on board the Beagle,” Darwin reflected, “with the many little traits of character, shewing how similar their minds were to ours.” Certain facial expressions, emotional vocalizations, and gestures seemed even to be universal.
Darwin described gesture as a longstanding accompaniment to vocalization, and “gesture-language” as analogous to speech. “I cannot doubt that language owes its origin to the imitation and modification, aided by signs and gestures, of various natural sounds, the voices of other animals, and man’s own instinctive cries…As bearing on the subject of imitation, the strong tendency in our nearest allies, the monkeys, in microcephalous idiots, and in the barbarous races of mankind, to imitate whatever they hear deserves notice.” Some gestures, such as shrugging, were almost certainly innate. Yet Darwin cited the theory of Edward Burnett Tylor, anthropologist of ancient Mexico, which held that most gestural signs were related to nature through instinctual mimicry, and then condensed, exaggerated, or otherwise altered over time. In Researches into the Early History of Mankind and the Development of Civilization, Tylor argued, “The Indian pantomime and the gesture-language of the deaf-and-dumb are but different dialects of the same language of nature.” Speech had evolved—somehow—as an improvement upon this “original language of man.”
In The Expression of the Emotions in Man and Animals, Darwin introduced the concept of “serviceable associated habits,” referring to behaviors that become habitual under certain conditions, but then can be triggered in new settings by mood or mental state. In a related phenomenon, Darwin argued, “There are other actions which are commonly performed under certain circumstances, independently of habit, and which seem to be due to imitation or some sort of sympathy. Thus a person cutting anything with a pair of scissors may be seen to move their jaws simultaneously with the blades of the scissors. Children learning to write often twist about their tongues as their fingers move, in ridiculous fashion.” Based on this last model, Paget insisted upon a more central, causal role for gesture in the evolution of language: early humans communicated via “pantomime,” unconsciously making the same gestures with their mouths; eventually, they dedicated their hands fully to labor and spoke orally instead. In his lectures on this topic, he often exclaimed, “Darwin has not only given us the origin of species, but also the origin of speeches!”
Paget built a range of “artificial talkers” to test his gesture-theory of speech.His minimalist Cheirophone, for instance, consisted of a vibrating reed held within the resonator of his own clasped hands, fed by an airstream. With three fingers simulating the tongue, thumb and forefinger as the lips, the Cheirophone was able to create most English sounds. It spoke the sentences “Hullo London, are you there?” and “Oh, Lilah, I love you” to radio audiences in England and America, demonstrating that hand and mouth might perform the same gestures. To identify the fundamental vocal posture for each vowel, Paget built other models from clay and plasticine. In a 1922 article, “The Origin of Speech—A Hypothesis,” he further theorized that if hands and clay could ”talk,” could not electrical circuits be designed with the appropriate patterns of resonance to synthesize speech?
While the gestural origins of speech are now questionable—and the primitiveness of sign language has lost its validity—telephone engineers of the early twentieth century invested heavily in Paget’s theory. Harvey Fletcher, who directed the speech and hearing research at Bell Laboratories while Dudley was assembling the vocoder, opened his textbook Speech and Hearing with a discussion of the gestural “mechanism of speaking”:
It is very probable that such signs, gestures, and expressions of the face were used before the evolution of the spoken language had progressed very far. According to some philologists, the vocal sounds of very primitive people were exclamatory and song-like and used mainly to express emotion. Sound mimicking nature came to designate certain things connected with the thing imitated. As man’s power of analysis developed, the sounds gradually developed into spoken words having definite meanings. According to Sir Richard Paget, human speech began by the performance of sequences of simple pantomimic gestures of the tongue, lips, etc., comparable with the natural gestures (of hands, etc.) which are still made by deaf mutes, and that these gestures were made audible by breathing or grunting.
“Signs” were at once symbolic and wonderfully concrete; mouth gestures could be described, quantified, and coded. Speech could be compressed by separating out these gestures from audible sound. Placed at the foundation of oral communication, gesture was at once universal and “primitive.” For the engineer, this suggested that communication was inherently amenable to translation between media—and it was perfectible, open to a modern and efficient re-tooling.
Spring 2009, Volume VIII, Number 3
Mara Mills is a Mellon Postdoctoral Fellow at the University of Pennsylvania. She earned a Master’s degree in Biology from Harvard in 2006, followed by her Ph.D. in History of Science in 2008.
n September 2004, Hurricanes Jeanne and Ivan struck the Caribbean and southern United States in rapid succession. Damage to Haiti in the West Indies was particularly severe. High…
Chalco is one of Mexico City’s poorest neighborhoods, far enough away from the center along the traffic-clogged highway to Puebla to feel isolated as well as arid. There, migrants from…
Where can you find an Italian professor teaching an American student in Spanish using lecture slides written in English? In the astronomy course I took at Chile’s Pontifícia Universidad Católica (PUC)…