Animal Behavior Laboratory Manual
ANALYZING ANIMAL SOUNDS

Lab Manual Table of Contents

The study of animal vocalizations and sounds first became important about
1950 following the development of portable tape-recorders and sound
spectrographs. These instruments allowed scientists to record and
then to analyze in detail the structure of sounds.
How Animals Make Sounds
Many vocalizations and other sounds produced by animals are specialized
for communication with other individuals of the same species.
Animals use a multitude of different mechanisms to produce sounds.
Mammals use two thin folds in the larynx (vocal chords) to produce sound.
Small muscles control the tension on the vocal chords and thus the
timing and pitch of the sounds. The cavities in the throat, mouth,
and nose sometimes resonate and thus modify the nature of the sounds
issuing from the animal's mouth.
Birds produce their songs by means of a special organ (the syrinx) which
consists of two (or in some species four) thin membranes in the wall of
the bronchi or lower trachea. As many as twelve pairs of tiny
muscles control the tension of these membranes. Birds can produce
exceptionally pure whistles with rapidly changing pitch and intricate
timing.
Frogs have a larynx with vocal chords, although the structures are not
homologous with those of mammals. Crickets scrape a series of bumps
(a comb) on one wing against a thickened ridge (a file) on the other.
There are many other mechanisms of sound production by animals.
The important point is that, regardless of the exact mechanism for sound
production, the detailed structure of a sound reflects the precise
neuromuscular coordination that produces it. Analysis of sounds
thus provides a way to compare neuromuscular coordinations in detail.
It provides an opportunity to study the structure of behavior with
great precision and convenience.
Sound and Sound Frequencies
Sound is a pressure wave, in which the molecules of the medium (air or
water) move minute distances to create rapidly alternating higher and
lower pressure. These changes in pressure are transduced by a
microphone into changes in electrical voltage. The amplitude of the
changes in pressure determines the intensity (our sensation of the
loudness) of the sound. The rate of the changes (cylces/second or
Hertz, abbreviated Hz) is the frequency of sound. The
frequency of sound determines our sensation of pitch.
A sound that consists of a sinusoidal change of pressure at a constant
frequency is a pure tone. A clear whistle is an example. The
frequency of a tone often changes (called frequency modulation).
Many birds' songs consist of pure tones that change rapidly in frequency.
We are for the most part unaware of the intricate changes in
frequency in birds' songs, but a spectrograph can display these changes
for us to study at leisure.
A spectrograph is an instrument that displays the frequency of
sound as a function of time. Such a display is called a
spectrogram or sonagram. A pure whistle thus appears
as a horizontal line at a particular frequency (which, remember,
determines our sensation of its pitch). Many birds' songs appear as
a series of lines that sweep upward or downward, or up and downward,
sometimes in extremely brief intervals of time. Nevertheless, close
inspection of the spectrogram reveals that only one frequency is present
at any instant.
Many sounds do not contain just a single frequency at a time. In
fact, it is rather difficult to produce such a sound. Instead,
sounds often have many frequency components, and some consist of white
noise (all frequencies are present at once).
Frequency Components of Complex Sounds
Any waveform can be decomposed into a series of sinusoidal frequency
components, each with appropriate amplitudes. This mathematical
process is called a Fourier Transform. A fundamental theorem states
that a waveform can be converted into a unique set of frequency
components, and vice versa. A spectrograph does this transformation
for us. It decomposes a complex waveform into its simultaneous
frequency components.
A spectrograph performs this operation repeatedly, in successive small
intervals of time. So a spectrogram displays the frequency
components present in each small interval of time. The duration of
the time interval for analysis is called the analysis period.
If pulses of sound are separated by intervals longer than the analysis
period, the spectrogram displays these pulses separated by gaps (analysis
periods with no sound). In contrast, if pulses of sound are
separated by intervals shorter than the analysis period, then the
spectrogram displays continuous horizontal bands, the frequency components
of a pulsed sound. A spectrograph often allows us to choose the
duration of the analysis period, so we can change the appearance of pulses
of sound.
Our ears perform a similar operation in analyzing the frequency components
of sound. The cochlea is a mechanical mechanism for separating the
frequency components of sounds. These frequency components
determine our sensations of pitch. The cochlea, like the
spectrograph, analyzes sound in small intervals of time, about 1/20 to
1/50 of a second (0.02 - 0.05 seconds or 20-50 milliseconds).
A sound consisting of pulses separated by intervals longer than 20-50 ms
is perceived as a series of distinct pulses. In contrast, a sound
consisting of pulses repeated more often than once every 20 ms is
perceived as a continuous sound with a specific timbre determined by its
frequency components.
PROCEDURES
Sound Spectrograph
In this lab, our spectrograph is a computer with frequency-analysis
software. The spectrogram is displayed on the monitor.
You can think of the spectrogram as a result of the following steps:
- First, the computer digitizes the continuous waveform of a sound into
a series of numbers (44100 numbers/second) that indicate the amplitude of
the waveform at evenly spaced moments of time.
- Then the software computes the Fourier Transform of this waveform
(the exact algorithm is called a Fast Fourier Transform or FFT). An
FFT converts the series of numbers representing the waveform into a series
of numbers representing frequency components.
- The software repeatedly computes FFTs in successive brief intervals
of time. The duration of these intervals, as explained above, is
called the analysis period. It is set by choosing the length of the
FFT. The length of the FFT (also called the Transform Size)
is simply the number of points converted into frequency components.
More points, longer analysis period.
Animal Vocalizations
First look at the spectrograms of some birds' songs. Notice the
vertical scale in frequency (depends on the display, often 0-11000 Hz or
0-11 kHz) and the horizontal scale in seconds (also depends on the
display, often tens or hundreds of milliseconds).
Confirm that in many species a single frequency is present at a time and
notice that this frequency can change remarkably fast. Calculate
the slopes of some frequency sweeps (Hz/second).
Look at spectrograms of some other animals (crickets, frogs, wolves).
Human Voice
Look at spectrograms of the human voice when you choose a relatively short
analysis period (for instance, Transform Size = 128). Human voices
consist of a very rapid series of pulses of sound. Each pulse is
created by the vocal chords momentarily opening or closing. These
impulses appear as thin vertical lines on the spectrogram. What
does this indicate about the frequency composition of the pulses?
What is the repetition rate of these pulses?
Use the microphone to compare the voices of men and women.
The frequency composition of a voice depends on the resonant frequencies
in the throat and nose (pharynx). Frequencies in the impulses from
the larynx that match a resonance of the pharynx are emphasized in the
sound that issues from the mouth. Other frequencies are attenuated.
The emphasized frequencies are called formants.
Compare spectrograms of the same person's voice with a relatively long
analysis period (Transform Size = 512) and with a relatively short
analysis period (Transform Size = 128). Does the sound appear as a
series of closely spaced vertical lines (impulses) in one case, but a
series of closely spaced horizontal lines (harmonics) in the other case?
Which analysis period produces which display? Why?
Does the spacing of impulses correspond to the spacing of harmonics as
follows:
interval between frequency components (in Hz) =
|
1 / interval between impulses (in seconds)
|
What is the difference between harmonics and formants? Which is
responsible for the basic difference between male and female voices?
What anatomical difference between men and women could produce this
difference in sound production?

References
Catchpole, C.K. and P.J.B Slater. 1995. Bird Song:
Biological Themes and Variations. Cambridge Univ. Press, Cambridge.
Denes, P. B., and E. N. Pinson. 1963. The Speech Chain: the
Physics and Biology of Spoken Language. Bell Telephone Labs. (Also Anchor
Books 1973). A true classic!

|