Is babbling language-specific? A listening test using vocalizations produced by Swedish and American 12- and 18-month-olds

Olle Engstrand, Karen Williams and Francisco Lacerda

Department of Linguistics, Stockholm University

Abstract

Two hypotheses concerning infants’ transition from babbling to speech are considered: a) That the phonetic characteristics of babbling are essentially independent of the child’s linguistic environment; and that, in contrast, b) babbling typically drifts in the direction of the child’s ambient language. Preliminary results from a listening test using babbled vocalizations from Swedish and American English 12- and 18-month-olds suggest that effects of linguistic environment can be discerned at both ages and, as expected, that the effect is stronger at 18 than at 12 months.

Introduction

While there is now general agreement - contrasting with Jakobson’s (1968/1941) original view - that infants’ phonetic route from babbling to speech is essentially continuous, it is not clear to what extent phonetic effects of the ambient language can be discerned in babbling or early words. According to an ‘independence hypothesis’, the phonetic characteristics of late babbling and early word production are assumed to be essentially independent of the child’s linguistic environment such that "the child’s first words can be seen as [...] a matter of choosing from the babbling repertoire a set of approximations to adult word forms" (MacNeilage, 1979, p. 30; see also Locke, 1983). But there is also the radically different ‘babbling drift hypothesis’ stating that "the most important thing about babbling [is] the fact that it drifts in the direction of the speech the infant hears" (Brown, 1958, p. 199).

For a long time, the empirical support for the babbling drift hypothesis was rather weak and based on methodologically shaky evidence (cf., e.g., Weir, 1966); and most attempts to experimentally verify the hypothesis have provided negative results (e.g., Atkinson et al., 1968; Olney & Scholnick, 1976; Oller & Eilers, 1982; Locke, 1983). For example, Atkinson et al. (1968) concluded that "adults can neither identify babbling infants [...] as English or non-English up to the age of 17 months, nor judge whether two samples from infants at a given age (5-6 or 16-17 months) are from the same or different language communities". In contrast, a number of recent cross-language studies reported by Boysson-Bardies and colleagues have suggested relatively early ambient language effects on infants’ babbling behavior; see also Whalen et al. (1991). For example, in Boysson-Bardies et al. (1984), the linguistic origin of vocalization samples by six-month-old French and Arabic infants were perceived above chance by a group of French phoneticians, and the same result was obtained by non-phoneticians for the corresponding group of eight-month-olds. Further cross-language effects, all going in the direction of the respective adult language norms, were reported in experiments using long-term spectra (Boysson-Bardies et al., 1986), formant measurements (Boysson-Bardies et al., 1989) and counts of consonant types (Boysson-Bardies et al., 1992).

The phonetic acquisition literature thus offers conflicting evidence concerning the phonetic path from babbling to speech in young children. This paper is a preliminary report on a listening experiment undertaken to settle the controversy, i.e., to determine to what extent effects of linguistic origin can be discerned in young children’s babbles and early words.

Methods

The children whose vocalizations were used in the present experiment were 8 American English and 8 Swedish 12-month-olds, and 8 American English and 8 Swedish 18-month-olds (a total of 32 children). All speech samples were recorded in sound-treated rooms, Swedish (SE) samples at the Department of Linguistics, Stockholm University, and American English (US) samples at the Department of Speech and Hearing Sciences, University of Washington, Seattle. Recordings were made using Lavalier microphones and transmitted via FM wireless systems (Sennheiser MKE 2 microphones, SK-2012 transmitters and EM-1005 receivers in Stockholm; Countryman MEMF05 microphones, and HME RX722 and Telex FMR-50 transmitters/receivers in Seattle). Signals were recorded on Panasonic VHS videocassette recorders using High-Definition audio tracks (model AG-7450 in Stockholm, model AG-1950 in Seattle; AG-W1 recorders were used at both sites to convert between the American and European video formats). To prepare stimuli for perceptual and acoustic tests, the US vocalizations were digitized in Seattle and the SE vocalizations in Stockholm using identical models of the Kay Elemetrics CSL (Ver. 4.0) speech analysis system (20 kHz sampling rate, 16 bit quantization).

The stimuli for the listening test consisted of 20 digitized utterances per child (a total of 640 stimuli), selected by rule. The rule was to take the first two, distinct utterances from the beginning of the second session, then jump ahead 3 minutes and take the next two, and so on, until there were 20 utterances. The utterances were to be non-cry, non-screaming, non-whispered, longer than 400 ms, and free from environmental noise.

Two independent judges, one native Swede and one American who were both fluent in the non-native language, moderately phonetically trained, and blind to the purpose of the experiment, used the audio and video recordings in combination to code each utterance in terms of its Word Status (was it meaningful?) and Imitation Status (was it a direct imitation?). They used 3 categories, Yes, No and ? (uncertain). Each utterance was coded using their combined scores.

The listeners, who were all phonetically trained, were one male American (RM) who was also fluent in Swedish, one female American (CS) who was not a speaker of Swedish, one female Estonian (DK) who was also fluent in Swedish and English, one female American (KW, the second author) who was fluent in Swedish, and one male Swede (OE, the first author) who was also fluent in English.

The listening test was performed using the Suxess program (written by J. Stark) running on an Apollo workstation. Each listener had his/her own randomization of the 640 utterances to be judged. The whole task took about 6 hours and was divided up into approximately one hour long sessions. The listeners set their own pace. First, the listener heard 4 repetitions of the given utterance and was required to decide whether the utterance was produced by a Swedish or an American child; at this stage, the listener was free to listen repeatedly before making a choice. The language choice was indicated by clicking the appropriate alternative on the screen. Next, the utterance was repeated another 4 times. The listener was now required to motivate his/her choice of language by clicking one of the alternatives ‘Word’ (if he/she had recognized a word as either Swedish or American English), ‘Phonetic Cue’ (if the listener had recognized some phonetic property of the utterance as typical of either language), or ‘Guess’ (if the choice of language was not motivated by either of these criteria). If the language choice was based on identification of a word or a phonetic property, the listener was asked to write the word or property on the screen. Then the listener could either click for a new utterance or take a break.

Results

Figure 1 displays the mean number of ‘American’ judgments and their 95% confidence intervals for the babbled utterances, i.e., the utterances that were judged as having Word Status=no. The data are clustered by age groups and the children’s ambient language. For each utterance, the maximum score is 5 (all judges agree that it is an American babble) and the minimum is 0 (all judges agree that it is a Swedish babble). A preliminary analysis of variance indicated that the children’s ambient language significantly influences the judgments (F(1,520)=70.89, p<0.0005) and a highly significant interaction between the children’s ambient language and their age (F(1,520)=15.51, p<0.0005) was found.

Figure 1. Summary of listeners’ ‘American’ judgments based on vocalizations having Word Status=no.

Summary and discussion

This listening experiment tested two hypotheses concerning infants’ transition from babbling to speech: the ‘independence hypothesis’ and the ‘babbling drift’ hypotheses. Overall, the results suggested that it is possible for listeners to determine above chance the linguistic background of those Swedish and American English infants’ vocalizations that did not have word status as determined by the procedure outlined above; as expected, vocalizations were identified more correctly with respect to ambient language at 18 than at 12 months of age. However, the specification of word status used here was conservative, i.e., there may still be ‘words’ in the material underlying figure 1. Thus, the results presented here provide at best a preliminary indication that the ambient language may have a phonetic effect on infants’ babbling at 12 months of age.

Acknowledgments

Thanks to Jeanette Blomquist and Keith Hayes for technical assistance, Johan Stark for programming, and Jessica Jarrett, Margaret Kehoe, Liselotte Roug-Hellichius and Ingrid Landberg for recording assistance. This work was supported by the Swedish Council for Research in the Humanities and Social Sciences (HSFR), the Nordic Coordination Committee for Research in the Humanities (NOS-H), the National Institute on Deafness and Other Communicative Disorders (NIDCD), and the Virginia Merrill Bloedel Hearing Research Center of the University of Washington, Seattle.

References

Atkinson, K., MacWhinney, B. & Stoel, C. 1968. An experiment on the recognition of babbling. Language behavior research laboratory working paper 14. Berkeley: University of California.

Boysson-Bardies, B. de, Halle, P., Sagart, L. & Durand, C. 1989. A crosslinguistic investigation of vowel formants in babbling. Journal of Child Language, 16, 1-18.

Boysson-Bardies, B. de, Sagart, L. Halle, P. & Durand, C. 1986. Acoustic investigation of crosslinguistic variability in babbling. In Lindblom, B. & Zetterström, R. (eds.): Precursors of early speech. New York: Stockton Press.

Boysson-Bardies, B. de, Sagart, L. & Durand, C. 1984. Discernible differences in the babbling of infants according to target language. Journal of Child Language, 11, 1-15.

Boysson-Bardies, B., Vihman, M.M., Roug-Hellichius, L., Durand, C., Landberg, I. & Arao, F. 1992. Material evidence of infant selection from the target language: A cross-linguistic phonetic study. In Ferguson, C., Menn, L. & Stoel-Gammon, C. (eds.), Phonological development: Models, research, implications. Timonium, Maryland: York Press.

Brown, R. 1958. Words and things. Glencoe, Ill.: Free Press.

Jakobson, R. 1968. Child language, aphasia, and phonologic universals. The Hague: Mouton. (Kindersprache, Aphasie und allgemeine Lautgesetze, Uppsala: Almqvist & Wiksell, 1941.)

Locke, J.L. 1983. Phonological acquisition and change. New York: Academic Press.

MacNeilage, P.F. 1979. Speech production. Proc. 9th International Congress of Phonetic Sciences, Copenhagen.

Oller, D.K. & Eilers, R.E. 1982. Similarity of babbling in Spanish- and English-learning babies. Journal of Child Language, 9, 565-577.

Olney, R.L. & Scholnik, E.K. 1976. Adult judgments of age and linguistic differences in infant vocalization. Journal of Child Language, 3, 145-155.

Weir, R.W. 1966. Some questions on the child’s learning of phonology. In F. Smith & G.A. Weismer (Eds.), The genesis of language. Cambridge, MA: MIT Press.

Whalen, D.H., Levitt, A.G. & Wang, Q. 1991. Intonational differences between the reduplicative babbling of French- and English-learning infants. Journal of Child Language, 18, 501-516.