The appended transformations in speaker age and sex have been obtained by analyzing a natural utterance using linear predictive coding and by re-synthesis after recalculation of the parameter values descriptive of the speech signal [1]. The recalculations of F0 and the formant frequencies were based on the values listed in [2], Table 2, except for an error in this table: The value of kF0 in the transformation from women to girl, 12-14 years, should be 1.03 (instead of 1.27). Speech rate has also been modified [3]. Q-values have been conserved.
Age rating experiments with speech manipulated in this way [4] show some bias towards the original age of the speaker. This can be attributed to the conserved 'verbal maturity' in the transformed versions.
In order to
a noise source with the right spectrum has to be substituted for the buzz source, and the formant frequencies have to be increased [5].
It is also possible to modify the speaker in sex.
Not quite convincing? With a better knowledge of the female - male differences in the acoustic properties of speech, it could probably be done in a more convincing way.
Now you may listen to a table conversation by a synthetic Swedish family.
References:
[1] H. Traunmüller, P. Branderud and A. Bigestans (1989) "Paralinguistic speech signal transformations" PERILUS X: 47-64. Department of linguistics, Stockholm university.
[2] H. Traunmüller (1988) "Paralinguistic variation and invariance in the characteristic frequencies of vowels", Phonetica 45: 1-29.
[3] G.J.T. Haselager, I.H. Slis and A.C.M. Rietveld (1991) "An alternative method of studying the development of speech rate" Clinical Linguistics and Phonetics 5: 53-63.
[4] H. Traunmüller and R. van Bezooijen (1994) "The auditory perception of children's age and sex", Proceedings ICSLP-94, Yokohama: 1171-1174. (Abstract).
[5] I. Eklund and H. Traunmüller (in print) "A comparative study of male and female whispered and phonated versions of the long vowels of Swedish." Phonetica. (Abstract).