LONG-MINGLE is a longitudinal corpus of child-directed speech. The corpus consists of ortographic transcripts of audio and video recordings of naturalistic free play sessions. The sessions were recorded at the Phonetics Laboratory at Stockholm University. The studio was equipped with a set of toys, and the parents were instructed to play with these toys as they normally would at home. All participating mothers and fathers are native speakers of Swedish.

LONG-MINGLE (version 1.0) consist of 57 transcripts (about 46 000 tokens and 12 000 utterances) from longitudinal dyads with 13 children between 2 and 33 months of age. A subset of this corpus, called MINGLE-3, has been multimodally annotated with eye gaze, gestures, and object-related actions (Nilsson Björkenstam & Wirén, 2014).

LONG-MINGLE is distributed as text files with three TAB-separated columns: 1) utterance start time, 2) utterance end time, and 3) utterance. LONG-MINGLE is available for research.


Nilsson Björkenstam, K. & Wirén, M. (2014). Multimodal Annotation of Synchrony in Longitudinal Parent–Child Interaction. In: MMC 2014 Multimodal Corpora: Combining applied and basic research targets: Workshop at LREC 2014. Paper presented at The 9th edition of the Language Resources and Evaluation Conference, 26-31 May, Reykjavik, Iceland. European Language Resources Association.


Kristina Nilsson Björkenstam