Mats Wirén, Institutionen för lingvistik, Stockholms universitet, talar under rubriken "A piece of industrial corpus linguistics: Eliciting and analysing data for the Telia 90200 call routing system".

Tid och plats:

15 april kl. 15-17. Lokal meddelas senare.


The launch of the Telia (Swedish Telecom) call routing system to the 90200 customer care in 2006/07 was the most ambitious attempt that far
to create a spoken natural-language interface to a commercial service
in Sweden. The service was one of the first in Europe to combine a statistical language model for speech recognition and a statistical classifier for semantic analysis of utterances. The basis for the design and training of the system was a data collection using Wizard-of-Oz methodology, involving 42,000 Telia customers and ten wizards -- as far as I know, the single largest data collection of this kind that has ever been carried out. In the seminar, I will describe the novel methodology used for collecting the speech corpus, the overall structure of the dialogue, the definition of the semantic formalism for caller utterances, and some experiments that were carried out using different sets of system utterances to try to influence caller behaviour.