Stochastic Spoken Natural Language Parsing in the Framework of the French Media Evaluation Campaign

A stochastic parsing component has been applied on a French spoken language dialogue corpus, recorded in the framework of the MEDIA evaluation campaign. Realized as an ergodic HMM using Viterbide coding, the parser outputs the most likely semantic representation given a transcribed utterance as input. The semantic sequences used for training and testing have been derived from the semantic representations of the MEDIA corpus. The HMM parameters have been estimated given the word sequences along with their semantic representation. The performance score of the stochastic parser has been automatically determined using the mediaval tool applied to a held out reference corpus. Evaluation results will be presented in the paper
Published in 2006