Accueil du site Accueil du site Adhésion Contact Plan du site

Prosodic Phrase Break Prediction: Problems in the Evaluation of Models against a Gold Standard

Claire Brierley, Eric Atwell

School of Computing
University of Leeds
Woodhouse Lane
Leeds, LS2 9JT
Royaume-Uni
(claireb, eric)@comp.leeds.ac.uk

The goal of automatic phrase break prediction is to identify prosodic-syntactic boundaries in text which correspond to the way a native speaker might process or chunk that same text as speech. This is treated as a classification task in machine learning and output predictions from language models are evaluated against a ‘gold standard’: human-labelled prosodic phrase break annotations in transcriptions of recorded speech - the speech corpus. Despite the introduction of rigorous metrics such as precision and recall, the evaluation of phrase break models is still problematic because prosody is inherently variable; morphosyntactic analysis and prosodic annotations for a given text are not representative of the range of parsing and phrasing strategies available to, and exhibited by, native speakers. This article recommends creating automatically-generated POS tagged and prosodically annotated variants of a text to enrich the gold standard and enable more robust ‘noise-tolerant’ evaluation of language models.


Télécharger:
Fichier PDF
Claire Brierley, Eric Atwell
154.8 kb

TAL Volume 48 2007 . 1. Principes de l’évaluation en Traitement Automatique des Langues

Date de dernière mise à jour : 7 May 2008, auteur : Rédacteurs en chef.