PORT-MEDIA (ANR 08 CORD 026 01) is an ANR funded project.
The start date is March 2009.
The project's duration is 36 months.
In today's commercial applications based on speech recognition, the quality of the human-machine interaction is still far from being enjoyable and effective. To improve the usefulness and acceptability of automatic dialog systems, a good mean is to increase the level of intelligence of automatic systems up to spoken language understanding (SLU).
This proposal is a natural follow-up of the Technolangue EVALDA/MEDIA understanding system evaluation campaign. The MEDIA project has given to the French public and industrial laboratories working on spoken language understanding in the man-machine dialogue area a common platform for the evaluation of their understanding systems, both with and without dialog context.
The object of the present project PORT-MEDIA is to provide the MEDIA corpus with three additional aspects of great importance in spoken dialog systems:
Robustness: Integration/coupling of the automatic speech recognition component in the understanding process.
Portability across domains and languages: evaluation of the genericity and adaptability level of the approaches implemented in the understanding systems. This is done by confronting the models with new data produced for a different information-query dialogue-system task or a different language and also by exploiting unannotated data.
Rich structures for high-level semantic knowledge representation: a solid settlement for the basic semantic units (concepts) has been proposed and tested during the first project, we intend to enrich the MEDIA data semantic representation with a new standardized high-level semantic representation suitable to account for the semantic composition inside and between consecutive user's interactions.
Therefore, the objectives of PORT-MEDIA can be summarized as follows:
Firstly, we aim to enrich the currently existing set of resources, tools and methodologies that was obtained in the framework of the previous MEDIA project. The setting-up of an evaluation protocol has required a considerable investment from every participant which, in return, has provided the research community with valuable resources and tools.
The first MEDIA campaign has been carried out on manual transcriptions of dialogs; the objective of PORT-MEDIA is to evaluate the robustness of our approaches confronted to the transcription errors brought by automatic speech recognition systems. Mainly the audio data from the MEDIA corpus will be used to develop a performing speech recognition system which will be used to help a fast and low-cost development of a new corpus.
Moreover the portability across domain and language will be addressed by the development of comparable corpora focusing on other tasks and languages. The transition from the hotel reservation MEDIA domain to another information query domain will allow to focus the researches on generic methods, easily adaptable to a new domain with a small amount of new data. Portability to a new language will imply the use of machine translation techniques. Also the data preparation in both cases will be an opportunity to evaluate new techniques which, while speeding up the process, can drastically reduce its cost (active learning).
Finally, an attempt will be made to settle a new standard of annotation allowing a thorough representation of the semantic composition inside users' turns.
Study of the dependencies between the speech recognition systems and the understanding modules. Proposal for the joint development of the speech and understanding modules (coupling).
Fast and low-cost production of new data for a particular task or language (adaptative training).
Use of automatically annotated data by several different understanding systems (unsupervised training).
Test of the genericity of the semantic representation.
Development of generic and adaptable approaches exploiting the available data from a domain and language to develop a system to another domain and/or language.
Validation of a high-level semantic annotation protocol (semantic composition annotation).
Availability to the largest possible audience (public research groups and industrial entities) of the resources and tools needed for a comparative evaluation of speech understanding system performance.
Availability of a large amount of data (more than 2000 dialogs) enriched with great quality meta-data (manual and automatic transcriptions along with manual and automatic semantic annotations, benchmarks with reference performance) allowing a straightforward development of new speech understanding systems.
Language resources.
Open source tools to exploit language resources on every aspect dealt with in the project : SLU components, ASR module, translation module
Platform for the evaluation of automatic speech understanding systems with manual or automatic transcriptions.
Free availability of all the data and results to the project partners.
Final corpus, including the manual and automatic transcriptions and the semantic annotations made available at very low cost for research purpose by ELDA (the distribution including the anonymous benchmark results and the tools developed for the campaign, data preparation and scoring).
The consortium will take a special care to the re-usability of these resources in order to contribute to the standardization of the evaluation methods.
| WP 0: | Project management |
| WP 1: | Specifications |
| WP 2: | Corpus acquisition and preparation |
| WP 3: | Automatic speech recognition module development |
| WP 4: | Portability of SLU components across domains and languages |
| WP 5: | High-level semantic annotation |
| WP 6: | System adaptation and benchmarks |
| WP 7: | Information dissemination, exploitation and capitalization on projet results |
— Fabrice Lefevre 2009/05/31 20:05
essay writing service
online scratch cards