Tutorial 2: Monday, 2 SEPT 2013, 2 pm - 6 pm
Domain Adaptation in Machine Translation
Cristina Vertan,University of Hamburg, Research Group "Computerphilology"
Mirela-Stefania Duma, University of Hamburg, Natural Language Systems Division
There was a massive progress in Machine Translation during the last couple of years. Systems performing translation for assimilation are available for various language pairs and rely mainly on statistical methods, which means on the availability of large amount of training data in form of parallel corpora. The dependency between the quality of the output and the type and quantity of the training material leads to the problem of domain adaptation. The output of a machine translation system from a different domain than the one of training corpus will have significant lower accuracy.
As it is impossible to built large parallel corpora for all possible domains and language pairs, current research focuses currently on domain adaptation methods.
State of the art domain adaptation techniques can be classified in:
- Alignment domain adaptation
- Language Model Adaptation
- Translation Model Adaptation
- Reordering model Adaptation
The tutorial intends to give a broad overview on domain adaptation methods in statistical machine translation and discuss also aspects of this topic, which are also for other machine translation paradigms like Example Based machine Translation or Rule-based Machine Translation.
The first part of the tutorial will focus on problems which MT-systems face when they are exposed to different type of input as they were trained / or for which resources were built.
Part II of the tutorial will introduce methods for domain adaptation in statistical machine translation, as listed above
Part III is dedicated to problems of domain adaptation for other machine translation paradigms
Finally the last part will deal with open problems and will present a practical case of domain adaptation, the ATLAS-System in which an MT-Engine was embedded into a web content management system. Domain adaptation was performed for 13 domains and 15 language pairs.
Tutotial website: http://www.c-phil.uni-hamburg.de/view/Main/DomainAdaptation2013