|
Abstract:
|
Weights of the various components in astandard Statistical Machine Translationmodel are usually estimated via MinimumError Rate Training. With this, one findstheir optimum value on a development set with the expectation that these optimalweights generalise well to other test sets. However, this is not always the case when domains differ. This work uses a perceptron algorithm to learn more robust weights to be used on out-of-domain corpora without the need for specialised data. For an Arabic-to-English translation system, the generalisation of weights represents an improvement of more than 2 points of BLEU with respect to the MERT baseline using the same information. |