Download | - View accepted manuscript: Conditional significance pruning : discarding more of huge phrase tables (PDF, 256 KiB)
|
---|
Author | Search for: Johnson, J. Howard1 |
---|
Affiliation | - National Research Council of Canada. Information and Communication Technologies
|
---|
Format | Text, Address |
---|
Conference | The Tenth Biennial Conference of the Association for Machine Translation in the Americas (AMTA), 28 October - 1 November 2012, San Diego, California, USA |
---|
Abstract | The technique of pruning phrase tables that are used for statistical machine translation (SMT) can achieve substantial reductions in bulk and improve translation quality, especially for very large corpora such at the Giga- FrEn. This can be further improved by conditioning each significance test on other phrase pair co-occurrence counts resulting in an additional reduction in size and increase in BLEU score. A series of experiments using Moses and the WMT11 corpora for French to English have been performed to quantify the improvement. By adhering strictly to the recommendations for the WMT11 baseline system, a strong reproducible research baseline was employed. |
---|
Publication date | 2012-11-01 |
---|
Language | English |
---|
Peer reviewed | Yes |
---|
NPARC number | 21249500 |
---|
Export citation | Export as RIS |
---|
Report a correction | Report a correction (opens in a new tab) |
---|
Record identifier | bb26b75e-ff34-47e4-82db-2f71940cf9bf |
---|
Record created | 2013-02-20 |
---|
Record modified | 2020-06-04 |
---|