Unpacking and transforming feature functions : new ways to smooth phrase tables

  1. (PDF, 275 KB)
AuthorSearch for: ; Search for: ; Search for: ; Search for:
Proceedings titleProceedings of the 13th Machine Translation Summit
ConferenceMachine Translation Summit XIII, 19-23 September 2011, Xiamen, China
Pages269275; # of pages: 7
AbstractState of the art phrase-based statistical machine translation systems typically contain two features which estimate the “forward” and “backward” conditional translation probabilities for a given pair of source and target phrase. These two “relative frequency” (RF) features are derived from three counts: the joint count of the source and target phrase and their marginal counts. We propose to “unpack” these three statistics, making them independent “3-count” features instead of two RF features. In our experiments, the 3-count features perform better than the RF ones in three of four systems we tested. By transforming and generalizing these 3-count features slightly, further improvements are obtained. Furthermore, under several different experimental conditions, we compare 3-count and generalized 3-count features to new features derived from Kneser-Ney smoothing, to a new low-frequency penalty feature, and to several known smoothing/ discounting schemes. Generalized 3-count performs similarly to or better than all of the smoothing methods except modified Kneser-Ney. In our experiments, the best phrase table (not language model) smoothing yields +0.6-1.4 BLEU.
Publication date
AffiliationInformation and Communication Technologies; National Research Council Canada
Peer reviewedYes
NPARC number21267976
Export citationExport as RIS
Report a correctionReport a correction
Record identifierf943e893-18a5-4f0b-95db-6e080fedb4bb
Record created2013-03-27
Record modified2016-05-09
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)
Date modified: