Exploring Sentence Variations with Bilingual Corpora

  1. (PDF, 312 KB)
AuthorSearch for: ; Search for:
ConferenceCorpus Linguistics 2005 Conference, July 14-17, 2005., Birmingham, United Kingdom
AbstractWe propose a system for retrieving similar sentences from a corpus which treats sentences as pure strings. The advantage of such an approach compared to more linguistically motivated approaches is that the system can quickly retrieve similar sentences from a large size corpus (over one million sentences), work well with illstructured sentences, and work across different human languages. The system has been tested using English, French and Chinese corpora and the results have been manually evaluated. The application suggested in this paper is to use our similar sentence search engine within a language-learning context to help language learners improve their writing skills and better understand grammar rules of their second language by studying different sentence variants from realistic examples. We further suggest using the system with bilingual parallel corpora to help translation students enhance their translation skills by accessing professional translations.
Publication date
AffiliationNRC Institute for Information Technology; National Research Council Canada
Peer reviewedNo
NRC number48511
NPARC number5764603
Export citationExport as RIS
Report a correctionReport a correction
Record identifierb167b1d5-1e96-4599-90e9-c911f769e82d
Record created2009-03-29
Record modified2016-05-09
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)
Date modified: