首页> 外文学位 >Parallel text mapping of web-based bilingual corpus materials .
【24h】

Parallel text mapping of web-based bilingual corpus materials .

机译:基于Web的双语语料库材料的并行文本映射。

获取原文
获取原文并翻译 | 示例

摘要

The chief objective of this thesis is to design and develop a Bitext Mapping Intelligent Agent (BMIA), a computational model that can be used to pair and compare translations texts. There are two main components in BMIA. The first component is the StatCan Daily Translation Extraction System (SDTES) which automatically extracts translations from web-based materials to construct the StatCan Daily Corpus (SDC). At the same time, a translation concordance system (TransConcord) has been developed to provide ready access to SDC and other bilingual corpora. The second component of BMIA is the StatCan Bilingual Text Comparison System (TextComp) that aims at aligning and comparing bilingual texts for translation discrepancy detection and Translation Correspondence Profiling (TCPro). To deal with potentially noisier data sets in the translation checking process, different text mapping algorithms have been designed to parse the input texts, align them, and scan through them to detect translation discrepancies. In order to give a more detailed picture of translation correspondences, TextComp maps translations at a more fine-grained level: the translation constituent level. A TCPro scaling metric is designed to compute the TCPro score for each aligned segment pair so that levels of translation correspondence can be estimated and distinguished. This scale-based view can help in identifying correspondence deviations and objectively assessing the faithfulness of translations. The two component systems in BMIA not only support human translators, but also shed light on machine translation, translation studies, and translation quality assessment.
机译:本文的主要目的是设计和开发双文本映射智能代理(BMIA),该模型可用于配对和比较翻译文本。 BMIA有两个主要组成部分。第一个组件是StatCan每日翻译提取系统(SDTES),该系统会自动从基于Web的资料中提取翻译内容,以构建StatCan每日语料库(SDC)。同时,已经开发了翻译一致性系统(TransConcord),以提供对SDC和其他双语语料库的便捷访问。 BMIA的第二个组件是StatCan双语文本比较系统(TextComp),该系统旨在对齐和比较双语文本以进行翻译差异检测和翻译对应分析(TCPro)。为了在翻译检查过程中处理潜在的噪音较大的数据集,已设计了不同的文本映射算法来解析输入文本,对齐它们并扫描它们以检测翻译差异。为了更详细地显示翻译对应关系,TextComp在更细粒度的级别上映射翻译:翻译构成级别。 TCPro缩放度量标准旨在计算每个对齐段对的TCPro得分,以便可以估计和区分翻译对应级别。这种基于比例的视图可以帮助识别对应的偏差,并客观地评估翻译的真实性。 BMIA中的两个组件系统不仅支持人工翻译,还为机器翻译,翻译研究和翻译质量评估提供了启示。

著录项

  • 作者

    Zhu, Qibo.;

  • 作者单位

    Carleton University (Canada).;

  • 授予单位 Carleton University (Canada).;
  • 学科 Artificial Intelligence.;Computer Science.
  • 学位 Ph.D.
  • 年度 2009
  • 页码 253 p.
  • 总页数 253
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号