首页> 外文会议>International Conference on Signal Processing(ICSP'06); 20061116-20; Guilin(CN) >The Research and Application about the Information Extraction in Chinese Domain
【24h】

The Research and Application about the Information Extraction in Chinese Domain

机译:中文领域信息抽取的研究与应用

获取原文
获取原文并翻译 | 示例

摘要

A specific prototype information service system was proposed by this paper, which can send interesting information to user with database search way from unstructured text. In order to achieve this goal, two fundamental issues were studied by using maximum entropy (ME) algorithm, which is named entity recognition and relation extraction. Our named entity recognition approach is distinguished from most of the previous approaches. Where, probabilistic feature functions are used instead of binary feature functions, it is one of the several differences between this model and the most of the previous ME based model. We also explore several new features in our model, which includes confidence functions, position of features etc. Like those in some previous works, we use sub-models to model Chinese Person Names, Foreign Names respectively, but we bring some new techniques in these sub-models. The experimental result is promising. Moreover, ME algorithm is the first time to be used to extract relations between entities from Chinese texts. Twelve features have been designed, which includes Morphology, grammar and semantic feature. The experimental result is satisfied. Therefore, two research results were used into my information extraction system, the goal of information service came from unstructured text is achieved.
机译:本文提出了一种特定的原型信息服务系统,该系统可以通过数据库搜索的方式从非结构化文本向用户发送有趣的信息。为了实现这个目标,使用最大熵(ME)算法研究了两个基本问题,即实体识别和关系提取。我们命名的实体识别方法与大多数以前的方法不同。其中,使用概率特征函数代替二进制特征函数,这是此模型与大多数以前的基于ME的模型之间的几个区别之一。我们还探索了模型中的几个新功能,包括置信度函数,特征位置等。像以前的作品中一样,我们分别使用子模型分别对中国人名,外名进行建模,但是在这些人中引入了一些新技术。子模型。实验结果是有希望的。此外,ME算法是首次用于从中文文本中提取实体之间的关系的算法。设计了十二个特征,包括形态,语法和语义特征。实验结果令人满意。因此,在我的信息提取系统中使用了两项研究成果,从而实现了来自非结构化文本的信息服务目标。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号