【24h】

A METHOD OF KNOWLEDGE COLLECTION BASED ON WEB TEXT MINING

机译:基于Web文本挖掘的知识收集方法

获取原文
获取原文并翻译 | 示例

摘要

The World Wide Web servers as huge, widely distributed, global information service center for various application. Web contains a rich and dynamic collection of hyperlink information and web mining is to discover the access pattern and hidden knowledge from the huge collection of documents plus hyperlink information, access and usage information. Web mining is divided into three kinds: web content mining, web structure mining and web usage mining. Web content mining includes text mining and multimedia mining. In these sorts of mining, web text mining is an efficient technique, which discovery valuable and potential knowledge from those semi-on-structured texts and hypertexts. A kind of new method of knowledge collection based on web text mining is put forward after the deficiency of knowledge collection method facing to documents of specialty knowledge: natural language understanding and expert system is being discussed. A kind of unified knowledge expressive model that can include many sorts of knowledge carriers and knowledge expression method is advanced firstly, XML is used as bridge to describe the structure of texts and then the XML tags of knowledge are designed based upon this model. After word materials are being tag processed, knowledge still stays in the original web documents but forms the web documental knowledge base with hierarchical structure. Finally, the corresponding knowledge collection method is studied and the knowledge collection system was developed. That system could understand the users' questions in natural knowledge and find out interrelated knowledge in the document and carry out reasoning. To answer the users' questions according to the result of reasoning and achieved semiautomatic knowledge collection of web documents.
机译:万维网服务器是适用于各种应用程序的庞大,分布广泛的全球信息服务中心。 Web包含丰富而动态的超链接信息集合,Web挖掘是从庞大的文档集合以及超链接信息,访问和使用信息中发现访问模式和隐藏的知识。 Web挖掘分为三种:Web内容挖掘,Web结构挖掘和Web使用率挖掘。 Web内容挖掘包括文本挖掘和多媒体挖掘。在这些类型的挖掘中,Web文本挖掘是一种有效的技术,它可以从那些半/非结构化文本和超文本中发现有价值的潜在知识。针对专业知识文献所面临的知识收集方法的不足,提出了一种基于网络文本挖掘的知识收集新方法:讨论了自然语言理解和专家系统。首先提出了一种可以包含多种知识载体的统一知识表达模型,并提出了知识表达方法,将XML作为描述文本结构的桥梁,然后基于该模型设计了知识的XML标签。在对单词资料进行标记处理之后,知识仍然保留在原始Web文档中,但形成具有层次结构的Web文档知识库。最后,研究了相应的知识收集方法,并开发了知识收集系统。该系统可以理解用户在自然知识中的问题,并在文档中找出相互关联的知识并进行推理。根据推理结果回答用户的问题,实现了Web文档的半自动知识收集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号