首页> 外国专利> OVERSEAS SCIENTIFIC ELECTRONIC PLAIN TEXT COLLECTING/INDEX/EXTRACTION SYSTEM AND METHOD THEREOF, AND MEDIA THAT CAN RECORD COMPUTER PROGRAM THEREOF

OVERSEAS SCIENTIFIC ELECTRONIC PLAIN TEXT COLLECTING/INDEX/EXTRACTION SYSTEM AND METHOD THEREOF, AND MEDIA THAT CAN RECORD COMPUTER PROGRAM THEREOF

机译:海外科学电子平原文本收集/索引/提取系统及其方法,以及可以记录计算机程序的媒体

摘要

The present invention overseas science and technology to collect electronic texts / index / extraction system and its methods and on the way relates to a recording medium storing a computer program, International Science and Technology in order to obtain e-original expert of the art low-power, and the quality and reliability of proven electronic texts only the robot to target particular sites (OA site) to handle through the collection and the electronic texts, collected electronic texts are built into the database based on bibliographic information relevant metadata extracted from the generated text file. After the file conversion stage at the same time kept in the collection of data and meta-information to build surge Search by destination were regular web search is possible. ; International Science and Technology Electronics Original Collection / index / extraction method according to the present invention includes the steps of a user for this information to enter the site that contains the electronic texts to identify international sites provide electronic texts; Step to the web robots to collect electronic texts using the site list information; Converting the collected electronic texts to the text information; Extracting metadata for citations in the converted text information; And inputting the extracted metadata to the database; it characterized in that it comprises a
机译:本发明的海外科学技术收集电子文本/索引/提取系统及其方法,并且涉及一种存储计算机程序的记录介质,国际科学技术以获取低水平的电子原始专家。强大的功能,以及经过验证的电子文本的质量和可靠性,只有机器人可以将目标指向特定站点(OA站点)以通过集合进行处理,并且电子文本是根据从书目信息中提取的与相关元数据相关的书目信息而构建到数据库中的。文本文件。在文件转换阶段之后,同时保留数据和元信息的收集以建立激增的搜索功能,可以按目标进行常规网络搜索。 ;根据本发明的国际科学技术电子原件的收集/索引/提取方法包括以下步骤:用户使该信息进入包含电子文本的站点,以识别提供电子文本的国际站点;进入网络机器人,使用站点列表信息收集电子文本;将收集到的电子文本转换为文本信息;提取转换后的文本信息中引文的元数据;并将提取的元数据输入数据库;它的特征在于它包括一个

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号