首页>
外国专利>
AUTOMATIC SEMANTIC INFORMATION EXTRACTION FROM WEB DOCUMENTS FOR SEMANTIC WEB ANNOTATION
AUTOMATIC SEMANTIC INFORMATION EXTRACTION FROM WEB DOCUMENTS FOR SEMANTIC WEB ANNOTATION
展开▼
机译:从Web文档中自动提取语义信息以进行语义标注
展开▼
页面导航
摘要
著录项
相似文献
摘要
A method and a system for automatically extracting semantic information from a web document for a semantic web annotation are provided to accelerate semantic and automatic tasks of large capacity web. A system for automatically extracting semantic information from a web document comprises a learning data generator(100), an integrated classifier generator(400) and a semantic information extractor(800). The learning data generator collects large capacity web documents, eliminates HTML tags from the collected web documents, disjoints compound words, and generates learning data to which semantic tags are attached via a learning data editor. The integrated classifier generator generates a support vector machine(200) and a Bayesian classifier by using the learning data, and integrates the support vector machine with the Bayesian classifier. The semantic information extractor automatically extracts semantic information from new web documents via the integrated classifier, and generates the semantic information as ontology instances.
展开▼