...
首页> 外文期刊>International Journal on Computer Science and Engineering >Populating domain specific words from Academic web pages of Tamil Nadu Universities to build domain ontology for educational websites
【24h】

Populating domain specific words from Academic web pages of Tamil Nadu Universities to build domain ontology for educational websites

机译:从泰米尔纳德邦大学的学术网页中填充领域专用词,以构建教育网站的领域本体

获取原文
           

摘要

This Machine translation from one natural language to the other is a challenging task. One of the methods of doing machine translation is using Interlingua based approach. In that approach the source language can be represented in an intermediate form, and that can be translated to the target language. Generation of Natural language sentence combines knowledge about language and the application domain to produce correct translation. And thus, it is important to prepare domain-specific corpus. Also it is equally important that the semantic hierarchy among the sets of domain words for machine translation of a document, since the hierarchy will provide semantic links and ontological information for words. Ontologies define concepts and interrelationships in order to provide a shared vision of a given application domain. One of the main problems is the difficulty in identifying and defining relevant concepts in the domain. This paper aimed the extraction of knowledge from Tamil Nadu university websites, in order to identify the domain specific words for educational sites. This paper proposes a method to identify domain specific words by utilizing the hierarchical structure of web directories node-by-node. This method will produce a list of domain dependent words with high frequency words.
机译:从一种自然语言到另一种自然语言的机器翻译是一项艰巨的任务。进行机器翻译的方法之一是使用基于Interlingua的方法。在这种方法中,源语言可以中间形式表示,并且可以翻译成目标语言。自然语言句子的生成结合了有关语言和应用领域的知识,以产生正确的翻译。因此,准备特定领域的语料库很重要。同样重要的是,用于文档的机器翻译的领域词集合之间的语义层次结构,因为该层次结构将提供单词的语义链接和本体信息。本体定义概念和相互关系,以提供对给定应用程序域的共同愿景。主要问题之一是在领域中难以识别和定义相关概念。本文旨在从泰米尔纳德邦大学网站中提取知识,以识别教育网站的特定领域单词。本文提出了一种利用网络目录的逐层结构来识别领域特定词的方法。该方法将产生具有高频词的域相关词的列表。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号