首页>
外国专利>
ONLINE INTERNET TOPIC MINING METHOD BASED ON IMPROVED LDA MODEL
ONLINE INTERNET TOPIC MINING METHOD BASED ON IMPROVED LDA MODEL
展开▼
机译:基于改进LDA模型的在线互联网主题挖掘方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
Disclosed is an online Internet topic mining method based on an improved LDA model. The method corresponds to a continuous and streaming type topic mining process conducted in a segmented mode, n web pages are processed each time, and these web pages are usually acquired by web crawlers from the Internet in an online and real-time mode, and mining results of the contents of these web pages generate k topics. After current n web pages are processed, newly acquired n web pages are continuously processed through the process. The process mainly comprises initialization of On-LDA model hyper-parameters, dynamic updating of the On-LDA model hyper-parameters, Internet topic mining based on the On-LDA model and the like. By means of the present invention, the assignment method and effect of use in respect to the hyper-parameters and of a traditional LDA model in the topic mining process are radically changed. Classified information to which the web page contents belong is fully utilized to assign initial values to the model hyper-parameters, so that the initial values of the hyper-parameters completely depend on the web page contents to be mined, and the computing process is simplified and rationality is achieved.
展开▼