微博是常用的社交媒体,但对于微博图片和文本相关性的研究还很少.为研究中文图文微博相关性,使用了三种方法计算图文微博相似度特征,并将其与图文微博文本特征、社会特征组合起来,采用三种机器学习方法进行相关性分类.实验结果表明,针对三种图文相似度特征计算方法,基于WordNet的方法与基于Word-Embedding的方法效果较好,基于余弦相似度的方法效果较差;而加入文本特征和社会特征后,相关关系识别结果在三种机器学习算法上都有所提高.综合考虑三种因素,使用Word-Embedding方法计算图文微博相似度特征,将其与文本特征和社会特征相组合,采用BP神经网络进行相关关系识别效果最好.%To reveal the correlation of Chinese image-Weibo,this paper used three methods to calculate the similarity feature of image-Weibo,combed it with text features and social features,and used three kinds of machine learning methods to classify.The result shows that the WordNet-based method and Word-Embedding based method are better than the cosine-based method.After combining the similarity feature with text features and social features,it improved the recognition results.Considering the three factors,it can get the best result through combining similarity feature,which is calculated by Word-Embedding method,with text features and social features,using BP network to classify.
展开▼