Adaptive semi-supervised learning on labeled and unlabeled data with different distributions

Akinori Fujino; Naonori Ueda; Masaaki Nagata

首页> 外文期刊>Knowledge and information systems >Adaptive semi-supervised learning on labeled and unlabeled data with different distributions

【24h】

Adaptive semi-supervised learning on labeled and unlabeled data with different distributions

机译：对具有不同分布的标记和未标记数据进行自适应半监督学习

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Developing methods for designing good classifiers from labeled samples whose distribution is different from that of test samples is an important and challenging research issue in the fields of machine learning and its application. This paper focuses on designing semi-supervised classifiers with a high generalization ability by using unlabeled samples drawn by the same distribution as the test samples and presents a semi-supervised learning method based on a hybrid discriminative and generative model. Although JESS-CM is one of the most successful semi-supervised classifier design frameworks based on a hybrid approach, it has an overfitting problem in the task setting that we consider in this paper. We propose an objective function that utilizes both labeled and unlabeled samples for the discriminative training of hybrid classifiers and then expect the objective function to mitigate the overfitting problem. We show the effect of the objective function by theoretical analysis and empirical evaluation. Our experimental results for text classification using four typical benchmark test collections confirmed that with our task setting in most cases, the proposed method outperformed the JESS-CM framework. We also confirmed experimentally that the proposed method was useful for obtaining better performance when classifying data samples into either known or unknown classes, which were included in given labeled samples or not, respectively.

机译：从分布与测试样本分布不同的标记样本中开发设计好的分类器的方法，是机器学习及其应用领域中一个重要且具有挑战性的研究问题。本文致力于通过使用与测试样本具有相同分布分布的未标记样本来设计具有高泛化能力的半监督分类器，并提出一种基于混合判别和生成模型的半监督学习方法。尽管JESS-CM是基于混合方法的最成功的半监督分类器设计框架之一，但它在我们在本文中考虑的任务设置中存在过拟合的问题。我们提出了一种目标函数，该目标函数利用标记和未标记的样本进行混合分类器的判别训练，然后期望该目标函数减轻过度拟合的问题。我们通过理论分析和实证评估表明目标函数的效果。我们使用四个典型的基准测试集合进行文本分类的实验结果证实，在大多数情况下，通过我们的任务设置，所提出的方法优于JESS-CM框架。我们还通过实验证实了，当将数据样本分为已知或未知类（分别包含在给定标记的样本中或不包含在给定的样本中）时，所提出的方法可用于获得更好的性能。

著录项

来源
《Knowledge and information systems》 |2013年第1期|共26页
作者
Akinori Fujino; Naonori Ueda; Masaaki Nagata;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动化系统理论;
关键词
Semi-supervised classifier; Hybrid discriminative and generative model; Transfer learning; Text classification;

机译：半监督分类器;混合判别与生成模型;转移学习;文本分类;

相似文献

外文文献
中文文献
专利

1. Adaptive semi-supervised learning on labeled and unlabeled data with different distributions [J] . Akinori Fujino, Naonori Ueda, Masaaki Nagata Knowledge and information systems . 2013,第1期

机译：对具有不同分布的标记和未标记数据进行自适应半监督学习
2. Semi-Supervised Logistic Discrimination via Labeled Data and Unlabeled Data from Different Sampling Distributions [J] . Shuichi Kawano Statistical Analysis and Data Mining . 2013,第6期

机译：通过不同采样分布中的标记数据和未标记数据进行半监督物流区分
3. SEMI-SUPERVISED LEARNING: EXPLOITING UNLABELED DATA WITH SYMMETRICAL DISTRIBUTION AND HIGH CONFIDENCE [J] . YIHAO ZHANG, JUNHAO WEN, FANGFANG TANG, International Journal of Pattern Recognition and Artificial Intelligence . 2012,第7期

机译：半监督的学习：利用对称分布和高置信度来探索无法标记的数据
4. Metric learning using labeled and unlabeled data for semi-supervised/domain adaptation classification [C] . Benisty Hadas, Crammer Koby 2014 IEEE 28th Convention of Electrical amp; Electronics Engineers in Israel . 2014

机译：使用标记和未标记的数据进行度量学习，以进行半监督/领域适应性分类
5. Learning from partially labeled data: Unsupervised and semi-supervised learning on graphs and learning with distribution shifting. [D] . Huang, Jiayuan. 2007

机译：从部分标记的数据中学习：在图上进行无监督和半监督学习，并通过分布转移进行学习。
6. Discriminatory Target Learning: Mining Significant Dependence Relationships from Labeled and Unlabeled Data [O] . Zhi-Yi Duan, Li-Min Wang, Musa Mammadov, 2019

机译：歧视目标学习：从标记和未标记的数据中挖掘显着的依赖关系
7. Combining Labeled and Unlabeled Data with Word-Class Distribution Learning [O] . Yanjun Qi, Koray Kavukcuoglu, Ronan Collobert, 2010

机译：将标记和未标记的数据与词类分布学习结合
8. Cognitive Study of Learning with Labeled and Unlabeled Data. [R] . Zhu, X., Rogers, T. T. 2012

机译：标记和未标记数据学习的认知研究。

Adaptive semi-supervised learning on labeled and unlabeled data with different distributions

摘要

著录项

相似文献

相关主题

期刊订阅