Methods, systems, and apparatus, including computer programs encoded on computer storage media for performing multi-task learning. In one method a system obtains a respective set of training data for each of multiple machine learning tasks. For each of the machine learning tasks, the system configures a respective teacher machine learning model to perform the machine learning task by training the teacher machine learning model on the training data. The system trains a single student machine learning model to perform the multiple machine learning tasks using (i) the configured teacher machine learning models, and (ii) the obtained training data.
展开▼