首页> 外国专利> Unsupervised Learning of Semantic Audio Representations

Unsupervised Learning of Semantic Audio Representations

机译:语义音频表示的无监督学习

摘要

Methods are provided for generating training triplets that can be used to train multidimensional embeddings to represent the semantic content of non-speech sounds present in a corpus of audio recordings. These training triplets can be used with a triplet loss function to train the multidimensional embeddings such that the embeddings can be used to cluster the contents of a corpus of audio recordings, to facilitate a query-by-example lookup from the corpus, to allow a small number of manually-labeled audio recordings to be generalized, or to facilitate some other audio classification task. The triplet sampling methods may be used individually or collectively, and each represent a respective heuristic about the semantic structure of audio recordings.
机译:提供了用于生成训练三胞胎的方法,该训练三胞胎可用于训练多维嵌入以表示存在于音频录音集中的非语音声音的语义内容。这些训练三元组可以与三元组损失函数一起使用,以训练多维嵌入,以便可以将嵌入用于对录音语料库的内容进行聚类,以便于从语料库中按示例查询,从而允许少量的手动标记的音频记录将被推广,或有助于其他一些音频分类任务。三重采样方法可以单独使用也可以一起使用,每种方法都代表了有关录音语义结构的启发式方法。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号