首页> 外国专利> Unsupervised Learning of Semantic Audio Representations

Unsupervised Learning of Semantic Audio Representations

机译：语义音频表示的无监督学习

页面导航

摘要
著录项
相似文献

摘要

Methods are provided for generating training triplets that can be used to train multidimensional embeddings to represent the semantic content of non-speech sounds present in a corpus of audio recordings. These training triplets can be used with a triplet loss function to train the multidimensional embeddings such that the embeddings can be used to cluster the contents of a corpus of audio recordings, to facilitate a query-by-example lookup from the corpus, to allow a small number of manually-labeled audio recordings to be generalized, or to facilitate some other audio classification task. The triplet sampling methods may be used individually or collectively, and each represent a respective heuristic about the semantic structure of audio recordings.

机译：提供了用于生成训练三胞胎的方法，该训练三胞胎可用于训练多维嵌入以表示存在于音频录音集中的非语音声音的语义内容。这些训练三元组可以与三元组损失函数一起使用，以训练多维嵌入，以便可以将嵌入用于对录音语料库的内容进行聚类，以便于从语料库中按示例查询，从而允许少量的手动标记的音频记录将被推广，或有助于其他一些音频分类任务。三重采样方法可以单独使用也可以一起使用，每种方法都代表了有关录音语义结构的启发式方法。

著录项

公开/公告号US2020349921A1

专利类型
公开/公告日2020-11-05

原文格式PDF
申请/专利权人 GOOGLE LLC;
展开▼

申请/专利号US201816758564
发明设计人 AREN JANSEN;MANOJ PLAKAL;RICHARD CHANNING MOORE;SHAWN HERSHEY;RATHEET PANDYA;RYAN RIFKIN;JIAYANG LIU;DANIEL ELLIS;
展开▼

申请日2018-10-26
分类号G10L15/06;G10L25/18;G10L15/02;G10L25/51;G06N3/04;G06N3/08;
国家 US
入库时间 2022-08-21 11:21:15

相似文献

专利
外文文献
中文文献