首页> 外国专利> Audio content fingerprinting based on two-dimensional constant Q-factor transform representation and robust audio identification for time-aligned applications

Audio content fingerprinting based on two-dimensional constant Q-factor transform representation and robust audio identification for time-aligned applications

机译：基于二维恒定Q因子变换表示的音频内容指纹识别和针对时间对齐应用的可靠音频识别

页面导航

摘要
著录项
相似文献

摘要

Content identification methods for consumer devices determine robust audio fingerprints that are resilient to audio distortions. One method generates signatures representing audio content based on a constant Q-factor transform (CQT). A 2D spectral representation of a 1D audio signal facilitates generation of region based signatures within frequency octaves and across the entire 2D signal representation. Also, points of interest are detected within the 2D audio signal representation and interest regions are determined around selected points of interest. Another method generates audio descriptors using an accumulating filter function on bands of the audio spectrum and generates audio transform coefficients. A response of each spectral band is computed and transform coefficients are determined by filtering, by accumulating derivatives with different lags, and computing second order derivatives. Additionally, time and frequency based onset detection determines audio descriptors at events and enhances descriptors with information related to an event.

机译：消费类设备的内容识别方法确定了可抵抗音频失真的强大音频指纹。一种方法基于恒定Q因子变换（CQT）生成代表音频内容的签名。 1D音频信号的2D频谱表示有助于在频率八度音程内和整个2D信号表示中生成基于区域的签名。此外，在2D音频信号表示中检测到兴趣点，并在所选兴趣点周围确定兴趣区域。另一种方法是在音频频谱的频带上使用累积滤波器函数来生成音频描述符，并生成音频变换系数。计算每个频谱带的响应，并通过滤波，累加具有不同滞后的导数并计算二阶导数来确定变换系数。另外，基于时间和频率的开始检测确定事件处的音频描述符，并利用与事件有关的信息增强描述符。

著录项

公开/公告号US9299364B1

专利类型
公开/公告日2016-03-29

原文格式PDF
申请/专利权人 JOSE PIO PEREIRA;MIHAILO M. STOJANCIC;PETER WENDT;
展开▼

申请/专利号US201213647996
发明设计人 PETER WENDT;JOSE PIO PEREIRA;MIHAILO M. STOJANCIC;
展开▼

申请日2012-10-09
分类号G06F17/00;G10L25/51;G10L19/018;
国家 US
入库时间 2022-08-21 14:29:02

相似文献

专利
外文文献
中文文献