We have shown that mono audio coding with auditory system like spectro-temporal block shaping achieves an avarage bitrate of 3.3 bit per sample (compression rate of = 1/5) The Instantaneous-frequenc-based tonality estimation employs a high temporal resolution and is capable to improve this result to an average bitrate of 3 bit per sample with no perceptual impact. If a minor perceptual impact (PEAQ quality index > -1) is allowed, decreasing the avarge bitrate to 1.8 bps is possible which is nearly a compression rate of w 1/10. This is most remarkable, because spectral masking across frequency bands was not included in the masking model. Extending the model towards instantaneous frequency based spectral masking estimation will be part of subsequent studies and should decrease of the bitrate further. Further studies with a larger database of audio samples is required to confirm the presented results.
展开▼