Multi-Stage Encoding Scheme for Multiple Audio Objects Using Compressed Sensing

Open access


Object-based audio techniques have become common since they provide the flexibility for personalized rendering. In this paper a multi-stage encoding scheme for multiple audio objects is proposed. The scheme is based on intra-object sparsity. In the encoding phase the dominant Time Frequency (TF) instants of all active object signals are extracted and divided into several stages to form the multistage observation signals for transmission. In the decoding phase the preserved TF instants are recovered via Compressed Sensing (CS) technique, and further used for reconstructing the audio objects. The evaluations validated that the proposed encoding scheme can achieve scalable transmission while maintaining perceptual quality of each audio object.

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • 1. BS.775 Int. Telecommunication Union. Multichannel Stereophonic Sound System with and Without Accompanying Picture. 2006.

  • 2. Hamasaki K. A 22.2 Multichannel Sound System for Ultrahigh Definition TV (UHDTV).– SMPTE Motion Imaging Journal Vol. 117 2008 No 3 pp. 40-49.

  • 3. Smolic A. An Overview of 3rd Video and Free Viewpoint Video. – In: Proc. of 13th International Conference on Computer Analysis of Images and Patterns Springer Münster Germany 2009 pp.1-8.

  • 4. Tanimoto M. Overview of Free Viewpoint Television. – Signal Processing: Image Communication Vol. 21 2006 No 6 pp. 454-461.

  • 5. Dolby Laboratories. Dolby ATMOS Cinema Specifications. 2014.

  • 6. Herre J. H. Purnhagen J. Koppens O. Hellmuth J. Engdegård J. Hilper L. Villemoes L. Terentiv C. Falch A. Hölzer M. L. Valero B. Resch H. Mundt H. O. Oh. MPEG Spatial Audio Object Coding – The ISO/MPEG Standard for Efficient Coding of Interactive Audio Scenes. – Journal of the Audio Engineering Society Vol. 60 2012 No 9 pp. 655-673.

  • 7. Liutkus A. S. Gorlow N. Sturmel S. Zhang L. Girin R. Badeau L. Daudet S. Marchand G. Richard. Informed Audio Source Separation: A Comparative Study. – In: Proc. of 20th European Signal Processing Conference EURASIP’12 Bucharest Romania 2012 pp. 2397-2401.

  • 8. Ozerov A. A. Liutkus R. Badeau G. Richard. Coding-Based Informed Source Separation: Nonnegative Tensor Factorization Approach. – IEEE Transactions on Audio Speech and Language Processing Vol. 21 2013 No 8 pp. 1699-1712.

  • 9. Zheng X. C. Ritz J. T. Xi. A Psychoacoustic-Based Analysis-Bysynthesis Scheme for Jointly Encoding Multiple Audio Objects into Independent Mixtures. – In: Proc. of 38th IEEE International Conference on Acoustics Speech and Signal Processing IEEE Vancouver Canada 2013 pp. 281-285.

  • 10. Jia M. Z. Yang C. Bao X. Zheng C. Ritz. Encoding Multiple Audio Objects Using Intra-Object Sparsity. – IEEE/ACM Transactions on Audio Speech and Language Processing Vol. 23 2015 No 6 pp. 1082-1095.

  • 11. Jia M. C. Bao X. Liu. An Embedded Speech and Audio Coding Method Based on Bit-Plane Coding and SQVH. – In: Proc. of IEEE International Symposium on Signal Processing and Information Technology IEEE Ajman UAE 2009 pp. 43-48.

  • 12. Candes E. J. M. B. Wakin. An Introduction to Compressive Sampling. – IEEE Signal Processing Magazine Vol. 25 2008 No 2 pp. 21-30.

  • 13. Candes E. J. J. K. Romberg T. Tao. Stable Signal Recovery from Incomplete and Inaccurate Measurements. – Communications on Pure and Applied Mathematics Vol. 59 2006 No 8 pp. 1207-1223.

  • 14. Sohn J. N. S. Kim W. Sung. A Statistical Model-Based Voice Activity Detection. – IEEE Signal Processing Letters Vol. 6 1999 No 1 pp. 1-3.

  • 15. QUASI Database – a Musical Audio Signal Database for Source Separation.

  • 16. Huber R. B. Kollmeier. PEMO-Q: A New Method for Objective Audio Quality Assessment Using a Model of Auditory Perception. – IEEE Transactions on Audio Speech and Language Processing Vol. 14 2006 No 6 pp. 1902-1911.

  • 17. Bosi M. K. Brandenburg S. Quackenbush L. Fielder K. Akagiri H. Fuchs M. Dietz. ISO/IEC MPEG-2 Advanced Audio Coding. – Journal of the Audio Engineering Society Vol. 45 1997 No 10 pp. 789-814.

  • 18. Golomb S. Run-Length Encodings (Corresp.). – IEEE Transactions on Information Theory Vol. 12 1966 No 3 pp. 399-401.

  • 19. Rice R. J. Plaunt. Adaptive Variable – Length Coding for Efficient Compression of Spacecraft Television Data. – IEEE Transactions on Communication Technology Vol. 19 1971 No 6 pp. 889-897.

Journal information
Impact Factor

CiteScore 2018: 0.84

SCImago Journal Rank (SJR) 2018: 0.215
Source Normalized Impact per Paper (SNIP) 2018: 0.595

Mathematical Citation Quotient (MCQ) 2017: 0.01

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 142 78 2
PDF Downloads 75 56 3