Perceptual coding of digital audio

Ted Painter; Andreas Spanias

doi:10.1109/5.842996

Perceptual coding of digital audio

Ted Painter, Andreas Spanias

Engineering, Ira A. Fulton Schools of (IAFSE)

Research output: Contribution to journal › Article › peer-review

623 Scopus citations

Abstract

During the last decade, CD-quality digital audio has essentially replaced analog audio. Emerging digital audio applications for network, 'wireless, and multimedia computing systems face a series of constraints suck as reduced channel bandwidth, limited storage capacity, and low cost. These new applications have created a demand for high-quality digital audio delivery at tow bit rates. In response to this need, considerable research has been devoted to 'he development of algorithms for perceptually transparent coding of high-fidelity (CD-quality) digital audio. As a result, many algorithms have been proposed, and several have now become international and!or commercial product standards. This paper reviews algorithms for perceptually transparent coding of CD-quality digital audio, including both research and standardization activities. This paper is organized as follows. First, psychoacoustic principles are described, with the MPEG psychoacoustic signal analysis model 1 discussed in some detail. Next, filler bank design issues and algorithms are addressed, with a particular emphasis placed on the modified discrete cosine transform, a perfect reconstruction cosine-modulated filter bank that has become of central importance in perceptual audio coding. Then, we review methodologies that achieve perceptually transparent coding of FM- and CD-quality audio signals, including algorithms that manipulate transform components, subband signal decompositions, sinusoidal signal components, and linear prediction parameters, as well as hybrid algorithms that make use of more than one signal model. These discussions concentrate on architectures and applications of those techniques that utilize psychoacoustic models to exploit efficiently masking characteristics of the human receiver. Several algorithms that have become international andor commercial standards receive in-depth treatment, including the ISOIEC MPEG family (-1, -2. -4), the Lucent Technologies PACEPACMPAC, the Dolby' AC-21 AC-3, and the Sony ATRACSDDS algorithms. Then, we describe subjective evaluation methodologies in some detail, including the ITU-R BSJ1I6 recommendation on subjective measurements of small impairments. This paper concludes with a discussion of future research directions.

Original language	English (US)
Pages (from-to)	451-512
Number of pages	62
Journal	Proceedings of the IEEE
Volume	88
Issue number	4
DOIs	https://doi.org/10.1109/5.842996
State	Published - 2000

Keywords

AC-2
AC-3
AT R AC
Advanced audio coding (AAC)
Audio coding
Audio coding standards
Audio signal processing
Data compression
Digital audio radio (DAR)
Digital broadcast audio (DBA)
Filter banks
High-definition tv (HDTV)
MPEG

ASJC Scopus subject areas

General Computer Science
Electrical and Electronic Engineering

Access to Document

10.1109/5.842996

Cite this

@article{e8a365118e5d4090bdfa50d2db3f611c,

title = "Perceptual coding of digital audio",

abstract = "During the last decade, CD-quality digital audio has essentially replaced analog audio. Emerging digital audio applications for network, 'wireless, and multimedia computing systems face a series of constraints suck as reduced channel bandwidth, limited storage capacity, and low cost. These new applications have created a demand for high-quality digital audio delivery at tow bit rates. In response to this need, considerable research has been devoted to 'he development of algorithms for perceptually transparent coding of high-fidelity (CD-quality) digital audio. As a result, many algorithms have been proposed, and several have now become international and!or commercial product standards. This paper reviews algorithms for perceptually transparent coding of CD-quality digital audio, including both research and standardization activities. This paper is organized as follows. First, psychoacoustic principles are described, with the MPEG psychoacoustic signal analysis model 1 discussed in some detail. Next, filler bank design issues and algorithms are addressed, with a particular emphasis placed on the modified discrete cosine transform, a perfect reconstruction cosine-modulated filter bank that has become of central importance in perceptual audio coding. Then, we review methodologies that achieve perceptually transparent coding of FM- and CD-quality audio signals, including algorithms that manipulate transform components, subband signal decompositions, sinusoidal signal components, and linear prediction parameters, as well as hybrid algorithms that make use of more than one signal model. These discussions concentrate on architectures and applications of those techniques that utilize psychoacoustic models to exploit efficiently masking characteristics of the human receiver. Several algorithms that have become international andor commercial standards receive in-depth treatment, including the ISOIEC MPEG family (-1, -2. -4), the Lucent Technologies PACEPACMPAC, the Dolby' AC-21 AC-3, and the Sony ATRACSDDS algorithms. Then, we describe subjective evaluation methodologies in some detail, including the ITU-R BSJ1I6 recommendation on subjective measurements of small impairments. This paper concludes with a discussion of future research directions.",

keywords = "AC-2, AC-3, AT R AC, Advanced audio coding (AAC), Audio coding, Audio coding standards, Audio signal processing, Data compression, Digital audio radio (DAR), Digital broadcast audio (DBA), Filter banks, High-definition tv (HDTV), MPEG",

author = "Ted Painter and Andreas Spanias",

note = "Funding Information: Manuscript received November 17, 1999; revised January 24, 2000. This work was supported in part by the NDTC Committee of Intel Corp. under a Grant. The authors are with the Department of Electrical Engineering, Telecommunications Research Center, Arizona State University, Tempe, AZ 85287-7206 (e-mail: spanias@asu.edu; painter@asu.edu). Publisher Item Identifier S 0018-9219(00)03054-1.",

year = "2000",

doi = "10.1109/5.842996",

language = "English (US)",

volume = "88",

pages = "451--512",

journal = "Proceedings of the IEEE",

issn = "0018-9219",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "4",

}

TY - JOUR

T1 - Perceptual coding of digital audio

AU - Painter, Ted

AU - Spanias, Andreas

N1 - Funding Information: Manuscript received November 17, 1999; revised January 24, 2000. This work was supported in part by the NDTC Committee of Intel Corp. under a Grant. The authors are with the Department of Electrical Engineering, Telecommunications Research Center, Arizona State University, Tempe, AZ 85287-7206 (e-mail: spanias@asu.edu; painter@asu.edu). Publisher Item Identifier S 0018-9219(00)03054-1.

PY - 2000

Y1 - 2000

N2 - During the last decade, CD-quality digital audio has essentially replaced analog audio. Emerging digital audio applications for network, 'wireless, and multimedia computing systems face a series of constraints suck as reduced channel bandwidth, limited storage capacity, and low cost. These new applications have created a demand for high-quality digital audio delivery at tow bit rates. In response to this need, considerable research has been devoted to 'he development of algorithms for perceptually transparent coding of high-fidelity (CD-quality) digital audio. As a result, many algorithms have been proposed, and several have now become international and!or commercial product standards. This paper reviews algorithms for perceptually transparent coding of CD-quality digital audio, including both research and standardization activities. This paper is organized as follows. First, psychoacoustic principles are described, with the MPEG psychoacoustic signal analysis model 1 discussed in some detail. Next, filler bank design issues and algorithms are addressed, with a particular emphasis placed on the modified discrete cosine transform, a perfect reconstruction cosine-modulated filter bank that has become of central importance in perceptual audio coding. Then, we review methodologies that achieve perceptually transparent coding of FM- and CD-quality audio signals, including algorithms that manipulate transform components, subband signal decompositions, sinusoidal signal components, and linear prediction parameters, as well as hybrid algorithms that make use of more than one signal model. These discussions concentrate on architectures and applications of those techniques that utilize psychoacoustic models to exploit efficiently masking characteristics of the human receiver. Several algorithms that have become international andor commercial standards receive in-depth treatment, including the ISOIEC MPEG family (-1, -2. -4), the Lucent Technologies PACEPACMPAC, the Dolby' AC-21 AC-3, and the Sony ATRACSDDS algorithms. Then, we describe subjective evaluation methodologies in some detail, including the ITU-R BSJ1I6 recommendation on subjective measurements of small impairments. This paper concludes with a discussion of future research directions.

AB - During the last decade, CD-quality digital audio has essentially replaced analog audio. Emerging digital audio applications for network, 'wireless, and multimedia computing systems face a series of constraints suck as reduced channel bandwidth, limited storage capacity, and low cost. These new applications have created a demand for high-quality digital audio delivery at tow bit rates. In response to this need, considerable research has been devoted to 'he development of algorithms for perceptually transparent coding of high-fidelity (CD-quality) digital audio. As a result, many algorithms have been proposed, and several have now become international and!or commercial product standards. This paper reviews algorithms for perceptually transparent coding of CD-quality digital audio, including both research and standardization activities. This paper is organized as follows. First, psychoacoustic principles are described, with the MPEG psychoacoustic signal analysis model 1 discussed in some detail. Next, filler bank design issues and algorithms are addressed, with a particular emphasis placed on the modified discrete cosine transform, a perfect reconstruction cosine-modulated filter bank that has become of central importance in perceptual audio coding. Then, we review methodologies that achieve perceptually transparent coding of FM- and CD-quality audio signals, including algorithms that manipulate transform components, subband signal decompositions, sinusoidal signal components, and linear prediction parameters, as well as hybrid algorithms that make use of more than one signal model. These discussions concentrate on architectures and applications of those techniques that utilize psychoacoustic models to exploit efficiently masking characteristics of the human receiver. Several algorithms that have become international andor commercial standards receive in-depth treatment, including the ISOIEC MPEG family (-1, -2. -4), the Lucent Technologies PACEPACMPAC, the Dolby' AC-21 AC-3, and the Sony ATRACSDDS algorithms. Then, we describe subjective evaluation methodologies in some detail, including the ITU-R BSJ1I6 recommendation on subjective measurements of small impairments. This paper concludes with a discussion of future research directions.

KW - AC-2

KW - AC-3

KW - AT R AC

KW - Advanced audio coding (AAC)

KW - Audio coding

KW - Audio coding standards

KW - Audio signal processing

KW - Data compression

KW - Digital audio radio (DAR)

KW - Digital broadcast audio (DBA)

KW - Filter banks

KW - High-definition tv (HDTV)

KW - MPEG

UR - http://www.scopus.com/inward/record.url?scp=0034172308&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0034172308&partnerID=8YFLogxK

U2 - 10.1109/5.842996

DO - 10.1109/5.842996

M3 - Article

AN - SCOPUS:0034172308

SN - 0018-9219

VL - 88

SP - 451

EP - 512

JO - Proceedings of the IEEE

JF - Proceedings of the IEEE

IS - 4

ER -

Perceptual coding of digital audio

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this