2013 MPEG Audio Codecs IRT Kolloquium

AUDIOCODIERUNG IN MPEG: AKTUELLER STAND UND
AUSBLICK AUF ZUKNFTIGE ENTWICKLUNGEN

Manfred Lutzky, manfred.lutzky@iis.fraunhofer.de
Nikolaus Rettelbach, nikolaus.rettelbach@iis.fraunhofer.de
Fraunhofer IIS
Outline
MPEG Audio
Nikolaus Rettelbach
Overview of Mature Standards

Recent Standards
Ongoing Standardization
MPEG Audio Communication Codecs
Manfred Lutzky
Fraunhofer IIS
Inhalt/Titel durch
Klicken
hinzufgen
MPEG-Audio:
Mature
Standards
MPEG-4: AAC-ELD
MPEG-D: MPEG Surround
MPEG-4 HE-AAC, HE-AACv2
MPEG-4: Audio (AAC-LC/LD/SCAL, TwinVQ, BSAC)
MPEG-2: NBC = AAC
MPEG-1/2: Layer I, II, III
1992
Fraunhofer IIS
1997
2003
2007
2012
MPEG-Audio: Standards and Applications

Standard
MPEG-1 Audio
Layer-I
Layer-II
Layer-III (mp3)
MPEG-2 Audio
ISO/IEC
11172-3
~PASC = DCC (Digital Compact Casette)

DAB, DVB, DVD(Europa)
iPod, Internet Streaming, Electronic Music Distribution
13818-3
Layer-I,II,III
AAC-LC
MPEG-4 Audio
Applications
13818-7
14496-3
adds low sampling rates

5.1 extension: not used
ARIB (Japan TV)
AAC-(E)LD
EBU N/ACIP, FaceTime, VideoConferencing,

iOS, Android, MacOSX
HE-AAC/
HE-AACv2
3GPP, DAB+, DRM, DASH, HbbTV, DVB-T, ATSC, ARIB Brazil

iOS, Android, MacOSX, MS Windows
MPEG-D
23003
MPEG Surround 23003-1
Fraunhofer IIS
DAB+, DVB, DASH, DECE
Inhalt/Titel durch
Klicken
hinzufgen
MPEG-Audio:
Mature
Standards
MPEG-4 HE-AAC, HE-AACv2
1992
Fraunhofer IIS
1997
2003
2007
2012
MPEG-4 HE-AACv2: ISO/IEC 14496-3

HE-AACv2 is a combination of:
Low Complexity AAC (AAC-LC)
Spectral Band Replication (SBR)
Parametric Stereo (PS)
Fraunhofer IIS
MPEG-4 HE-AACv2: AAC-LC

Classic Perceptual/Transform Coder
Audio Input
Bit Demux
Time/
Frequency
(MDCT)
Psychoacoustic
Control
Quantization
(Scaling/
Noise
Shaping)
Scale-factors
Scale-factors
Inv.
Quantization/
Rescaling
Frequency/
Time
(IMDCT)
Noiseless
Coding
Bit Mux
Perceptual Encoder
Fraunhofer IIS
Noiseless
Decoding
(Huffman)
Audio Output
Decoder
Encode lower frequencies

only
Describe high frequencies
parametrically (Tonality,
Energy-Envelope,)
Decoder:
xover
Amplitude
Encoder:
Amplitude
HE-AACv2: Spectral Band Replication (SBR)
Decode lower frequencies
Fraunhofer IIS
Replicate higher frequencies

by copying up and adapting
the decoded spectrum
according to transmitted SBR
parameters
Frequency
Amplitude
xover
xover
HE-AACv2: Parametric Stereo (PS)

Encoder:
Encode Mono Downmix with AAC Encoder
Parametrize Stereo Image (Panning, Correlation)
Decoder:
Decode Mono Downmix
Regenerate Stereo Image based on transmitted parameters
[EBU TECHNICAL REVIEW January 2006: MPEG-4

HE-AACv2 audio coding for todays digital media
world; Stefan Meltzer and Gerald Moser] 10
Fraunhofer IIS
MPEG-4 HE-AACv2
State of the art in efficient high-quality multi-channel and stereo audio

codec for more than 10 years
Used in TV, radio, and streaming worldwide
in more than 5 billion devices today
[ETSI TS 101 154]

Fraunhofer IIS
11
Inhalt/Titel durch
Klicken
hinzufgen
MPEG-Audio:
Mature
Standards
MPEG Surround
1992
Fraunhofer IIS
1997
2003
2007
2012
12
MPEG Surround Codec Principle

Stereo Playback
HE-AAC
Encoder
Stereo Bitstream
MPS Bitstream
HE-AAC
Decoder
Automatic
Stereo
Downmix
Stereo
Downmix
5.1 Playback
MPS
Encoder
5.1 PCM
MPS
Decoder
Binaural Playback
13
Fraunhofer IIS
MPEG Surround
High-Quality Surround Sound at Stereo Bit-Rates
MPEG Surround allows an efficient and backward compatible compression
of high-quality surround sound
Six for the price of two

multi-channel audio at stereo bit-rates
side information (e.g. 4-10 kbps) transparently carried in the audio stream
easy upgrade to any stereo transmission system for multi-channel
applications
One file for all

MPEG Surround files can be played back on stereo and surround
loudspeaker systems, or ordinary headphones
14
Fraunhofer IIS
Inhalt/Titel durch
Klicken
hinzufgen
MPEG-Audio:
Recent
and Future
Standards
MPEG-H: 3-D Audio

Dialogue Enhancement
xHE-AAC, AAC-ELDv2
MPEG SAOC (Spatial Audio Object Coding)
2003
Fraunhofer IIS
2007
2012
15
Inhalt/TitelHE-AAC
durch Klicken hinzufgen
Extended
MPEG xHE-AAC
2003
Fraunhofer IIS
2007
2012
16
Extended HE AAC: Motivation/Background
State of the art codecs in 2007:

HE-AACv2 for music and generic audio
AMR-WB+ for speech
Call for Proposals issued in Autumn 2007 with the aim to develop a
codec which
performs comparable to or better than the best coding technology that
might be tailored specifically to coding of either speech or general audio
content
Unified Speech and Audio Coding (USAC)
MPEG-D Part-3 USAC published in early 2012 (ISO/IEC 23003-3)
17
Fraunhofer IIS
Extended HE AAC: USAC and the AAC family
Final algorithmic design of the codec is an upgrade of

MPEG-4 High Efficiency AAC v2 (HE-AACv2, enh. AAC+)
New profile Extended HE AAC, includes HE-AACv2 and USAC.
New devices will be backwards compatible by design
AAC-LC
18
Fraunhofer IIS
Extended HE AAC: General Codec Structure
well known from HE-AACv2 or

comparable codecs:
(Partially) parametric stereo
coding stage:
Unified Stereo Coding
Enhanced Spectral Band
Replication (eSBR) for parametric
coding of high audio frequencies:
much improved SBR as known
from HE-AAC (v2)
Remaining core signal coded with
MDCT based transform coder
amended with speech coding
technologies
Fraunhofer IIS
Encoder
Decoder
19
Extended HE AAC: Core Coder
MDCT based transform coder base as

known from AAC
Bitstream De-Multiplex
Arithm.
Dec.
Plus Speech coding derived

technologies:
LPC based spectral shaping
Time domain based ACELP
Bass postfilter (pitch enhancement)
Scalefactors
ACELP
Inv.
Quant.
LPC
Dec.
Scaling
LPC to
Freq.
Dom.
LPC
Synth
Filter
IMDCT
FAC
Windowing, Overlap-Add
Bass Postfilter
Bandwidth Extension
Stereo Processing
Uncompressed PCM Audio
Decoder Detailed Structure

Fraunhofer IIS
20
Extended HE AAC: Technical Highlights
Many new tools and improvements

throughout the system
Bitstream De-Multiplex
Arithm.
Dec.
Technical Highlights:
Frequency Domain Noise Shaping
Inv.
Quant.
LPC
Dec.
Scaling
LPC to
Freq.
Dom.
Forward Aliasing Cancellation

Context Adaptive Arithmetic Coding
Harmonic Transposition
Unified Stereo Coding
Scalefactors
ACELP
LPC
Synth
Filter
IMDCT
FAC
Windowing, Overlap-Add
Bass Postfilter
Bandwidth Extension
Stereo Processing
Uncompressed PCM Audio
21
Fraunhofer IIS
Extended HE AAC: Performance Overview

Mono
Stereo
MUSHRA score
100
MUSHRA score
100
0
ad
24
126
16
24
Bit Rate (kbit/s)
AMR
Fraunhofer IIS
USAC
620
VC
24
8 16
b t ate [ bps]
20
664
AMR
HE-AAC
2496
832
Bit Rate (kbit/s)

AMR
AMR
48
64
96
22
Extended HE-AAC: Performance II
Low bitrates [8 kbps mono / 16 kbps stereo to ~24/32 kbps]:

better speech quality than any speech codec
Perceived audio quality: up to ~20-40 point improvement on a 100
point scale for speech over HE-AAC at very low bit rates
New state of the art for mono and stereo coding
Intermediate bitrates [32 ~96 kbps stereo]:
Better performance compared to HE-AAC through e.g.
Unified stereo
Improved entropy coding
23
Fraunhofer IIS
Extended HE-AAC: Conclusions
Extended HE-AAC allows coding of arbitrary content (music, speech,

mixed) at bitrates as low as
8 kbit/s mono, 16 kbit/s stereo
while also allowing coding of highest quality at high rates and
multi-channel
Optimal codec for transmission over bandwidth limited channels, e.g.
over-the-air broadcasting, streaming to mobile devices
Outperforms legacy state-of-the-art dedicated speech and audio codecs
Extended HE AAC is an upgrade to HE-AACv2 , thus guaranteeing
backwards compatibility with existing services and content
24
Fraunhofer IIS
Inhalt/Titel
Klicken hinzufgen
MPEG
SAOCdurch
and Dialogue
Enhancement
MPEG SAOC: Dialogue Enhancement
MPEG SAOC (Spatial Audio Object Coding)
2003
Fraunhofer IIS
2007
2012
25
Personalized User Experience
User benefit
Enables users to change the balance between dialogue and
background according to individual preferences
Dialog enhancement for better intelligibility
Adaptation to listening environment
Broadcaster benefit
Same audio mix for all listening environments
No need to send different audio versions
Backwards compatible with existing devices
Cost efficient hearing-impaired audio service
26
Fraunhofer IIS
Todays Challenge
Finding the Right Mix
Audio mix with one balance between dialogue and background is always
a compromise
Hearing impaired people require a higher loudness of the dialog
Non-native speakers need about 3dB higher S/N
The listening environment has an influence on the preferred setting of
the mix
Depends on content, e.g.
Sport events: commentary vs. stadium atmosphere
Movies: music & effects vs. dialogue level
27
Fraunhofer IIS
MPEG Spatial Audio Object Coding (SAOC)

Efficient transmission and rendering of multiple audio objects with
Interactivity / Personalization
SAOC
Encoder
SAOC
Decoder
28
Fraunhofer IIS
Object based with parameterized objects
Each audio element is treated as an object.
The objects are parameterized to allow their manipulation at the
receiver. The parameters are send with the mix.
Backward compatible transmission
Audio mix is not changed
Parameters embedded into bitstream
29
Fraunhofer IIS
Signal Flow Overview
30
Fraunhofer IIS
Scenarios
Stereo
Mono Dialog or Stereo Dialog
Stereo Background
5.1 Multi-channel
Mono Dialog, Center-only
Stereo Dialog: all three front channels (Left, Center, Right) contain
Dialog signal parts
5.1 Background
31
Fraunhofer IIS
Dialogue Enhancement:
Workflow Integration
Separate objects (sources) are available at the encoder:
Dialog and Background
Mix is done in the SAOC encoder
Bitstream
AAC
Encoder
Parameters
Mix
Dialog
Background
SAOC
Encoder
32
Fraunhofer IIS
Inhalt/Titel
durch
Klicken hinzufgen
MPEG:
Current
Standardization
MPEG-H: 3-D Audio
2003
Fraunhofer IIS
2007
2012
33
MPEG-H 3D-AUDIO
34
Fraunhofer IIS
MPEG-H 3D-Audio
Idea
Universal bitrate-efficent high quality compression format

Format can represent spatial audio from very high spatial fidelity (22.2
loudspeakers) down to stereo
Decoder/renderer is able to render from 22.2 down to stereo or
headphone reproduction
Rendering to non-ideal (misplaced) loudspeaker setups
Bitrates from 1200 kbps down to 256 kbps for 22.2 channel material
Application Scenarios
Home Theatre, Personal 3D TV, TV for Smartphones/Tablets
35
Fraunhofer IIS
MPEG-H 3D-Audio
Format
Two format approaches under parallel consideration

Channels + Objects (C+O)
Channels: Waveforms for dedicated loudspeaker signals
Objects: Waveforms where rendering position is variable
Higher Order Ambisonics (HOA)
A recording and storage format for a whole 3D scene
36
Fraunhofer IIS
MPEG-H 3D-Audio
Flowchart Channel+Object
Direct Loudspeaker
Output (22.2)
Compressed
bitstream,
256 ... 1200 kbps
Channel+Object
Decoder
Object Renderer
Format Conversion
to reduced number
of Loudspeakers
(e.g. 8.1, 5.1)
Headphone
processing
37
Fraunhofer IIS
MPEG-H 3D-Audio
Timeline
Call for technology proposals since 01-2013

Submission of technology candidates, 06-2013
Selection of reference model 0 technology, 08-2013
Merge of Channel+Object and HOA-technology is envisioned
Work item to be finalized in 2015
38
Fraunhofer IIS
Inhalt/Titel durch
Klicken hinzufgen
MPEG-Audio:
Communication
Codecs
MPEG-4 AAC-ELDv2
MPEG-4 AAC-ELD
MPEG-4 AAC-LD
1997
Fraunhofer IIS
2003
2007
2012
39
ISO/MPEG low delay Audio Coding
MPEG
Surround
2004
240
algorithmic delay [ms]
2006
HE-AAC
v2
180
2003
HE-AAC
1999
120
AAC-LC
60
0
10
24
40 64
Bit rate per channel [kbps]
128
40
Fraunhofer IIS

Low delay AAC (AAC-LD)
MPEG
Surround
2004
240
2006
HE-AAC
v2
180
2003
HE-AAC
1999
120
AAC-LC
60
AAC-LD
0
10
24
40 64
2000
128
41
Fraunhofer IIS

Enhanced Low Delay AAC (AAC-ELD)
MPEG
Surround
2004
240
2006
HE-AAC
v2
180
2003
HE-AAC
1999
120
2008
60
AAC-LC
AAC-ELD
0
10
24
40 64
AAC-LD
2000
128
42
Fraunhofer IIS

AAC-ELDv2 - 3GPP EVS
MPEG
Surround
2006
2004
240
HE-AAC
v2
180
2003
HE-AAC
1999
120
2011
2008
60
0
AAC-ELD
v2
10
AAC-ELD v2 stereo operation mode
Fraunhofer IIS
24
AAC-ELD
AAC-LC
AAC-LD
40 64
2000
128
43

MPEG
Surround
2006
2004
240
HE-AAC
v2
180
2003
HE-AAC
1999
120
2011
2014 sched.
60
0
AAC-ELD
3GPP
EVS v2
10
Fraunhofer IIS
2008
24
AAC-ELD
AAC-LC
AAC-LD
40 64
2000
128
44
ISO/MPEG AAC-ELD
2011
AAC-ELD
v2
+ MPS
2008
+ SBR
AAC-ELD
AAC-LD
Status:
AAC-ELD International MPEG Standard Q4/2007
AAC-ELD v2 International MPEG Standard, part of MPEG SAOC
Innovation of AAC-ELD:
Low delay Spectral Bandwidth Replication (SBR)
Delay optimized filterbank/window
Innovation of AAC-ELD v2:
Low Delay MPEG Surround (MPS)
Delay optimized codec structure
Fraunhofer IIS
iChat & FaceTime
45
ISO/MPEG AAC-ELD
Main features:
AAC-LD
AAC-ELD
AAC-ELD v2
bandwidth
Full audible bandwidth up to 48 kHz sampling

rate
Algorithmic
delay
20 40 ms
15-32 ms
21- 39 ms
(stereo mode)
Typical
bitrate
[kBit/s]
32 (mono)128 (stereo)
24 (mono)
128 (stereo)
24-48 (stereo)
Frequency
domain
mixing
Fraunhofer IIS
Josef-von-Fraunhofer
Prize 2011
46
ISO/MPEG AAC-ELD
licensing
patent pool for all relevant AAC family members run by via
AAC-LD, AAC-ELD v2 and AAC-LC, HE AACv2 part of unified license
Licensors: AT&T, Dolby, Fraunhofer, Philips, LG, Microsoft, NEC, Nokia, NTT,
Orange, Panasonic, Sony, Ericsson
http://www.vialicensing.com/licensing/aac-overview.aspx
47
Fraunhofer IIS
ISO/MPEG AAC-ELD
stereo quality
48
Fraunhofer IIS
ISO/MPEG AAC-ELD
AAC-ELD delay optimized t->f transformations
Delay
saving
AAC-LD two window shapes
AAC-ELD one universal low delay window

49
Fraunhofer IIS
ISO/MPEG AAC-ELD
Low delay Spectral Bandwidth Replication (SBR)
2008
AAC-ELD
+ SBR
AAC-LD
Warped Copy of
AAC Spectrum
Added Spectral
Components
Amplitude
Amplitude
AAC
Spectrum
Original Spectrum
Frequency
SBR Decoder Output
Frequency
Bitrate saving coding of high frequencies

AAC-ELD core codec is responsibel for low band spectrum
Low delay SBR reconstructs high band
50
Fraunhofer IIS
ISO/MPEG AAC-ELD
Low Delay MPEG Surround (MPS)
2011
AAC-ELD
v2
+ MPS
AAC-ELD
Parametric Multi channel extension for mono or stereo audio codecs

Encoder derives relevant spatial parameters and pre-processed downmix
Decoder expands downmix
to multi-channel by means
of the spatial parameters
51
Fraunhofer IIS
ISO/MPEG AAC-ELD
Dt. Telekom listening test June 2010 (1)
Independent subjective listening test of low delay communication codecs
MUSHRA listening test procedure
Speech and music content
Test result mono
Excellent quality (mushra points >80) can be achieved with some codecs
at different bitrates
AAC-ELD at 32kbps
Other codecs need 48 kbps (G.719, G.722.1-C...)
Or never achieve excellent quality (G.718, Silk/Skype,..)
Tetst result Stereo:
The best performance has the AAC-ELD which offers excellent quality at
bitrates beginning at 48 kb/s
52
Fraunhofer IIS
Source: ftp://ftp.3gpp.org/tsg_sa/WG4_CODEC/TSGS4_59/Docs/S4-100479.zip
ISO/MPEG AAC-ELD
Dt. Telekom listening test June 2010 (2)
Mono bitrates in kbit/s for excellent quality
Codec
Arbeit
Club
Fiedel
Jazzpiano Rea
Speech average
AAC-ELD 32
24
48
32
32
32
32
AAC-LD 48
48
32
48
48
48
48
CELT
64
48
64
48
48
48
54
G.718
48
32
G.719
32
32
48
48
32
48
40
G722.1-C 48
32
24
32
G722.2
G.722
SILK
40
40
40
Speex
Source: ftp://ftp.3gpp.org/tsg_sa/WG4_CODEC/TSGS4_59/Docs/S4-100479.zip
Fraunhofer IIS
53
ISO/MPEG AAC-ELD
deployment
iOS
Natively in OS since 5.0 used for Face Time
Andorid:
Natively in OS since Jelly Bean (4.1)
OSX
Natively in OS since Lion
Videoconferencing
Defacto standard for high quality audio, TIP
EBU/ACIP broadcast contribution
Recommended codecs AAC-LC/LD

Fraunhofer IIS
54

MPEG
Surround
2006
2004
240
HE-AAC
v2
180
2003
HE-AAC
1999
120
2011
2014 sched.
60
0
AAC-ELD
3GPP
EVS v2
10
Fraunhofer IIS
2008
24
AAC-ELD
AAC-LC
AAC-LD
40 64
2000
128
55
3GPP/SA4
Enhanced Voice Service (EVS) - Objectives
Next generation speech and audio codec for NGN services
5 Objectives:
1. Enhanced quality and coding efficiency for narrowband (NB) and
wideband (WB) speech services
2. Enhanced quality by the introduction of super-wideband (SWB)
speech
3. Enhanced quality for mixed content and music in conversational
applications (for example, in-call music)
4. Robustness to packet loss and delay jitter
5. Backward interoperability to the 3GPP AMR-WB codec
56
Fraunhofer IIS
3GPP/SA4
Enhanced Voice Service (EVS) - Design constraints
Design constraints:
sampling rates: 8-48 kHz
Channels: Mono, stereo
Bitrates: 7.2 128kbps CBR; 5.9 kbps VBR
SWB 13.2 kbps
Delay: 32 ms
Complexity: 88 wMOPS (=2x AMR-WB)
Features: JBM, rate switching, PLC, VAD/DTX/CNG
57
Fraunhofer IIS
3GPP/SA4
Enhanced Voice Service (EVS)
Time schedule:
Submission of qualification executable 11/2012
Qualification 03/2013
Submission of selection executable 11/2013
Selection 04/2014
SA4 finalization of characterization TR 8/2014
SA approval of Characterization TR 9/2014
58
Fraunhofer IIS
3GPP/SA4
Enhanced Voice Service (EVS) - Fraunhofer candidate
encoder
decoder
59
Fraunhofer IIS
3GPP/SA4
Enhanced Voice Service (EVS) qualification rules
WID objectives
1
Tests Sets
Weight
NB and WB clean speech

and speech under
background noise quality
requirements
1
at rates lower than 13.2
kbps (gross rate) with
& without DTX
2
at rates of 13.2kbps
(gross rate) and higher
[with DTX ( 24.4
kbps) and] without
DTX
NB/WB clean and noisy speech

(FER=0%)
at gross bit rates <13.2kbps with
and without DTX and at
13.2kbps with DTX
20%
NB/WB clean and noisy speech

(FER=0%)
at gross bit rates >=13.2kbps
without DTX
10%
Enhanced
quality by the
introduction of
SWB speech
All SWB speech quality

requirements with and
without DTX; clean speech
and speech under
background noise
SWB clean speech and speech

under background noise with
and without DTX (FER= 0%)
30%
Enhanced
quality on
mixed content
and music in
conversational
applications
Quality requirements for

music and mixed content
cases capturing the
situations and use cases
where use of the 3GPP
audio codecs would not be
possible
NB/WB mixed content and

music (FER=0%)
Robustness to
packet loss and
delay jitter
Quality requirements
related to robustness to
packet losses and delay
jitter
NB/WB clean/noisy speech

(FER values >0%, MTSI delayjitter profiles) at gross bit rates
<13.2kbps with and without
DTX
and at 13.2kbps with DTX
5%
NB/WB clean/noisy speech

(FER values >0%, MTSI delayjitter profiles) at gross bit rates
>=13.2kbps without DTX
2.5%
SWB clean/noisy speech (FER

values >0%, MTSI delay-jitter
profiles)
7.5%
NB/WB (50%) and SWB(50%)

mixed content and music (FER
values >0%, MTSI delay-jitter
profiles)
5%
WB clean speech, noisy speech,

mixed content and music (all
tested FER values >0%, all
0%
Enhanced
quality and
coding
efficiency for
NB and WB
speech
services
Backward
interoperability
to AMR-WB
Quality requirements for the

AMR-WB interoperable
EVS codec mode
SWB mixed content and music

(FER=0%)
Source: http://ftp.3gpp.org/tsg_sa/WG4_CODEC/TSGS4_70/Docs/S4-121249.zip
Fraunhofer IIS
10%
50% weight
SWB FoM
10%
These 4
items
will
count
together
in Rule
2a and
Rule 2b
60
3GPP/SA4
Enhanced Voice Service (EVS) - Quailification test results
Qualified for selection
FoM#1
FoM#2a
FoM#2b
100%
98%
96%
94%
92%
e-NTT
a-Fra
i-QCI
b-Hua
j-Sam
f-DOC
k-Eri
c-Mot
m-ZTE
u
z
n
q
Original PC Label / Blinded PC Label
d-Nok
h-Pan
g-FTO
l-Voi
61
Source: http://ftp.3gpp.org/tsg_sa/WG4_CODEC/TSGS4_72bis/Docs/S4-130292.zip
Fraunhofer IIS
Further Information I
MPEG Home Page:
http://mpeg.chiariglione.org/
MPEG-AAC
ISO/IEC MPEG-2 Advanced Audio Coding; Bosi et al; JAES Volume 45
Issue 10 pp. 789-814; October 1997
HE-AACv2:
EBU TECHNICAL REVIEW January 2006: MPEG-4 HE-AACv2 audio
coding for todays digital media world; Stefan Meltzer and Gerald
Moser http://tech.ebu.ch/docs/techreview/trev_305-moser.pdf
MPEG Surround:
http://www.mpegsurround.com/index.html
xHE-AAC:
MPEG Unified Speech and Audio Coding - The ISO/MPEG Standard
for High-Efficiency Audio Coding of All Content Types; Neuendorf et
al; AES Convention:132 (April 2012) Paper Number:8654
62
Fraunhofer IIS
Further Information II
http://tech.ebu.ch/docs/techreview/trev_2012-Q2_DialogueEnhancement_Fuchs.pdf
http://www.iis.fraunhofer.de/de/bf/amm/forschundentw/forschaudiom
ulti/dialogenhanc.html
MPEG-H 3D Audio:
http://mpeg.chiariglione.org/standards/mpeg-h/3d-audio
AAC-ELDv2
http://www.full-hd-voice.com
EVS
Project plan:
http://ftp.3gpp.org/tsg_sa/WG4_CODEC/TSGS4_73/Docs/S4-130521.zip
Design Constraints
http://ftp.3gpp.org/tsg_sa/WG4_CODEC/TSGS4_65/Docs/S4-110710.zip
63
Fraunhofer IIS

2013 MPEG Audio Codecs IRT Kolloquium

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2013 MPEG Audio Codecs IRT Kolloquium

Uploaded by

Copyright:

Available Formats

AUDIOCODIERUNG IN MPEG: AKTUELLER STAND UND

AUSBLICK AUF ZUKNFTIGE ENTWICKLUNGEN

Overview of Mature Standards

MPEG-Audio: Standards and Applications

~PASC = DCC (Digital Compact Casette)

adds low sampling rates

EBU N/ACIP, FaceTime, VideoConferencing,

3GPP, DAB+, DRM, DASH, HbbTV, DVB-T, ATSC, ARIB Brazil

DAB+, DVB, DASH, DECE

MPEG-4 HE-AAC, HE-AACv2

MPEG-4 HE-AACv2: ISO/IEC 14496-3

MPEG-4 HE-AACv2: AAC-LC

Encode lower frequencies

HE-AACv2: Spectral Band Replication (SBR)

Decode lower frequencies

Replicate higher frequencies

HE-AACv2: Parametric Stereo (PS)

[EBU TECHNICAL REVIEW January 2006: MPEG-4

State of the art in efficient high-quality multi-channel and stereo audio

[ETSI TS 101 154]

MPEG Surround Codec Principle

Six for the price of two

One file for all

MPEG-H: 3-D Audio

Extended HE AAC: Motivation/Background

State of the art codecs in 2007:

Extended HE AAC: USAC and the AAC family

Final algorithmic design of the codec is an upgrade of

Extended HE AAC: General Codec Structure

well known from HE-AACv2 or

Extended HE AAC: Core Coder

MDCT based transform coder base as

Plus Speech coding derived

Uncompressed PCM Audio

Decoder Detailed Structure

Extended HE AAC: Technical Highlights

Many new tools and improvements

Forward Aliasing Cancellation

Uncompressed PCM Audio

Extended HE AAC: Performance Overview

Bit Rate (kbit/s)

Bit Rate (kbit/s)

Extended HE-AAC: Performance II

Low bitrates [8 kbps mono / 16 kbps stereo to ~24/32 kbps]:

Extended HE-AAC: Conclusions

Extended HE-AAC allows coding of arbitrary content (music, speech,

MPEG SAOC: Dialogue Enhancement

MPEG SAOC (Spatial Audio Object Coding)

MPEG Spatial Audio Object Coding (SAOC)

MPEG-H: 3-D Audio

Universal bitrate-efficent high quality compression format

Two format approaches under parallel consideration

Call for technology proposals since 01-2013

ISO/MPEG low delay Audio Coding

ISO/MPEG low delay Audio Coding

ISO/MPEG low delay Audio Coding

ISO/MPEG low delay Audio Coding

algorithmic delay [ms]

AAC-ELD v2 stereo operation mode

ISO/MPEG low delay Audio Coding

algorithmic delay [ms]

AAC-ELD v2 stereo operation mode

iChat & FaceTime

Full audible bandwidth up to 48 kHz sampling