The introduction of digital audio (Compact Disc) has boosted the audio quality compared to the old analog media. Since then, the term "digital" has become a synonym of quality. This quality is bought at the price of an enormous amount of data - over 700'000 bits per second (700 kbps) are necessary for encoding a mono signal.
If you reduce this data rate, the sound quality suffers - because you simply throw away some information. But this reduction is needed for low bandwidth channels as a telephone line, the internet or terrestic broadcasts.
Standard digital sound is stored as PCM data (pulse code modulation): The audio signal is measured at a fixed rate ("sampling rate") with a given precision ("resolution"). On a CD, each stereo channel is measured 44'100 times a second with a precision of 16 bits, resulting in 705'600 bits per second. Decreasing this rate can be achieved by lowering the sampling rate or the resolution. But the best idea is to lower both values. Of course, there is a best compromise between reducing the sampling rate and the resolution. These best combinations are called "optimal PCM" (oPCM).
But there are much more intelligent ways in decreasing the data rate with (hopefully!) fewer audible quality loss. Special audio formats ("codecs") were introduced: ADPCM (adaptive differential pulse code modulation), MPEG audio ("layer I, II, III") and RealAudio for example.
How much quality do they gain at a given bit rate compared to uncompressed digital data? In the following table, various codecs are compared with oPCM. From the right to the left, the bit rate decreases (logarithmic scale). At the top of the table, the sound quality is best and decreases to the bottom. The quality parameter is defined by the reference oPCM files on the white diagonal fields.
The bigger the difference between the conventional oPCM files on the diagonal and the test items, the more "intelligent" the codec. The best codecs are on the blue fields.
Compare on your own: Just click onto the test items below.
| mono bit rate [kbps] > | ||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 16 | 20 | 24 | 28 | 32 | 40 | 48 | 56 | 64 | 80 | 96 | 112 | 128 | 160 | 192 | 224 | 256 | 320 | 384 | 448 | 550 | ||||
q u a l i t y [Q] |
9.1 | L2 | PCM | 9.1 | q u a l i t y [Q] |
|||||||||||||||||||
| 8.8 | L3 | L3+ | L3 | 8.8 | ||||||||||||||||||||
| 8.6 | PCM | 8.6 | ||||||||||||||||||||||
| 8.3 | L3 | L2 | 8.3 | |||||||||||||||||||||
| 8.0 | PCM | 8.0 | ||||||||||||||||||||||
| 7.8 | L3+ | L3+ | RA3 | L2 | ADP | 7.8 | ||||||||||||||||||
| 7.6 | L3 | L2 | PCM | 7.6 | ||||||||||||||||||||
| 7.3 | L3 | L2* | 7.3 | |||||||||||||||||||||
| 7.0 | L3 | L2* | PCM | 7.0 | ||||||||||||||||||||
| 6.8 | L3 | 6.8 | ||||||||||||||||||||||
| 6.6 | RA3 | ADP | PCM | 6.6 | ||||||||||||||||||||
| 6.3 | L2 | 6.3 | ||||||||||||||||||||||
| 6.0 | PCM | 6.0 | ||||||||||||||||||||||
| 5.8 | L3 | L2* | L2 | 5.8 | ||||||||||||||||||||
| 5.6 | ADP | PCM | 5.6 | |||||||||||||||||||||
| 5.3 | RA3 | ADP | 5.3 | |||||||||||||||||||||
| 5.0 | PCM | 5.0 | ||||||||||||||||||||||
| 4.8 | 4.8 | |||||||||||||||||||||||
| 4.6 | PCM | 4.6 | ||||||||||||||||||||||
| 4.3 | 4.3 | |||||||||||||||||||||||
| 4.0 | PCM | 4.0 | ||||||||||||||||||||||
| 16 | 20 | 24 | 28 | 32 | 40 | 48 | 56 | 64 | 80 | 96 | 112 | 128 | 160 | 192 | 224 | 256 | 320 | 384 | 448 | 550 | ||||
| mono bit rate [kbps] > | ||||||||||||||||||||||||
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||
Sound data compression consits of two steps:
For example, Real Audio 3.0 sounds at 40 kbps ("ISDN mono") roughly like oPCM at 96 kbps (Q=6.6). This means a net compression ratio of 1:2.4 (Q gain = +1.3). However, most of the total compression is loss.
There is one technical obstacle when generating oPCM files: Standard sound devices do not allow "odd" sampling rates and resolutions. I have therefore simulated the oPCM quality by upsampling to a higher rate and resolution.
In these first tests, the sample sound CEMBALO was used. It has a very broad frequency spectrum and reveals most compression artifacts (distortion, dynamic hiss). It is much more suitable for codec evaluation than narrow-band signals (e.g. string orchestra) that pose no severe problem for codecs. However, CEMBALO has very limited dynamic (no silence intervals) and no isolated high frequencies. The spectrum of CEMBALO is given below.

The following preliminary data were obtained in 3 listening tests with headsets (Sennheiser Reference II) and the test file CEMBALO. The tests were performed at 10, 25, 37.5, 50, 75, 100, 150, 200, 300, 400 kbps in steps of 1 bit. The plot shows the interpolated curve. The estimated standard deviation is ± 0.3 bit (1 sigma).
Precision measurements are currently performed with other test sounds (including speech), higher resolution and more listeners.
RESOLUTION [bit] 1 kHz 2 kHz 4 kHz 8 kHz 16 kHz
16 + . . . . . . . : . . . : . . . : . . . : . . . + X
. . . . . . . . : . . . : . . . : . . . : . . . + CD
15 + . . . . . . . : . . . : . . . : . . . : . . . +
. . . . . . . . : . . . : . . . : . . . : . . .O+
14 + . . . . . . . : . . . : . . . : . . . : . . .O+
. . . . . . . 80 dB . .:. . . .:. . . .:. . . O+.
13 + + + + + + + +:+ + + +:+ + + +:+ + + +:+ + + O+.
. . . . . . . : . . . : . . . : . . . : . . .O+ .
12 + . . . . . . : . . . : . . . : . . . : . . .O+ .
. . . . . . .:. . . .:. . . .:. . . .:. . . O+. .
11 + . . . . . : . . . : . . . : . . . : . . .O+ . .
. . . . . . : . . . : . . . : . . . : . . O + . .
10 + . . . . .:. . . .:. . . .:. . . .:. . OO.+. . .
. . . . . .:. . . .:. . . .:. . . .:. OO. .+. . .
9 + . . . . : . . . : . . . : . . . :OOO. . + . . .
. . . . .:. . . .:. .ISDN:. . .OOOO . . .+. . . .
8 + . . . : . . . : . . . X OOOOO : . . . + . . . .
. . . . : . . . : . OOOOOO. . . : . . . + . . . .
7 + . . .:. . . .OOOOO. .:. . . .:. . . .+. . . . .
. . . : . .OOOO . . . : . . . : . . . + . . . . .
6 + . .:.OOOO .:. . . .:. . . .:. . . .+. . . . . .
. . :OO . . : . . . : . . . : . . . + . . . . . .
5 + .OO . . .:. . . .:. . . .:. . . .+. . . . . . .
.OO . . . : . . . : . . . : . . . + . . . . . . .
4 . . . . ::. . . ::. . . ::. . . + .16 kHz . . . .
. . . ::. . . ::. . . ::. . . + . . . . . . . . .
3 + . ::. . . ::. . . ::. . . + . . . . . . . . . .
. ::. . . ::. . . ::. . . + . . . . . . . . . . .
2 +:. + . +:. + . +:. + . + . + . + . + . + . + . +
8 16 32 64 128 256 512 -> RATE [kbps]
3.0 4.0 5.0 6.0 7.0 8.0 9.0 -> Q
optimal quality at given bitrate OOOOOOO (oPCM) (± 0.3 bit)
perception limit (loud music) + + + + (16kHz/ 80dB)
lines of constant audio bandwidth ::::::: (1/2/4/8 kHz)
|
perceptional quality QQ = log2 RATE/kbps
Q RATE RES. BW Divisor
kbps bit kHz 216/levels
4.0 16 6.0 1.333 1024
4.6 24 6.4 1.888 800
5.0 32 6.7 2.396 640
5.6 48 7.4 3.263 400 "telephone"
6.0 64 7.7 4.168 320 "AM/MW"
6.6 96 8.0 6.000 256
7.0 128 8.4 7.659 200
7.6 192 9.4 10.261 100 "Dolby B"
8.0 256 10.0 12.800 64 "FM/UKW"
8.6 384 12.7 15.144 10 "DSR"
9.0 512 15.0 17.067 2
9.1 550 16.0 17.188 1 "CD"
RATE: bit rate of oPCM data, per channel
RES.: sampling resolution
BW : bandwidth = 1/2 smpl-rate
|
Example: The sample sound A has the quality of Q=6.8 (or 68 dQ, deci-Q) when it offers a better quality than a oPCM file at Q=6.6 (96 kbps per channel), but a lower quality than oPCM at Q=7.0 (128 kbps per channel). If A is a stereo sound, it must be compared with stereo oPCM sounds. The Q parameter is defined as the binary logarithm of the oPCM bitrate per channel. You must indicate whether the Q value refers to a mono or a stereo sound.
The shortcomings of this method (used by ISO) are obvious: The rating is rather subjective and the only reference item is the perfect CD quality. That is why the listener only knows what the rating "5.0" means. The lower ratings are strongly a matter of taste because they are not strictly defined. As a consequence, the results vary from listener to listener.
The oPCM method overcomes these problems: The listener knows exactly what a rating of e.g. Q=6.7 means: The test item sounds better than the Q=6.6 oPCM refernce file, but worse than the Q=6.8 oPCM file.
As an illustration, think about "sound intensity evaluation". A fictive "intensity scale" could be: "5.0 = as loud as reference item", "4.0 = a little bit weaker", "3.0 = weaker", "2.0 = much weaker", "1.0 = very much weaker". It is obvious that this fictive scale would produce very subjective results. The objective decibel (dB) scale would be much better. The oPCM Q-scale for quality evaluation is like the decibel scale for intensity evaluation.
Thanks to: Nils Ehlert (Germany) and Bernhard Weber (Germany) for listening tests; Ross Lewis (New Zealand) for some Layer III streams.
Stefan Scheller, webmaster