ABX Testing on Audio Codecs
ABX Testing on Audio Codecs
I have recently been reading about new video and audio codecs, which made me wonder, “what is transparent for a modern audio codec?” I had previously done testing back around 2008, where I found that MP3 required ~192kbps to be transparent, Ogg Vorbis required ~224kbps, and WMA required ~180kbps. I decided it was time for a new test, with the newest contenders.
I performed ABX testing on CD quality audio. I randomly picked audio and divided them up into 10 second chunks (at random offsets within each track). Among these chunks, I selected the first ones I saw that corresponded to the genres: synthesized video game music, classical, metal, pop, and alternative. Each track was ABX tested by encoding to MP3, Vorbis, AAC, and Opus at different quality settings. Each ABX test consisted of 16 rounds.
For the first analysis, I looked at the minimum bitrate needed to have transparency across all the samples. An ABX test is considered to be transparent if I misidentified a track more than 3 times out of 16.
There are a few points of note. First, my ears noticeably became fatigued as the experiment went on, as it involved listening to the same 5 tracks a few hundreds each. As such, I believe these numbers to be lower than the actual limits. I made some notes as I went along where I thought there was some sort of difference between the two. Here’s the same graph, with a “subjective” feel added of where I felt the music became transparent.
Finally, I also made notes where I felt the audio quality, although noticeably different, was “good enough” that on a casual listen, I would not think it was compressed.
A couple of interesting notes:
1.) Opus performed incredibly well. 2.) AAC and Vorbis had a large gap between perceived quality and detectable quality. I wonder if this effect is due to listener fatigue. 3.) I previously could tell the difference up to 192kbps MP3, but was only able to detect up to 150kbps on this run. I wonder if the difference is due to improvements in the encoder, hearing fidelty loss, or simply different soundtracks used between the tests. 4.) The transition between transparent and non-transparent was clear for MP3, but quite muddled for the other codecs – newer ones do a better job of gracefully degrading. 5.) AAC required significantly more bandwidth for transparency than expected.
The remainder of this post contains details of the experiment setup and the raw data.
Experiment Setup
- DAC/AMP: Asus Xonar Essence ST
- Headphones: Beyerdynamic DT990 headphones
- ABX software: Foobar2000
- Opus encoder: opusenc opus-tools 0.1.9 (using libopus 1.1)
- AAC encoder: WinAMP Fraunhofer encoder, fhgaacenc version 20120624 by tmkk, modified by Case 20151024
- Vorbis encoder: OggEnc v2.88 (libvorbis 1.3.5)
- MP3 encoder: LAME v3.99.5
- Samples:
- Synthesized video game music: “The Wounded Shall Advance into the Light”, Xenogears original soundtrack
- Classical: “Sequenz - 5. Confutatis”, Mozart’s Requiem
- Metal: “When Rides the Scion of Storms”, Bal-sagoth
- Pop: “Lay All Your Love On Me”, Abba
- Alternative: “Twisting”, They Might be Giants
Raw Data and Notes
For each codec, the the number correctly identified in the ABX test is provided. Note that all of these numbers are out of 16 trials. E.g., 16/16 means I was able to always identify the encoded sample from the original. If I determined that I could not tell the difference, I sometimes ended the test early. These entries are marked with the word “transparent” rather than the number of trials that identified the music correctly. A bit rate is provided in parentheses if it was calculated for that sample.
Sometimes, the words “good enough” appear next to a trial – this indicates that I thought the transcoding, although possibly not transparent, was free from compression artifacts. If it is not listed, it means that encoding artifacts were present until the point of transparency. Also, when I thought an encoding reached transparency, I indicate it by writing, “feels transparent” next to the entry. If this word is absent from a sample/codec pair, it means that it felt transparent at the actual point of transparency.
The order of the codecs listed represents the order of the codecs tested for that sample. The order was randomized for each sample. Note that the samples were done in the order listed on this page, so the entire set of data can be read chronologically.
Synthesized Video Game Music
- Vorbis
q-1
(41kbps): transparent
- LAME
V9
(77kbps): transparent
- AAC
VBR1
: 16/16VBR2
(70kbps): 16/16VBR3
(105kbps): transparent
- Opus
bitrate 32
: 16/16bitrate 48
(55kbps): 16/16bitrate 64
(75kbps): transparent
Classical
- Vorbis
q1
: 16/16q2
(87kbps): 16/16 (“good enough” as differences are very small, have to listen carefully to pick them up)q3
(104kbps): 14/16q4
(113kbps): 12/16q5
(145kbps): 8/16
- LAME
V6
: 16/16V5
: 14/16V4
(127kbps): 16/16 (“good enough”, differences are small)V3
(150kbps): transparent
- AAC
VBR1
(31kbps): 16/16VBR2
(58kbps): 15/16VBR3
(90kbps): 14/16VBR4
(104kbps): 13/16 (“good enough”, differences are very small)VBR5
(155kbps): 4/16
- Opus
bitrate 48
(48kbps): 16/16bitrate 64
(61kbps): 9/16bitrate 80
(77kbps): 8/16bitrate 96
(93kbps): 11/16
Metal
- Opus
bitrate 32
(34kbps): 16/16bitrate 48
(50kbps): 15/16 (“good enough”)bitrate 64
(65kbps): 15/16bitrate 80
(80kbps): 13/16bitrate 96
(95kbps): 8/16
- AAC
VBR1
(43kbps): 16/16VBR2
(77kbps): 15/16VBR3
(116kbps): 14/16VBR4
(147kbps): 14/16 (“good enough”)VBR5
(217kbps): 12/16VBR6
(266kbps): 7/16 (“feels transparent”)
- MP3
V8
(86kbps): 16/16V7
(104kbps): 8/16 (note: ears might be getting fatigued)V6
(121kbps): 12/16 (“good enough”)V5
(135kbps): 6/16V4
(154kbps): 7/16 (“feels transparent”)
- Vorbis
q0
(68kbps): 16/16q1
(82kbps): 10/16q2
(99kbps): 11/16 (“good enough”)q3
(115kbps): 10/16q4
(127kbps): 9/16 (note: does not feel transparent, ears are tired; did a second round at this setting with 11/16)q5
(154kbps): transparent
Pop
- Vorbis Note the unusual dip at the
q1
setting to 9/16. I believed I had listener fatigue, and re-ran the test twice more, yielding 12/16 and 8/16. However, I am only listing the original round.q-1
(53kbps): 10/16q0
(69kbps): 12/16q1
(83kbps): 9/16q2
(99kbps): 12/16q3
(115kbps): 10/16 (“good enough”)q4
(130kbps): 9/16 (“feels transparent”)
- AAC
VBR1
(42kbps): 16/16VBR2
(74kbps): 8/16VBR3
(112kbps): 10/16 (“good enough”)VBR4
(150kbps): 8/16 (“feels transparent”)
- MP3
V8
(91kbps): 16/16V7
(109kbps): 8/16 (“good enough”, “feels transparent”)
- Opus
bitrate 32
(34kbps): 16/16bitrate 48
(50bkps): 9/16 (“good enough”)bitrate 64
(66kbps): 8/16bitrate 80
(82kbps): transparent
Alternative
- Opus
bitrate 32
(34kbps): 16/16bitrate 48
(51kbps): 6/16 (“good enough”)bitrate 64
(67kbps): transparent
- Vorbis
q-1
(50kbps): 13/16q0
(69kbps): 9/16q1
(83kbps): 9/16 (“good enough”)q2
(100kbps): transparent
- AAC
VBR1
(43kbps): 13/16VBR2
(77kbps): 8/16 (“good enough”)VBR3
(120kbps): cannot tell difference, but does not feel transparentVBR4
(162kbps): transparent
- MP3
V8
(93kbps): 16/16V7
(119kbps): 7/16