ABX Testing on Audio Codecs

ABX Testing on Audio Codecs

I have recently been reading about new video and audio codecs, which made me wonder, “what is transparent for a modern audio codec?” I had previously done testing back around 2008, where I found that MP3 required ~192kbps to be transparent, Ogg Vorbis required ~224kbps, and WMA required ~180kbps. I decided it was time for a new test, with the newest contenders.

I performed ABX testing on CD quality audio. I randomly picked audio and divided them up into 10 second chunks (at random offsets within each track). Among these chunks, I selected the first ones I saw that corresponded to the genres: synthesized video game music, classical, metal, pop, and alternative. Each track was ABX tested by encoding to MP3, Vorbis, AAC, and Opus at different quality settings. Each ABX test consisted of 16 rounds.

For the first analysis, I looked at the minimum bitrate needed to have transparency across all the samples. An ABX test is considered to be transparent if I misidentified a track more than 3 times out of 16.

Bit Rate to Achieve Transparency over all Samples

There are a few points of note. First, my ears noticeably became fatigued as the experiment went on, as it involved listening to the same 5 tracks a few hundreds each. As such, I believe these numbers to be lower than the actual limits. I made some notes as I went along where I thought there was some sort of difference between the two. Here’s the same graph, with a “subjective” feel added of where I felt the music became transparent.

Bit Rate to Achieve Transparency over all Samples, with Subjective Score

Finally, I also made notes where I felt the audio quality, although noticeably different, was “good enough” that on a casual listen, I would not think it was compressed.

Subject Bit Rate to Achieve Transparency over all Samples on Casual Listen

A couple of interesting notes:

1.) Opus performed incredibly well. 2.) AAC and Vorbis had a large gap between perceived quality and detectable quality. I wonder if this effect is due to listener fatigue. 3.) I previously could tell the difference up to 192kbps MP3, but was only able to detect up to 150kbps on this run. I wonder if the difference is due to improvements in the encoder, hearing fidelty loss, or simply different soundtracks used between the tests. 4.) The transition between transparent and non-transparent was clear for MP3, but quite muddled for the other codecs – newer ones do a better job of gracefully degrading. 5.) AAC required significantly more bandwidth for transparency than expected.

The remainder of this post contains details of the experiment setup and the raw data.

Experiment Setup

Raw Data and Notes

For each codec, the the number correctly identified in the ABX test is provided. Note that all of these numbers are out of 16 trials. E.g., 16/16 means I was able to always identify the encoded sample from the original. If I determined that I could not tell the difference, I sometimes ended the test early. These entries are marked with the word “transparent” rather than the number of trials that identified the music correctly. A bit rate is provided in parentheses if it was calculated for that sample.

Sometimes, the words “good enough” appear next to a trial – this indicates that I thought the transcoding, although possibly not transparent, was free from compression artifacts. If it is not listed, it means that encoding artifacts were present until the point of transparency. Also, when I thought an encoding reached transparency, I indicate it by writing, “feels transparent” next to the entry. If this word is absent from a sample/codec pair, it means that it felt transparent at the actual point of transparency.

The order of the codecs listed represents the order of the codecs tested for that sample. The order was randomized for each sample. Note that the samples were done in the order listed on this page, so the entire set of data can be read chronologically.

Synthesized Video Game Music

Classical

Metal

Pop

Alternative