128 kb Multiformat listening test...
2003-12-13 14:04:41
Still interested by performance of lossy encoders at mid/small bitrate with my favorite music genre, I've decided to launch a new personnal listening test, based on classical music ONLY. My previous test was flawed due to some mistakes : - bad (old) nero encoder was used - faac was used without any knowledge of the encoder, and of the average output bitrate (excessively low last time) - vorbis GT3b1 was used This time I tried to be more prudent, and more organised too. I also decided to refocuss the test on hardware playback possibilities. Exit MPC (more than interesting at this bitrate, but probably rarely used at --radio profile) and WMA PRO (more than good with classical at this bitrate : clear champion last time!). Therefore, I decided to test four lossy format: mp3, wma, vorbis and aac/mp4, on a friendly bitrate for flash memory players. CHOICE OF ENCODERS The choice of format is easy, but the choice of encoder is really problematic, and the choice of setting is finally subject to big critisicm. I'll try to explain the reason of choice I made for this test. • MP3 : I took lame, because output quality is nice, still in progress, and seems to be the best. I used latest stable version (3.93.1) • WMA : I took last encoder, WMA9, because I guess that latest is the best • Vorbis : I had choice between official encoder (1.01) and the Garf tuned encoder, based on 1.00 library and tuned for -b5 [180 kbps on average, less for classical] and above. I prefered the official one: newer, and with no countraindication *possible* for the setting I used. [Nota: GT3b2 wasn't release when I performed the test; but at -q4, 1.01 and GT3b1 should be identical, or very close] • AAC : Here, I have choice between plenty of encoders. Nero, QuickTime/iTunes, Faac, Compaact, Winamp, PsyTEL, Fraunhofer! This abundance is very exciting, but for testing, it is problematic. All encoders have some of the following qualities: free, open-source, CLI, fast, good reputation, fast evolution... so it's difficult to oust one of them. Here are the choices I made : - QuickTime AAC : very good at 128 kbps (best AAC encoder on my previous test, and best AAC CBR encoder on Roberto's public test) one. I used iTunes software, sharing the same engine, but with a more handy interface, and a slightly inferior (and theoretical) quality. - Nero AAC : very complete, with VBR abilities, High Efficiency profile, usable with a very large amount of input format and software, old enough to be considered as mature and still in heavy development. - Faac AAC : CLI encoder (and therefore easy to use), fast and of course open-source project. Quality isn't good reputated, but progress are fast, and the encoder seem to be mainly tuned for mid-bitrate. Last but not least, I didn't help faac reputation with my previous test, and I had to be forgiven. CHOICE OF SETTINGS The following rule : VBR>ABR>CBR is a fair basis for the choice of different settings. Fair, but slightly abusive... • MP3 : VBR encoding for ~128~ kbps encoding is possible, but according to a common opinion, not recommanded AT ALL. At this bitrate, ABR superiority is an accepted fact, especially with -alt-preset tuning. ABR is reliable on size, but with classical, output bitrate is *systematically* 6-7 kbps inferior to the requested value. Therefore, I did myself the small correction, exactly the same I used to make for my portable player, and used --alt-preset 134 setting. It works: for the 15 entire tracks I've selected for this test, average bitrate was slightly superior to 127 kbps. • WMA9 : VBR is possible with WMA v.9, but limited to 6 fixed presets. It wasn't possible to use VBR and being in the same time close to 128 kbps. Therefore, I used ABR 128 (called VBR 2-pass) rather than CBR. • Vorbis : I was tempted to use the kind of correction than for lame ABR. -b 4.25 instead of -b4.00 was even used for Roberto's test, following a public request of bitrate calculation. But there was some changes with 1.01 Vorbis encoder, and this correction value may not work anymore. I decided to test before beginning the test, and I have encoded all tracks with 4.00 and 4.25 kbps: the first setting was slightly inferior to 128 kbps, whereas the second gave me an excessive average bitrate. Still acceptable, but not really justified. Therefore, I used the most immediate and popular setting for Vorbis 128 output files : -b4. • iTunes AAC : no problem here. This encoder is CBR only, witout tweaking possible. • faac AAC : I was discouradged to use VBR. First, because there are no real presets or explicit bitrate scale for faac. Second, because I had to perform some preliminary tests before finding a setting close to ~128 kbps~ for the 15 tracks I've used during this test, and this setting might not work with another samples suit ... I didn't used VBR, following some recommandation in THIS DISCUSSION . There's a new ABR mode in faac, and it seems to be more reliable, if not the best setting for classical stuff. I used -a 64 setting, and manually fixed a lowpass at 17000 kbps (16000 is sometime shoking on direct ABX comparison, and 18000 may surpass my earing abilities). • Nero AAC : As WMA, Nero encoder has a limited step VBR mode. As WMA, it's not always possible to find a preset that will be close to the targeted bitrate. But contrary to WMA, Nero AAC is popular: therefore, you can't use CBR without being criticized by some people for not using VBR (considered as superior); you can't use a VBR setting that produce inferior bitrate files without being flamed for this difference (at least if Nero is badly ranked at the end); and of course, using a large VBR setting will make all Nero's fans happy, but certainly not (and for good reason) Vorbis and Lame users (if course Nero AAC will appear to be better than the other formats)... Fortunately, this WILL NOT happen! A VBR setting of Nero produces, on the 15 full tracks (1h45'), a bitrate very very close to 128 kbps (127,6). The situation would be ideal, if there wasn't a last problem: the profile name was... "internet", and not the expected "streaming" one. I suppose that "internet" profile wouldn't produce encoding close to 128 kbps with other musical stuff (especially loud recording, when dynamic is heavily compressed on mastering), and I suppose that some people won't be happy if I compare Nero -internet to others competitor. As solution of the problem, I couldn't decently opt for -streaming profile: with my 15 tracks, average bitrate is 152 kbps (+19% compared to 128 kbps, but -5% only compared to 160 kbps), and last but not least, other people listening to non-classical music noticed an average bitrate superior to 128 kbps with -streaming profile and newer Nero encoder. Therefore, in order to avoid all kind of complaint, I decided to include in the same test Nero CBR 128 and Nero VBR internet. P.S.: ABR 128 produces identical audio stream as CBR 128 kbps. CHOICE OF SAMPLES I decided to leave the samples I used last time, and to compose a more equilibrated suit. I based the choice on a sample CD bundled with the magazine "Diapason", special issue, still available in France and released in newspaper kiosk. This CD include 14 tracks of best recording (technical and artistic point of vue) discs released during the last year. I drew on this disc 11 tracks (I left three [one piano, and two baroque instrumental pieces], redundant with others tracks), and filled some artistic gaps with four last samples of my own (choral, full-orchestra, male voice, and electronic). Therefore, the ensemble is divided into the follow way: - medieval: 1 tracks - baroque: 8 tracks - classical/romantism: 5 tracks - contempory: 1 track or - orchestral: 3 tracks (2 modern orchestra & one baroque instrumental ensemble) - Lyrical: 2 tracks (opera: one male and one female voice) - Choral: 2 tracks (one full choral and one female plainchant) - Solo instrument: 4 tracks (piano / harpischord /lute / organ) - Chamber music : 3 tracks (flute & harpsichord / violin and pianoforte / cello and Continuo) - Electronic : 1 track Total length is 105 minutes. I used 30 seconds samples for each tracks. I've generally selected the first 30 seconds (and I removed the existing silent beginning); for some tracks, I've selected other parts, more interesting (especially for lyrical moments, rarely present during the 30 first seconds of the track). BITRATE AND SIZE Mixing CBR, ABR and VBR encodings in the same test generally gives rise to endless dicussions. Here are my opinions: 1/ forcing each short samples to xxx kbps is IMO completely stupid, except if for some reason I can't understand, people used to listen limited parts of their favorite songs. Personnaly, I generally prefer listening full tracks or complete composition... Therefore, I'm used to encode full tracks, and if I care about average bitrate produced by a encoder/setting, I'm looking on the bitrate of the full track (or album), and not on the bitrate of short and arbitrary selected parts of these tracks. 2/ When I create a MP3 CD-R compilation, or filling my USB flash MP3 player, I never forced the encoder to match a defined value. I generally need and ask for an approximate bitrate, and therefore, I didn't worry: - about inner variations (if a 150 kbps track is compensated by many others at 125 kbps, it suits to me) - about slight deviations at the end. If final bitrate is 124 or 131 kbps, I'll not lose my time to encode all files again... If bitrate is slightly too high, and if I can't store the last file on my player or media, I'll probably reencode some files I don't really like, or simply remove another one). => in other words, for practical reason, I'm looking for the bitrate of an ENSEMBLE OF FILES, and certainly not for the size reached of a small portion of a single file. Testing is usless if it doesn't match daily behaviour. Therefore, I accepted for this test the same kind of deviation I had accept for filling a player/media. Because I don't have in my possession the full album for most tracks used in this test, I can only calculate the bitrate for each tracks, and not for the full album. By doing this, and with iTunes encodings** considered as 128.00 kbps, I had : AAC FAAC 79,2 Mo (83 093 385 octets) 127.27 kbps [-0.57%] AAC iTunes 79,6 Mo (83 567 161 octets) 128.00 kbps AAC Nero CBR 79,3 Mo (83 154 442 octets) 127.37 kbps [-0.49%] AAC Nero internet 79,4 Mo (83 297 704 octets) 127.59 kbps [-0.32%] AAC Nero streaming94,6 Mo (99 290 088 octets) 152.08 kbps [+18.81%] MP3 ABR 134 78,6 Mo (82 451 163 octets) 126.29 kbps [-1.34%] VORBIS b4 76,6 Mo (80 331 707 octets) 123.04 kbps [-3.87%] WMA ABR 128 79,2 Mo (83 118 871 octets) 127.31 kbps [-0.54%] ** layout of iTunes MP4 encodings was first optimized with foobar, in order to gain some space and to obtain a real 128 kbps bitrate. So, if we except Nero "streaming", the most dramatical distortion is Vorbis one, with -3.9% compared to iTunes AAC. Quality won't probably being audibly affected by this little difference. On the other side, this small difference is the risk to pay for VBR encoding mode, and superior quality... People complaining about this difference should be happy to note that for the 30 seconds samples I've tested, Vorbis average bitrate is 127,7 kbps... Sorry for being long. I just hope that these explanations would prevent redundant and really boring discussion about "apple and oranges", or others "surprises" as bitrate variations of... VBR encoders. Before I gave my results, short but important warnings : • the following results ARE MINE, depending of my subjectivity and my APPRECIATION of different artifacts. I can't bear some of them, but it doesn't imply that everybody will share my disgust. • the following results, and hierarchy, apply to 128 kbps area, not 112 kbps or 150 kbps (and of course not 64 or 220 kbps). The "winner"of this test may be crappy at 192 kbps. Carl Lewis was a legendary sprinter, but a poor marathon runner. • the significance of these results are limited to classical; and classical only. Classical mean wide dynamic range (up to 70 dB), quiet volume, subtle instruments, non-boomy music, etc... Don't extrapolate these results to others genre: Carl Lewis wasn't good on soccer or Kung-Fu • results might be different with others classical samples. 15 may give an approximate idea about encoders performances, taht's all. Results are more leads for further tests than a final answer, even and especially for me! [span style='font-size:14pt;line-height:100%']RESULTS, and complete BITRATE TABLE, are available on the following webpage: [/span]http://membres.lycos.fr/guruboolez/AUDIO/t...al_results.html [span style='font-size:12pt;line-height:100%']DESCRIPTION of all 15 tracks, with expected artifacts: [/span]http://membres.lycos.fr/guruboolez/AUDIO/t...description.htm Comments are welcome