HydrogenAudio

Hydrogenaudio Forum => Uploads => Topic started by: sauvage78 on 2009-03-16 11:35:51

Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-03-16 11:35:51
I searched back on the forum & on various samples databases for the worst problem samples I could found for vorbis & I edited with audacity the ones that I was able to ABX, usually from 10-30 sec to 1-2 sec, I even duplicated the channel from mono to stereo if it happens that the problem was only on the right or left channel.

I ended with 5 very short truly killer samples which I ABXed at various level to see if higher bitrates were transparent or not, here is the result:

(http://img185.imageshack.us/img185/3021/resultssummarypart1.png)
(http://img185.imageshack.us/img185/1591/resultssummarypart2.png)
(http://img299.imageshack.us/img299/3548/celt.png)

All this samples comes from real CD, but they are so focused on the artefact that I may gives newbies a bad idea of vorbis,
so let it be clear aoTuV Beta5.7 is a very good lossy codec. I tested on aoTuV because it is IMHO the best overall codec around. (Edit: After adding other codecs, sadly I don't think so anymore)
Using the best DCT codec around was simply a way for me to find the worst killer samples around.
This is more a ritual killing of vorbis, gathering all my (small) knowledge against it, than a normal ABX test. I hit exactly where it hurts. So it is normal that it is painfull for vorbis.

So if I ABX aoTuV at up to -q8, it means that other lossy codecs will probably fail even worst (specially MP3) (Edit: not true). It doesn't mean that vorbis sounds bad at all on average music.
It only means that there is no lossy codec which is transparent on every samples, that simply doesn't exist, no matter the bitrate.

But, in some way, this test is made to prove that people using lossy at overkill bitrate are not protected from problem samples.
I don't even have golden ears, I can't ear above 18Khz...

You can run the test for yourself everything is included in the attached archive. Every samples at every bitrates. Everything is already encoded & replaygained, all you have to do is to listen & compare your results against mine, my logs are included (when successfull). You can test your earing very quickly because the samples are very short. I prepared everything to ease your life.

I didn't discovered these samples, none of them is new. But I renamed some of them for my own use (I named each sample by its artist) & also because they have been so shortened that they are sometimes unrecognizable.
Anyone collecting problem samples should have a look ... these 5 samples really worth it. I spend a couple of hours just to edit them & focus on pure artefact.
You don't have to point other people to the artefact inside the sample, it IS the artefact.
In fact my only goal with this listening test wasn't to test vorbis, but to start a small collection of killer samples for any DCT codec in order to know where they fail.
So I absolutly don't care if vorbis sounds good or bad ... I use 100% lossless actually.

Indeed this test is dedicated to Monty & Aoyumi. Thks for vorbis I hope it will help you improve the codec.

PS:
If you know the origin (Artist-Album ...) of the Castanets sample, I am interested by this information.
If you know very good killer samples, I am interested. Plz describe & post them so that I can add them to my test next time.
If ever Gabriel or Roberto reads this plz fix http://lame.sourceforge.net/quality.php (http://lame.sourceforge.net/quality.php) (I get a 403 error when I try to download samples)

Edit1: Inclusion of Average Bitrate in the table for later comparison to others codecs.
Edit2: Added Nero AAC
Edit3: Added Musepack
Edit4: Added Lame MP3
Edit5: Added Itunes AAC
Edit6: Added Abfahrt Hinwil Sample
Edit7: Errata N°1 (http://www.hydrogenaudio.org/forums/index.php?showtopic=70442&st=50&p=623512&#entry623512)
Edit8: Added aoTuV exp-bs1, Errata N°2 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=70442&view=findpost&p=627617)
Edit9: Added Celt
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: halb27 on 2009-03-16 15:21:16
... It only means that there is no lossy codec which is transparent on every samples, that simply doesn't exist, no matter the bitrate. ...

I see you're using lossyWAV at -P quality. At a bitrate like that or higher and with a well-designed codec that keeps the straightforward signal path of the pcm data I think we can beleive that audible deviation from the original is very subtle if audible at all.
It's quite astonishing however that you are able to ABX vorbis at 256 kbps with an annoying result.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-03-17 03:35:12
I bump my own thread just to let people know that I added Nero AAC to the comparison.

As you can see Nero AAC shines but that doesn't mean Vorbis is bad, because I selected problem samples specific to vorbis & then tested it on Nero AAC, so it is unfair for vorbis.
I admit I don't like MPEG ... but that said I was impressed by Nero Q0,55 ...
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: ExUser on 2009-03-17 04:16:13
Thank you for your efforts. Listening tests take time and dedication! I use aoTuV in my Vorbis streaming component for foobar2000, so this affects me as well.

Have you sent aoyumi a message linking to your test? He would probably be interested.

Finally, would you consider testing -q10 as well? I would like to know if these artifacts cover the whole breadth of Vorbis.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: rpp3po on 2009-03-17 11:29:24
Nice work! Thank you!

It anyway kind of damages Vorbis somehow, in my opinion, to see it failing this way. I know, these specifically are targeted problem samples and every lossy codec probably has some skeletons in the closet, but some of them are already quite old (like Castanets) and still don't work. In contrast the Nero devs, for example, were able to straighten out most critical samples thrown at them here at HA over the years.

I really would have liked to see Vorbis succede. Even when it produced artifacts at lower bitrates, I always found them more pleasing / less annoying than those of other codecs. I don't know why. Albeit being an excellent codec, Vorbis missed the wave of momentum, it would have needed, a couple of years ago in my opinion and was overhauled by LAME-MP3 and AAC. A phone/player manufacturer can probably satisfy far more than 90% of its user base by just by supporting those two formats. Vorbis' Tremor showing serious quality problems on some hardware players probably didn't help either. Maybe that all reverses if Vorbis gets included in the next W3C spec, but I don't know about the current state.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: uart on 2009-03-17 14:21:58
Yes a very impressive test sauvage78.

Any chance of adding Lame3.98 to the comparison.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: Kim_C on 2009-03-17 19:53:56
Thanks for your effort. Excellent visualisation of results!

If you have time and are interested, i think latest iTunes / Quicktime AAC would be good addition to comparison. It'd be interesting to see how well it performs against Nero AAC now because in previous tests iTunes was slightly better than Nero.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: dgauze on 2009-03-17 20:14:26
Thanks for conducting this test, sauvage78! I hope your hard work will help contribute to the advancement of both codecs.


I searched back on the forum & on various samples databases for the worst problem samples I could found for vorbis & I edited with audacity the ones that I was able to ABX, usually from 10-30 sec to 1-2 sec, I even duplicated the channel from mono to stereo if it happens that the problem was only on the right or left channel.


It would be very interested to see how Vorbis performs on samples known to be problematic for the Nero AAC encoder.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-03-18 06:05:48
I just added Musepack.

I consider this thread evolutive, so I may add other codecs & other killer samples later as my mood changes or my knowledge increases.

I specially plan to test:
1: Official Vorbis (all range) & Aotuv Q10
2: Lossy|Flac at --portable & a setting lower near 256Kbps (maybe --zero), but not higher (Add Ginnungagap sample)
3: Lame MP3 (not a priority, but usefull as a reference)

... actually I don't plan to test Itunes for the simple reason that I don't want to install it. Even if it would be good, I want a CLI encoder so Itunes is useless for my personnal needs.
If ever my system gets broken, I may test it before I re-install... but it will be much later (maybe when windows seven comes out)
no matter Itunes audio quality I consider Nero AAC better for personnal use due to the fact that Nero is both CLI & already transparent at ~192Kbps, so testing Itunes would only be usefull to get an idea of the quality provided by the Music Store which I don't use. Even if Itunes would beat Nero, it would be a marginal quality gain (& only at 128Kbps & below maybe) for the huge pain of using Itunes.

I don't promise anything as I am short of time. I do it for myself, I only publish it as the job is done honestly, to the best of my knowledge, so it can as well be public. And I also do it in case I would have done something wrong & to help developers. But I must say I don't care much about the opinion of others. I know what I hear, I spot artefact before beginning ABXing. I don't ABX randomly at all, that's why my score is almost always 100% success or 100% failure. It's never "I think that I hear something" except in very rare case noted as yellow. I am 100% sure that I can redo the test & get the exact same result except for yellow results.

Currently I consider Kraftwerk & Rush samples (specially the Rush sample) as Vorbis bugs that could be easyly fixed IMHO.
So far I never heard the Rush bug outside of vorbis. (I hear a toad in the background ...). It happens that I really love that Rush's song so, it is important that this particular bug gets fixed.

Concerning Musepack, its huge problem is not its quality at high bitrate but its lack of flexibility. It doesn't matter if you can now put it in MKV, guys from Doom9 would never use a codec which perform so badly at mid/low bitrate. Even at 128Kbps Musepack doesn't compete with Vorbis/AAC (even if the table doesn't show it as both looks almost tied, aotuv q4 beats musepack radio as the artefact of aotuv is always softer) ... Musepack is just an improvement compared to Lame ~192Kbps IMHO (Edit: After testing Lame MP3, not even true ...). Musepack seems to beats aotuv in the table but this is due to the fact that the samples are heavyly targetted at vorbis' flaw. I am confident that once vorbis bugs will get fixed, vorbis -q6 will beat musepack extreme ... or last be tied at high bitrate & beat it at mid/low bitrates.

Also the 10/12 on on nero Castanets Q0.35 is due to boredom & lack of focus, I am 100% sure that I can get a 8/8 there as I get 8/8 at Q0.40 which doesn't seem logic. But I don't care about re-trying as I am sure of myself here.

Canar:
for a confirmation at Q10 see the original thread of the Rush sample problem:
http://www.hydrogenaudio.org/forums/index....showtopic=44862 (http://www.hydrogenaudio.org/forums/index.php?showtopic=44862)
I am very confident that I can get the same score.

Edit1: Some thoughts about my methodology.
I have chosen 8 trials, because I wanted a number that would not be too long to ABX & that would prevent me from a row of lucky guess. From my experience it happens easyly that sometimes you guess up to 4 times in a row, starting at 5 or 6 success in row it becomes unlikely that you were guessing. At 7 or 8 success in a row I consider the result valid. I had hesitations between 8 & 12 trials because at 12/12 you don't have even the shadow of a doubt. In the end, I decided to split the apple in two. First I do 8 trials & see how confident I am in the validity of the result. If I get 8/8 I consider the result valid, specially if I can identify what I listen to. If I get a result of 5/8 or less, I consider the test a failure, specially if I cannot identify what I listen to. If I get 6 or 7 out of 8 trials, I go up to 12 trials. Then If I get at last 10 success out of 12 trials I consider the test valid specially if I know what I listen to. If I get 9 success out of 12 trials, I consider the test invalid specially if I don't know what I listen to.
It never happened that I would have a result of 9/12 while knowing what I was listening to, I have chosen & edited my samples specially in order that it never happens.
Surprisingly, what did happen is that I was able to get a 8/8 result while not knowing what I was listening to, such case were very rare (1 or 2) & marked as yellow (but not every yellow result is in this case). I consider them valid as I consider that there is an overall very slight modification in the audio without being a flaw that you can point out. For me it doesn't necessary means that I was guessing, it means that I was very very close to the transparency point. Also sometimes it happens that it sounds different but not bad, just slightly different. I may re-test yellow results later, but yellow means the quality is good anyway, so it may not be worth the headache.

Edit2: If ever a moderator read this plz edit the tittle
from Listening Test: aoTuV Beta5.7 on 5 Killer Samples, 96-128-192-256Kbps (17 ABX Log+Files)
to Multi-Codec Listening Test: 96-128-192-256Kbps, Killer Samples targetting vorbis (with logs)
Thks
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: uart on 2009-03-18 15:52:10
Quote
I specially plan to test:
1: Official Vorbis (all range) & Aotuv Q10
2: Lossy|Flac at --portable & a setting lower near 256Kbps (maybe --zero), but not higher (Add Ginnungagap sample)
3: Lame MP3 (not a priority, but usefull as a reference)


Great, looking forward to see those. Especially the lame reference to put the test into perspective for me (as an mp3 user).
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: Aoyumi on 2009-03-18 16:02:36
sauvage78:
Thank you for the test.

somebody:
I can't improve these problems immediately. Because I wrestle with another problem slowly.
On the other hand, I write views for motivated somebody.

Kraftwerk/Rush/Autechre
Please improve block switching. It is the cause that switches from short to long are too early as for those problem to happen at the high bit-rate.  But the method to change the threshold simply is inefficient.

Harlem
It is point stereo in fact to influence a sound. In addition, there are some problems for block switching.

All the best!
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: ExUser on 2009-03-18 17:01:26
Edit2: If ever a moderator read this plz edit the tittle
from Listening Test: aoTuV Beta5.7 on 5 Killer Samples, 96-128-192-256Kbps (17 ABX Log+Files)
to Multi-Codec Listening Test: 96-128-192-256Kbps, Killer Samples targetting vorbis (with logs)
Done-da-dun-dun-DOOOOONE
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-03-19 10:29:34
As several people requested it, I added Lame MP3 sooner than I thought.
I must say I didn't expected Lame MP3 to compete so good.

I don't plan to add any other tests before some time now because I am bored & I don't have the time anyway. So I will draw my own personnal conclusion.
According to me:

at low/mid bitrate (96/128Kbps)
1: Nero
2: Vorbis/Lame (tied but only due to vorbis bugs, overall I still favor vorbis)
3: Musepack (not even in the competition)

at high bitrate (192/256Kbps)
almost all tied, except vorbis which has serious problems that are not normal at this bitrate.

I don't know if I must be happy or sad that Lame competes so well, because it means that, most likely, nobody will ever code an open source AAC codec as good as what x264 is for AVC in the video codec world. I was secretly wishing that Lame MP3 would be awfull

Overall, I didn't expected that there would be such a big difference between Vorbis & the other codecs. I knew I was hurting vorbis but I thought that, as tranform codecs, both Nero & Lame would suffer too (particulary Lame) ... It just didn't happen. In the same way, I knew Musepack was bad at low bitrates as it is not designed/optimized for it, but I thought that, due to the fact the it is a subband codec (& also influenced by 128Kbps comparisons where it was not bad at all), it would maybe not suffer too much at mid bitrate. I noticed a variation within the nature of the artefacts (Musepack smears much more than others codecs at 96/128Kbps) but Musepack was hurted very badly by my samples too. The fact that a codec is a transform or a subband codec alone is not enougth to tell anything about the quality of a codec IMHO. As long as the limit of the technology is not reached, the implementation is much more important than the technology used. That's why Lame just doesn't want to die. According to me, the claim that Musepack would be as good as Nero/Vorbis at 128Kbps is not true. I used to think that Musepack was better than Lame, I don't think so anymore. Maybe there was a time it was true, I don't know.

I also thought that the difference between AAC & MP3 would have been bigger. Even if both seems often tied within the table, AAC is always better qualitywise. When both Nero AAC & Lame MP3 are ABXable & tied, the artefact is always softer with Nero AAC.

The only reason for not using Nero (I don't) is that it is patented & closed. Qualitywise it is brilliant. Congratulations to Ivan & Co, I wish vorbis would be as good as Nero AAC.

Personnally before this test I thought that I would maybe encode some of my rips ... after this test I am back to lossless. ... ignorance is bless.

PS: Thks for the tittle Canar.

Note: In all honesty, this test was conducted by an anti-mp3 & an anti-mpc user. Concerning Lame I think that it is more than time for a Lame AAC version. Concerning Musepack, I admit I never understood the Musepack fanatism. Nowadays there is no rationnal reason to use Musepack. Anyway I think that I wasn't unfair with any of these codecs even if I dislike them much.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: halb27 on 2009-03-19 11:31:20
I hoped you alse test lossyWAV --portable. (Settings at ~ 256 kbps which you wrote you also wanted to test are not very useful IMO though it would be useful to test at low -q settings, for instance -q 1.0 or -q 1.5 -V.)
Maybe - in order to keep your hard work restricted - you can test just -q 1.5 -V. That would be great.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: Nick.C on 2009-03-19 11:35:23
-V is not required for beta v1.1.3e - it is the default (and only) spreading function.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: DigitalDictator on 2009-03-19 12:00:57
Thanks for the test! Very informative and nice table!

Quote
... actually I don't plan to test Itunes for the simple reason that I don't want to install it. Even if it would be good, I want a CLI encoder so Itunes is useless for my personnal needs.

Can't someone encode the iTunes samples and send them to you? then you don't have to install the program. I'm also interested in how iTunes compares with Nero.

Lame seems to do a pretty good job as well, right?
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-03-19 12:11:47
halb27:
I will not test lossywav --zero in order to see if it sounds good but in order to see how different sound the artefacts produced from usual DCT artefacts & also to see if DCT problem samples affect it in some way. Following the same idea, I plan to test Ginnungagap to see how DCT codecs react to it. Maybe everything will be transparent due to different technology, I can't know if I don't test. All I know is that I don't rely on other anymore to tell me what sounds good. I didn't follow lossywav development lately as it is too technical for me. (specially as I am not a native english speaker).
Nick asked me to test his new spreading function, I have interest in doing so. (Edit: well it seems it's too late) But I don't use neither vorbis nor lossy|flac on a daily basis actually so I will test when I get some time & if it's not too late. The problem with lossywav is also that it is very hard to ABX, so far I am only able to ABX it on Ginnungagap or at very low setting which are not supposed to be transparent anyway. I quickly tried the samples provided by the guy who could ABX --portable, I wasn't able to ABX his samples. The more the codec is transparent the longer it takes to ABX it, that why I tested pure lossy first. The yellow & failed ABX trials take much more time & are much more boring than orange & red results. I am not nuts & paranoid, I don't even try to ABX transparent audio randomly. Autechre & Ginnungagap produce very similar artefact, that why I want to compare these two sample particulary. My interest in testing lossy|flac will rise again as my HDD space will decrease, but it can be months before I come back to lossywav, I have to study ... sorry. I still think lossywav is great, even if it's a little to big for my taste

Quote DigitalDictator:
"Lame seems to do a pretty good job as well, right?"
Yes, I was surprised by its quality. (specially at V7 which I expected to be in the red zone)

well in all honesty I don't plan to test itunes anytime soon even if I had the pre-encoded samples  I will never use this codec for my personnal use.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: DigitalDictator on 2009-03-19 14:37:27
Quote
well in all honesty I don't plan to test itunes anytime soon even if I had the pre-encoded samples wink.gif I will never use this codec for my personnal use.

Too bad, there are many out here who uses iTunes and who are really interested in the comparison between the Itunes codec and others. So if we ask nicely? 
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-03-20 14:21:32
I just added Itunes AAC,
I must say I was very dispointed by the results. I expected it to be much better, specially due to its reputation at mid bitrates. I was hoping that maybe it would be better than Nero AAC at 128Kbps ... I was far from the reality ... overall even Lame MP3 beats Itunes AAC which was quite a big surprise to me ... also it is affected by the Krafwerk sample while Nero is not. Because its the same technology as Nero AAC I didn't expected Itunes AAC to fail on this sample.

DigitalDictator:
I didn't test it for you but because I realized I never tested the VBR version of the Itunes AAC codecs, last time I installed such a terrible software on my machine it was Itunes Version 4.0 with CBR only ...
this is the first & last time I test Itunes AAC, not only it is a software for children, (70meg, 5 firewall alerts, 4 folders in Program Files, dead keys in registry) but it sounds very average. Not specially bad (except Krafwerk), not specially good ...

Anyway it's done & I know what it worths. It is not such a bad codec overall, afterall at low birates it beats Musepack & at high bitrates it beats Vorbis ... but in the context of the AAC codecs battle. Nero wins by a good margin qualitywise & its CLI is so much friendly.

When I started my test I didn't expected it to be such a triumph for Nero AAC (& to some extend for Lame MP3 too) ... to my dismay Vorbis is losing grounds. Vorbis is only very good for streaming, because its artefacts are usually soft at low bitrates. For webmasters vorbis is great, but for CD archiving, it's not such a good option. At least actually, ... I hope it can be fixed.

Don't even ask for any codec addition for ages. I didn't count but it takes more than 3 hours just for one codec. Each time I need to organize the files (encode/rename), do the ABX test, do the table, edit the screenshot, re-up the PNG on imageshack, edit the topic ... & I don't count the time spent to find & edit samples that I could ABX ... I'd rather add new killer samples than add new codecs now.

Just take what I give as it comes & be happy with it ... or leave  I am BORED of ABXing !!!

Edit: I added the result table to the uploads, it's in openoffice .odt format, in case anyone is willing to run the same test & publish his results it will save him some time maybe ...
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: Busemann on 2009-03-20 14:53:56
The bitrates on the Kraftwerk sample on the QT files seem awfully low, and different from what I get when encoding it with QT 7.6.

Edit: I see you used a "custom" sample and I guess you must have used the custom 256 vbr setting rather than iTunes Plus too.. my mistake!
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-03-20 15:03:43
Yes, it is the same sample but edited (shortened & channel with the artefact duplicated to make it stereo) to focus on the specific artefact that I could heard.
You are right the bitrate is low, but it's not a problem with my sample, it is an Itunes bug. With a low bitrate too (72Kbps) Musepack achieve transparency on the same sample.
You can download all the samples (both lossy & lossless), everything is in the archives.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: Busemann on 2009-03-20 15:16:09
Yes, it is the same sample but edited (shortened & channel with the artefact duplicated to make it stereo) to focus on the specific artefact that I could heard.
You are right the bitrate is low, but it's not a problem with my sample, it is an Itunes bug. With a low bitrate too (72Kbps) Musepack achieve transparency on the same sample.
You can download all the samples (both lossy & lossless), everything is in the archives.


Yeah I tried the "regular" kraftwerk sample and the bitrates was significantly higher on all settings, which is why I got confused. I also got a few kbps' off on the 256kbps samples but I encoded with the default iTunes Plus setting which use the maximum quality setting of the encoder. I'm not sure if the Plus files would sound much different tough
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: rpp3po on 2009-03-20 15:39:42
Your Quicktime 7.6 evaluation is definitely flawed. The bitrates don't match and are partly totally amiss. Your ABX findings are also inexplicable.

Attached you'll find the correct encodings (QT 7.6) for target bitrate mode (constrained VBR/iTunes) and, where Quicktime 7.6 really excels, target quality mode (true VBR).

In true VBR mode Quicktime considers the following bitrates sufficient at the highest quality level:

Autechre: 222kbit/s
Castanets: 200kbit/s
Harlem: 216kbit/s
Kraftwerk: 298kbit/s
Rush: 196kbit/s

All are quite high on average, even for the highest Q setting. Using the same setting QT averages at about 185kbit/s over my whole music collection.

So Quicktime correctly identifies problematic content and adjusts bitrate accordingly. That your own Kraftwerk sample shows 162 kbit/s for the 256kbit/s constrained encode seems quite off the mark.

AAC is inherently a VBR format. Forcing it into certrain bitrates is really not helpful. With MP3 the latter at least increased compatibility, but that's not the case for AAC. Just let it flow at Q127 and it will give you both very small and very large files at a very reasonable total average.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-03-20 16:00:38
I didn't use quicktime I used the Itunes interface & change the importation settings which were 256Kbps VBR by default to 96-128-192-256Kbps VBR ... maybe there is another AAC encoder or advanced settings in quicktime ... I didn't have much paramaters within Itunes. I don't use Itunes at all for my personnal use so maybe I used the wrong software simply ... I didn't have any True-VBR or ABR options.
I am looking at it. Unfortunatly I have already get rid of Itunes  My HDD is allergic to it ... 20 dead keys after desinstallation & a firefox plugin I never asked it to install ...

Edit: Can I do it with the quicktime installed by Itunes ? a few years ago quicktime wasn't a freeware if I recall well.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: rpp3po on 2009-03-20 16:02:01
Yes, sadly iTunes on Windows is a real pain compared to OS X.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: Alex B on 2009-03-20 16:02:01
These are interesting tests, but I don't think any serious conclusions can be made because the durations are only a second or two. In the public HA listening tests the first two seconds of the encoded samples have always been cutted off because the lossy codecs may first need to adapt to the content. I don't know how severe the problem can be and which codecs & settings are most affected, but that has been the accepted practise.

In addition, as sauvage78 stated, anyone who interprets the results must remember that the results are valid only for these specific samples, which represent in total of 8 seconds of quite unusual sound clips. I'd recommend playing once through the original lossless samples before making any conclusions.

[!--sizeo:1--][span style=\"font-size:8pt;line-height:100%\"][!--/sizeo--]EDIT: fixed a typo (adopt > adapt)[/size]
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: rpp3po on 2009-03-20 16:06:55
In the public HA listening tests the first two seconds of the encoded samples have always been cutted off because the lossy codecs may first need to adopt to the content.


Lossy audio codecs don't "adopt" to content over time (n-pass video coding is different). In fact, they don't even have any memory about the past surviving the current frame boundary (except maybe a bit reservoir for bitrate constraints).
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: rpp3po on 2009-03-20 16:11:37
Edit: Can I do it with the quicktime installed by Itunes ? a few years ago quicktime wasn't a freeware if I recall well.


You can just use my samples or create your own with Quicktime Player's "export" function (and see that they are identical to mine). But there is no batch interface on Windows, yet, so better save your time for the actual testing, if you don't have a Mac available.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-03-20 16:35:56
Alex B:
I can only agree with you as some people seems to only watch the colored table & say WAOUH ... that is not the right way to do things, you have to test for yourself to put it in perspective.
That said, the Harlem sample is applauds, so every live CD is potentially affected by applauds.
In the same way, killer samples are very usual in electronic music, songs from NIN, Ministry, Marylin Manson, Fear Factory ... are full of effects that can be very similar to the Krafwerk/Rush/Autechre samples.

So overall, even if it's only 10 sec in the ocean of music ... I think you can draw some conclusions from my test both for live music & for electronic music.
But I agree it definitly needs more samples for other genres.

I can only tell you that this test is very serious & very honest ... I didn't spend two days to do a cheap test. I have better things to do in life specially as I don't use lossy !!!
Unless you're a real sadomasochist, you don't listen to Autechre 30min in a row for fun ... that I can tell you ...
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: Alex B on 2009-03-20 16:57:59
Lossy audio codecs don't "adopt" to content over time (n-pass video coding is different). In fact, they don't even have any memory about the past surviving the current frame boundary (except maybe a bit reservoir for bitrate constraints).

I just spent half an hour trying to find the original reason for adding 1000-2000 ms of additional offset in the public listening tests, but unfortunately my searches didn't find the correct threads/posts. It may be related to the bit reservoir behavior or something else, but if I recall correctly it has something to do with audio quality in the very beginning of the encoded samples.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: guruboolez on 2009-03-20 17:12:32
The test is really interesting (ABXing vorbis at -q8 is not common) but the cross-comparison of different audio coders could be misleading. I just quote the original posters words:

Quote
I searched back on the forum & on various samples databases for the worst problem samples I could found for vorbis (…)

Quote
but that doesn't mean Vorbis is bad, because I selected problem samples specific to vorbis & then tested it on Nero AAC, so it is unfair for vorbis.


Maybe a big red warning on the top of the first message should avoid future confusions.

Anyway, thank you for your test (and welcome to the club of people disgusted by ABX procedure  )
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: Alex B on 2009-03-20 17:45:27
I finally found at least one thread (http://www.hydrogenaudio.org/forums/index.php?showtopic=29555) in which Gabriel explains why "additional offset" should be used.

I would like to suggest a little change to the recommended practices in listening tests.

Most of modern codecs are working based on the recent context. They usually have a way to adapt the bitrate to the content that take into consideration the past recent bitrate (a window). Many encoders also have a psychoacoustic model that take into consideration the previous psychoacoustic parameters/results.

Right now, when listening to samples, we usually encode a short sample with the encoder and listen to the result.
But the encoder needs some time to adapt its models (bitrate and psychoacoustic), and of course will not be able to properly adapt at the very beginning of the sample. If the sample hasn't been extracted from the full track, the encoder would have some time to adapt its models. It means that encoding a short sample is not totally representative of how this portion would be encoded in a "real" encode.

That is why I am proposing the following:
When encoding a short sample, allow a 1 second margin at the beginning and at the end  of the sample so the encoder can adapt its models. This should not be 1s of silence, but a real 1s of content.
For ease of use, this could even be taken into consideration by the testing tools.

For video, the vqeg already has a similar recommendation: 1s at the beginning and 1s at the end should not be considered for tests, in order to let encoders stabilize themselves.

And:
Well, I suggested 1s because we have to find a reasonable value.
I do not know about wma encoders, but even 1s is not optimal to Lame, as the ATH adjustement might need more than 1s to stabilise.
But 1s is still way better than nothing and does not reduces the sample that much.

Regarding the testings themselves, I think that it would be very nice to have the tools automatically restrict the default time by 1s at both ends.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-03-20 18:03:09
It may be true in theory but my experience shows that it is not true in practice, at last for vorbis. For a very simple reason ... before I had my "artefact only" 2 sec samples ... I had to test 10 to 30 sec samples to actually find the artefact ... it never happened that there was a variation in what I was hearing betwen the 1-2 sample & the 10-30 sec sample ... This is true for Vorbis at -q2 which is the codec/setting I used to find my artefacts/samples ... I don't know for other codecs. I doubt it, specially as the argument comes from Gabriel & Lame MP3 competes very well, if it was a real problem Lame wouldn't compete so well.

Edit: When I will have more time I will re-code both long & short samples at Lame V7 then decode both to wav & cut the long wav to match the small sample. If what you say affect audio quality, I should be able ABX a difference. I think I won't find any, but I prefer to be 100% sure.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: IgorC on 2009-03-20 20:20:44
Making statements that codec A is better than B based on ABX results is nonsense. ABC/HR is required.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-03-20 20:57:29
Well it is an A/B Vs. C/D test using foobar2000 ABXing component so the reference was hidden, I didn't knew my reference file. I conducted the test in 2 parts, first I determined which one I considered as the lossless file between A & B then I determined between C & D which was the closer. It is not an X selected reference vs. A/B random test. What it really miss IMHO is statistical validity & a larger sample database ... with time I can fix the first problem but I cannot force others to test ... any quality claims are to be taken carefully. But if each time someone test codecs everyone jump & tell this is not valid for reason XYZ ... it is not surprising that tests like this doesn't show up more often. Not only it's boring as hell, but the whole world suddenly disagree with you. The only thing you can do is to test with the highest transparency possible, the most scientifically possible & then tell your opinion so that others can disagree openly. I have nothing against critics. All the files are here & my ABX logs too, so the test can re-run forever by others until is is proven scientifically valid. There is a small part of truth in this test. Readers should just be aware that it is not THE truth. If such test didn't gave an hint/a clue/an orientation, it wouldn't even be worth it testing audio for yourself.

I runned the test for myself, I am very confident of the result for myself. But in the same way I don't blindly trust tests made by others, I don't expect others to blindly trust myself. It is not a problem for me if you disagree, it's a problem for me if I made mistake ... like not using the optimal settings for iTunes AAC. (but my test is valid within the setting I used, which is iTunes for Windows default import setting, I will edit the table to make it clearer for mac OS users)

For me, it is nonsense to make quality claims within the same area of flaw. I cannot honestly tell what is better between to two orange/medium or two yellow/light artefacts. But if you tell me that I cannot tell that a sample I marked as red sound worst than a codec I marked as green. It is such an evidence that it is false, that I can only disagree. I didn't rate the sample from 1 to 5 because I consider that its to wide to be honest so I rated as 1/2/3. The fact that my scale is smaller means that the difference between rating is higher. Trust me, red is awfull & yellow shouldn't be ABXable for anyone without ABXing experience.

I am pretty confident that I can make some quality claims because I am very confident that I was able to find the right anchor in the first place. I agree this is very un-scientific & personnal ... but I cannot disagree with myself, I am not yet schizophrenic  It's your job to disagree !
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: rpp3po on 2009-03-20 21:12:12
@Alex B: Calling all lossy codecs oblivious was maybe to general. MP3 does in fact know frame interdependencies. So frame 3 could be depending on frame 2's content. I doubt that more than a quarter of a second would make a difference, but I don't know the actual implementation. Anything concerning rate control is rather messy compared to AAC anyway, in my opinion. AAC doesn't know frame interdependencies, so you wouldn't need leading silence for these kind of tests. I don't know how this is handled in Vorbis.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: menno on 2009-03-20 21:22:58
@Alex B: Calling all lossy codecs oblivious was maybe to general. MP3 does in fact know frame interdependencies. So frame 3 could be depending on frame 2's content. I doubt that more than a quarter of a second would make a difference, but I don't know the actual implementation. Anything concerning rate control is rather messy compared to AAC anyway, in my opinion. AAC doesn't know frame interdependencies, so you wouldn't need leading silence for these kind of tests. I don't know how this is handled in Vorbis.


You're mistaken here, an AAC decoder has very minimal inter frame dependencies (but not none), but nothing is stopping an encoder from keeping track of, and using, a lot of past (or even future) data, as long as the bitstream conforms.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: rpp3po on 2009-03-20 21:28:30
You're mistaken here, an AAC decoder has very minimal inter frame dependencies (but not none)...


You must know. Just out of curiosity, which would that be?
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: menno on 2009-03-20 21:40:55
You're mistaken here, an AAC decoder has very minimal inter frame dependencies (but not none)...


You must know. Just out of curiosity, which would that be?


For LC it is only overlap and add in the filterbank, but this has no influence as long as the frames are presented to the decoder in the same order as the encoder output them. Then there is inter frame prediction for MAIN profile (who uses that). And for SBR there is a header with some configuration data emitted only once so many frames, as well as a lot of influence on parameters from previous frames.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: rpp3po on 2009-03-20 22:43:41
I am sad to report that the Kraftwerk sample just miserably failed my ABX for Quicktime AAC in all versions up to 274 kbit/s (256kbit/s constrained VBR/iTunes Plus), so even the ones with corrected bitrate:

Code: [Select]
foo_abx 1.3.3 report
foobar2000 v0.9.6.2
2009/03/20 23:29:36

File A: Y:\Downloads\01- DCT Killer Samples (Lossless)\01- Artefact+Context\QT7.6_VBR_Target-Bitrate\04- Kraftwerk (Artefact+Context) QT7.6_256kbs_VBR_constrained.m4a
File B: Y:\Downloads\01- DCT Killer Samples (Lossless)-1\01- Artefact+Context\04- Kraftwerk (Artefact+Context) Lossless.flac

23:29:36 : Test started.
23:30:33 : 01/01  50.0%
23:30:41 : 02/02  25.0%
23:30:59 : 03/03  12.5%
23:31:12 : 03/04  31.3%
23:31:30 : 04/05  18.8%
23:31:38 : 05/06  10.9%
23:31:58 : 06/07  6.3%
23:32:13 : 07/08  3.5%
23:32:21 : 08/09  2.0%
23:32:39 : 09/10  1.1%
23:32:51 : 10/11  0.6%
23:33:08 : 11/12  0.3%
23:33:20 : 12/13  0.2%
23:33:29 : 13/14  0.1%
23:33:31 : Test finished.

----------
Total: 13/14 (0.1%)


It's instantly noticeable, just try it yourself. The synth sound in the middle part (from 00:01) is completely muffled.

And I can also completely reproduce sauvage78's findings that Nero is already transparent at q .4:

Code: [Select]
foo_abx 1.3.3 report
foobar2000 v0.9.6.2
2009/03/20 23:57:41

File A: Y:\Downloads\03- Nero AAC 1.3.3.0\04- Kraftwerk\04- Kraftwerk (Artefact Only) (Duplicated Right Channel) Lossless.flac
File B: Y:\Downloads\03- Nero AAC 1.3.3.0\04- Kraftwerk\04- Kraftwerk (Artefact Only) (Duplicated Right Channel) Nero AAC 1.3.3.0 Q0.40.mp4

23:57:41 : Test started.
23:58:58 : 00/01  100.0%
23:59:23 : 01/02  75.0%
23:59:47 : 01/03  87.5%
00:00:00 : 01/04  93.8%
00:00:14 : 01/05  96.9%
00:00:28 : 01/06  98.4%
00:00:40 : 02/07  93.8%
00:00:51 : 03/08  85.5%
00:01:01 : 03/09  91.0%
00:01:12 : 04/10  82.8%
00:01:20 : 04/11  88.7%
00:01:27 : 05/12  80.6%
00:01:35 : 05/13  86.7%
00:01:39 : Test finished.

----------
Total: 5/13 (86.7%)
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: C.R.Helmrich on 2009-03-20 23:40:31
My two cents on the behavior of a codec during the first 1 or 2 seconds of audio, in short form:

At the beginning of an encoding, the bit reservoir is full (since previously, there was no audio)
=> The bit reservoir can be "drained" more aggressively than in the middle of an encoding
=> more than the targeted average bits per time can be spent
=> the first few frames (or tenths of a seconds) most likely sound better than later audio parts

Of course, this only applies to CBR coding. A VBR codec doesn't have to enforce a very strict bit rate per second (or per x frames), so there is no, or a very lenient, bit reservoir.
=> For VBR, quality should be the same for the first few frames and later frames.

Still, I recommend using samples longer than 1 or 2 seconds because, as said, an encoder might need a few frames to adjust to the input, and because our hearing also needs some time to get accustomed to the stimulus (especially if it's something noisy and transient like sauvage78's test set and there is a distinct click/pop when looping a test item).

sauvage78, which items did you use for ABXing? The artefact+context, or the artefact-only? The former ones are long enough, the latter ones not, in my opinion.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: /mnt on 2009-03-21 00:53:50
Wow i didn't know that ogg autov can have some really serious precho problems.

Code: [Select]
foo_abx 1.3.3 report
foobar2000 v0.9.6.3
2009/03/21 00:36:40

File A: C:\Downloads\02__aoTuV_Beta5.7\02- aoTuV Beta5.7\04- Kraftwerk\04- Kraftwerk (Artefact Only) aoTuV Beta5.7 256Kbps.ogg
File B: C:\Downloads\02__aoTuV_Beta5.7\02- aoTuV Beta5.7\04- Kraftwerk\04- Kraftwerk (Artefact Only) Lossless.flac

00:36:40 : Test started.
00:36:56 : 01/01  50.0%
00:37:00 : 02/02  25.0%
00:37:18 : 02/03  50.0%
00:37:21 : 03/04  31.3%
00:37:25 : 04/05  18.8%
00:37:30 : 05/06  10.9%
00:37:35 : 06/07  6.3%
00:37:39 : 07/08  3.5%
00:37:42 : 08/09  2.0%
00:37:47 : 09/10  1.1%
00:37:50 : 10/11  0.6%
00:37:55 : 11/12  0.3%
00:37:59 : 12/13  0.2%
00:38:10 : 13/14  0.1%
00:38:13 : 14/15  0.0%
00:38:18 : 15/16  0.0%
00:38:22 : 16/17  0.0%
00:38:27 : 17/18  0.0%
00:38:30 : 18/19  0.0%
00:38:33 : 19/20  0.0%
00:38:38 : 20/21  0.0%
00:38:44 : Test finished.

 ----------
Total: 20/21 (0.0%)

Precho all the way through the synth, causing smearing and making it sound muffed up. Pretty bad for a 358kbps file.

I can also confirm that iTunes AAC has the same problem aswell.

Code: [Select]
foo_abx 1.3.3 report
foobar2000 v0.9.6.3
2009/03/21 00:30:22

File A: C:\Downloads\04__iTunes_AAC_8.1.0.52\04- iTunes AAC 8.1.0.52\04- Kraftwerk\04- Kraftwerk (Artefact Only) (Duplicated Right Channel) iTunes 8.1.0.52, QuickTime 7.6 256Kbps VBR.m4a
File B: C:\Downloads\04__iTunes_AAC_8.1.0.52\04- iTunes AAC 8.1.0.52\04- Kraftwerk\04- Kraftwerk (Artefact Only) (Duplicated Right Channel) Lossless.flac

00:30:22 : Test started.
00:30:30 : 01/01  50.0%
00:30:34 : 02/02  25.0%
00:30:42 : 02/03  50.0%
00:31:05 : 03/04  31.3%
00:31:09 : 04/05  18.8%
00:31:12 : 04/06  34.4%
00:31:15 : 05/07  22.7%
00:31:19 : 06/08  14.5%
00:31:25 : 07/09  9.0%
00:31:32 : 08/10  5.5%
00:31:36 : 09/11  3.3%
00:31:39 : 10/12  1.9%
00:31:44 : 11/13  1.1%
00:31:50 : 12/14  0.6%
00:31:55 : 13/15  0.4%
00:31:59 : 14/16  0.2%
00:32:03 : 15/17  0.1%
00:32:07 : 16/18  0.1%
00:32:12 : 17/19  0.0%
00:32:15 : 18/20  0.0%
00:32:20 : 19/21  0.0%
00:32:26 : 20/22  0.0%
00:32:29 : 21/23  0.0%
00:32:40 : 22/24  0.0%
00:32:46 : 23/25  0.0%
00:32:52 : 24/26  0.0%
00:32:58 : Test finished.

 ----------
Total: 24/26 (0.0%)

Same problem.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-03-21 12:50:51
rpp3po:
I have re-installed Itunes in order to see if I did anything wrong, here is the encoder/setting that I used
Preferences/Import Settings/Custom then I only changed the bitrate.



C.R.Helmrich:
I used the short version as marked within the ABX logs. But if this is a real problem I will find it, give me some time. I need to know as I intend to find even more short samples with artefact to make the ABXing time shorter.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: rpp3po on 2009-03-21 13:16:44
I have this already cleared up here (http://www.hydrogenaudio.org/forums/index.php?showtopic=70405). In iTunes you have the chance between VBR constrained and ABR. Maybe the bitrate confusion comes from the fact that your listed bitrates are for artifact only encodes and mine for artifact+context? A too short sample shouldn't really worsen a codec's ability to prevent artifacts, but you can't really compare bitrates for such ultra short clips.

When you want to try the QT pro version just for this test and not for anything else you may google for pablo/nop.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-03-21 13:55:24
Thks a lot, I get what you meant with your googling thing  but it is not my intention to test codecs which are not free for us mortal windows users. I will re-focus on aotuv vs. nero aac because this is where my interest really stands. I will edit VBR to VBR constrained in the table.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: DigitalDictator on 2009-03-22 01:20:20
What are you talking about? They are all free. Maybe not Quicktime Pro, but iTunes is, and so are the rest.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: knucklehead on 2009-03-22 02:12:48
What are you talking about? They are all free. Maybe not Quicktime Pro, but iTunes is, and so are the rest.


You need to read all the posts.

If you have a Mac, you can actually get all the options in a very convenient way for free.

Some folks seem to have real problems with the idea of using a Mac.

Some of those ideas might be rational,

some perhaps a bit less than rational....

Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: ktf on 2009-03-22 13:10:11
I was playing around with Vorbis today, and I found another sample which you might find interesting. It is lossless, directly rendered from the trial of FL Studio. The problem is, when coded with vorbis (aoTuV, latest beta) at quality 2, you can hear distortion in the foreground synth.

Edit: typo
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-03-22 13:53:03
I tried to catch something but I failed ... can you tell me when & what to listen to more exactly, I focused on synth at the beginning/middle & end, I found nothing.

Don't worry this happens often  this morning after discovering the eig sample from Mo0zOoH, I tried to ABX his two other samples:

Autechre — [Gantz Graf EP #01] Gantz Graf [3:58];
Nine Inch Nails — [Quake OST #01] Quake Theme [5:08].

from this thead:
Mo0zOoH's problem samples (http://www.hydrogenaudio.org/forums/index.php?showtopic=49601)

I cannot ABX the first one at all & I can ABX the second one but only at lame V7 so it wasn't worth it.

That's why it's hard to put such a test together, first you must waste time listening to things that others can hear while you cannot... It's boring & frustrating.
Thks anyway.

Edit:
In the future, I will split the table in two as, if I will add new samples for sure to this test (Ministry & Abfahrt Hinwil are planned, I already know that Lame MP3 will have 2 medium artefacts at V2 on these). I do not plan to test the new samples on iTunes & Musepack, for various reasons (not only audio quality) I don't think that these codecs worth that I spend time on them. I think the same of Lame MP3, but actually Lame MP3 is a good anchor & is usefull to identify the artefacts so I decided to keep Lame MP3 even if I will never use it personnaly. I will focus on Vorbis/Nero/Lame & lately Lossy|Flac. Also, I want people to be able to test for themselve. So I want them to be able to download the samples, but there is an upload limit that I am already almost reaching. So when I will add new samples I will re-organize my files to remove lossless doublons in the archive & gain some space. I want this thread to be heavyly oriented toward vorbis, so that, maybe one day, its flaws get fixed.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: DigitalDictator on 2009-03-22 14:23:30
What are you talking about? They are all free. Maybe not Quicktime Pro, but iTunes is, and so are the rest.


You need to read all the posts.

Sure I read all the posts. I just think he confuses codecs with applications and platforms. It's still the same codec AFAIK. Not important anyway.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-03-22 14:38:01
DigitalDictator:
I didn't get what was your problem, but it doesn't matter anyway the problem was solved thks to rpp3po & I don't plan to test Itunes AAC anymore.
If you think Itunes ACC is great, just use it ... I have no problem with it. I didn't test its true VBR mode so maybe it is really great... I don't know & I don't care.
I didn't confuse anything, it's Itunes for windows which isn't clear/friendly at all. It doesn't even tell you that you are using the constrained VBR mode, how could I have known ?
Don't be such a ... Dictator
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: ktf on 2009-03-22 14:54:08
I tried to catch something but I failed ... can you tell me when & what to listen to more exactly, I focused on synth at the beginning/middle & end, I found nothing.


Well, now I listen it back myself, it isn't that clear anyway.  Attached: the exact problem and two vorbis files at q0 and q2, as the title says
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-03-22 15:44:27
I think that I catched what you meant, there is something, but I don't think that this is really a killer sample, it is just a sample that doesn't archive transparency at low bitrate, I would rate it light/yellow artefact at Q0 so I guess that at higher bitrate I will be unable to ABX it. I am searching for samples that have problems at 128-192Kbps more than at 96Kbps because I expect that LOTS of samples will not be transparent at 96Kbps. There is much worse problem cases around, so honestly I doubt that I will add it, thks anyway. If it's not transparent at 192Kbps then it's most likely a good custumer for the job.

Code: [Select]
foo_abx 1.3.3 report
foobar2000 v0.9.6.3
2009/03/22 16:29:11

File A: C:\Documents and Settings\SolidInc__Artefact__Lossless.flac
File B: C:\Documents and Settings\SolidInc__Artefact__Lossless_q0.ogg

16:29:11 : Test started.
16:30:11 : 01/01  50.0%
16:30:52 : 02/02  25.0%
16:31:22 : 03/03  12.5%
16:33:12 : 04/04  6.3%
16:34:43 : 05/05  3.1%
16:35:09 : 05/06  10.9%
16:37:06 : 06/07  6.3%
16:38:02 : 07/08  3.5%
16:38:33 : Test finished.

----------
Total: 7/8 (3.5%)


Codecs have been so tuned for it that I think that even castanets is not a real killer sample anymore. It is just a hard to encode sample that shows that you shouldn't expect transparency at low/mid bitrate. Before this test, my signature was aoTuV -q4, I had to rise my bitrate because I was a little too enthousiastic ...
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: ktf on 2009-03-22 15:45:38
Ah, well, I'll try to find better ones then
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: DigitalDictator on 2009-03-22 17:33:24
DigitalDictator:
I didn't get what was your problem, but it doesn't matter anyway the problem was solved thks to rpp3po & I don't plan to test Itunes AAC anymore.
If you think Itunes ACC is great, just use it ... I have no problem with it. I didn't test its true VBR mode so maybe it is really great... I don't know & I don't care.
I didn't confuse anything, it's Itunes for windows which isn't clear/friendly at all. It doesn't even tell you that you are using the constrained VBR mode, how could I have known ?
Don't be such a ... Dictator

What? What did I do now?? Someone said he didn't like non free codecs. So I just pointed out that AFAIK they're all free. It's just that some applications (i.e. the interface), which perhaps enable more features, aren't free. So I don't see why you are upset. You tested iTunes, for that I thank you.

Well yea, I might be a dictator, but not here at HA, I'm not, haha.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-03-24 20:20:16
I just added the Abfahrt Hinwil sample (also known as eig), this one is very bad for Lame MP3 which is affected no matter the bitrate. Vorbis also have a problem that sounds different at Q2 with it.
This sample leave Nero AAC alone on the top of the mountain. Also it shows that depending on the sample Musepack Extreme may be better than Lame V2 (only at high bitrate). So it gives some credit to the Musepack sect out there.

I tested dozen of killer samples the problem is that either I can't ABX them or they only affect a single codec & not vorbis. I am still searching for something that I could ABX with Nero Q0.55 (I tested the Fear Factory sample but I couldn't ABX it ...)

Itunes files are not available anymore for download as I am reaching the upload limit, I have keeped it on the table for this time, but I may remove it next time. (The old results will still be available within the .odt anyway)

Errata: after I discovered that I could ABX aotuv -Q4 easyly on Abfahrt Hinwil this morning (must have been tired yesterday) I decided to downgrad it from failed to orange, -Q6 switched from X to failed (which doesn't change anything, it's still transparent for me), then I compared aotuv -Q4 & nero Q0.35, I decided to downgrad Nero Q0.35 from yellow to orange in order to be fair with aotuv. The artefact are differents, I am more sensible to echo than snapping, but once you catch it, snapping is more annoying than echo & once trained both are easy to ABX (snapping is masked within the little explosions). I add hesitations to put red for Musepack Thumb on this sample, it's only orange because Lame & Vorbis are worst at ~96Kbps, all the 3 are very bad but Lame & Vorbis are hurting while with Musepack it's just a big echo. The notation is also done by comparing each sample between codecs in order to be as fair as possible.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: shadowking on 2009-03-24 23:35:00
Nero had trouble with this. The sample is somewhere on my old PC, But is originaly from Guruboolez .

http://www.hydrogenaudio.org/forums/index....showtopic=55339 (http://www.hydrogenaudio.org/forums/index.php?showtopic=55339)
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-03-24 23:45:45
If ever someone upload it I will try ...
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: Antonski on 2009-03-25 01:20:21
Also it shows that depending on the sample Musepack Extreme may be better than Lame V2 (only at high bitrate). So it gives some credit to the Musepack sect out there.

Amen
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: halb27 on 2009-03-25 21:50:15
I guess I found the not just Nero AAC problem emese shadowking wrote about.
Upload doesn't work (not sufficient space).

For a limited time I put it on my webspace: emese sample (http://home.arcor.de/horstalb/problems/A03_emese.flac)
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-03-25 22:24:49
Thks ! I have it  I'm gonna get crazy listening to this, it loops "the human blood burns" in french  with a frightening voice in the background
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: ExUser on 2009-03-26 00:12:01
That Abfahrt sample just rapes poor MP3. 8/8 at 320kbps.

8/8 on Kraftwerk and Rush using aoTuV -q8 as well, though both differences were subtle. Failed to scale up to -q10.

This is fun. I haven't ABXed anything in a while. Thanks for the opportunity.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: ktf on 2009-04-12 20:44:12
I tried to catch something but I failed ... can you tell me when & what to listen to more exactly, I focused on synth at the beginning/middle & end, I found nothing.


Well, now I listen it back myself, it isn't that clear anyway.  Attached: the exact problem and two vorbis files at q0 and q2, as the title says


To come back to this sample, I played around with it and found out that it isn't really a problem for vorbis (transparant to me @ 104kbps, q 2,5) but it is a problem with Nero's AAC (transparant at q0,43 but not at q0,42 to me so around 160 kbps) and with LAME, which is only transparant at -V2 or above, 220kbps. The artifact isn't present at all in Musepack, not even at quality 1 (that's below thumb), but then the rest of the song sounds horrible...

Just in case anyone would be interested 
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-04-15 02:03:47
I just added a partial result for aoTuV exp-bs1, even incomplete it took me almost 3 hours (in case some people don't realize how even such a little test is time greedy ...), the good news is that for me the Rush sample is completely fixed, the bad news is it's the only sample affected by aoTuV  exp-bs1. I didn't noticed any progression/regression on any other sample (including the 2 samples I didn't strictly ABXed & that are greyed in the table). So aoTuV exp-bs1 might be a good fixe on some problems but it is not a magic solution. There is no "overall quality progression", it's more a targetted patch that achieve its goal on a specific sample. I was a little desapointed because the Rush sample was the first sample I tested so I was very excited in the beginning as I had some hope that if all those samples are affected by the same block switching issue & the Rush sample was fixed , then maybe it would improve all the other samples ... it wasn't the case. Still I like aoTuV exp-bs1, because I like Rush as a band & specially New World Man (http://www.youtube.com/watch?v=WNkAtgX-HT4) as a song

It is very likely that Castanet & Harlem results would be the same for aoTuV exp-bs1 & aoTuV Beta5.7, quick non-blind listening test sounded this way.

In the end, it doesn't make Vorbis shine but it's a good step to getting -q8 transparent for everyone again, only Krafwerk remains easy to ABX at -q8.

Errata2:
Because I failed to ABX Autechre aoTuV exp-bs1 -q8 & because aoTuV Beta5.7 -q8 was a yellow result that might have been either lucky guess or a succesfull ABX but taking all necessary time to succed (which mean VERY long). I upgraded aoTuV Beta5.7 -q8 from Yellow/Light to Green/Failed. I can't tell a difference between aoTuV Beta5.7 -q8 & aoTuV exp-bs1 -q8, so it was illogic to up the first in yellow & the secong in green, just because I didn't took the same time ABX both. In the same time I downgraded aoTuV Beta5.7 -Q6 from orange to red, the red artefact from -Q2 & -Q4 is still here & even if lighter, it doesn't decrease much until -q8, so there is no reason why it wouldn't be red. It's a one result up for one result down trade & this way the table highlight the artefact even brighter IMHO.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: Axon on 2009-04-15 02:40:16
3/8, Abfahrt, LAME 3.97 -V0 (236kbps). 

Any suggestions? (besides counting my blessings?)
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: /mnt on 2009-04-15 03:44:12
3/8, Abfahrt, LAME 3.97 -V0 (236kbps). 

Any suggestions? (besides counting my blessings?)

Be glad that your that not sensitive to precho, IMO its more annoying then warbling artifacts.

Code: [Select]
foo_abx 1.3.3 report
foobar2000 v0.9.6.4
2009/04/15 03:32:35

File A: C:\Downloads\01__DCT_Killer_Samples__Lossless_\01- DCT Killer Samples (Lossless)\02- Artefact Only\01- Abfahrt Hinwil (Artefact Only) Lossless.flac
File B: C:\Downloads\04__Lame_3.98.2\04- Lame 3.98.2\01- Abfahrt Hinwil\01- Abfahrt Hinwil (Artefact Only) Lame 3.98.2 V0.mp3

03:32:35 : Test started.
03:32:45 : 01/01  50.0%
03:32:52 : 02/02  25.0%
03:32:57 : 03/03  12.5%
03:33:05 : 04/04  6.3%
03:33:10 : 05/05  3.1%
03:33:22 : 06/06  1.6%
03:33:28 : 07/07  0.8%
03:33:36 : 08/08  0.4%
03:33:43 : 09/09  0.2%
03:33:51 : 10/10  0.1%
03:33:57 : 11/11  0.0%
03:34:02 : 12/12  0.0%
03:34:08 : Test finished.

 ----------
Total: 12/12 (0.0%)

The beats makes cracking noises. IMO eig sounds less annoying at V0, while the 1980's dance club track from RoboCop has some really bad precho issues that sound really bad even at 320 kbps.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: halb27 on 2009-04-15 09:50:42
3/8, Abfahrt, LAME 3.97 -V0 (236kbps). 

Any suggestions? (besides counting my blessings?)

My suggestion is to listen to the eig (aka Abfahrt Hinwihl) result of other mp3 encoders. Try for instance previous Lame versions (EDITED: I just saw you did. Try 3.98 for a comparison) or a recent Fraunhofer encoder.
You'll find that Lame 3.98.2 does a very good job within the restrictions of mp3.

The sample just shows that mp3 pre-echo can be very bad. Like in this case the really bad problems are expected to come from electronic music because impulses can be arbitrarily artificial here.
With music originating from natural instruments things usually are fine or at least not very bad even with mp3.

Lovers of electronic music may be happier using another codec.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: /mnt on 2009-04-15 15:45:10
Also the PTP samples (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=70598&view=findpost&p=622938) also shows how bad MP3 can be with electronic music.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-04-19 16:07:37
I just added results for Celt 0.52, due to space restriction I cannot provide the encoded files (it's decoded WAV packed in flac) so only the logs are available. IMHO you cannot draw any conclusion at this early stage of development, it sounds bad but it's experimental non-frozen code so I didn't expected it to sound good, I was already very happy that Xiph provided a win32 build so that I could test it. I was also happy that it was not affected at all by 2 samples. Other than that it is absolutly un-tuned & unusable for backup actually. This is only a test for audio enthousiasts.

Note: The artefact noted as medium for Celt are often severe medium, I could have noted them as M+ instead of just M, but I marked them as medium because if I wouldn't have done so, almost everything would be in the red zone. Actually Celt 256Kbps is clearly worst than Vorbis 128Kbps, I am not even sure that it can compete with Vorbis 96Kbps.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: ktf on 2009-04-19 16:35:47
PS:
(...)
If ever Gabriel or Roberto reads this plz fix http://lame.sourceforge.net/quality.php (http://lame.sourceforge.net/quality.php) (I get a 403 error when I try to download samples)


I just saw you can download them "by hand". Just paste the displayed name behind the link, http://lame.sourceforge.net/download/samples/track7.wv (http://lame.sourceforge.net/download/samples/track7.wv) for instance. Only The first one, labeled Roel's infamous velvet.wv, won't work like that.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-04-19 16:43:21
Thks for the tip ktf 
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-04-20 10:50:48
I have tested all the samples in the lame archive at aotuv -q4 & lossywav -q 1.5:

I can ABX applaud, spahm & testsignal3 at aotuv -q4 & below
I can ABX Fools at lossywav -q 1.5 & below

applaud reminds me of harlem
testsignal3 reminds me of kraftwerk but not as bad

spahm is more specific, I tested all this 3 samples on aoTuV exp-bs1 with the hope that maybe spahm would be improved but exp-bs1 has no affect (except on Rush)

Thks again ktf, the Fools sample will be very usefull for lossywav tuning.

If someone have information on artist album track ... for applaud, spahm & testsignal3 I am interested (I have all usefull informations for the fools sample), when exp-bs1 will be released I will remove the Rush sample from the test has it will be fixed, I may add one of these 3 samples (applaud is unlikely has it would be a doublon with harlem, I also still have Ministry & Pierre Henry samples in my bag but those are not targetting vorbis)

I now have almost all the samples I searched except one called Amnesia (acid music) I heard from Pio2001, if ever someone have this one.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: Alex B on 2009-04-20 12:41:01
Here's Amnesia: [attachment=5053:amnesia.flac]

You might also want to try the sample I uploaded here: http://www.hydrogenaudio.org/forums/index....showtopic=54752 (http://www.hydrogenaudio.org/forums/index.php?showtopic=54752)

I think it is problematic for most lossy encoders (at least at low and middle bitrates).

Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-04-20 13:04:05
Thks too Alex B, I have quickly tried these two (plus Track02cut from the same topic) on aotuv -q4 & lossywav -q 1.5:

I can ABX amnesia on aotuv -q4, I failed to catch anything on all the other samples/parameters. Will retry once later today then I'll give up.

Edit: Attached Fool's Garden for lossywav testing (This is a temp link for Nick.C & co)
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: NullC on 2009-04-23 00:10:09
I just added results for Celt 0.52, due to space restriction I cannot provide the encoded files (it's decoded WAV packed in flac) so only the logs are available. IMHO you cannot draw any conclusion at this early stage of development, it sounds bad but it's experimental non-frozen code so I didn't expected it to sound good, I was already very happy that Xiph provided a win32 build so that I could test it. I was also happy that it was not affected at all by 2 samples. Other than that it is absolutly un-tuned & unusable for backup actually. This is only a test for audio enthousiasts.

Note: The artefact noted as medium for Celt are often severe medium, I could have noted them as M+ instead of just M, but I marked them as medium because if I wouldn't have done so, almost everything would be in the red zone. Actually Celt 256Kbps is clearly worst than Vorbis 128Kbps, I am not even sure that it can compete with Vorbis 96Kbps.


Awesome!  Thanks for testing.  Is there any chance you could do a quick check of CELT with your "killer samples" in mono mode?    Stereo support in CELT 0.5.2 is pretty much entirely untuned (and the stereo infrastructure will likely be entirely replaced in the next release).  The current version also has some internal limitations at high rates removing them is on my todo list, and they may be making 256 perform no better than 192. Transparency under critical testing at high bitrates (your backup use case) hasn't yet been an area of active tuning, but it is something that we want to get right. 

If you have the patience to do a little more testing than just a single mono check I could build a special encoder/decoder that exposes some internal settings as knobs that you could try tuning.  It seems like your samples are largely block switching stress test cases.  The block switching logic in CELT is pretty simplistic and also pretty much untuned. On the other hand, CELT's long blocks are the size of typical Vorbis shorts, so the switching doesn't need to be too smart (CELT only has shorts at all because without them AAC LD was clearly outperforming it on castanets; likely due to TNS).

Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-04-23 09:35:17
Sorry but in all honesty I have no interest in tuning CELT specially on mono, it was just a one shot test done by curiosity to see what it has in the stomach. Now I know that CELT requires much tuning which will take a long time. I am a lossless user so I will not spend my time helping the development of all lossy codecs in the world, I already tried to help lossywav & to a lesser extend aotuv & I noticed how much time greedy this is compared to the little gain. I have spend hours for almost no gain on lossywav (only discarding useless parameters) & a very little gain on aotuv (only fixing the Rush sample). I have high hope for CELT but I am short of time. Also I feel that it is a natural processus that after some ABXing you finally get bored & lost your interest. I will re-test CELT for sure, but after several releases, in something like a year & for my selfish personnel need, not for tuning. The ability to split CELT is a priority for me because I use my music in a particular way, I want only CDImage+cue, so I want to use splitable lossy CDImage+cue, that's why my personnal interest lies in lossywav & vorbis actually. But I am not specially a free software defender, if nero AAC would be easily splitable with a cue, I would most likely switch to aac/mp4 as I already use avc for video. Good luck with CELT, it's nice too see that Monty is not travelling alone anymore, even if it seems faster. It's gonna be a looong way to the top
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: jmvalin on 2009-04-23 10:08:45
Note: The artefact noted as medium for Celt are often severe medium, I could have noted them as M+ instead of just M, but I marked them as medium because if I wouldn't have done so, almost everything would be in the red zone. Actually Celt 256Kbps is clearly worst than Vorbis 128Kbps, I am not even sure that it can compete with Vorbis 96Kbps.


Thanks a lot for spending time doing this test with CELT. From what I understand, CELT does very well on "normal samples" and very bad on "problem samples". There's probably a few explanations for that. The first (and obvious one) is that the other codecs are VBR (and can bump the rate during transients), while CELT is CBR for now (VBR support planned, but not that useful anyway). Now, that can't explain why it does so badly on the problem samples, even at very high bit-rates. Of course, there's a tuning issue, but there's also some features like folding and short blocks that need to be tuned. Another thing I hope will eventually help at high rate is to actually include a real psy model. As surprising as it may sounds, CELT currently does the bit allocation without even considering the signal or computing any masking curve. I'm hoping to fix all that (and possibly other problems I don't know about yet) for 1.0.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-04-23 10:32:16
I don't know how CELT perform on "normal samples" I didn't even try to encode a full song to see how it sounds on average input, but I wouldn't expect it to perform "very well", I would rather say that it should sound "acceptable" at high bitrate considering that it is experimental code. I wouldn't use it on anything for backup actually even on non-problems samples. I consider it a toy for enthousiast actually, nothing more. I know quite well the quality of Vorbis at various development stage so if I would compare to old vorbis I would say that it's likely that even old vorbis RC beat the ass of CELT

It's not that CELT is bad, I am confident that the improvement margin is huge. It's just that as a codec, actually, it's a baby.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: jmvalin on 2009-04-23 17:46:23
I don't know how CELT perform on "normal samples" I didn't even try to encode a full song to see how it sounds on average input, but I wouldn't expect it to perform "very well", I would rather say that it should sound "acceptable" at high bitrate considering that it is experimental code. I wouldn't use it on anything for backup actually even on non-problems samples. I consider it a toy for enthousiast actually, nothing more. I know quite well the quality of Vorbis at various development stage so if I would compare to old vorbis I would say that it's likely that even old vorbis RC beat the ass of CELT

It's not that CELT is bad, I am confident that the improvement margin is huge. It's just that as a codec, actually, it's a baby.


I'm getting the impression that you missed something about what CELT is trying (and especially *not* trying) to be. CELT is all about real-time communication with really, really low delay. It's designed so you can play music remotely through a DSL connection or to have perfect quality videoconference again using little bandwidth. This is something that neither AAC nor Vorbis can do because they have >100 ms delay. Even AAC-LD and G.722.1x have too much delay for network music performances. So if you're waiting for CELT to beat Vorbis or AAC, that will likely never happen. However, if you compare it to low delay codecs (even things like AAC-LD and G.722.1C that have a lot more delay), then it does already does quite well (see the paper that I mentioned in the other thread). I'm already surprised that CELT beats MP3 (LAME) in CBR mode in the tests we did. That is far better than I hoped to achieved when starting CELT.

That being said, I'm sure CELT can do a lot better, especially on the problem samples you tried. Now that I know what to test on, I'm hoping to be able to do better. So thanks very much again for doing these tests.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-04-23 18:31:13
I know very well that CELT is supposed to beat all the other codecs on latency, but I have no interest at all in things like low latency, mono or voice. I have read the Xiph website on CELT, if you don't intend to compete with non-streaming oriented codecs you'd better not say that CELT is supposed to be able to handle music. Because you advertise on your website that CELT is supposed to be something between vorbis & speex & when you read that too quickly you can understand that CELT will swallow both speex & vorbis, obviously in your mind it is closer to speex than to vorbis. I have nothing against codec feeding the needs of webmasters rather than the use of audiophiles. I am used to Xiph, I know you guys from a long time, I used to stay in your IRC channels back in the days when I had some irrationnal faith in Monty & vorbis & I know that your priorities are not mine. That's why I quit using vorbis & I will most likely never use CELT if it doesn't target transparency: I have learned from my past misstakes. But before I knew what to think of CELT I had to give it a try. I am sorry if CELT doesn't sound as good as you expected, I cannot honnestly recommend it to anyone in its actual state & anyway as I said several time in this topic I invite anyone to test for himself. Some people are very happy with vorbis -q2, what can I say ? I will not fight the whole world, if people are happy with vorbis -q2 I am happy for them, the same goes for CELT. If CELT improves be sure I'll be fair with it. I am already happy that my short test helped you realized that CELT wasn't perfect, it's a good start.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: Alex B on 2009-04-23 19:18:30
Thks too Alex B, I have quickly tried these two (plus Track02cut from the same topic) on aotuv -q4 & lossywav -q 1.5:

I can ABX amnesia on aotuv -q4, I failed to catch anything on all the other samples/parameters. Will retry once later today then I'll give up...

Yeah, my sample appears to be easier than I thought for Vorbis and also for Nero. I was barely able to ABX Vorbis at -q4 (I heard a small tonal difference in the highest frequencies in the quieter part of the sample. I tried also Nero -q 0.45 and failed to quickly ABX it.

However, just for kicks you could try LAME -V2. The problem I noticed with LAME3.97 is now somewhat less pronounced, but still there with LAME 3.98. It produces clear artifacts. An ABX test is not needed to hear them. -V0 is still easily ABXable. Also Musepack --quality 5  produces a funny and obvious artifact. Musepack --quality 6 is better and I couldn't easily ABX it. (My easily is a quick test that produces 8/8 without doubt. I didn't feel like trying to seriously test a possibly transparent sample. I need to be in a very good shape to try that.)

EDIT

The Wab5s sample is available here: http://www.hydrogenaudio.org/forums/index....showtopic=54752 (http://www.hydrogenaudio.org/forums/index.php?s=&showtopic=54752)
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: jmvalin on 2009-04-23 21:37:55
I know very well that CELT is supposed to beat all the other codecs on latency, but I have no interest at all in things like low latency, mono or voice. I have read the Xiph website on CELT, if you don't intend to compete with non-streaming oriented codecs you'd better not say that CELT is supposed to be able to handle music. Because you advertise on your website that CELT is supposed to be something between vorbis & speex & when you read that too quickly you can understand that CELT will swallow both speex & vorbis, obviously in your mind it is closer to speex than to vorbis. I have nothing against codec feeding the needs of webmasters rather than the use of audiophiles. I am used to Xiph, I know you guys from a long time, I used to stay in your IRC channels back in the days when I had some irrationnal faith in Monty & vorbis & I know that your priorities are not mine. That's why I quit using vorbis & I will most likely never use CELT if it doesn't target transparency: I have learned from my past misstakes. But before I knew what to think of CELT I had to give it a try. I am sorry if CELT doesn't sound as good as you expected, I cannot honnestly recommend it to anyone in its actual state & anyway as I said several time in this topic I invite anyone to test for himself. Some people are very happy with vorbis -q2, what can I say ? I will not fight the whole world, if people are happy with vorbis -q2 I am happy for them, the same goes for CELT. If CELT improves be sure I'll be fair with it. I am already happy that my short test helped you realized that CELT wasn't perfect, it's a good start.


Still missing the point. CELT *does* handle music. The fact that it doesn't handle it as well as Vorbis is not the point. If you want to hear a codec that really doesn't handle music, try Speex. Also, the simple fact that you do not need real-time behaviour means you don't need CELT. It's of no use for you, even if it improves a lot. OTOH, for the people who do need very low delay, CELT is really what they need. The only codec besides CELT that can give you a delay below 10 ms is FhG's ULD codec and in the samples we got, CELT has better quality. I think you misread my previous post. CELT has already *exceeded* my original expectations. The low-delay constraint that CELT has really comes with a heavy price, but I always though that price was even higher than it is. To give you an idea of what we're talking about. Imagine taking AAC or Vorbis, forcing it to CBR, forcing it to always use short blocks, preventing it from using any look-ahead for the psychoacoustics (has to be based on the current frame only), and preventing it from using a bit reservoir. That's what CELT has to do to achieve low delay. Of course, it could have a lot higher quality if it didn't have those constraints, but that's not what I'm trying to do.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: sauvage78 on 2009-04-23 22:22:03
Well I had some hope that CELT would come in different flavors targetting different uses. I'll put this hopes in the delete bin. It's not a big problem for me  Sorry about the missundertanding about CELT handling music, but you marketed it like this ... just sell it as high quality VoIP if you don't want such missundertanding, because reading the website I thought it was both high quality VoIP & maybe music via a VBR mode (I mean CD quality). You seem to say that VBR will not help much, so I wouldn't call it a codec suited for music, even if it's more than just voice. Obviously handling music doesn't mean the same thing for an audio developper & for an end user.
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: jmvalin on 2009-04-23 23:03:26
Well I had some hope that CELT would come in different flavors targetting different uses. I'll put this hopes in the delete bin. It's not a big problem for me  Sorry about the missundertanding about CELT handling music, but you marketed it like this ... just sell it as high quality VoIP if you don't want such missundertanding, because reading the website I thought it was both high quality VoIP & maybe music via a VBR mode (I mean CD quality). You seem to say that VBR will not help much, so I wouldn't call it a codec suited for music, even if it's more than just voice. Obviously handling music doesn't mean the same thing for an audio developper & for an end user.


You seen to have a weird definition of "handling". According to your definition, Speex doesn't handle speech (it's not 100% transparent) and neither does any other speech codec, nor any phone -- much less cell phone. You have the right to care only about lossless audio, but what most people actually want is decent quality with decent bit-rate. Of course, "decent" is different depending on whether we're talking about music players, web streaming, cell phones, ...
Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: NullC on 2009-04-24 23:15:27
I know very well that CELT is supposed to beat all the other codecs on latency, but I have no interest at all in things like low latency, mono or voice. I have read the Xiph website on CELT, if you don't intend to compete with non-streaming oriented codecs you'd better not say that CELT is supposed to be able to handle music.


Only lossless is unconditionally guaranteed to be indistinguishable to all listeners on all samples in all circumstances. Of course, the price you pay for that is higher bitrates.

Of course— we would like to have CELT as transparent as possible at sufficiently high bitrates, and I'll reiterate my thanks for testing and pointing out test cases where it is not… but if you define failing ABX with carefully selected torture samples at high rate to be 'not handling' music then you really have unreasonable expectations for lossy codecs. You might think that lossy codec X is unconditionally transparent at Y bits per second, but it's highly likely that with a different sample, different listening conditions, or a better listener you'd be able to ABX them. 

Quote
Because you advertise on your website that CELT is supposed to be something between vorbis & speex & when you read that too quickly you can understand that CELT will swallow both speex & vorbis,


The webpage states "The CELT codec is meant to bridge the gap between Vorbis and Speex for applications where both high quality audio and low delay are desired".  Speex can't produce music quality high enough to fool even the most tin-eared layman, but it offers fairly low latency. Vorbis offers decent quality but requires high latency. CELT is intended to bridge the gap or, in other words, to fill in the space *between* Vorbis and Speex.  This is almost the opposite of how you read the text on the webpage.  If you can suggest language which is less likely to confuse people, it could be included.

Cheers.



Title: Multi-Codec Listening Test: 96-128-192-256Kbps
Post by: ktf on 2009-04-27 12:38:35
PS:
(...)
If ever Gabriel or Roberto reads this plz fix http://lame.sourceforge.net/quality.php (http://lame.sourceforge.net/quality.php) (I get a 403 error when I try to download samples)


Looks like it's fixed now