Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: 80 kbps personal listening test (summer 2005) (Read 239795 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

80 kbps personal listening test (summer 2005)

Reply #50
Quote
I do understand you are mentioning vorbis here, but are there really any large-scale testing to support a claim that any implementation of aac performs better than LAME at 128 kbps? Robertos multiformat test showed iTunes and LAME to be practically tied at this bitrate (both beaten by vorbis and MPC).

Problem is, everybody tells me that aac theoretically is much better than mp3, but I havent seen much reliable testing of aac implementations to substantiate this...
[{POST_SNAPBACK}][/a]


The next 128kbps listening test will be interesting indeed. iTunes 4.2/QT6.4 was used in the last test, but one of guru's [a href="http://www.foobar2000.net/divers/tests/2004.12/aac_global_results.png]classical music tests[/url] showed that the encoder improved nicely with iTunes 4.7/QT6.5.2. How QT7.0.1 with VBR would perform is anyones guess, but it should be closer to mpc/ogg vorbis than last time.

80 kbps personal listening test (summer 2005)

Reply #51
Quote
Quote
mp3@128 is the best choice for mobile devices[a href="index.php?act=findpost&pid=312625"][{POST_SNAPBACK}][/a]


Dude, that's the high anchor.
[a href="index.php?act=findpost&pid=312627"][{POST_SNAPBACK}][/a]

I know 

I didn't say that because it had the best results    but rather because other codecs bring more 'problems' (compatibility, possible higher battery use, ...) than quality or space gain

80 kbps personal listening test (summer 2005)

Reply #52
Quote
Quote
Simply a m a z i n g  Guru, as usual 

My understanding of the results : mp3@128 is the best choice for mobile devices, no need to bother with other codecs

If you want 128 kbps encodings for your player, vorbis and AAC are probably better than MP3.
And if you want LAME 128 kbps quality, you can probably reach it at lower bitrate (90...120 kbps) with other formats, and therefore increase the musical content of your player.

MP3 128 performed the test as anchor, not as competitor. It's here as reference.
[a href="index.php?act=findpost&pid=312628"][{POST_SNAPBACK}][/a]

see my previous post 

80 kbps personal listening test (summer 2005)

Reply #53
About 96 kbps and MP3

Using LAME and WMP10-Fhg isn't a problem: ABR is the most suited encoding mode for LAME, and Fhg only offers CBR encoding. The situation is very different for iTunes and Audition. I don't know if VBR mode could be safely use at this bitrate with this encoders.

iTunes VBR is apparently protected against possible VBR flaws by a fixed bitrate floor: 96 kbps. It's not a joke. I've set VBR 96 at highest quality with iTunes, and the encoder is supposed to stay close the 96 kbps target without allowing any frame inferior to the target! Consequently, the average bitrate will necessary be superior to the target  In fact, the average bitrate I obtained for 150 classical short sample is 100 kbps. It's really close to the target. It implies that the VBR mode rarely go beyond 96 kbps. LAME ABR encodings have more fluctuation than iTunes VBR (at least at 96 kbps), and it could be checked with Encspot.
I've tried to evaluate with quick and unreliable comparisons what encoding mode could be considered as the best. First point: both are very close, and none revealed problems that was spared to the other. I've just note that VBR offers some improvements over CBR with some samples. Second point: they're both crap. It's much worse than what I've heard during the 80 kbps test. Obviously, none of this encoder would pass the pool.
I think I'll go for VBR.

Adobe Audition: I have many options. CBR and VBR, different encoding mode and also different sampling rate (for CBR). I would prefer the new encoder and avoid the legacy one (people agree?). VBR is probably the wisest choice. The stronger argument in favor of VBR is the lowpass (> 14000 Hz), clearly less irritating than the 11000 Hz one used by default with CBR. I've make quick comparison, and indeed, VBR encodings are immediately much more enjoying: sound is less poor, and the VBR encodings don't reveal terrible distortions as consequence of the higher lowpass (sound is far better than iTunes encoders!). VBR 30 gives me a nice 96 kbps average bitrate (luck! but I have to check the bitrate with the second group). Other good point for VBR: it doesn't allow internal 32000 KHz encoding which might help the encoder to reduce some artefact. It's something less I have to test.
My choice for Audition would be VBR, Current encoder, VBR q30 (probably: it could be precisely changed). It implies:
- 14780 lowpass
- 44100 samplingrate
- joint stereo
- intensity stereo
- CRC writing
All other options are unchecked.


The pool would look like:

- LAME 3.97a11 ABR
- Windows Media Player - Fraunhofer ACM CBR 96
- iTunes VBR 96 kbps BEST
- Audition - Fraunhofer VBR Q30

Average bitrate for the classical group will apparently be comprise between 96 kbps (WMP CBR) and 100 kbps (iTunes). Does it sound OK? Comments? Critics?

80 kbps personal listening test (summer 2005)

Reply #54
Quote
I didn't say that because it had the best results   but rather because other codecs bring more 'problems' (compatibility, possible higher battery use, ...) than quality or space gain


Mobile phones are all AAC compatible these days so compatibility isn't an issue. For HD players there's no need to use such a low bitrate.

Quote
iTunes VBR is apparently protected against possible VBR flaws by a fixed bitrate floor


For some reason, that's how iTunes VBR works. The bitrate you select is a guaranteed minimum according to the documentation.

80 kbps personal listening test (summer 2005)

Reply #55
Quote
but are there really any large-scale testing to support a claim that any implementation of aac performs better than LAME at 128 kbps? Robertos multiformat test showed iTunes and LAME to be practically tied at this bitrate (both beaten by vorbis and MPC).


LAME was performed with VBR mode. It's a nice improvement over CBR/ABR, and indeed LAME was very close to iTunes AAC. But the performance of VBR is dramatically inverted with classical music. See for information the Debussy sample performed in this test: notation was very low. The current low VBR settings can't handle very well quiet passage. On the other side, iTunes AAC have no problems here. I've already tested -V5 and compared it to ABR on a small classical suite (15 samples, somewhere in alphas testing thread), and indeed VBR was inferior to ABR.
Lame V5 could therefore reach a very good quality, but the results are unstable, and encodings could reveal in some situations severe artefacts you won't get with AAC. iTunes AAC is much more reliable, and could apparently handle more situations than LAME VBR.

80 kbps personal listening test (summer 2005)

Reply #56
Quote
Quote
I didn't say that because it had the best results   but rather because other codecs bring more 'problems' (compatibility, possible higher battery use, ...) than quality or space gain


Mobile phones are all AAC compatible these days so compatibility isn't an issue. For HD players there's no need to use such a low bitrate.

Well, if 128 is too low, what would I say for 80 kbps 

80 kbps personal listening test (summer 2005)

Reply #57
Cheers gurubolez.

I tried to do this test myself at 80 kbps comparing the aotuvb4 with the latest official one, and well... I couldn't really hear the difference between the two different encoders, and the one time i really did hear a clear difference, i couldn't decide which one actually sounded better... figures...

Oh well.

80 kbps personal listening test (summer 2005)

Reply #58
Did you try with all samples? Making a difference isn't necessary that easy. Sometimes a good training is needed to evaluate the progress (or regressions). With aoTuV (and vorbis in general), I'd say that it is really important to focuse on usual problems: fatness, noise, HF boost... Listener should also be attentive to pre-echo and other form of artefacts.

80 kbps personal listening test (summer 2005)

Reply #59
Well, it seems that iTunes MP3 encoder would make a perfect choice... as low anchor:

- Reference
- LAME ABR
- iTunes VBR


80 kbps personal listening test (summer 2005)

Reply #60
Quote
- iTunes VBR


Whoa! Sounds like a badly tuned radio 

80 kbps personal listening test (summer 2005)

Reply #61
Quote
AAC-HE screeching artefacts are probably more acceptable when compared to other form of distortions audible with non-SBR products
[a href="index.php?act=findpost&pid=312422"][{POST_SNAPBACK}][/a]

I guess it's just a matter of taste… Personally, I can't stand that awful SBR sound. It just beats the high frequencies to a violent death, as I can hear it.

Edit: BTW, what else do we need to make the merged 1.1.1+b4 version the recommended one? It seems that it is at least not inferior to 1.1…
Infrasonic Quartet + Sennheiser HD650 + Microlab Solo 2 mk3. 

80 kbps personal listening test (summer 2005)

Reply #62
Quote
Edit: BTW, what else do we need to make the merged 1.1.1+b4 version the recommended one?


Probably additional listening tests, or maybe calling into question the way recommendations are done.

80 kbps personal listening test (summer 2005)

Reply #63
Back to the upcoming MP3 96 kbps Pool.

I have encoded the second group of 35 various samples, and bitrate was significantly higher than the average one obtained with the classical group.

iTunes classical = 100 kbps
iTunes various = 104 kbps
=> with iTunes, the bitrate for the second group is within the +/- 10% tolerence I've fixed. VBR is possible I'd say

Audition q30 classical = 96 kbps
Audition q30 various = 112 kbps
=> with Audition, the bitrate for the second group is 20% higher than the target bitrate. Even if I admit that the average bitrate for these 35 short samples doesn't entirely correspond the the average bitrate of full albums, it's clearly too high.


As a consequence, I've lowered the setting to VBR q20. Lowpass is automatically adjusted, but doesn't significantly drop (from 14780 to 14440 Hz). Bitrate:
Audition q20 classical = 89 kbps
Audition q20 various = 102 kbps
=> bitrate is now within the acceptable range for both group. However, bitrate for classical now reaches the critical limit of 87 kbps (96kbps -10%) and is not fully comparable with the bitrate obtained with iTunes (100 kbps).

Then I've tried VBR q25. It's a manual preset, and there's no defaulted lowpass value for manual VBR setting. I've therefore choose 14600 Hz. Bitrate:
Audition q25 classical = 92 kbps
Audition q25 various = 107 kbps
=> excessive deviation with the second group (+11...12%)


At this stage, I have four possibilities:

1/ Using Q20: bitrate is OK for group2, bitrate is OK for group1 but too low when compared to iTunes at 100 kbps.
2/ Using Q25: bitrate is OK for classical, but is too high for group2
3/ Trying Q22..23 in order to obtain an unlikely better compromise
=> in all cases, the selected setting can't be the optimal one for none of both musical category.

4/ Using two different settings for the two different groups.
To me, this possibility makes sense. As someone planning to encode classical only, I won't choose anything else than VBR Q30 which match the desired bitrate. If someone plan to encode something different, he won't probably happy with Q30 (~110 kbps) and will certainly go for Q20, and even maybe a slightly lower setting.
The dual bitrate problem will also occur with other listening tests. All VBR encoders can't output the same bitrate with different kind of samples. It can be experienced with faac, Nero AAC, Lame MP3, Fhg MP3, MPC, WMA9 and WMA9Pro. In all case, I will have to make compromises which probably not correspond to the users' choices. Using two different settings (each one corresponding to the rational choice of someone listening to either "various music" [yes, the concept sucks] or "classical music"].


I could also play a dangerous game: testing iTunes and Audition VBR encodings at an excessive bitrate, cross my finger and hope to see LAME win. The scenario is possible I'd say. iTunes has clearly no chance to pass the pool even with a winning bitrate; but I'm less confident with a contender such as Audition.



I feel that solution 2 (Q25 for both group) and solution 4 (Q20 for various, Q30 for classical) are the two more pertinent. What would be the best in your opinion?

80 kbps personal listening test (summer 2005)

Reply #64
Quote
I feel that solution 2 (Q25 for both group) and solution 4 (Q20 for various, Q30 for classical) are the two more pertinent. What would be the best in your opinion?
[a href="index.php?act=findpost&pid=312826"][{POST_SNAPBACK}][/a]
I think you should go for solution 4. 

First and foremost because you are doing the listening tests to satisfy your own curiosity regarding the various codecs performance for the music you love. Never choose a solution that does not make you happy - or motivation might slowly go away. 

Secondly because your listening test are helping put classical music on the agenda when comparing the performance of the various codecs. I beleive you are doing classical listeners a big favour with your insight and very qualified comments.

Thirdly because the "various music" category is "just" an added bonus to the main objectives of your listening tests, although interesting to follow. 

Hopefully someone will be inspired by your work and create samples of different pop / rock / jazz instruments, like different guitars, drums, cymbals, electronic organs etc. from first class recordings.

80 kbps personal listening test (summer 2005)

Reply #65
Quote
I feel that solution 2 (Q25 for both group) and solution 4 (Q20 for various, Q30 for classical) are the two more pertinent. What would be the best in your opinion?
[a href="index.php?act=findpost&pid=312826"][{POST_SNAPBACK}][/a]


The question is: Do you want to compare encoders at a given bitrate, or do you want to compare encoders using given settings?    In the first case, solution 4 will come closest to the intention, in the second case, solution 2 would be better.

Personally I think that there is most sence in comparing at a given bitrate. In a perfect world, this would actually mean that you would have to find the perfect VBR setting for each sample to get the perfect bitrate - however, this would generate a lot of work previous to the testing (unless it could be done via a program/script?), and would not be consistent with normal use or be very helpful for the normal user. I therefore find the idea of finding the perfect VBR setting for each group of samples to be a good compromise.

Conclusion: I'm a supporter of solution 4!

Edit: In brackets

80 kbps personal listening test (summer 2005)

Reply #66
Quote
[Personally I think that there is most sence in comparing at a given bitrate. In a perfect world, this would actually mean that you would have to find the perfect VBR setting for each sample to get the perfect bitrate - however, this would generate a lot of work previous to the testing (unless it could be done via a program/script?), and would not be consistent with normal use or be very helpful for the normal user. I therefore find the idea of finding the perfect VBR setting for each group of samples to be a good compromise.

Conclusion: I'm a supporter of solution 4!
[a href="index.php?act=findpost&pid=312989"][{POST_SNAPBACK}][/a]

I am not too happy with solution 4 - after all, VBR modes target a certain quality, and not a certain bitrate. If the VBR mode allocates too little bits for certain samples and maybe thereby creates artefacts, it is a fault of the encoder/psymodel and should be treated and evaluated as such.
I know that there is no perfect solution to this problem, but I think combining the two sample sets and calculating the average bit rate of all samples and use this for selecting the VBR mode might be the least bad solution. It would be even less bad if the number of "classical" and "various" samples would be approximately equal
Proverb for Paranoids: "If they can get you asking the wrong questions, they don't have to worry about answers."
-T. Pynchon (Gravity's Rainbow)

80 kbps personal listening test (summer 2005)

Reply #67
Quote
I am not too happy with solution 4 - after all, VBR modes target a certain quality, and not a certain bitrate.

That's true. But everyone is counting on an approximate bitrate with every VBR setting. MPC --standard is something like ~180 kbps, MP3 --standard close to ~200 kbps, etc... We are all using this kind of correspondence, for our own purpose or for recommendation (someone using CBR 192 is suggested to use VBR --standard instead).
In other words, we're all thinking in term of bitrate.


Quote
If the VBR mode allocates too little bits for certain samples and maybe thereby creates artefacts, it is a fault of the encoder/psymodel and should be treated and evaluated as such.


Why "too little"? The fact is that bitrate is very different from one group to another. Here, classical needs less bitrate than 'various'. Therefore, someone listening to classical and looking for a VBR setting which offers 96 kbps would be tempted to use Q30, and not Q20, not Q25. He will use the VBR preset matching with the target.
But this setting, working well with classical, won't work with other kind of music. People looking for 96 kbps with various won't use Q30 (-> 110 kbps), Q25 neither (105 kbps). There's no point for both kind of listener to make a compromise, as long as they don't mix both kind of music.

Quote
I know that there is no perfect solution to this problem, but I think combining the two sample sets and calculating the average bit rate of all samples and use this for selecting the VBR mode might be the least bad solution.


I understand the logic, but does it really correspond to a real usage?
I'm tempted to make an analogy with video encoding. Take XviD as example. There's a dedicated mode to improve the quality with cartoons movies. If you want to test the quality of XviD with both movies and cartoons, it would be senseless to use one and only one parameter for both kind of movies. It would also be surprising if the tester will try to find an unlikely hybrid settings: he will certanly lower the quality for both genres, and then handicap the contender with this compromise. If I remember correctly, Doom9 has adapted the encoder's parameters according to the kind of movie. Futurama was encoded with cartoon mode, but not Matrix. Did people complaint? I don't know.

Contrary to previous collective tests, I'm using a wide gallery of samples. Wide enough to make a distinction between both kind of music. The second group is maybe a "bonus" (I said it in my first post), but I don't mean by "bonus" something minor that should be neglected. I'd like to test the reaction of various encoders with different kind of music, and see if some of these encoders are unbalanced in favor of either classical or either something more popular. From my own experience, some renowned encoders have based their reputation on one specific kind of music or sample - and one only. Problem is, that these reputed encoders are sometimes suggested to people listening to something totally different. LAME -V5 is as example working very well with many kind of music, but with classical at least, it's not trustable.
That's why I'm very interested to test two separate kind of samples. And I'm more and more convinced that testing both category with one setting is 1/ not optimal 2/ won't correspond to the real usage of possible listeners.

For example, with AAC at 128 kbps, it will be impossible for me to test Nero's VBR internet (which appeared to be the best AAC solution on classical on a previous test I made last december). Why: bitrate is ~140 kbps. WIth the seond group, the average bitrate don't have this problem. By discarding VBR for bitrate issues with one group, I'll be force to use CBR with both group, and I let you imagine the reaction of many people, which will probably shout about the usage of unoptimal setting, etc...

Some goes for MPC. If --radio seems to be close to 128 kbps with te second group, it's not the case for classical. I could try to find a average setting, and at the end I'll be force to use something comprise within the --thumb profile. I let you also imagine the reaction of some people (a couple of names immediately come into my mind...).

For all these reasons, it looks preferable for me to evaluate both group as independent one. You probably noticed that I didn't proposed any mixed results for the final test, and let both category totally independant.

Quote
It would be even less bad if the number of "classical" and "various" samples would be approximately equal

I don't have material enough to build a coherent gallery similar to the classical one. And I don't plan to restrict the amount of tested situations with classical. I can't solve the imbalance of both categories, unless someone plan to build something similar with 'various' music.

80 kbps personal listening test (summer 2005)

Reply #68
Quote
I am not too happy with solution 4 - after all, VBR modes target a certain quality, and not a certain bitrate. If the VBR mode allocates too little bits for certain samples and maybe thereby creates artefacts, it is a fault of the encoder/psymodel and should be treated and evaluated as such.
I know that there is no perfect solution to this problem, but I think combining the two sample sets and calculating the average bit rate of all samples and use this for selecting the VBR mode might be the least bad solution. It would be even less bad if the number of "classical" and "various" samples would be approximately equal
[a href="index.php?act=findpost&pid=313060"][{POST_SNAPBACK}][/a]

The objective of the testing is essential. I assume that you, in principle, want to see a comparison of encoders using given settings. (As guruboolez says, you probably have an idea about bitrates, anyway - you wouldn't throw in LAME --aps in this test, right...?)

In that case, the result of the testing will not be sufficent to answer the question: "If I'm willing to spend 96 kbps of my flash-space to store store this track, which encoder would probably give me the best sound on that particual space?" The "perfect-world-solution" could have answered that...  On the other hand, if the winner was a VBR-encoder/-setting, this solution would have left it up to each user to find the setting on the winning encoder that actually produced a 96 kbps file out of their particular track. 

Have you seen those TV-shows where people are invited to make the best dinner possible for a small amount of money, lets say 10$? I really it hate when someone brings ingredients for 10.98$...

80 kbps personal listening test (summer 2005)

Reply #69
@guruboolez & a_aa:
I appreciate your arguments. Maybe I'm too much of a VBR enthusiast concerning my encoding habits. If I want a certain quality level, I just use whatever preset or quality setting corresponds to this level and am not too concerned about the resulting bitrates, as long as they roughly are in line with what I have in mind. I find that in practice a few KB more or less are irrelevant, unless you are restricted by a really small flash player etc.
But as I said, there is no perfect solution with regard to a listening test that has a certain bitrate as its target. Whatever you choose, people will complain

BTW, I find the analogy with XviD does not really hold. AFAIK, its cartoon mode does not use more bits per se, it's just more tuned for cartoons, which are very different from "normal" movies with regard to the demands they pose to an encoder. So you get better quality without increasing the bitrate. Using cartoon mode for normal movies would probably be disastrous. It's like tuning an encoder especially for classical music in a way that would probably decrease its performance if used with other music.
Proverb for Paranoids: "If they can get you asking the wrong questions, they don't have to worry about answers."
-T. Pynchon (Gravity's Rainbow)

80 kbps personal listening test (summer 2005)

Reply #70
Quote
(...) am not too concerned about the resulting bitrates, as long as they roughly are in line with what I have in mind.


I've fixed a tolerence margin at the beginning of the test.
I agree with you: as user, I won't really be annoyed if the final bitrate deviate a bit too much from what I've expected. But as tester, the situation is really problematic. Testing something at 142 kbps (IIRC the bitrate for Nero AAC VBR with classical samples) and comparing it with another contender at 124 kbps (IIRC bitrate for WMA9pro VBR) is irrelevant. It's like comparing a movie encoded at 700 MB to another one encoded at 500 MB. The comparison is not necessary unintersting (at 500 MB, some encoders might perform as well as another one with more bitrate), but in most case it won't answer to the question: at a given bitrate, could we say that encoder x is the better or not than encoder y.

Quote
Whatever you choose, people will complain

I agree, but I'm trying to find the most diplomatic solution. For the 80 kbps I heard no complaint. But with further test, I really fear that I can't find the ideal compromize.

Quote
BTW, I find the analogy with XviD does not really hold. AFAIK, its cartoon mode does not use more bits per se, it's just more tuned for cartoons

Right, but in both case of my analogy, we have a tester who tried to adapt the setting to the content. It's not usual. For the past collective test one single setting was used for all samples. It's probably what I would do if my tests were limited to ~15 samples. But here, it is obvious that VBR encoders doesn't react the same with all samples, and that the classical group and the various group could or maybe should be divided in two different categories, with adapted (and optimal) settings for both of them. It looks more pertinent, as user as well as tester.

80 kbps personal listening test (summer 2005)

Reply #71
BTW Guruboolez, what is you listening set-up?

Sound card, speakers / headphones.

80 kbps personal listening test (summer 2005)

Reply #72
Quote
BTW Guruboolez, what is you listening set-up?

Sound card, speakers / headphones.
[a href="index.php?act=findpost&pid=313117"][{POST_SNAPBACK}][/a]

I hope that's not Lynx Two with Sennheiser Orpheus or smthng.
[span style='font-size:8pt;line-height:100%']Just kidding though. [/span]
Infrasonic Quartet + Sennheiser HD650 + Microlab Solo 2 mk3. 

80 kbps personal listening test (summer 2005)

Reply #73
I used a Creative Audigy2, a Beyerdynamic DT-531 headphone and between them, a basic Onkyo amp. Nothing unusual, but the headphone (~120 euros) plays a very important role.

80 kbps personal listening test (summer 2005)

Reply #74
Guru, I'm assuming that since you encoded the QT AAC sample through iTunes, the resulting sample rate was 44.1kHz? I ask because it might be interesting to see if there is any improvement when encoding directly in QT, which would let you resample to 32kHz. There might be a little more pre-echo as a result, but at this bitrate, it's all a trade off anyway....

Excellent investigation, as usual