Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Using insane settings with mp3 (Read 79428 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Using insane settings with mp3

Reply #200
hallb27, what do you have to say about this encoder at 320 kbps?
MP3sxEnc
Lame 3.97: -V2 --vbr-new

Using insane settings with mp3

Reply #201
Quote
hallb27, what do you have to say about this encoder at 320 kbps?
MP3sxEnc
[a href="index.php?act=findpost&pid=350898"][{POST_SNAPBACK}][/a]

I tried Fraunhofer's surround encoder when it came out and was pretty happy with the results, which showed up to be a little bit better than those of Fraunhofer's WMP10 codec.
I don't remember the exact circumstances but I'm pretty shure I abxed the trumpet sample which at that time was the most problematic sample to me for Lame encodings.
Now that we have Gabriel's problem sample thread and his Lame_attack thread, there are even harder samples out there. To me the most outstanding samples now are herding-calls and Atem-lied.

Atem-lied is rather easily abxable with FhG surround encoder at -br 270.
Edited:
With -br 320 it's still abxable, but not easily.

I personally will not go the Fraunhofer way but stick with Lame because of Lame's quality as well as because of the enthusiasm of the people who do it the non-commercial way.

According to my experience abr270 with 224 kbps minimum bit rate can yield a very high quality even with problematic samples.
For my current production encodings I use Lame 3.91 --abr 270 -b224 -h --lowpass 18600.
But current Lame development is about to improve a lot (see lame_attack). Hopefully soon we will have a completely satisfying new Lame version.
lame3995o -Q1.7 --lowpass 17

Using insane settings with mp3

Reply #202
I just finished a small multiformat listening test on very problematic samples which brought some interesting aspects apart from well known things.

A preliminary remark:
I intended to use 5 problem samples and wanted to learn about the achieved quality with various encoders and settings. I started with harp40_1. After some days of intensive listening tests I came to the testing of Lame -b320 -h, and the result was it was not transparent. This was not what I expected, especially as I did test Helix the day before at 320kbps and rated it transparent. I retested Helix, and now I found it was as little transparent as was Lame. Both results were very good, but not transparent.
Which showed me two things. Demanding transparency at such a high degree as I have done is not very healthy indeed when using mp3 or a similar format. (Yes, I've been told before, especially with respect to pre-echo samples). Another thing is: my listening tests are not very well reproducable. I was aware before that abxing is not a totally objective thing (though the best thing to do), but I hadn't imagined it can be that serious. On the positive side it showed learning to differentiate from the original is possible. Serious progress can be achieved within a day.

As a result I changed my goal. For practical purposes transparency especially with very problematic samples is not so important. Getting at the 'perceptible but not annoying' quality level should be sufficient. In practical listening situations even when careful listening concentration is a lot lower than when abxing, and we don't compare to the original in practical listening situations. So this goal is reasonable, at least to me.

As a direct consequence I concentrated on the samples

- harp40_1
- herding_calls
- trumpet

and skipped castanets and Atem-lied because it's not very difficult to bring them to my non-annoyance level.

Suddenly I realized these samples are matching pretty well what headphone amp producing Dr. Xin once wrote: most difficult musical samples to electronic equipment is: piano, trumpet, and female voice.
Well, harpsichord isn't piano, but it's very similar. herding_calls of course isn't typical female voice. Which made me search for a typical female voice sample which is hard to encode. I tried hard, looking up the sample pages and my own collection (wich includes a lot of female singers), but it was only to end up with eight samples I classify as 'slightly annoying' with Lame 3.90.3 -b128. Going -b160 -h brought them to the non-annoying class. Not really very problematic samples. But I can confirm Dr. Xin: I checked several hundred spots in the encodings which were suspicious to me because they sounded distorted, but only to find out usually that the distortions are in the original as well. And it was only with the vocals, Instruments usually had a very good quality at the same time.

Another thing outstanding with these samples is that they are all tonal samples, and the artifacts can be classified as 'rough distortions'. May be this is why harpsichord seems to be more problematic than piano. harpsichord has an attack characteristic similar to piano (a hammer hitting a string), but at the same time it has a more tonal character than piano. Maybe it's this combination that can fool encoders.

As for the test:

I started with Fraunhofer from MusicMatch Jukebox 6.1.
With this encoder herding_calls is the quality limiting problem.
The non-annoying level for all the three samples was reached at cbr224.
vbr was fooled, and even vbr100% didn't provide the demanded quality (though it's not really bad).

I went over to Fraunhofer Pro 3.3.2 (build 44) from WMP10.
As with the MMJB encoder, the non-annoying quality is achieved at cbr224.

With Helix -X2 -SBT450 -TX0 -HF2 the desired quality with cbr was achieved also with cbr224. It was Helix' VBR mode which really did shine: -V120 (which takes an average bitrate of about 200 kbps with 'normal samples') provided non-annoying results. Even more astonishing: it's only the restricted temporal resolution of harp40_1 which kept me from judging the samples as 'not annoying' at somewhat lower vbr quality levels - herding_calls and trumpet were encoded well.

Lame 3.90.3 GPSYCHO (I used -bxxx -h) usage arrived at the non-annoyance level also at cbr 224. --abr 192 -h wasn't sufficient, so it took --abr 224 as well (didn't try out a smaller step size). The -Vx levels were not sucessful as was known before.

With 3.90.3 NSPSYTUNE it's exactly the same thing: --alt-preset cbr 224 as well as --alt-preset 224 provided the non-annoyance quality, whereas --alt-preset extreme failed.

As for current Lame new 3.98a3 provided the better quality with these samples than did 3.97b2. It's not only the vastly improved VBR mode, abr results too were better. With current lame trumpet is the limiting problem, and with cbr even cbr256 doesn't bring trumpet to the non-annoyance level. abr is better, and abr256 can be considered to be non-annoyant. As for -Vx 3.98a3 -V0 provides a better quality than does 3.97b2.
So with Lame 3.98a3 the common paradigm 'VBR>ABR>CBR' is correct also in the high bitrate area.
Ignoring trumpet --abr 224 was sufficient to get at the non-annoyance level.
cbr 224 however was sufficient only with 3.98a3, 3.97b2 had slight problems with harp40_1. With 3.98a3 -V1 had slight problems with herding_calls, so it took -V0 as well even when ignoring trumpet.

Out of curiosity I tried 3.98a3 GPSYCHO -Vx usage and got excited. Even -V3 yielded an excellent quality with these problems. But it was only until I did real-life encodings. -V3 led to bitrates in the 250-300 kbps range! So it's really no good wildly mixing these things.

So much for mp3.
As the main result it's interesting to see that most encoders arrived at the non-annoying quality level at 224 kbps with these hard problem samples.
And I should add even cbr does the job. vbr mode can be counter-productive as can be seen by Lame before 3.98a3 and MMJB 6.1. Sure a vbr mode considered safe is the best choice.
Looks a bit like when going for robustness mp3 has its steepest slope concerning the quality/bitrate ratio in the say 190-250 kbps range (whereas it's say 90-150 when going for the 'quality most of the time'-target).
Another interesting thing is that latest encoder development doesn't necessarily improve quality. I was told not to care about a 4 year old Lame version, and especially not to care about old GPSYCHO. As I said so many times before I don't have any doubts about the remarkable improvements in the low to moderate bitrate range current Lame development brought us. But I can't see why people carry this over to the high bitrate range so easily. When I look at Serge Smirnoff's SoundExpert System test results 3.97b2's cbr 320 results are tied with 3.90.3 GPSYCHO cbr320's result. So: where is the progress? Sure this test (as does any other) can only bring an approximation to truth cause nobody can test the universe of music. And when it comes to robustness old 3.90.3 (no matter whether GPSYCHO or NSPSYTUNE usage) looks like being a bit more robust than current Lame. (I'm sure I get banned right now - two much said against holy current Lame).

As I had arrived in the bitrate range beyond 200 kbps it's a natural idea to consider mp2 with its temporal resolution considered superior against mp3.

I tried QDesign mp2 which came out very well in Serge Smirnoff's SoundExpert System test.
For QDesign trumpet is the limiting problem too, and it takes cbr 320 to achieve non-annoyance.

I also tried mp2eenc (the link I took it from said it came from the CDex page).
With this herding_calls is the worst sample, and it was still slightly annoying at cbr320 as well as with the best quality variable bitrate setting.

So for all those who intend to use mp3 at cbr320 QDesign's mp2 may be the better alternative, especially for people sensitive to pre-echo.
mp2 encoded tracks have a good chance to be playable on mp3 DAPs. I tried with my Rockbox improved iRiver H140, and everything was fine exept for the bitrate being wrongly displayed.
Moreover with mp2 384 kbps can be used by those who are out for extremely good quality.

Once in this bitrate range it was interesting to test wavPack lossy which is also useable on Rockbox based DAPs.
herding_calls was the most problematic sample and it took -b352x to get at the non annoying level. Trading the -h option for lower bitrate didn't help: -b320hx did not provide the required quality.

Leaving the very high bitrate range I checked formats which hopefully performed well at bitrates lower than those necessary with mp3.

With Vorbis aoTuVb4.51 -q5 was enough to make the tracks non-annoying.
The limiting problem here is harp40_1. It's only the restricted temporal resolution of this sample which keeps me from giving the 'non-annoying' judgment even to lower q-levels.

As an encoder for AAC I decided to use Nero. I had Nero 6 reloaded on my system but wanted to use the new codec which came out quite well in the latest 128 kbps listening test. So I upgraded to Nero 7, but only to find out that it doesn't contain the new codec. Luckily I found a link to this codec. So I was able to test two different codecs.

As for 4.2.4.8 which came with Nero7:
This version produces a special artifact with trumpet at 3.2 sec which takes cbr 256 or vbr extreme to make the samples not-annoying. Guess it's a bug.

As for 3.1.0.2 (encoding quality: high):
The most problematic sample here is herding_calls.
cbr160 and vbr streaming are not annoying (with vbr streaming being perhaps a bit on the edge).

So Vorbis and AAC really shine in the moderate bitrate range even with these problematic samples.
lame3995o -Q1.7 --lowpass 17

Using insane settings with mp3

Reply #203
...

never mind, i misunderstood what i read. i will ask about this in another thread. sorry.

Using insane settings with mp3

Reply #204
I was asked to include MPC 1.14 and 1.15v into my little multiformat not-at-all-annoying-level listening test.

After 3 weeks or so after that test it was essential to get close to what precisely I meant with not-at-all-annoying and what it meant to be at a quality level a little bit worse. So for a warmup and for comparison purposes I redid the test for lame 3.90.3 --abr 224 and --abr 192.

As for MPC I started with --radio quality. Quality is already very high, but for calling it not-at-all-annoying, for trumpet and harp40_1 it misses this quality level a little bit. To me 1.15v sounded a tiny bit better than 1.14 on harp40_1 and trumpet.

With MPC --standard, no matter whether 1.14 or 1.15v, the not-at-all-annoying level was reached for me.

So as for these samples and my test conditions, MPC is great already to me in the moderate bitrate range, just as Vorbis aoTuv and current pre-release Nero AAC.
lame3995o -Q1.7 --lowpass 17

Using insane settings with mp3

Reply #205
Serge Smirnoff asked me to apply my little not-at-all-annoying test to Winamp 5.2's aacPlus (HE-AAC) High Bitrate Encoder v1.2 cause it came out extremely well in his 320 kbps test.

Again for a warmup and to get close to my test conditions a month ago I first redid the test with lame 3.90.3 abr at 224 kbps (not annoying) and 192 kbps (slightly annoying).

With Winamp's high bitrate HE-AAC encoder I used the setting from Serge's test (independent stereo, mpeg-4). Because of the good results in that test I started with 128 kbps. harp40_1 and trumpet were a little bit annoying. For trumpet the usual problem areas were ok to me, but there is a small problem in the 3-3.5 sec-area. The most problematic sample among those tested however was harp40_1.
160 kbps is considerably better, but it took 192 kbps to get to the level I call not-at-all-annoying.

Out of curiosity I tried 'stereo' mode instead of 'independent stereo' on harp40_1 with 160 kbps, but I wouldn't call the result 'not-at-all-annoying' either.

So as for this test Winamp's High Bitrate aacPlus encoder is not very interesting compared to vorbis, lc-aac or mpc as far as the bitrate is concerned, and it is also not very attractive against mp3, which arrives at the same quality level with only a slightly increased bitrate.

My test is on lowest bitrate to achieve the not-at-all-annoying level, and Serge's test is on perception of encoding errors when these are seriously amplified. As these things are very different I think there is no contradiction in the test results.
lame3995o -Q1.7 --lowpass 17

Using insane settings with mp3

Reply #206
I was asked to include MPC 1.14 and 1.15v into my little multiformat not-at-all-annoying-level listening test.

With MPC --standard, no matter whether 1.14 or 1.15v, the not-at-all-annoying level was reached for me.



Try FSOL , Amnesia, hex samples. They are annoying even outside of abx - amnesia rings hard (less on 1.15), hex is stuffed up M/S stereo and FSOL is hideous !

Using insane settings with mp3

Reply #207
... I was asked to include MPC 1.14 and 1.15v into my little multiformat not-at-all-annoying-level listening test. ....

Try FSOL , Amnesia, hex samples. They are annoying even outside of abx - amnesia rings hard (less on 1.15), hex is stuffed up M/S stereo and FSOL is hideous !

Thanks for the important remark.

I appreciate those samples I used for my small test cause they seem to be problematic for many different encoders, but sure my test results can provide only for a very small contribution for judging about the quality of a specific encoder.
lame3995o -Q1.7 --lowpass 17

Using insane settings with mp3

Reply #208
160 kbps is considerably better, but it took 192 kbps to get to the level I call not-at-all-annoying.

Thank you, halb27, for the test. I think if Hi-HE-AAC encoder is really optimized for high bitrates and gets your “not-at-all-annoying” mark @192 kbit/s, then its very good performance @320 kbit/s becomes less surprising.
keeping audio clear together - soundexpert.org

Using insane settings with mp3

Reply #209
I do not completetly understand what the overall value is of focussing so hard on 3 particular samples, and then trying to make general conclusions from them. I am sure there are a few other samples that will offset the conclusion very much (and shadowking already gave examples of them).

The term "blind listening test" probably wasn't meant in that way.
"We cannot win against obsession. They care, we don't. They win."

Using insane settings with mp3

Reply #210
(...)Dr. Xin once wrote: most difficult musical samples to electronic equipment is: piano, trumpet, and female voice.
Well, harpsichord isn't piano, but it's very similar. (...)
harpsichord has an attack characteristic similar to piano (a hammer hitting a string), (...)

This is by far not true. Piano and harpsichord (also add organ and synthesiser) are all keyboard instruments but the mechanism (and therefore the sound generation) is completely different:
- piano is a percussive instrument (cousin of xylophone!)
- organ is a wind instrument (a monstruous cousin of flute, oboe, trumpet...)
- harpsichord is a [I don't know the english word] pinched-bow instrument (close to guitar, lute,...)
- synthesiser is an electronic instrument.

Saying that piano and harspichord are similar is like saying that a tom-tom is close to the cithara

There's no hammer on a harpsichord. Otherwise, it would be possible to create dynamical effects with harpsichord, but it's impossible (or very hard). In that aspect clavichord is closer to piano & pianoforte.

If piano is considered as hard to reproduce on electronic device, it's I suppose a consequence of the wide dynamic range of this instrument - great performs are mastering this with a high degree of subtility. With harpsichord, the dynamic range is very poor.


Quote
herding_calls of course isn't typical female voice. Which made me search for a typical female voice sample which is hard to encode. I tried hard, looking up the sample pages and my own collection (wich includes a lot of female singers), but it was only to end up with eight samples I classify as 'slightly annoying' with Lame 3.90.3 -b128. Going -b160 -h brought them to the non-annoying class. Not really very problematic samples. But I can confirm Dr. Xin: I checked several hundred spots in the encodings which were suspicious to me because they sounded distorted, but only to find out usually that the distortions are in the original as well. And it was only with the vocals, Instruments usually had a very good quality at the same time.


I'm not sure to uderstand what you're saying.
1/ There's someone called Mr. Xin who said that female voice and piano are hard to reproduce on electronic equipment.
2/ You made a correspondence between this difficulty and lossy encoders issues
3/ You looked hard and deeply among hundreds samples of female voice and found nothing hard to encode; not one single piano sample is used in your test but only an harpsichord one you're considering as close enough to piano...
4/ But basing your opinion on this failure, you agree with Dr. Xin first theory?

Did I understand correctly?

Using insane settings with mp3

Reply #211
@ stephanV:
As you say it this test is of very restricted value of course.

As for how I did the test: I used the foobar abxing tool, listened once or twice to the a and b results, and then dicided upon the x and y results. I started at the small bitrate end, and it was obvious whether x or y was the original. No need for a real abx procedure. At the not-at-all-annoying level finding out the original was sometimes hard, and I remember there were cases where I went through to 8/8 that is I behaved like when real abxing, but usually it wasn't necessary to go all through to 8/8 cause it was clear what was the encoding, and I could decide whether I called the result 'slightly annoying' or not.
The more problematic thing to me is that I can't really be sure that I use the same meaning of 'not-at-all-annoying' level for all the tests. I really tried to be fair, but the problem remains. But this is a subjective thing anyway, and other people certainly have another understanding of it.
So it's best to look at the results in a rather rough way, and as for that there is some value in the results to me. For instance as for these samples mp3 has an interesting quality slope in the 192...256 kbps range (as opposed to the say 100...192 kbps range for more 'normal' samples).
lame3995o -Q1.7 --lowpass 17

Using insane settings with mp3

Reply #212
... But basing your opinion on this failure, you agree with Dr. Xin first theory? ....

Well, you put more value into this than I do. I only remembered when I wrote about it that when I was once looking for a headphone amp headphone amp manufacturer Dr. Xin wrote this. I found it quite interesting that part of the samples which I had in mind are corresponding with his statement. Take it as a side comment. Sorry for being so wrong with harpsichord. As for female voice and the many samples I checked I really found: voice is often distorted in recordings whereas instruments aren't.
lame3995o -Q1.7 --lowpass 17

Using insane settings with mp3

Reply #213
As for female voice and the many samples I checked I really found: voice is often distorted in recordings whereas instruments aren't.

It's interesting... too bad that you don't bring more details on that point.

As far as  know, critical samples (for transform coders) are not specificaly female voices one. I can't even quote one killer corresponding to that kind of signal (Waiting is male voice): castanets, fsol, awe_32, fatboy, Krafwerk... are for most of all (if not all) instrument of artificial (=synthesized) samples. Your experience is apparently opening new territories for tuning encoders.

My own experience also differ from yours. I split my 150 classical samples into 4 categories:
- 2 for instrumental music
- 1 for voice

And in all tests I performed with these 150 samples during the last year, the VOICE group (30 samples) was always "easier" to encoder (i.e. less distorted) than instruments.
My tests results could be found:
http://forum.hardware.fr/hardwarefr/VideoS...0-1.htm#t922005
http://www.hydrogenaudio.org/forums/index....showtopic=37973
http://www.hydrogenaudio.org/forums/index....showtopic=35438
http://www.hydrogenaudio.org/forums/index....showtopic=38792

 

Using insane settings with mp3

Reply #214
... My own experience also differ from yours ...

I do not think so. As I wrote female voice wasn't a serious problem for encoding. With the 8 samples I found that were slightly annoying to me with mp3 cbr128 annoyance was gone with cbr160.
If you're interested I've uploaded one in the upload section.
And I can't say female voice is worse to encode than male voice. I was just looking for female voice samples (yes, driven by Dr. Xin's statement), and as my favorite singers are women most of the time, I got more female voice than male voice samples anyway.

What I found was that voices are rather often distorted in the original in contrary to the instruments. I guess we are a more sensitive to errors in human voice than towards errors in instrumental reproduction, and this seems to be especially true for errors introduced when recording.
I didn't record these samples as they were irrelevant for encoding problems. But I recorded the 8 samples with voices a little bit difficult for encoders (that's why I could upload one of these samples). Within Carla Bruni's "Quelqu'un m'a dit" I found 3 tracks containing slight problem spots for Lame 3.90.3 (Tout le monde, L'amour, Chanson triste), and I do remember well that with this cd I found a lot of vocal spots which were distorted to me in the original. As was the case with many cds I don't remember right now.
I can't imagine that you don't know about the issue. Guess you misunderstood what I wrote, maybe I didn't write it clear enough.
lame3995o -Q1.7 --lowpass 17