Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Experimental CBR switch (Read 4325 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

Experimental CBR switch

This is from Takehiro.  Note that it is currently for CBR only.  The message is addressed to the developer list, but I believe that the informed listener community should be more tightly coupled with the development team.  So:

Quote
FROM: Takehiro Tominaga
DATE: 11/25/2001 07:15:46
SUBJECT:  [Lame-dev] new noise shaping Hi all.

some of you already know that, but I am just implemented the new noise shaping algorithm into the current CVS LAME tree.

the algorithm is named "pseudo-half step quantization", and enables with -q0 option (currently only when CBR).

# The only difference between -q0 and -q1 is this new algorithm.

mp3x shows the new algorithm brings more "cutting-edge" noise shaping.  We could more tune the algorithm(quality and speed), but I want to everyone to review it.

tell me your opinion.

I will explain detail of this algorithm later.
-- 
Takehiro TOMINAGA // may the source be with you!


I haven't looked at the CVS source, so I can't say at the moment which version it's in or if it works with both gpsycho and nspsytune or with one or the other.  Perhaps others could follow up.

ff123

Experimental CBR switch

Reply #1
has anyone checked this?
does it indeed offer an increase in quality?

i can't tell anything with my 15$ speakers...

Experimental CBR switch

Reply #2
I've very briefly experimented with this (with VBR by enabling it internally to work that way) and I didn't see any real increases in quality on the problem clips, especially the ones that noise shaping 2 has typically had problems with.  It might be more helpful for lower bitrate CBR (128kbps) or something than anything else.

Really though, I don't think noise shaping itself is actually the big problem in LAME, its problems with masking calculations (tonality estimation, etc), probably issues with the spreading function, better short block handling, and problems with functions to determine best quantization that I believe will really lead to significant improvements.  This conclusion is based on my experience tweaking different areas of LAME to try and make --alt-preset standard better.

Anyway, it would be interesting to see this tested more, especially at lower bitrates to see what people find, I'm just not expecting huge improvements in higher bitrate vbr encoding from what I've seen.

Experimental CBR switch

Reply #3
Quote
Originally posted by Dibrom
I've very briefly experimented with this (with VBR by enabling it internally to work that way) and I didn't see any real increases in quality on the problem clips, especially the ones that noise shaping 2 has typically had problems with. 
I've tested it also very briefly. Seems that -q0 is actually worse than -q2 (-h) if nspsytune and noise shaping type 2 (scalefac_scale enabled) is used (higher vbr line). For example castanet.wav guitar echo at 4 sec goes worse. Didn't hear any quality increase with this.
Juha Laaksonheimo

Experimental CBR switch

Reply #4
I don't think it's made for use with nspsytune.

Experimental CBR switch

Reply #5
Well, Takehiro hasn't said that anywhere, but I suspect the same.
He has said it works with vbr-old though.
Juha Laaksonheimo

Experimental CBR switch

Reply #6
Takehiro said it himself, the new mode shall do more cutting edge noise shaping. Read it as more possibilities to fail. In my opinion LAME's noise shaping works quite well. But the psychoacoustics need improvements.

Experimental CBR switch

Reply #7
[deleted]

Experimental CBR switch

Reply #8
Quote
Originally posted by TrNSZ
I have a somewhat silly question... would it be possible for LAME to switch back and forth from nspsytune and gpsycho on a frame-per-frame basis, based on some sort of analysis?


This would require 2 pass encoding but I guess it could be possible.  It wouldn't be very practical at all though.  How would you decide which psymodel to use?  And besides, I'm not convinced gpsycho should be used at all...

Quote
There still are a few samples that gpsycho handles better than nspsytune.


Hrmm.. I'm not so sure about this.  Nspsytune is superior to Gpsycho in just about every way possible.  Are you thinking of some specific clips?

Realistically, the only cases I know of where nspsytune has sometimes not performed as well as gpsycho is when noise shaping 2 was too aggressive.  Nspsytune itself is not actually a part of noise shaping 2, it just happens to turn it on by default, and you can toggle it back off with -Z of course.  At any rate, in my latest --alt-preset standard I have implemented many techniques which change things around internally on the fly and one of them is to switch between noise shaping 1 and 2 depending on the situation.  Also it will switch between -X1 and an -X3 like mode on the fly as well.  With these things implemented I don't think I know of any clip which sounds better with gpsycho, and I suspect that the problems you may have spoken of were directly related to noise shaping 2 anyway.

Quote
Also, whats the possibility of support for intensity stereo and mixed blocks sometime soon?


This was being discussed awhile ago, but I think Naoki and Takehiro are working on other things now... at least in the case of mixed blocks.

Quote
If I remember correctly, someone wrote an encoder that used mixed blocks that showed good results on fatboy.


I'm not sure mixed blocks would really help so much here.  According to Naoki mixed blocks have problems in that you can't transition between them and short blocks.  He also said that short blocks with a high enough bitrate should essentially sound as good as long blocks for a particular situation.. so considering that, it appears mixed blocks are only good for saving bits but that they aren't going to really lead to big quality improvements at higher bitrates.  What he did mention might help on this clip is better temporal masking... I have no idea what the status on that is though.

A little OT, but just for the hell of it, you should give some of these samples which you believe may still sound better with gpsycho a whirl with the latest --alt-preset standard