HydrogenAudio

Lossy Audio Compression => MP3 => MP3 - Tech => Topic started by: Makaki on 2013-06-01 00:21:13

Title: LAME Q Switch
Post by: Makaki on 2013-06-01 00:21:13
Out of curiosity, I was looking for all the information that I could find on LAME's Q switch when using CBR. Other than:
-q 0: Highest quality, very slow
-q 9: Poor quality, but fast

I wanted to know a little more of what was happening with each setting (without having to go into the source code myself). So I used the verbose flag, and then compared the output and encoding times.  I did a few tests and decided to post the results here in case someone else was curious too. All tests where done with -b 320 (CBR 320). If someone has more insight, any information would be welcome.

differences with previous level
observations

-q 0

-q 1 same output, but faster (unknown difference)

-q 2

-q 3 same output, but faster (unknown difference)

-q 4

-q 5
-q 6

-q 7
-q 8
-q 9


NOTE: 5-6 seem identical, Same for 7-9.
Title: LAME Q Switch
Post by: lvqcl on 2013-06-01 00:32:45
You can also look at lame_init_qval() function here - http://lame.cvs.sourceforge.net/viewvc/lam...athrev=lame3_99 (http://lame.cvs.sourceforge.net/viewvc/lame/lame/libmp3lame/lame.c?view=markup&sortby=date&pathrev=lame3_99) (line #360)


Added: sorry, didn't mention your "without having to go into the source code myself"
Title: LAME Q Switch
Post by: db1989 on 2013-06-01 01:06:30
Thanks for the pointer. Well, I’ll take the plunge. Here’s what I could make of the relevant code. The summaries begin at -q0 and ascend numerically, with all parameters not mentioned being identical with the previous number.

-q0 uses noise-shaping, sub-step-shaping = 2, noise-shaping amplitude = 2, noise-shaping stop = 1, sub-block gain = 1, best Huffman encoding, and the full outer loop search
-q1 drops the full outer loop search
-q2 drops noise-shaping amplitude to 1
-q3 drops sub-step shaping (it seems)
-q4 drops both of noise-shaping amplitude and noise-shaping stop to 0
-q5 use best Huffman = 0
-q6 is identical to -q5
-q7 drops noise-shaping and, if the mode is vbr-mt or vbr-mtrh, uses the full outer loop search
-q8 is identical to -q7
-q9, unlike -q7 and -q8, does not allow the full outer loop search

In particular, note the differences in the use of the outer loop when compared to LAME’s own reports in the OP: LAME is programmed to report use of the “outside loop” whenever use_best_huffman==1, even if use_outer_loop==0. Go figure!

So, what is this “full outer loop search”? Judging by an adjacent comment, it seems to be involved in the Huffman encoding. In -q0, it replaces a “type 2” of best Huffman encoding, which apparently was too slow. That “type 2” was programmed to be reported as “inside loop”, just to confuse things even further.

For VBR, -q5 to -q9 act identically and use quantisation = x^3/4, whereas -q0 to -q4 act identically and add the best Huffman dividing code.

In all modes, the default is -q3. At some point in the past, this was -q2.
Title: LAME Q Switch
Post by: BFG on 2013-06-01 02:14:38
Interesting info (even though I lack technical expertise and so don't understand half of it).  I had thought from my reading of the LAME documentation that the -q setting also affected short/switch/long frame size decisions.
Title: LAME Q Switch
Post by: Makaki on 2013-06-01 02:45:01
The code seems kind of clear, after someone points you to the right line, Thanks for that. Looking at the code, seems that -q 9's difference applies only to VBR, and in the case of CBR, would be the same variables. That would match the results I got, as q 9 didn't seem any different than 7 or 8 (with CBR).

Nice observation on the verbose output not covering "use_outer_loop".
Title: LAME Q Switch
Post by: db1989 on 2013-06-01 14:50:00
Interesting info (even though I lack technical expertise and so don't understand half of it).  I had thought from my reading of the LAME documentation that the -q setting also affected short/switch/long frame size decisions.
Having not studied the source exhaustively, I can’t completely rule out the possibility that it might. However, at least within lame.c, all significant references to gfp->quality are used to assign values to other variables, not directly.

The code seems kind of clear, after someone points you to the right line, Thanks for that.
Glad to contribute!

Quote
Looking at the code, seems that -q 9's difference applies only to VBR, and in the case of CBR, would be the same variables.
True for CBR. VBR is supposed to be simpler, but judging from this segment of code, something is amiss:
Code: [Select]
1022     /* The newer VBR code supports only a limited
1023     subset of quality levels:
1024     9-5=5 are the same, uses x^3/4 quantization
1025     4-0=0 are the same 5 plus best huffman divide code
1026     */
1027     if (gfp->quality < 0)
1028     gfp->quality = LAME_DEFAULT_QUALITY;
1029     if (gfp->quality < 5)
1030     gfp->quality = 0;
1031     if (gfp->quality > 7)
1032     gfp->quality = 7;
Thus values of -q would be re-mapped as follows: 0–4 = 0; 5 = 5; 6 = 6; 7–9 = 7.

This does not accord with what the documentation claims, or even the comment directly above in the source itself. Perhaps VBR follows a different codepath that does result in the documented behaviour, but I would appreciate information from a developer on what is really going on here. Otherwise, the code is not doing what it claims to do. I’ll try running a few little tests on this.

That would match the results I got, as q 9 didn't seem any different than 7 or 8 (with CBR).
I meant to ask before with reference to your comparisons: have you compared the output streams? You did mention “output” at one point, but that could have referred to the text in the console rather than the actual encoded data.

Quote
Nice observation on the verbose output not covering "use_outer_loop".
Yeah, another possible fix for the source, but again, I might be missing something.
Title: LAME Q Switch
Post by: db1989 on 2013-06-01 15:21:02
OK, here we are. I performed bit-comparisons of different -q values under -V0.

As documented, -q0 to -q4 are identical. However, -q5 to -q9 are not identical as the documentation and comments claim. Specifically, -q5 and -q6 are identical, but -q7 to -q9 form a separate group (internally identical).

Thus, in reality, VBR seems to provide (at least) three choices of -q, not the two claimed. Again, I would appreciate clarification on this from a developer. If it is intended, the documentation should be updated.

Code: [Select]
All tracks decoded fine, no differences found.

Comparing:
"C:\Users\?\Downloads\piano2q0.mp3"
"C:\Users\?\Downloads\piano2q1.mp3"
No differences in decoded data found.

Differences found in 1 out of 1 track pairs.

Comparing:
"C:\Users\?\Downloads\piano2q4.mp3"
"C:\Users\?\Downloads\piano2q5.mp3"
Differences found: 7518 sample(s), starting at 0.3370000 second(s), peak: 0.0023986 at 2.0611458 second(s), 1ch

All tracks decoded fine, no differences found.

Comparing:
"C:\Users\?\Downloads\piano2q5.mp3"
"C:\Users\?\Downloads\piano2q6.mp3"
No differences in decoded data found.

Differences found in 1 out of 1 track pairs.

Comparing:
"C:\Users\?\Downloads\piano2q6.mp3"
"C:\Users\?\Downloads\piano2q7.mp3"
Differences found: 605424 sample(s), starting at 0.0000000 second(s), peak: 0.0077936 at 2.4563750 second(s), 1ch

All tracks decoded fine, no differences found.

Comparing:
"C:\Users\?\Downloads\piano2q7.mp3"
"C:\Users\?\Downloads\piano2q8.mp3"
No differences in decoded data found.

All tracks decoded fine, no differences found.

Comparing:
"C:\Users\?\Downloads\piano2q8.mp3"
"C:\Users\?\Downloads\piano2q9.mp3"
No differences in decoded data found.
I then captured the verbose output of the console to get a better idea of what is causing the differences. There are even more differences in applied settings than the bit-comparisons reveal! Specifically, -q5 and -q6, whilst apparently producing identical output (at least on this sample), invoke different sets of parameters for noise-shaping. Again, data below ascend numerically, and data not shown for higher values of -q are identical to those used by the previous setting.

Code: [Select]
-q0 to -q4:
misc:
        scaling: 1
        ch0 (left) scaling: 1
        ch1 (right) scaling: 1
        huffman search: best (outside loop)
        experimental Y=0
psychoacoustic:
        using short blocks: channel coupled
        subblock gain: 1
        adjust masking: -6.8 dB
        adjust masking short: -6.8 dB
        quantization comparison: 9
        ^ comparison short blocks: 9
        noise shaping: 1
        ^ amplification: 2
        ^ stopping: 1
        ATH: using
        ^ type: 5
        ^ shape: 1 (only for type 4)
        ^ level adjustement: -7.1 dB
        ^ adjust type: 3
        ^ adjust sensitivity power: 1.000000
        experimental psy tunings by Naoki Shibata
          adjust masking bass=-0.5 dB, alto=-0.25 dB, treble=-0.025 dB, sfb21=8
.25 dB
        using temporal masking effect: no
        interchannel masking ratio: 0

-q5
misc:
        huffman search: normal
psychoacoustic:
        noise shaping: 1
        ^ amplification: 2
        ^ stopping: 1

-q6
psychoacoustic:
        noise shaping: 1
        ^ amplification: 0
        ^ stopping: 0

-q7
psychoacoustic:
        subblock gain: -1
        noise shaping: 0
        ^ amplification: 0
        ^ stopping: 0
Devs? Anyone?
Title: LAME Q Switch
Post by: robert on 2013-06-01 17:34:20
The current (new) VBR mode uses different code paths, where some of the q-val controlled settings have no effect, like all of those noise_shaping* vars.
9-7: vbr code uses PSY model, but does not calculate quantization noise, just takes a quick guess.
6-5: vbr code enables quantization noise calculation and enables subblock gain feature
4-0: vbr code enables best huffman search
Title: LAME Q Switch
Post by: Makaki on 2013-06-01 21:16:14
I didn't want to compare the output files themselves because I thought that different "algorithms" may still yield the same results. And I might have detected equality in cases where different roads where taken to get there.

But I guess with the different data coming from the other angles, it could complement the results. I'll see to compare the output files later.

EDIT: (Results)

Using window's FC tool, the only files that were truly identical where -q 7 and -q 8 (because 8 is literally hard-coded to 7)
I was getting a minor difference between 5-6 and 8-9. I assumed it had to do with the lame header, so I tried again with it disabled (-t)
Now files 5-6 are binary identical and same for 7 thru 9. All the other files had mayor differences at the binary level.
Again this test was using CBR.

Note that I'm not comparing waveform, I'm using a binary comparison.
Title: LAME Q Switch
Post by: db1989 on 2013-06-01 22:54:44
robert: Thanks very much for the info!  Can we see the docs updated to reflect this?  Also, what about LAME always reporting that the outside loop is in use: is that incorrect, or does it refer to something other than use_outer_loop?

Makaki: Thanks for doing the latest test. The results are much as expected, although I wonder if -q9 would ever produce different output to -q8 and -q7.
Title: LAME Q Switch
Post by: robert on 2013-06-02 19:00:14
Also, what about LAME always reporting that the outside loop is in use: is that incorrect, or does it refer to something other than use_outer_loop?

I don't know, what you mean with use_outer_loop. LAME prints huffman search: best (outside loop) at q-vals 4-0, not always.
Title: LAME Q Switch
Post by: db1989 on 2013-06-02 19:18:03
Sorry, my mistake: it should be full_outer_loop. What I meant was that LAME does indeed print that message for -q0 to -q4, but in the code, full_outer_loop seems to be set only for -q0 (and possibly -q7). So, I was asking whether the message and the variable refer to different things. If they mean the same thing, the printing does not reflect what is actually happening in the code. So, I guess they are different.

I think I kept mistakenly thinking the variable was called use_outer_loop, but its actual name suggests that it can be separate: one could use the outer loop, but it would not be necessary to use it fully. I should probably go and read the code in more depth.
Title: LAME Q Switch
Post by: robert on 2013-06-02 19:41:33
Well, verbose doesn't print everything thinkable, that full_outer_loop control has no textual representation.
In CBR/ABR/VBR(old): 0 => search ends as soon as possible, else it further tries to minimize quantization noise.
With current VBR, this var controls calculating (>= 0) or guessing ( < 0) quantization noise.



ref: quantize.c around line 1151 and vbrquantize.c line 1283
Title: LAME Q Switch
Post by: db1989 on 2013-06-02 19:45:28
Thanks for the info, and sorry for the confusion.  It was mainly due to me getting the name mixed up, thinking it was use_outer_loop and thus thinking it was the same thing reported by --verbose, which it isn’t.

And more generally, of course, thanks for all the work you do on LAME!
Title: LAME Q Switch
Post by: Makaki on 2013-06-02 20:06:54
Do we agree the HTML documentation could use an update, or is it OK as it is?

And, if so, what would be the suggested update?

I know the text documentation needs to be updated about the default:
http://lame.cvs.sourceforge.net/*checkout*/lame/lame/USAGE (http://lame.cvs.sourceforge.net/*checkout*/lame/lame/USAGE)
Title: LAME Q Switch
Post by: db1989 on 2013-06-02 23:46:27
Certainly it requires updating to reflect the three sets of -q values under VBR. It would also be nice to see the summarised details from this thread included in a table or something similar in the detailed documentation, if only because they’re quite interesting!

Updates might be needed for CBR, too, depending on whether the picture is as simple as that painted by lame_init_qval() and its switch block. The main question here, I think: is -q8 actually identical to -q7, or do processes elsewhere in the code take differing paths depending on these values of gfp->quality?
Title: LAME Q Switch
Post by: Makaki on 2013-06-03 14:56:24
Work in progress:

http://wiki.hydrogenaudio.org/index.php?title=LAME_Q_Switch (http://wiki.hydrogenaudio.org/index.php?title=LAME_Q_Switch)


EDIT:
I've noticed -q 9 is no different than -q 7. In all cases.
In the documentation and the comments in the code it seems it was meant to disable the psy-model, but this never happens? The internal variables used, seem to be the same as q 7

Quotes:
/* no psymodel, no noise shaping */
Disables almost all algorithms including psy-model. Poor quality.
Title: LAME Q Switch
Post by: db1989 on 2013-06-03 16:27:58
I held off from saying it for the title of this thread, but now that it is on the wiki: I advise against referring to the switch as Q (capital). This is not one of the cases where two differently cased versions of the same letter are used as different switches (e.g. -b and -B), but nonetheless, it is only right to refer to it with a small q to avoid ambiguity and keep it future-proof. I suggest having an administrator rename the page.

But having said that, although I appreciate your effort, I feel that such ventures should, at a minimum be left until the developers create an updated description with guaranteed correct knowledge. For non-developers with incomplete knowledge to try to pre-emptively guess what developers will write seems like it could easily lead to confusion, to say the least. Anyway, whether we can justify having our own guide depends on what level of detail goes into the official documentation. If it would basically say the same thing, it would be redundant. If, on the other hand, the official documentation ends up having less than the info in this thread, a page on the Knowledgebase would be nicely complementary. In any case, waiting until the former is released seems prudent IMO.

I also wondered about the relationship of -q9 and the psymodel. Another question for Robert!
Title: LAME Q Switch
Post by: Makaki on 2013-06-03 16:40:35
Since it's a work in progress, I could create the new page and mark the old one for deletion, or maybe set a redirect in the mean time. (EDIT: No admin was needed, the wiki has a move function)

I created the page as a work in progress, in order to ease the changes for the developers. They can in the end proofread our conclusions, and incorporate them on the official documentation. That said, I think the docs are pretty close to what they should be except for the USAGE file being outdated and -q 9 "possibly" not being any different, in contrast to what it says.

But I think that as detailed as the explanation may be, they may decide not to cover in detail each option. And because the verbose option doesn't state EVERYTHING either, this wiki may be a good place for the curious user to find more information. See for example the Y switch. Where this wiki has a better explanation the the detail usage doc.
Title: LAME Q Switch
Post by: db1989 on 2013-06-03 16:52:11
No admin was needed, the wiki has a move function
Well, I learned something new today.

Quote
But I think that as detailed as the explanation may be, they may decide not to cover in detail each option. And because the verbose option doesn't state EVERYTHING either, this wiki may be a good place for the curious user to find more information.
My thoughts exactly. This would definitely be good.

Once we have a better idea of what is going on internally from Robert, we can be more confident that the statements on that page are correct. As it is, some of them are just lifted directly from the current documentation and therefore might not reflect recent versions in reality. I might run some more tests later to confirm some things. Anyway, at some point, my inner editor will pay a visit to the article.
Title: LAME Q Switch
Post by: Makaki on 2013-06-03 16:56:50
With the information I have so far, which would need to be verified by a developer I would suggest the following description for the detailed usage, and the USAGE file:

-q 0: Use slowest & best possible version of all algorithms.
-q 2: Recommended. Same as -h. -q 0 and -q 1 are slow and may not produce significantly higher quality.
-q 3:   Default value. Good speed, good quality.
-q 5-6: Very fast, average quality. (Use faster huffman encoding, aka noise shaping)
-q 7-9: Same as -f. Fastest, lowest quality. (psycho acoustics are used for pre-echo & M/S, but no noise shaping is done.

EDIT:
The VBR descriptions have could be same as above:
-q 0-4: Default value. Use slowest & best possible version of all algorithms.
-q 5-6: Very fast, average quality. (Use faster huffman encoding, aka noise shaping)
-q 7-9: Same as -f. Fastest, lowest quality. (psycho acoustics are used for pre-echo & M/S, but no noise shaping is done.
Title: LAME Q Switch
Post by: db1989 on 2013-06-03 17:22:47
See my edits, now at http://wiki.hydrogenaudio.org/index.php?title=LAME_-q_switch (http://wiki.hydrogenaudio.org/index.php?title=LAME_-q_switch) . It still needs quite a bit of work, but I have at least fixed some of the previous grammatical/syntactic issues and added a couple of brief explanations of what is actually happening.

Quote
-q 2: Recommended.
Where is this recommendation made? The general advice here is that the default -q3 is fine and need not be changed.
Title: LAME Q Switch
Post by: lvqcl on 2013-06-03 17:36:05
AFAIK previously -q 5 was the default for CBR/ABR, and -q 2 was the default for VBR and --alt-presets.
Title: LAME Q Switch
Post by: Makaki on 2013-06-03 17:40:49
Where is this recommendation made? The general advice here is that the default -q3 is fine and need not be changed.


http://lame.cvs.sourceforge.net/viewvc/lame/lame/USAGE (http://lame.cvs.sourceforge.net/viewvc/lame/lame/USAGE)

I made the descriptions based on a mixture of both official documents.

EDIT:

AND AFAIK, 3 is currently always the default, but since 3 is mapped to 0 for VBR the default of VBR becomes 0, which is the same as 3.
5 was in fact the default at some point but has since changed.

Title: LAME Q Switch
Post by: Makaki on 2013-06-03 18:12:18
In order for me to update my "latest" suggestion as information becomes available, and not have to re-post each time, I've made my suggestion on the talk page:

http://wiki.hydrogenaudio.org/index.php?ti...:LAME_-q_switch (http://wiki.hydrogenaudio.org/index.php?title=Talk:LAME_-q_switch)

I'm trying to preserve the old wording as much as possible, but at the same time make it clearer.
Title: LAME Q Switch
Post by: [JAZ] on 2013-06-03 19:31:47
Makaki: The most up-to-date information is here: http://lame.cvs.sourceforge.net/viewvc/lam...l/detailed.html (http://lame.cvs.sourceforge.net/viewvc/lame/lame/doc/html/detailed.html)

Said that, it has almost the same information for this setting, because that's what was documented.
Title: LAME Q Switch
Post by: db1989 on 2013-06-03 22:23:07
The official documentation, even the latest, defines only two groups of -q setting for VBR rather than the three that exist in reality, and it does not divulge that certain pairs like -q8/7 and -q5/6 are seemingly identical. Things like this are why we would be happy to see the developers update it officially with the latest information rather than us having to ask them lots of questions and guess things that we are not told.