Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Public Listening Test [2010] (Read 177988 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.


Public Listening Test [2010]

Reply #227
Why not figure out what the bitrate of the chunk is by doing what people normally do: encode the entire track.

For some items, the test sample actually is the entire track. And for some other items, the test sample equals the first few seconds of the track. Only relatively few items are an excerpt of the middle part of a song.

That being said, I agree that kicking out iTunes CVBR is problematic, since that's what many people on this globe are using. Remember, not everyone knows about (and how to use) qtaacenc.

How about using iTunes 128kb/s CVBR (default quality, available even on Windows), loop each sample, check the bit rates, and take the loop giving the lowest bit rate (as the above plot shows, there obviously is a clear lower bit rate border)?

Chris
If I don't reply to your reply, it means I agree with you.

Public Listening Test [2010]

Reply #228
How about using iTunes 128kb/s CVBR (default quality, available even on Windows), loop each sample, check the bit rates, and take the loop giving the lowest bit rate (as the above plot shows, there obviously is a clear lower bit rate border)?

Should be the middle bitrate choosen as it has highest probability (mean value)?

Public Listening Test [2010]

Reply #229
Should be the middle bitrate choosen as it has highest probability (mean value)?

Does the mean bit rate really have the highest probability? In alexander's post #181, the minimum bit rate is most likely to occur.

I still prefer my above proposal. 128 kb/s is a bit on the high side, but taking the minimum loop bit rate levels things out a bit. Plus it's reproducible using only iTunes on Windows.

Chris
If I don't reply to your reply, it means I agree with you.

Public Listening Test [2010]

Reply #230
Yes, the chunk with lowest bitrate has bitrate very close to average bitrate on emese x 100 loops.

Average bitrate for emese sample x 100: 136 kbps.
The chunk with lowest bitrate: 135 kbps.

As CVBR 128 produces slightly higher bitrates comparing to TVBR 60 the selection of chunk with lowest bitrate will be more appropriated.

Then settings are
1. Nero -q 0.41
2. Apple --tvbr 65 --highest
3. Apple --cvbr 124 128 --highest .
4. Pre-test:
4a. Divx CBR 128
4b. CT CBR 130



Public Listening Test [2010]

Reply #232
Yes, Chris. It should be --normal.

I had a problem with yamb/mp4box splitter http://yamb.unite-video.com/download.html.
It doesn't cut precisely. 29.206 seg is cut to 29.00. Only integer value. I've tried to do 29.000 sec WAV but somehow it still cut imperfectly.

CVBR bitrate distribution for looped x16 samples:
I've tried fatboy_30sec sample.  h*tp://www.mediafire.com/?nz1cmjoj1y5
1st chunk: 133 kbps.
from 2d to 16th chunks: 140 kbps.

It won't be right to choose the chunk with minimal bitrate. I propose to choose average bitrate ( the chunk with 140 kbps in this case).

Public Listening Test [2010]

Reply #233
It won't be right to choose the chunk with minimal bitrate. I propose to choose average bitrate ( the chunk with 140 kbps in this case).

OK, let's take the median then. Similar to average (also around 140 kbps for fatboy and emese samples), but much less sensitive to outliers. Example: [100 100 100 100 100 100 100 300] has average 125 and median 100.

Chris

P.S.: I was planning to create test samples which are exactly 15 seconds long, so the yamb/mp4box issue shouldn't cause any problems.
If I don't reply to your reply, it means I agree with you.

Public Listening Test [2010]

Reply #234
Settings
1. Nero -q 0.41
2. Apple --tvbr 60 --highest
3. Apple --cvbr 128 --normal *
4. Pre-test:
4a. Divx CBR 128
4b. CT CBR 130

*
CVBR 128 produces slightly higher bitrate than TVBR
http://www.hydrogenaudio.org/forums/index....st&p=686112
http://www.hydrogenaudio.org/forums/index....st&p=683265

Also CVBR has unconstant bitrate distribution.
The samples for CVBR will be looped x 32 times and the chunk with bitrate slighlty inferior to median value will be chosen to compensate that extra 2-3 kbps comparing to TVBR.

I think it's fair workaround for all competitors.

Public Listening Test [2010]

Reply #235
... CVBR bitrate distribution for looped x16 samples:
I've tried fatboy_30sec sample.  h*tp://www.mediafire.com/?nz1cmjoj1y5
1st chunk: 133 kbps.
from 2d to 16th chunks: 140 kbps.

It won't be right to choose the chunk with minimal bitrate. I propose to choose average bitrate ( the chunk with 140 kbps in this case).

Actually, the fatboy_30s sample is from beginning of the Fatboy Slim's Kalifornia track. I have the CD. Out of curiosity, a did some testing. I cut three samples of various durations (6, 15 and 30 s), encoded the samples and the complete track (5+ min) with @ --cvbr 128 --normal and measured the AAC frame sizes with the modified FAAD decoder that was already used earlier in this thread by guruboolez. The results were a bit surprising. This time all samples produced exactly identical AAC frames. I.e all four files were identical up to 6 s, three files were identical up to 15 s, and two files were identical up to 30 s. Only the last two or three frames in each cutted sample were different from the complete track. If fatboy_30s is going to be included, it would be absolutely correct to encode just the sample without any looping. I will post an Excel table of the measured data in the uploads section.

I think the encoder's bitrate behavior must be checked sample by sample after the samples are selected.

However, perhaps a bit more worrying thing is its tendency to alter the file's volume level. Here is the difference of the source sample and the encoded sample (I used the version I cut myself. Its duration is exactly 30 s (1323000 samples). For decoding the M4a file I used foobar. It appears to be produce a file of accurate length. The two images in the animated gif file are screenshots from Audition):



The difference by numbers (from Audition):

Code: [Select]
ORIGINAL

    Left    Right
Min Sample Value:    -32768    -32768
Max Sample Value:    32718    32766
Peak Amplitude:    0 dB    0 dB
Possibly Clipped:    1    4
DC Offset:    -.016     -.004
Minimum RMS Power:    -73.09 dB    -84.87 dB
Maximum RMS Power:    -5.45 dB    -7.44 dB
Average RMS Power:    -17.05 dB    -16.79 dB
Total RMS Power:    -16.12 dB    -15.8 dB
Actual Bit Depth:    16 Bits    16 Bits

Using RMS Window of 50 ms

CVBR

    Left    Right
Min Sample Value:    -31040    -32768
Max Sample Value:    32767    32767
Peak Amplitude:    0 dB    0 dB
Possibly Clipped:    1    9
DC Offset:    -.009     -.004
Minimum RMS Power:    -77.87 dB    -84.53 dB
Maximum RMS Power:    -5.95 dB    -8 dB
Average RMS Power:    -17.34 dB    -17.11 dB
Total RMS Power:    -16.45 dB    -16.18 dB
Actual Bit Depth:    16 Bits    16 Bits

Using RMS Window of 50 ms

The difference in the Replay Gain value is about 0.5 dB (measured by foobar), but I don't think the problem can be fixed simply by adjusting the playback gain because the encoder seems to adjust some AAC frames more than others.

EDIT:

The Excel table is available here: http://www.hydrogenaudio.org/forums/index....st&p=695191

Public Listening Test [2010]

Reply #236
Thanks a lot for the bit rate experiments, Alex! I'll take a look at the Excel sheet later. Igor and I are currently also investigating the looped bit rate issue on the emese sample.

Regarding the volume alterations, I think this is the limiter [edit: compressor] being triggered in Apple's encoder. It will hopefully be circumvented in the listening test by not allowing sample values higher than about -2.5 dBFS (75%), i.e. lowering the volume of affected files. I will match the loudness levels of all samples under test, so that the listeners won't have to adjust the playback volume during the test (one less source of distraction).

Chris
If I don't reply to your reply, it means I agree with you.

Public Listening Test [2010]

Reply #237
Regarding using a looped sample, I think it would be difficult to cut it accurately without first decoding it. Perhaps the sample should be decoded and the wave file cut in order to get it right. Then that sample should be provided in a lossless format.

Edits to my above reply:
-- It appears to be produce a file of accurate length.
-- with qtaacenc @ --cvbr 128 --normal
-- I used the version I cut by myself
(I don't particularly like the current forum policy that doesn't allow to fix mistakes after one hour.)

Public Listening Test [2010]

Reply #238
Ok, then fatboy_30s sample will be encoded as is without loops.

As Alex has already said about wav decoded file.
Encoded AAC files have slightly different lenght  to lossless files. It's complication when it comes to split.

Possible solution for CVBR is
1. Add some silence to beginning and end of each reference lossless sample
2. Loop it  to x16
3. Encode it to AAC
4. choose the chunk with desired bitrate.
3. Decode it to wav and align it to be compared to reference.

It works.

Public Listening Test [2010]

Reply #239
Regarding using a looped sample, I think it would be difficult to cut it accurately without first decoding it. Perhaps the sample should be decoded and the wave file cut in order to get it right. Then that sample should be provided in a lossless format.

It shouldn't be a problem when the item length is an integer multiple of 1024 samples, the frame length. That's why I propose 15-second samples: 15*44100/1024 = 645.996. Add 4 samples (one millisecond), and we get exactly 646.

Alex or Igor (or both), can you take your 15-second fatboy sample, add 1msec silence so that it's 661504 samples long, loop it, encode it with iTunes CVBR, then split the looped MP4 encode into 15-second chunks, and report on the file size of each chunk? That would be interesting.

Chris
If I don't reply to your reply, it means I agree with you.

Public Listening Test [2010]

Reply #240
Add 4 samples (one millisecond), and we get exactly 646.

Is the period of one sample equal to 1/44100 ~ 22.68 us?
us - microseconds

That is problem. We can't cut the reference WAV with precision of us but only +/- 1mseg

The possible solution for CVBR from my previous post is still an option.

Public Listening Test [2010]

Reply #241
This shouldn't be a problem. MP4 bitstreams can only be cut at frame borders, anyway, so if you specify a cut point which doesn't coincide with a frame border, I assume it will round the cut point to the nearest frame.

Chris
If I don't reply to your reply, it means I agree with you.

Public Listening Test [2010]

Reply #242
Good.
But 4 samples aren't 1 ms.

Public Listening Test [2010]

Reply #243
Ooops, true

Chris
If I don't reply to your reply, it means I agree with you.

Public Listening Test [2010]

Reply #244
Maybe I'm starting to be annoying but I think my solution is enough optimal as it gets rid of precision issues.

Public Listening Test [2010]

Reply #245
You're not annoying, and your solution might be fine, but it seems you haven't understood what I'm trying to find out. I'm trying to check whether it's necessary to loop in the first place. I'm not convinced that this is really the case. It can be verified as follows:

  • Create a test item which is exactly 661504 samples long. Take for example the first 15.0001 seconds of the fatboy_30s item. If your audio editor doesn't support this time resolution, send me a message, and I'll create the item for you. Or ask Alex, he seems to have Audition as well.
  • Loop this item, e.g. 16 times as you proposed.
  • Encode the looped item, i.e. the concatenation of the 16 loops.
  • Use mp4box or whatever to split the looped item into its individual loops, i.e. chunks of 15.0001 seconds = 646 frames.
  • Check whether all loops have the same total or average bit rate. If not, take the one with the median bit rate of all chunks, and decode this chunk. If they do have the same (or extremely similar) bitrate, then we don't need to do the loop thing and just encode and decode the initial unlooped item.

So, I'd like to repeat my request:
Quote
Alex or Igor (or both), can you [...] loop it, encode it with iTunes CVBR, then split the looped MP4 encode into 15-second chunks, and report on the file size of each chunk? That would be interesting.

Please?

Thanks,

Chris
If I don't reply to your reply, it means I agree with you.

Public Listening Test [2010]

Reply #246
I tried to do this test:

1st chunk - 158 kbps.  2nd and all other chunks: 152 kbps.

Chunks 2, 3, 4... are identical. The difference between 1st and 2nd chunks is mostly within first 2.6 seconds.

Public Listening Test [2010]

Reply #247
Thanks, lvqcl! I see. The difference in the bit rate during the first seconds is due to 1. encoder delay (one or two more frames so that the first frame can be decoded correctly) and 2. full bit reservoir. Both phenomena are unimportant if we concatenate all 16 test items and encode/decode them in one go (only the first item in the concatenation will be affected, but we will probably chop off the first 2 seconds of that item after decoding, so everything should be fine). So from my point of view, we don't need the loop thing.

Can anyone else confirm lvqcl's findings, preferably with a different item, e.g. the first 15.0001 seconds of Human_Disease?

Thanks,

Chris
If I don't reply to your reply, it means I agree with you.

Public Listening Test [2010]

Reply #248
Chris or Alex or somebody else.
Please send me 15.0001 sec of Human_Desease or any other sample. I have only Nero 6.0 Wave Editor which marks resolution of only +/- 1 ms.

Is there free wave editor with high resolution (microseconds) like Audition?

Public Listening Test [2010]

Reply #249
Yes, Chris.
You are right. CVBR behavior is normal on sources with 1024 multiple length.
I tried sources with 661504 samples (15 sec * 44100 + 4 samples) and all chunks for emese and human disease have the same bitrate.

If we are going to use CVBR 128 without looping then we should raise Nero q setting a little bit. 0.41 -> 0.415