Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: SE listening test @96kbit/s (Read 26053 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

SE listening test @96kbit/s

The following codecs are going to be added to 96kbit/s section for listening tests:

AAC VBR@89.8 (Winamp 5.63) - CVBR, LC-AAC
AAC Encoder v1.04 (Fraunhofer IIS) from Winamp 5.63: variable Bitrate, preset: 3

AAC VBR@92.1 (QTime 7.7.3) - TVBR, LC-AAC
QuickTime (7.7.3) AAC Encoder via qaac 2.18 (CoreAudioToolbox 7.9.8.2): qaac -V45 ref.wav

AAC VBR@90.0 (NeroRef 1540) - CVBR, LC-AAC
Nero AAC Encoder 1.5.4.0 (build 2010-02-18): neroAacEnc.exe -q 0.34 -if ref.wav -of out.mp4

Vorbis VBR@90.4 (Xiph 1.3.3)
OggEnc v2.87 (libVorbis 1.3.3): oggenc2 -q2.2 ref.wav

Opus VBR@90.7 (libopus 1.0.2)
opusenc --bitrate 90 ref48.wav (44.1/16 -> 48/24 by Audition CS6)

    mp3 VBR@89.1 (Lame 3.99.5)
    encode: lame -V7 ref.wav
    decode: MAD 32kHz/32bit -> 44.1kHz/24bit by Audition CS6[/li][/list]
      mp3 VBR@90.2 (Lame 3.99.5)
      encode: lame -V6.9 ref.wav
      decode: MAD 32kHz/32bit -> 44.1kHz/24bit by Audition CS6[/li][/list]
      Not sure what variant of lame settings to choose - first one uses more usual "-V7" setting and the second - has more appropriate target bitrate.
      Other suggestions/remarks are welcome as well.
      keeping audio clear together - soundexpert.org

      SE listening test @96kbit/s

      Reply #1
      Vorbis VBR@90.4 (Xiph 1.3.3)
      OggEnc v2.87 (libVorbis 1.3.3): oggenc2 -q2.2 ref.wav


      At this bitrate you should be using (or at least including) Oggenc2.87 using aoTuVb6.03 which is tuned for lower bitrates/presets.

      SE listening test @96kbit/s

      Reply #2
      Why don't you use ABR 96 on all of them?

      SE listening test @96kbit/s

      Reply #3
      At this bitrate you should be using (or at least including) Oggenc2.87 using aoTuVb6.03 which is tuned for lower bitrates/presets.

      Why don't you use ABR 96 on all of them?

      The idea is to use the encoders with default settings if possible, assuming they are recommended by developers. Also it is good to make this test in succession to the previous one @64.
      keeping audio clear together - soundexpert.org

      SE listening test @96kbit/s

      Reply #4
      What do you mean by "default settings"? You are not using any "default setting" ("qaac file.wav" for example), almost any AAC CLI has usually ~128 kbps as "default settings". Like you are preparing it is not a 96 kbps listening test.

      Not trying to be rude but I've never liked your lack of expertise and you call yourself Sound Expert.

      SE listening test @96kbit/s

      Reply #5
      What do you mean by "default settings"? You are not using any "default setting" ("qaac file.wav" for example), almost any AAC CLI has usually ~128 kbps as "default settings". Like you are preparing it is not a 96 kbps listening test.

      Not trying to be rude but I've never liked your lack of expertise and you call yourself Sound Expert.


      TOS #2?

      Anyway, there is plenty of precedence for using VBR settings in listening tests even when those settings do not produce the desired bitrate on the test material.  Historically, this is done by testing the settings on a large corpus of music.

      There has been plenty of discussion about the usefulness of of the results of these kinds of listening tests.  His approach to conducting the test is scientifically valid, even though there are doubts over whether the methodology produces effects that correlate to human hearing.  And the use of VBR over ABR is recommended for a variety of reasons.

      SE listening test @96kbit/s

      Reply #6
      What do you mean by "default settings"? You are not using any "default setting" ("qaac file.wav" for example), almost any AAC CLI has usually ~128 kbps as "default settings". Like you are preparing it is not a 96 kbps listening test.

      Not trying to be rude but I've never liked your lack of expertise and you call yourself Sound Expert.


      Default Settings = Recommended by Developers. The settings that they tune every release. 
      Sorry for my bad english.

      SE listening test @96kbit/s

      Reply #7
      Default Settings = Recommended by Developers. The settings that they tune every release.

      LOL WHAT? Please explain further, post references, links, discussions where developers "tune" only one settings instead of the full encoder capabilities.

      SE listening test @96kbit/s

      Reply #8
      LOL WHAT? Please explain further, post references, links, discussions where developers "tune" only one settings instead of the full encoder capabilities.


      This is typical practice.

      http://lame.cvs.sourceforge.net/viewvc/lam...amp;view=markup

      Edit: I should clarify.  There are typically a handful of presets and settings that are highly tuned.  Anything in between these values is often done by interpolation of parameters between neighboring presets.  This means that, in practice, the "preset" values are going to produce the best quality:bitrate ratio.  Certainly there have been plenty of improvements that effect an encoder's overall quality.  But the point remains that anything other than built-in presets will result in lesser quality (per bit) than the default preset settings.

      SE listening test @96kbit/s

      Reply #9
      Edit: I should clarify.  There are typically a handful of presets and settings that are highly tuned.  Anything in between these values is often done by interpolation of parameters between neighboring presets.  This means that, in practice, the "preset" values are going to produce the best quality:bitrate ratio.  Certainly there have been plenty of improvements that effect an encoder's overall quality.  But the point remains that anything other than built-in presets will result in lesser quality (per bit) than the default preset settings.

      I wonder how well  the interpolation of e.g. 2 presets over the whole bitrate spectrum would work, one for acceptable quality at a low bitrate, and the other one for transparency with maximum compression. Shouldn't this theoretically give a more or less ideal efficiency, if these presets were perfectly tuned?

      SE listening test @96kbit/s

      Reply #10
      • The following codecs were added to 96 kbit/s section for crowd-testing:
        [blockquote]AAC VBR@89.8 (Winamp 5.63) - CVBR, AAC LC
        AAC VBR@92.1 (QTime 7.7.3) - TVBR, AAC LC
        AAC VBR@90.0 (NeroRef 1540) - CVBR, AAC LC
        Vorbis VBR@90.4 (Xiph 1.3.3)
        Opus VBR@90.7 (libopus 1.0.2)
        mp3 VBR@90.2 (Lame 3.99.5)[/blockquote]
      • At the moment all codecs from 96 kbit/s section are under test, though probability of getting test files of the newly added ones are higher as they have less grades.
      • Listening tests in 96kbit/s section, including this one, are performed without artifact amplification.
      • Besides usual live ratings on 96kbit/s page there will be full report similar to previous 64kbit/s test. This practice will also be usual from now.
      keeping audio clear together - soundexpert.org

       

      SE listening test @96kbit/s

      Reply #12
      Since when has the Winamp AAC encoder CVBR?

      Winamp uses Fraunhofer AAC Codec with VBR encoding support since v5.62 and AFAIK the only aac encoder that utilizes TrueVBR is by Apple.
      keeping audio clear together - soundexpert.org

      SE listening test @96kbit/s

      Reply #13
      Winamp's AAC VBR is somewhat between a CVBR and a TVBR. At high bit rates it's more like Apple's TVBR, at low bit rates it's more CVBR. Just call it VBR.

      Chris
      If I don't reply to your reply, it means I agree with you.

      SE listening test @96kbit/s

      Reply #14
      Winamp's AAC VBR is somewhat between a CVBR and a TVBR. At high bit rates it's more like Apple's TVBR, at low bit rates it's more CVBR. Just call it VBR.

      Thanks. Corrected.
      keeping audio clear together - soundexpert.org

      SE listening test @96kbit/s

      Reply #15
      Sergei,

      I visit the russian audio-related sites. Your tests have null reception there as well as any other place. Even people of your mother tongue are strongly disagree with your tests. 

      Let's see. http://soundexpert.org/encoders-64-kbps
      Code: [Select]
      MP3 22.050 kHz - 2.79
      Opus - 2.77
      Vorbis - 2.49


      So, are You suggesting us that MP3 22.050 kHz, 64 kbps is better than Opus and Vorbis, 48/44.1 kHz, 64 kbps?
      It's not even possible neither realistic.


      A good design is very necessary condition to avoid 90-100% of future flaws. It's impossible to correct anything with math or statistics  after an end of test.

      You can continue to ignore people disliking your tests or You can start from scratch and elaborate a good design for future tests.

      SE listening test @96kbit/s

      Reply #16
      Let's see. http://soundexpert.org/encoders-64-kbps
      Code: [Select]
      MP3 22.050 kHz - 2.79
      Opus - 2.77
      Vorbis - 2.49


      So, are You suggesting us that MP3 22.050 kHz, 64 kbps is better than Opus and Vorbis, 48/44.1 kHz, 64 kbps?
      It's not even possible neither realistic.


      I uploaded three streams that were used to produce test files for the codecs you picked up - http://www.hydrogenaudio.org/forums/index....st&p=832552
      If you encode SE test samples with the recent Lame you'll be surprised even more.

      Also I would be more careful saying "better" when comparing such close values of codec averages.

      In short - yes, those codecs have comparable sound quality.
      keeping audio clear together - soundexpert.org

      SE listening test @96kbit/s

      Reply #17
      Again. This is how your test is designed not what can happen in real scenario.



      Your set of samples are >50% strongly tonal samples and have only one transient sample?

      5 of 9 samples are strongly tonal.
      1 ackward synthetic sound and only 1 transient sample? And that's all?

      It's very unrepresentaive and waaaay out of real scenario.

      And it's only a little start. There are way much gross flaws.


      Please, listen what people are trying to tell you. They are diasgree with your tests. ALL of them. Not only hear them, but actually listen and try to understand them.

      -хзнч

      SE listening test @96kbit/s

      Reply #18
      It's very unrepresentaive and waaaay out of real scenario.

      There are multiple "real scenarios". I doubt you can find any finite representative set of test samples. SE tests codecs with these 9 sound samples since 2001, you can consider this test as Big Mac Index in audio. The latter is also not representative but still meaningful.
      keeping audio clear together - soundexpert.org

      SE listening test @96kbit/s

      Reply #19
      Let's talk about what we got on a table not jumping to philosofic endless talks.

      5 tonals, 1 tranients ->  not representative. Nada.
      Why in the wotld You would do that?  It's 5 vs 1. It's not balanced. Why?

      Of course it's impossible to find 100% ideal representative set of samples. But why don't just include different types of samples?

      A set of samples with equal amount of different types of samples (20% tonal samples, 20% transient, 20% speech, 20% mixed, 20% stereo etc) will be already much representative.

      SE listening test @96kbit/s

      Reply #20
      Let's talk about what we got on a table not jumping to philosofic endless talks.

      5 tonals, 1 tranients ->  not representative. Nada.
      Why in the wotld You would do that?  It's 5 vs 1. It's not balanced. Why?

      Of course it's impossible to find 100% ideal representative set of samples. But why don't just include different types of samples?

      A set of samples with equal amount of different types of samples (20% tonal samples, 20% transient, 20% speech, 20% mixed, 20% stereo etc) will be already much representative.


      These sound samples were chosen more than 10 years ago. They represent different types of audio material. They will not be changed in the near future at least. Period.
      keeping audio clear together - soundexpert.org

      SE listening test @96kbit/s

      Reply #21
      Wunderbar,

      Good luck with your tests.

      SE listening test @96kbit/s

      Reply #22
      Detailed results of this listening test  are available.



      Attempting to make graphical representation of results more informative and easy to interpret for an average user I tried a slightly different approach to calculation of resulting mean scores of codecs and to codecs comparison in general. Main ideas of the approach are below.

      Analysis of collected grades ends at the point when means and confidence intervals for sound excerpts have been computed. A set of such means for each codec (colored ones) most completely characterizes its performance on chosen sound material. Further averaging of grades into the single parameter - overall mean discards too much information and its use for comparison of codecs is very questionable while comparing sets of means allows more comparison techniques to be elaborated and applied.

      The simplest way to compare different sets of means with each other is to compare their averages. Here comes the first crude (preliminary) integral estimator of codec performance – average of means in the set. Confidence interval for this average (bootstrapping only as distribution of means in a set is not normal and varies from set to set) has clear and useful meaning from user's perspective – if more sound samples would be used in a listening test their average will get into that interval with high probability. Comparison of such confidence intervals therefore is another meaningful method for comparing sets of means. Different estimators of variance of means in a set also could be helpful (range of means in set are shown on SE rating bar graphs). As the most important question for end user of codecs looks something like “which codec is better?” even direct simple comparison of sets of means could give a clear answer. For example it could be conventionally defined that codec A better than codec B  only if all means in a set of codec A are higher than corresponding means in a set of codec B (in other words all sound samples of codec A must be graded higher in listening test). Probability that it could happen by chance is very low and depends on number of sound excerpts in a set. So keeping and comparing the full sets of means seems more revealing, has stronger research potential and in most cases is more simple to perform and interpret than trying to make helpful inference comparing over-aggregated overall means of grades.

      The figure above helps to make such comparisons visually and shows resulting averages with confidence intervals computed according to the described approach.

      Raw grades collected for this listening test are in the article – http://soundexpert.org/news/-/blogs/opus-a...-kbit-s#results
      Article about previous @64 listening test was also supplemented with similar graph, so you can compare both old and new – http://soundexpert.org/news/-/blogs/opus-a...n#update11-2013
      keeping audio clear together - soundexpert.org

      SE listening test @96kbit/s

      Reply #23
      Following this very painful but insightful discussion it became clear that the above calculation of overall confidence intervals (wide ones) using sample means is not correct. Any arbitrary set of samples (especially small one) chosen for a listening test is representative to some unknown/undefined general population of music. Consequently, the confidence intervals calculated for this unknown population have little or no meaning. Results of such test can't be generalized beyond this set of samples. Loosing that generalization allows to discard separate samples from analysis of overall means and consider all grades of all samples as a single/indivisible entity. Confidence intervals of overall means calculated using grades turn out to be small. Such increase of test power is a reward for loosing generalization of test. 

      Taking all this into account SE returns to initial/standard calculation of overall confidence intervals. The correct version is below.

      keeping audio clear together - soundexpert.org