Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: What is the recommended setting for speeches? (Read 17192 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

What is the recommended setting for speeches?

Reply #25
Probably would be transparent, although you really should listen to some results and then decide if they are suitable for your needs.

As for making sense ... well.  --vbr is the default.  You can specify it without harm but it is unnecessary.  Similarly --comp 10 is also the default, it is provided only so that the high CPU load (aka SLOW) of Opus encoding can be reduced in situations where it would be a problem.  I would not specify --ignorelength unless you are actually getting problems without it.  In most cases, opusenc will work without this option, but for streaming audio through stdin it is probably needed, and occasionally for input files that don't specify the data length appropriately.  With it, you may get undefined results from bad input files.

Lastly, ignoring everything that has been said so far, you have specified a non-standard frame size.  I see nothing to indicate that you need this frame size, or that it will be of any benefit to you.  The default is 20ms, use it unless you really know why you need some other value.  At a bitrate of 70kbps for mono speech, you will almost certainly get a straight CELT coding and it does not even support a 20ms frame size, so you are fudging together weird composite packets, increasing latency and complexity, for no good reason.


Yeah just using most settings to be "safe", feels better when i see the settings.

I am also listening on it, i and i would say it's 60+ where it's starting to get interesting.

will do without the ignorelength then, just used it as it seems pretty common, and if i have problems i will use it i guess.

Not sure about the Framsize, from what i have read, the latency only really matters for streaming, where you have to send a package of 20ms (Framesize) for it to play, meaning there is always a delay of the framesize added to the Latency.

And at lower bitrates, Framesize directly affects quality, a high framesize will make lower bitrate sounds better (pretty much like increasing bitrate from my understanding).

No idea what you mean with "CELT coding" and not supporting 20ms though?

What is the recommended setting for speeches?

Reply #26
And at lower bitrates, Framesize directly affects quality, a high framesize will make lower bitrate sounds better (pretty much like increasing bitrate from my understanding).

At 70kbps, you won't get any 60ms frames. Instead, you'll get groups of three 20ms frames glued together. This is why lithopsian and I suggest using 20ms frames.

What is the recommended setting for speeches?

Reply #27
Larger frames can be helpful in reducing the overhead at very low bitrates, which leaves more bits for maintaining quality audio.  Very low in this context being around 10kbps.  At 70 kbps there is nothing to be gained by forcing packets beyond 20ms.

What is the recommended setting for speeches?

Reply #28
Yeah just using most settings to be "safe", feels better when i see the settings.

Yeah no, you are just risking overriding default settings which you don't understand.
Quote
No idea what you mean with "CELT coding" and not supporting 20ms though?

Uh huh.  Use the defaults.

What is the recommended setting for speeches?

Reply #29
Yeah just using most settings to be "safe", feels better when i see the settings.

Yeah no, you are just risking overriding default settings which you don't understand.
Quote
No idea what you mean with "CELT coding" and not supporting 20ms though?

Uh huh.  Use the defaults.


Answering both of you.

I will ignore the framesize and let it stay at it's default (20ms).

As for overriding. It shouldn't really matter with those settings. VBR is the Mode, nothing weird there.
Comp set the "preset", slowest there, nothing more.

What is the recommended setting for speeches?

Reply #30
Larger frames can be helpful in reducing the overhead at very low bitrates, which leaves more bits for maintaining quality audio.  Very low in this context being around 10kbps.  At 70 kbps there is nothing to be gained by forcing packets beyond 20ms.


I'm encoding speech at 16 kbps. No latency concern, no voip.  Here's what I'm seeing for overhead with framesize 20:

      Encoded: 1 hour, 13 minutes, and 0.82 seconds
      Runtime: 1 minute and 30 seconds
                (48.68x realtime)
        Wrote: 9063069 bytes, 219041 packets, 4383 pages
      Bitrate: 15.9329kbit/s (without overhead)
Instant rates: 5.2kbit/s to 25.6kbit/s
                (13 to 64 bytes per packet)
      Overhead: 3.73% (container+metadata)

So if I increase frame size, I decrease overhead. correct? At framesize 40, I'll get overhead of ~ 1.5%+ ( I realize it's not linear).

Shouldn't I _always_  set the framesize to 40 or 60 when I don't care about latency? I gain some more bits for audio (though less and less) at no cost. If not, when would you use framesizes of 40 or 60?

On another topic, why would I ever specify 10 kbps? I thought opus only had fixed bit rates. so if I specified 10 kbps I'd get 16 ( or 8 ?).

Thanks
sean

What is the recommended setting for speeches?

Reply #31
Frame size doesn't impact overhead directly, but the Ogg container overhead increases steadily when the packet size drops below 255 bytes.  At 16 kbps that would happen with frames around 120ms.  So you should be able to significantly drop your overhead with larger frames.  The obvious thing to do is to try it, you can use 60ms frames (effectively, don't worry about the internals) at this bitrate.

P.S.  I assume you don't have actual metadata (ie. tags) in there?

You would use 10 kbps if you wanted to save space and the quality is acceptable for your use.  Much bigger gain than worrying about shaving a percent off your overhead.  You can specify essentially any bitrate you want in Opus (min = 6 kbps?  max 510 kbps).  Certainly it isn't limited to powers of 2.  With the most common quality settings you won't be guaranteed the exact bitrate you asked for, but that can be arranged too if it is important.

 

What is the recommended setting for speeches?

Reply #32
Interesting using framesize 60

Here 20 ::                                                                      Here 60

Encoded: 1 hour, 16 minutes, and 15.84 seconds      ::  1 hour, 16 minutes, and 15.84 seconds
.........
Wrote:  9448175 bytes, 228792 packets, 4578 pages ::  8959943 bytes, 76264 packets, 4769 pages
Bitrate:  15.9009kbit/s (without overhead)                ::  15.305kbit/s (without overhead)
Instant rates: 4.8kbit/s to 27.2kbit/s        ::                  5.6kbit/s to 21.7333kbit/s
          (12 to 68 bytes per packet)          ::                  (42 to 163 bytes per packet)
Overhead: 3.74% (container+metadata)  ::                  2.3% (container+metadata)

There's a 5% reduction in file size, which is good, but that seems to be primarily caused by a 4% decrease in bitrate - which is not good.

So increasing framesize will decrease file size, but, oddly, at the expense of quality.

Curious.