Are there switches in OpusEnc not in FFMPEG?

Topic: Are there switches in OpusEnc not in FFMPEG? (Read 3064 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

Are there switches in OpusEnc not in FFMPEG?

2023-01-08 01:05:54

I can't find the --speech argument in any ffmpeg documentation or googling.

I tried everything to get the hint of annoying peak grain in speech encoding <24 kps with ffmpeg to opus files. I am wondering if opusenc might have some switch that I overlooked by only trying to encode opus files with ffmpeg.

Re: Are there switches in OpusEnc not in FFMPEG?

Reply #1 – 2023-01-08 01:38:41

Add -application voip

Re: Are there switches in OpusEnc not in FFMPEG?

Reply #2 – 2023-01-08 10:08:58

https://ffmpeg.org/ffmpeg-codecs.html#libopus-1

Code: [Select]

application (N.A.)
    Set intended application type. Valid options are listed below:

    ‘voip’
        Favor improved speech intelligibility. 

    ‘audio’
        Favor faithfulness to the input (the default). 

    ‘lowdelay’
        Restrict to only the lowest delay modes.

Re: Are there switches in OpusEnc not in FFMPEG?

Reply #3 – 2023-01-09 15:15:13

-frame_duration 60 intrigues me, but I am reading conflicting advice, which I read as confusion over repeating advice regardless of speech or movie or music. Also, fuzzy descriptions, using undefined language like, "fairly low bitrate". I consider 16kps and below as low kps, but I consider >20 kps as inefficient for audio books, while many people consider anything under 32 kps or even 64 kps, as a low bitrate. Also, I am making opus files, which is another variable that is cited vaguely as a vague reason not to a custom frame duration, possibly because files are usually movies or music in most people's minds .

I can see benefit for letting the encoder choose the frame rate, but I can see how increasing it might improve compression too at the same quality. I would greatly appreciate every kps saved, when in the 18-22 kps range, because each translates into dozens or even scores of hours more speech audio per gig.

So, I don't know if I would benefit from using that argument. I did one half or hour hour test, but was using --application audio with ffmpeg. I don't currently have a working pair of good headphones, so I don't trust my current cheap headphones to know empirically for sure if changing the frame rate would help quality or size, especially if the improvement were significantly under like 20 or 30 percent improvement, which is in the placebo range and uncertain without further tests.

Re: Are there switches in OpusEnc not in FFMPEG?

Reply #4 – 2023-01-09 19:20:56

Quote from: degarb on 2023-01-09 15:15:13

-frame_duration 60 intrigues me, but I am reading conflicting advice, which I read as confusion over repeating advice regardless of speech or movie or music. Also, fuzzy descriptions, using undefined language like, "fairly low bitrate".

The description is fuzzy because the exact bitrate where it can have a positive impact depends on the other encoding settings and the music/speech detection. In my own testing, I usually saw the encoder taking advantage of 60ms frames below 16kbps, but my tests were not thorough.

Re: Are there switches in OpusEnc not in FFMPEG?

Reply #5 – 2023-01-10 08:09:13

I'm not sure what "annoying peak grain" amounts to. The more detail you can give about what you mean, the better someone may be at making that diagnosis.

Opusenc's documentation lists the options available. Some niche uses benefit from using the advanced set-ctl options but those aren't very useful for normal uses cases.

See my previous reply on the topic of long frames. In sum: Long frames exist primarily to help reduce wasted bandwidth from transport overhead when streaming. Their impact on "at-rest" encoded files is small. Savings are on the order of 0.2 kbps, and the frames are only actually longer if you're using SILK-only modes. (Unless you tell the encoder otherwise it may use hybrid mode for anything above about 12kbps.) Otherwise it just packs 20ms frames together in a way that reduces container or transport overhead. And there are minor downsides.

Re: Are there switches in OpusEnc not in FFMPEG?

Reply #6 – 2023-01-10 13:59:48

Quote from: jensend on 2023-01-10 08:09:13

I'm not sure what "annoying peak grain" amounts to. The more detail you can give about what you mean, the better someone may be at making that diagnosis.

Opusenc's documentation lists the options available. Some niche uses benefit from using the advanced set-ctl options but those aren't very useful for normal uses cases.

See my previous reply on the topic of long frames. In sum: Long frames exist primarily to help reduce wasted bandwidth from transport overhead when streaming. Their impact on "at-rest" encoded files is small. Savings are on the order of 0.2 kbps, and the frames are only actually longer if you're using SILK-only modes. (Unless you tell the encoder otherwise it may use hybrid mode for anything above about 12kbps.) Otherwise it just packs 20ms frames together in a way that reduces container or transport overhead. And there are minor downsides.

If you compare an opus file with Exhale, xHE-aac or LC, <30 kps. The Opus file has a definite grain or scatichiness, that reminds me of a weak fm signal (noise reduced so clean on silence), or a speaker with pneumonia and a chest rattle. Exhale, or fdk don't have this.

Now, if I resample a wav file to 22050 hz and feed it to exhale exe in the non HE setting of 0 (Exhale HE can't handle resampling under 44.1 kHz) , I get a similar annoying scratchiness to the Opus files.

I have tried to tweak my Opus encoding to get rid of the peak syllable volume scratiness and grain by adjusting frame rate, forcing -application to voip, tried lowering the lowpass to as low as 7800 hz, tried resampling, put it on the slowest compression level, tried changing the frame rate. No luck yet..... I suspect Opus is screwing up in the source code, because there is no reason we can't get as clean of a speech sound (annoying hint of grain or static in human speech syllable volume peaks), as what is possible with FDK LC or Exhale LC, because the Opus technology should be on par with AAC. If the source code had sub 27 kps files resampling to 22050, it would explain the annoying grain or scratchiness that isn't immediately apparent to a new user, but is definitely present for everyone.

I would rather use Opus than Exhale, because Exhale can only encode so many wav samples per encoding, which affect recording with ffmpeg (piped to exhale.exe) from the sound card, and limits my speech encoding with Exhale to 18 hours, before it gets scratchy like Opus. I also like the finer control over the vbr bitrate when using Opus, without getting absurdly inefficient like FDK's abr limitation.

I would rather Opus speech files used a lowpass, down to 7800 hz, than resample and get scratchiness or grain. I strongly suspect Opus, when detecting speech mode on <27 kps files, is internally resampling the wav to 24000 hz, encoding, then resampling the output to 48000 hz, because it sounds like it. They probably could do better by using a lowpass than resampling. But, this is speculation on my part. I don't read or compile source code, because of days wasted trying and failing to do this in the past.

Re: Are there switches in OpusEnc not in FFMPEG?

Reply #7 – 2023-01-10 23:42:05

Hello,

I have tried some opus encoding and I think I (may?) understand what you mean by grain. I've got some interviews encoded and get some warble effect when recorded signal quality goes slightly down.

Anyway, I propose this setting for ffmpeg, not transparent but good enough for me:

Code: [Select]

-ac 1 -c:a libopus -b:a 20k -cutoff 12000 -application voip -frame_duration 40

You can play with cutoff but it is restricted to 4000, 6000, 8000, 12000 or 20000 values.

Surprisingly, bitrate can be lowered a bit by resampling input before feeding opus encoder:

Code: [Select]

-ac 1 -ar 24000 -af aresample=resampler=soxr:precision=28 -c:a libopus -b:a 20k -cutoff 12000 -application voip -frame_duration 40

AiZ

Re: Are there switches in OpusEnc not in FFMPEG?

Reply #8 – 2023-01-11 09:18:42

If you really want to force Opusenc to produce wideband rather than super-wideband or fullband, you can do something like this:

Code: [Select]

opusenc --set-ctl-int 4004=1103 --bitrate 16 sample.wav sample.opus

That set-ctl is actually saying MAX_BANDWIDTH_REQUEST=OPUS_BANDWIDTH_NARROWBAND.
For those who might fiddle with this: the magic numbers are in the main project's opus_defines.h.

I don't know what you'd have to do for ffmpeg.

Re: Are there switches in OpusEnc not in FFMPEG?

Reply #9 – 2023-01-11 11:46:05

Hi,

Quote from: jensend on 2023-01-11 09:18:42

I don't know what you'd have to do for ffmpeg.

Actually, it is the cutoff parameter.

	opusenc --set-ctl-int 4004=	ffmpeg -cutoff
NARROWBAND	1101	4000
MEDIUMBAND	1102	6000
WIDEBAND	1103	8000
SUPERWIDEBAND	1104	12000
FULLBAND	1105	20000

Thanks for the opusenc trick.

AiZ

Re: Are there switches in OpusEnc not in FFMPEG?

Reply #10 – 2023-01-12 07:33:58

Oups,

Quote from: AiZ on 2023-01-10 23:42:05

Surprisingly, bitrate can be lowered a bit by resampling input before feeding opus encoder:

Bad copy-paste, I forgot to remove cutoff, useless because of resampling.

Code: [Select]

-ac 1 -ar 24000 -af aresample=resampler=soxr:precision=28 -c:a libopus -b:a 20k -application voip -frame_duration 40

AiZ

Notice