Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: [split] Help with long files (over 8 hours) using exhale encoder (Read 2998 times) previous topic - next topic - Topic derived from exhale - Open Source ...
0 Members and 1 Guest are viewing this topic.

Re: [split] Help with long files (over 8 hours) using exhale encoder

Reply #25

I am getting 23 to 25 kps, resampling speech to 22050 hz, lowpass 10300, and Exhale 0.

I would like to play around with speech bit rates closer to 21 through 18 kps, to see if it is possible to get reasonable files in those bit rates...... I did only one speech file so far with Exhale on the A mono setting, and got a 20 kps, which didn't sound too bad.
That sounds reasonable, but you shouldn't need the extra lowpass with preset 0 then since the audio bandwidth is only 11 kHz anyway with 22050 Hz sampling rate (for up to 27 hours and 3 minutes of encoding at once, if I calculated correctly). Modern encoders are much more intelligent in choosing how many bits to spent on high-frequency coding, you don't have to worry about these things anymore (it's basically a relic from "old mp3 ways of audio coding").

Chris


I have just played around with resampling to 16000.  It doesn't sound too different, surprisingly, than the 20050 sampling rate.  The bit rate was about 1 or 2 kps lower, I think. (I got, same speaker, same lecture, samples of 27 kps with 44.1kHz B, 25kps @22050hz 0, and 24 kps@16000hz q0--which is fine, but I think it is possible to get lower with a functionally same quality.  A -1 and -2 Exhale q setting would be nice to play with, especially on stuff I would only listen to once, and might not even enjoy the topic of discussion.)   I suspect a tad over 36 hours in one encoding would be possible at 16000, if my math is correct.

I will try to stop my lowpass filter tendencies, but I might need to be weened off the habit.  No cold turkey, as I fear withdraw symptoms with some special gum to chew or something. :-)

Re: [split] Help with long files (over 8 hours) using exhale encoder

Reply #26
Hello,

Resampling with ffmpeg:  Since I am using ffmpeg, I think they only allow -ar resampling at 48000,44100,32000, 22050, 16000, 11025.  If I put in 24000, I recall it rounding to 22050. There is supposed to be -af "aformat=sample_fmts=s16:sample_rates=%Resampling%"     doesn't seem to work, since the console is saying a regular sample rate, and the playback of the Exhale file is 44.1 or 48 kHz (I don't recall which.).

ffmpeg has no problem to resample to 24Khz. In fact you can input whatever value: 24000, 12345, 501, etc. Now, container/codec supporting these unusual rates is another problem (wav is Ok).

If you like to have placebo extra quality, add
Code: [Select]
-af aresample=resampler=soxr:precision=28
after your -ar %Resampling% parameter.
FFmpeg Resampler Documentation

    AiZ

I am still getting interleaving complaints or rounding when I try a non typical sample rate.  But I am piping the wav to Exhale.  It might be that Exhale only accepts certain sample rates.

I added your placebo sox resample af filter, just in case it helps.   At least, I don't see any console complaints, so I think it must be working splendidly.   I may even get around to googling what it does, someday.

Re: [split] Help with long files (over 8 hours) using exhale encoder

Reply #27

Hello Chris,
 
    You should add a paragraph to https://gitlab.com/ecodis/exhale under the section, Third-party stdin (foobar2000): called Third-party Piping (ffmpeg recording):
 
“After downloading from https://ffmpeg.org/download.html, find your sound card you wish to record with using the dos lines in the command console, ffmpeg -list_devices true -f dshow -i dummy. This will list your audio devices with the name something like,  "Stereo Mix (Realtek(R) Audio)" which ffmpeg.exe can use as the input to Pipe to Exhale.exe.  Example, useful console lines are:
ffmpeg -list_devices true -f dshow -i dummy (list your valid audio devices to record from and pipe stdout to exhale in real time.)
ffmpeg -f dshow -i audio="Stereo Mix (Realtek(R) Audio)" -c:a pcm_s16le -f wav -t 3600 - |  exhale b output.m4a       (A simple stereo 36000 second recording example, with the b quality. Roughly, 12 hours is the maximum record time for a non resampled wav in one encoding session. )
ffmpeg -f dshow -i audio="CABLE Output (VB-Audio Virtual Cable)" -ac 1 -c:a pcm_s16le -f wav -t 72000 -ar 20050 -af aresample=resampler=soxr:precision=28 - |  exhale b output.m4a     (A 20 hour mono recording, with resampling to 20050 hz, and audio filter arguments. Quality is b in the example.)
ffmpeg -f dshow -i audio="CABLE Output (VB-Audio Virtual Cable)" %monovar%-c:a pcm_s16le -f wav -t %seconds% -ar %Resampling% -af "lowpass=%lowpass%" -af aresample=resampler=soxr:precision=28 -af silenceremove=stop_duration=45:stop_threshold=-50dB - |  exhale %ExhaleQuality% %album%.m4a    (A line that uses variables, as set or calculated,  in a larger batch file for automation.  Note, that I am not fully confident that the -af silenceremove argument doesn’t have a flaw and works to stop the recording of silence at the end of the recording, but I include it because it doesn’t hurt and is an example of where audio filter arguments go in the command line. )
Chris, I initially googled xHE-AAC ffmpeg plug in, which gave a page (I don’t recall) where I believe, you decided it wasn’t necessary because of MainConcepts plugin, which I rejected as an option because of the high price and target audience of professional video encoders.... Adding to that thread the possibility of piping ffmpeg live stream to exhale (and possibly to an online streaming radio server with a triple pipe command line), would obviate the need to write any ffmpeg plugin.

Unfortunately, I spent a good long day googling piping on windows and trying different things, before I gave up, and only achieved a command line that piped unaltered stereo to exhale, while no arguments of ffmpeg worked.  I posted in hydrogen io and 3-4 days later, someone posted a working pipe line.  I had a bad syntax and order in my ffmpeg line.  His post saved me the need of running ram disk, which cut down on my usable ram, and risked losing hours of recording on a power outage or accidental hibernation.

If you add the suggested pipe option (or rewrite my instructions with your added knowledge, to https://gitlab.com/ecodis/exhale it would save other people who wish to record with exhale, lots of hours and frustration.   (If the old thread discussing the ffmpeg plugin discussion can be re-dug up, necromancered and appended with the option to pipe to Exhale, with the line example, that thread wouldn't be a dead end for people googling how to do this.  I think the thread was closed by you 2 years ago, here, https://gitlab.com/ecodis/exhale/-/issues/7  without any real solution for amateurs.  I can see my professional video editor brother buying the  main concept plugin, since he rents software for $2000 a month that he bills to his clients.  But, I can't justify taking the money from the family budget just to get a better 24kps speech file, plus main concept plugin looks like a steep learning curve and probably not useful for what I wish to use xHE-AAC for.   Transcoding doesn't fit my needs either, and takes forever too.   )


 

Re: [split] Help with long files (over 8 hours) using exhale encoder

Reply #29
I will note that I hear a hint of peak scratchiness, Exhale q0, when resampling to 22050 hz to get 24 hours of recording, which reminds me of the Opus peak grain in <30 kps Opus files, which I was using Exhale to avoid.

I don't think it is as obvious for 32000 hz resampling with Exhale files, q0.  This would mean, I could get acceptable files from 13.6 to 18.2 hours, resampling to 32000 hz using the pipe method.   Anything over 18 hours probably should be broken into 2 recording sessions, tracks.  Else, just record a larger bitrate opus or fdk abr in one track.

I am consistently getting about 2 kps larger files, using Exhale at quality 0 v. B, with the same apparent sound quality.  I have done about a dozen and a half comparison files, so far, which isn't enough to declare that my results so far are true with all samples. And, I don't claim to have the best headset currently.  If I need to do abx, then the quality difference is not enough to worry about, as far as I am concerned, for speech. I am just interested in not getting obvious artifacts that might become annoying down the road, as I become more conscious of the artifact. 

To recap my personal speech compression philosophy which has worked well for several decades of listening to tens of thousand of hours of speech:  I personally don't believe there is anything worth encoding above 10300 lowpass on most speech files, but would and have accepted a lowpass as low as 7800 in order to get much smaller encoded speech files (if a tradeoff were possible in modern encoders), since clarity and information resides <8k hz in human speech. 8k to 10300 khz is openness feeling which is not even noticed in a loud work area using a passive noise canceling earpiece. But, >10.3 , in the right playback circumstances,  is great for background sounds and story telling with a stereo background sounds or acting out a story.  I do realize that I may be more conscious (and annoyed) about inefficiency and sustainability than most Americans, due partly to my formative impressionable years in the 1970s, where efficiency and sustainability was driven into students, like carbon is today.  Plus, I have managed to fill up one 128 gig phone with audio.