New Audio File Conversion Tool

Topic: New Audio File Conversion Tool (Read 15992 times) previous topic - next topic

0 Members and 1 Guest are viewing this topic.

New Audio File Conversion Tool

2016-04-27 04:01:17

Hi all,

I have been a long-time lurker of this forum, and have benefited greatly from the resources here.
Now I'm hoping that I can contribute something of value back to the community.

I have been in the process of developing a sample-rate conversion tool. The goal of this is to achieve very clean sample-rate conversions, with a minimum of artifacts. This project started out primarily as a coding exercise to help better understand some DSP principles, but it slowly seems to be growing into a potentially very useful tool. By virtue of the fact that it uses libsndfile, it can convert between around 20 different sound file formats.

The binaries, source code, and documentation are available on GitHub. There are two components:

resampler - CLI program which does the conversion
Ferocious File Converter - a GUI for resampler.exe (copy of resampler.exe included in package)

I have only managed to build it for Windows so far, but I have tried to keep the code reasonably portable for future Linux or Mac builds - just haven't had time to look at that yet. I would also like to add mp3/m4a/opus/wavpack support in the not too distant future.

I encourage everyone to try it out. Any critiques / bug reports / suggestions / contributions you have would be most welcome.

best regards,
Judd

Re: New Audio File Conversion Tool

Reply #1 – 2016-04-28 02:26:27

How does one set the output file in the GUI? I cannot seem to.

Also, would be nice to highlight a bunch of files at once. What src does it use? Wondering how it compares to SoX

Re: New Audio File Conversion Tool

Reply #2 – 2016-04-28 03:14:09

Quote from: ron spencer on 2016-04-28 02:26:27

How does one set the output file in the GUI? I cannot seem to.

Hi Ron - thanks for checking it out. I think I have found what you are referring to:

If you browse to a folder somewhere, and then type a new filename, it fails because the file doesn't exist (yet). Is that what you mean ? ~~i think i can fix that ...~~

Edit: This should be fixed now. The "Select Output File" Dialog will now happily accept files that don't exist.

Note also that with the output filename, if you first choose an input file, and then click into the blank "Output File" box, it will automatically generate a default one for you by appending "(Converted)" to the name (before the extension). You can also leave the output file box balank, and it will create the default (converted) name as well.

Quote from: ron spencer on 2016-04-28 02:26:27

Also, would be nice to highlight a bunch of files at once.

Agreed. batch-processing is definitely on my to-do list.

Quote from: ron spencer on 2016-04-28 02:26:27

What src does it use? Wondering how it compares to SoX

I wrote my own algorithms to do the SRC, but they are basically use the same interpolation/decimation techniques which other SRCs use. When using a "simple" ratio (eg 1:2 or 2:1), it uses a 511-tap Kaiser-Windowed FIR lowpass filter. When using "complicated" ratios, eg (96k->44.1k), it uses a much larger filter (32767 taps). This is of course slower, but the results are very clean, with detectable aliasing suppressed to under -150dB (below the noise floor of 16- and 24-bit).

Anyway, this is the response I get for a 0-48kHz sweep when down-converting from 96kHz to 44.1kHz:
https://github.com/jniemann66/ReSampler/blob/master/Swept_24.png

You can compare that to other converters (SoX included) on this site: http://src.infinitewave.ca/
(note: mine isn't submitted there yet - want to do even more tuning first)

cheers,
Judd

Re: New Audio File Conversion Tool

Reply #3 – 2016-04-28 23:27:47

Very nice fix. I like it!

Suggestions:

1. Maybe save settings for default output format? I would think this should be WAV in most cases.
2. In conjunction with 1, have no need to specify output name when specifying output folder. In this case, one could just set the output folder and the output name would automatically be the same as the input name (input_name.wav).
3. Perhaps echo the normalization and dither settings in the GUI output screen so the user can verify that the options are being used properly.
4. Do you think normalization should default to 0 and not 1, since most may convert to FLAC later with replaygain tags? Not sure on this.

Notes:

-It is not clear to me how the dither should work...what is triangular in this case?
-Will the GUI (ferocious) always have the latest SRC in it? It is a separate download from GitHub. It may be useful to have an option to set the location of the ReSampler as well, other than the first run.

I really like this!! Congrats!

Re: New Audio File Conversion Tool

Reply #4 – 2016-04-29 07:39:03

that's great feedback, Ron. Much appreciated. I'm working on implementing your suggestions ...
cheers,
Judd

Re: New Audio File Conversion Tool

Reply #5 – 2016-04-29 17:00:30

That is awesome...I'll keep checking github..are you file versioning there?

If you get a chance, can you go over in more detail your dither options (1 to 6) and how they relate to triangular, etc.

Re: New Audio File Conversion Tool

Reply #6 – 2016-04-29 19:07:51

OK,

I added the following changes:

1. Added an "Output File Options" menu item, which allows user to set:

suffix to be appended to output filename (or choose no suffix)
output directory
output file type

If, for example, you wanted the output file to have the same file name, but different folder, and force it to be a .wav file, you would set it like this:

Append Suffix: no (unchecked)
Output Directory: yes (checked), <your folder location>
Default to this file type: wav

Note: these options set the rules for the generation of the default output file name. You are of course free to manually change the output filename to whatever you want at any time

2. Added Menu item (Converter location ... ) to allow user to select the location of resampler.exe at any time.

These settings are persistent.

I have tried to test this as much as I can, but given the amount of fancy string manipulation involved, there may still be a couple of weird bugs in there. I think things will get interesting once I start porting it to Linux, due to the differences in File Systems between to OSes - no doubt, it will take some tweaking.

In regard to your other points:

Quote

Do you think normalization should default to 0 and not 1, since most may convert to FLAC later with replaygain tags? Not sure on this.

Notes:

-It is not clear to me how the dither should work...what is triangular in this case?
-Will the GUI (ferocious) always have the latest SRC in it? It is a separate download from GitHub. It may be useful to have an option to set the location of the ReSampler as well, other than the first run.

With Normalization, 0 means silence and 1.0 means the peaks will be amplified to the maximum possible level for the output format. Often, when mastering for CD, you would normalize to 0.98 or something like that, to prevent clipping on some older DACs which have a bit of overshoot. Normalizing controls the peak output level, but the overall loudness of the track is another thing again.

With dithering, "triangular" refers to a particular type of noise, called triangular probability density function (tpdf) noise, being deliberately added to the signal to reduce distortion. If you can imagine a game of monopoly, or backgammon, where you have two dice - when you roll the two dice together, the probability of getting a "medium-sized" number is higher than that of getting a really low or really high number. In fact, the probability curve for this resembles a triangle, hence the term "triangular". With dithering, it is similar, and the noise is based on adding two random numbers together. The dither noise is then "shaped" (EQ'd) to distribute the noise in a manner that is least audible (or objectionable) to our ears. This particular type of noise has some desirable properties, such as being decorrelated with the original signal. The "dither amount" controls the magnitude of dither added, in bits. You use this whenever you convert from 24-bit to 16-bit or from 16-bit to 8-bit, to help preserve the really quiet passages which would otherwise be below the threshold of the target format. It increases the dynamic range at the expense of having a bit of added noise. I would recommend using a value between 1 and 2. There are some good resources on the web that can explain it better than me. Nigel Redmon's video is quite good

With the distribution of the GUI, I have included the converter as a convenience, mainly to make it easy for people to try it out, because I've found that if people can't get it to work straight away, they may just give up. But, yes, they are two separate projects. usually when I update resampler.exe, I also update ferocious. I don't know if I'll continue to bundle them like that in the future - it may get too tricky to maintain. I think there is a better way on github using submodules etc, but I need to explore that.

I started versioning , too - this latest build is 1.0.1

Cheers,
Judd

Re: New Audio File Conversion Tool

Reply #7 – 2016-04-30 00:07:29

It is really getting better! Just a couple of points.

1. The options menu has a font colour of white. I did not see it right away because it is on a nearly white "menu bar" itself. Perhaps change the font to black?

2. Upon entering my preferences in the options bar and exiting, all was ok and remembered. But when I chose a file to convert, nothing was populated in the output file menu. If I clicked straight on convert, it did not save into the preferred folder.

For example, if I have chose output file options as directory H:\music\ and file type as wav, the output file menu portion is not populated as H:\music\test.wav once I have chosen my input files as test.flac

3. Maybe put the version number in the GUI up top? For example: Convert Sample Rate 1.0.1? Or Ferocious Audio Converter V1.0.1 (it is not really a file converter)....just nit picky....only relevant if you think it is.

Nice job so far!

Re: New Audio File Conversion Tool

Reply #8 – 2016-04-30 02:23:48

Quote from: ron spencer on 2016-04-30 00:07:29

1. The options menu has a font colour of white. I did not see it right away because it is on a nearly white "menu bar" itself. Perhaps change the font to black?

2. Upon entering my preferences in the options bar and exiting, all was ok and remembered. But when I chose a file to convert, nothing was populated in the output file menu. If I clicked straight on convert, it did not save into the preferred folder.

For example, if I have chose output file options as directory H:\music\ and file type as wav, the output file menu portion is not populated as H:\music\test.wav once I have chosen my input files as test.flac

3. Maybe put the version number in the GUI up top? For example: Convert Sample Rate 1.0.1? Or Ferocious Audio Converter V1.0.1 (it is not really a file converter)....just nit picky....only relevant if you think it is.

Thanks Ron. I made the following changes (now v1.0.2):

1. Updated CSS in menus to render them properly - it was an issue anytime you used "aero" style windows (I use old-school).
2. Previously, you had to click into the output filename textbox to trigger it to update. I have now added code to update it immediately upon editing the input filename, or browsing to a file. This feels much more responsive
3. I'm still mapping out the design for a proper versioning system. And, yes - maybe I should call it "Ferocious File conversion" instead of converter. Thanks for your feedback - it's been very helpful !

Re: New Audio File Conversion Tool

Reply #9 – 2016-04-30 04:32:10

Quote from: judd on 2016-04-30 02:23:48

2. Previously, you had to click into the output filename textbox to trigger it to update. I have now added code to update it immediately upon editing the input filename, or browsing to a file. This feels much more responsive

Note: just one more thing - output filenames are only auto-generated when the output filename box is empty. This is to prevent it overwriting a filename that the user may have intentionally put there.

Re: New Audio File Conversion Tool

Reply #10 – 2016-04-30 21:22:23

Looks good!!

One thing you may want to add in options is the default flac compression value. I use 5, but others like 8. This would likely be useful for those making flac output.

I am able to enter dither values from 0 to 8. Values above 8 revert to 1.0 and negative numbers revert to 0. Are you able to explain what happens as the values change from 0 to 8 (I have not watched the movie you suggested yet...sorry if it explains it...does it?).

Also, can you explain what you resample method does? I also have voxengo r8brain pro, which does not dither on truncates. It offers Linear phase, minimum phase and an ultra steep mode option. Is yours more like a linear phase?

To be clear, as long as I set normalize less than 1 there will be no clipping right?

excellent tool!

Re: New Audio File Conversion Tool

Reply #11 – 2016-05-01 15:46:06

Quote from: ron spencer on 2016-04-30 21:22:23

Looks good!!

One thing you may want to add in options is the default flac compression value. I use 5, but others like 8. This would likely be useful for those making flac output.

ok, I'll look into it.

Quote

I am able to enter dither values from 0 to 8. Values above 8 revert to 1.0 and negative numbers revert to 0. Are you able to explain what happens as the values change from 0 to 8 (I have not watched the movie you suggested yet...sorry if it explains it...does it?).

Yes, I limited the range to be from 0-8. The higher the number, the louder the dither ...
1 is the default. I can't imagine any practical reason reason for going higher than 8, except for "research" purposes. All it means is that if you set it to 1, it will fill the lowest bit of your output file with dither. If you set it to 4, it will fill the lowest 4 bits with dither etc.

If you create an 8-bit output file, and use 8-bits of dither, then the entire sound file would be dither (it would just be a bunch of noise). If you have a 16-bit file, with 8 bits of dither, then the lowest 8 bits of the file would be all dither etc. Note that you can also have "in-between" values, such as 0.7 or 3.2 - it's a continuum. This is possible because the dither magnitude is set prior to the final quantization stage when the file is being written out to disk. The best level to set it at depends on the program material. In practice (with 16-bit files), it is a challenge to hear the difference because it is all happening down around -90dB ! The most obvious place to listen for the effects of dither (or lack of dither) is in the fade-out at the end of a song.

I am still tweaking the dither algorithm to make it sound better. One process I have been using to test it, is to convert down to 8-bit, where the noise floor is still relatively loud (-42dB) and the effects are quite audible, as follows:

1. Take a passage of music in a 16- or 24-bit file and, with a wave editor, reduce the overall level to be too quiet for 8-bit (ie quieter than -42dB)
2. Convert that to 8-bit without any dither and verify that it is being truncated (will typically sound very distorted with lots of "sputtering" and "splatting" as the music goes below the threshold of the quietest levels 8-bit can normally produce)
3. For comparison, convert the same file to 8-bit, but this time with some amount of dither, and verify that the (quiet) music can now be heard properly in 8-bit, albeit with a wash of dither noise mixed in - ideally, it should be inoffensive and sound a bit like tape hiss - and not harsh or fatiguing to the ears.

By using the process above, I have been iteratively going back and auditioning and tuning the noise-shaping filter to produce the best results. This is somewhat subjective, and it is no surprise that there already exists a whole variety of different noise-shaping curves amongst all the commercial products out there. A lot of products offer a choice of several different noise-shaping curves, and I will probably do that too, once I have found a couple of formulas that I am really happy with.

Quote

Also, can you explain what you resample method does? I also have voxengo r8brain pro, which does not dither on truncates. It offers Linear phase, minimum phase and an ultra steep mode option. Is yours more like a linear phase?

Yes, it's linear phase. I'm working on minimum phase, but to be honest, the maths is doing my head in at the moment :-)
Interestingly, Aleksey Vaneev (author of r8brain and r8brain pro) has made the source code for his resampling routines available on gitHub. It would be tempting to use his library, but I like the challenge of working it out myself, so I'm going to keep slugging it out.

Quote

To be clear, as long as I set normalize less than 1 there will be no clipping right?

That's correct - actually, you are safe setting it to be less than or equal to 1. (At 1, the loudest peak will just touch the highest possible level). Any higher than 1 is guaranteed to clip (GUI won't let you do it, but command-line version will, and it will give you a warning...)

Quote

excellent tool!

Thank you very much.

Re: New Audio File Conversion Tool

Reply #12 – 2016-05-01 17:59:30

thanks for the reply..

I'll have to play with the dither values, as I am not sure how to relate them to some of the standard ones, such is triangular, MBIT in izotope and POWR in yet again others. With your converter I have 8 to choose from LOL. Perhaps renaming the dither setting that best matches TPDF could help users. If it was, say, 3, then 3 in the drop down menu would be listed as 3 (TPDF).

Since you seem interested in tackling the maths, I notice that iZotope offers an Auto-blanking options, whereby dither is stopped for sections of audio where "the input signal is completely silent (0 bits of audio) for at least 0.7 seconds." This could be a neat option on the main GUI page. Not sure it is easily to implement or not as I do not know how to code...I can only provide suggestions LOL.

I hope others are trying this tool...it is cool and fun to use.

Re: New Audio File Conversion Tool

Reply #13 – 2016-05-02 00:57:19

Quote from: ron spencer on 2016-05-01 17:59:30

thanks for the reply..

I'll have to play with the dither values, as I am not sure how to relate them to some of the standard ones, such is triangular, MBIT in izotope and POWR in yet again others. With your converter I have 8 to choose from LOL. Perhaps renaming the dither setting that best matches TPDF could help users. If it was, say, 3, then 3 in the drop down menu would be listed as 3 (TPDF).

It's always TPDF, just with an adjustable volume. You may be confusing the PDF with the noise shaping curve - these are two different aspects of dithering. The PDF refers to the distribution of random numbers used to make the noise. My program always uses triangular (TPDF). You may see other PDFs in other products such as rectangular or Gaussian. (rectangular is analogous to throwing 1 die, triangular is analagous to throwing 2 dice, Gaussian is analogous to throwing many dice ...). However, TPDF is generally considered to be the most effective for audio.

The noise shaping refers to how the noise is boosted or attenuated at specific frequencies to make the noise either less audible or less objectionable to human ears. This old paper shows measurements of the noise shaping curves of various commercial products, and compares them. It's pretty cool. The curve I have been working with sort of resembles the Cool-Edit "C1" curve - sort of - but is still quite unique.

Quote

Since you seem interested in tackling the maths, I notice that iZotope offers an Auto-blanking options, whereby dither is stopped for sections of audio where "the input signal is completely silent (0 bits of audio) for at least 0.7 seconds." This could be a neat option on the main GUI page. Not sure it is easily to implement or not as I do not know how to code...I can only provide suggestions LOL.

yep, I have already done this, but only in the CLI (using the --autoblank switch).
I just haven't got around to it in the GUI yet. With autoblank switched on, it will blank after 30,000 samples of silence (0.68 seconds for 44.1k, 0.31 seconds for 96k). Any input less than -193 dB is considered silence.

Quote

I hope others are trying this tool...it is cool and fun to use.

Me too. As you can see, I am willing to take feedback, so everyone who tries it and makes suggestions will be helping to make it better.

Cheers, Judd

Re: New Audio File Conversion Tool

Reply #14 – 2016-05-02 18:45:18

I think I'm getting it for your 1-8 settings for dither...I'll just play around and see, but I'll likely stay at 1.

I really like how you can have multiple input types (flac, etc). r8brain always bothered me because you cold only have wav in and wav out. Your program really saves time, has no installer and the resulting output sounds excellent.

Re: New Audio File Conversion Tool

Reply #15 – 2016-05-04 20:16:39

I noticed that change to 1.0.5 of the GUI with the Help-About tab. When using Help-About there is an error. See attached screen shot

Re: New Audio File Conversion Tool

Reply #16 – 2016-05-05 01:49:42

Quote from: ron spencer on 2016-05-04 20:16:39

I noticed that change to 1.0.5 of the GUI with the Help-About tab. When using Help-About there is an error. See attached screen shot

Thanks Ron - I'll check it out.

Re: New Audio File Conversion Tool

Reply #17 – 2016-05-05 12:48:53

Quote from: ron spencer on 2016-05-04 20:16:39

I noticed that change to 1.0.5 of the GUI with the Help-About tab. When using Help-About there is an error. See attached screen shot

Hi Ron - have you checked that Ferocious is pointing to the latest version of resampler.exe ?
(The reason I ask is that versions of resampler.exe prior to 1.0.3 don't understand the --version command, and will give the error you have shown above).

Let me know how you go.

PS - version 1.0.5 of resampler.exe is now up. I also refreshed the Ferocious packages to include it.
Ferocious and resampler are both aligned at version 1.0.5 now

Cheers,
Judd

Re: New Audio File Conversion Tool

Reply #18 – 2016-05-05 15:31:41

yes...it works fine now!

Looking good...not sure what else I can add at this point. If I remember the last things I suggested were

1. Ability to choose many files (batch)
2. Add GUI check for auto-blanking
3. Add option for FLAC compression level.

Again just suggestions...not demands.

Quite useful!

Re: New Audio File Conversion Tool

Reply #19 – 2016-05-05 16:13:32

Resampler.exe doesn't run on Windows XP, because of this:
https://hydrogenaud.io/index.php/topic,109766.msg906816.html#msg906816

I had to do the edit at 158H and 160H for Resampler.exe to be able to run on XP.
So it's a just a flag to turn on in the compiler.

More info about that:
https://software.intel.com/en-us/articles/linking-applications-using-visual-studio-2012-to-run-on-windows-xp
http://www.msfn.org/board/topic/172970-why-not-a-valid-win32-application-xp-programs/

Re: New Audio File Conversion Tool

Reply #20 – 2016-05-06 05:47:52

Quote from: Brazil2 on 2016-05-05 16:13:32

Resampler.exe doesn't run on Windows XP, because of this:
https://hydrogenaud.io/index.php/topic,109766.msg906816.html#msg906816

I had to do the edit at 158H and 160H for Resampler.exe to be able to run on XP.
So it's a just a flag to turn on in the compiler.

More info about that:
https://software.intel.com/en-us/articles/linking-applications-using-visual-studio-2012-to-run-on-windows-xp
http://www.msfn.org/board/topic/172970-why-not-a-valid-win32-application-xp-programs/

Thanks very much for this - I'll see what I can do to fix it. I'm using VC2015, but from what I can see in the intel.com article you provided, it should be possible to change the build options to make it work on XP.

Also, I noticed that Qt (which is used to build Ferocious) will be dropping support for XP in the forthcoming 5.7 version (I'm still using 5.5, though ... )

Cheers,
Judd

Re: New Audio File Conversion Tool

Reply #21 – 2016-05-06 06:20:27

Quote from: Brazil2 on 2016-05-05 16:13:32

Resampler.exe doesn't run on Windows XP, because of this:
https://hydrogenaud.io/index.php/topic,109766.msg906816.html#msg906816

Rightio - I've just rebuilt it on my system to work with Windows XP onwards, but I just want to do a but of regression testing first to make sure I haven't broken anything before I commit it. I don't have an XP machine handy at the moment, either - I might have to crank up a VM.

I also did a check on the PE header for ferocious.exe, and I note that it sets "Operating System Version" and "Subsystem Version" to 4.0 , so I assume this means that it is intended to run on NT 4.0 hehe ! (No way I can actually test that, right now, though)

Re: New Audio File Conversion Tool

Reply #22 – 2016-05-06 08:04:23

Current ReSampler 1.0.5 works on WinXP after editing the header (it requires SP2 or later). Actually only Subsystem matters. These fields are seldom set to the exact OS version required.

Re: New Audio File Conversion Tool

Reply #23 – 2016-05-06 08:19:00

I just put a new build up for XP - so let me know how that goes. (The only changes are to the binaries, so I'm not bumping the version number - there are no changes to functionality, at least not intentionally, anyway ...

)

Re: New Audio File Conversion Tool

Reply #24 – 2016-05-06 13:03:58

Quote from: judd on 2016-05-06 08:19:00

I just put a new build up for XP - so let me know how that goes.

Works

Thanks for that

Notice