HydrogenAudio

Lossy Audio Compression => MP3 => MP3 - Tech => Topic started by: GeorgeFP on 2009-08-02 12:58:05

Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-08-02 12:58:05
Hello,

I'd like to introduce my project "fpMP3Enc" - a multicore MP3 encoder based on LAME 3.98.2. It is a sample application/case study for my multicore library "Fiber Pool".

The source code can be downloaded here: http://www.fiberpool.de/en/downloads.html (http://www.fiberpool.de/en/downloads.html)

The current version is still mostly single-threaded for single-file encoding but scales for multiple-files encoding.

The multicore tweaks used in this version are:
- parallel/asynchronous conversion/scaling of PCM data to float samples
- parallel/asynchronous computation of replay gain
- I/O ordering (WAV files are read into memory first before MP3 files are written to disk)

With the following test system
- Intel Q9450@2.66GHz (Quad Core)
- 8 GiB RAM
- Windows Vista x64 (Superfetch disabled)
- 61 WAV files (2.99 GiB, about 5 hours play time)

I've got the following results:
LAME 3.98.2 (x32; rarewares): 24.6x
LAME64 3.98 (x64; mp3tech): 20.8x
fpMP3Enc (x64; single): 23.3x
fpMP3Enc (x64; multi): 80.2x

Single-file encoding is slower than the original LAME because I haven't SSE-optimized the code.
The scale factor in multiple-files encoding is 3.3 compared to LAME and 3.4 compared to "fpMP3Enc single", which is quite good for a first version.

The next steps will be to perform the psycho acoustics computation for each frame in parallel, then the MDCT, and so on.

Compile and usage:
- You need Visual Studio 2008 (and obviously Windows) to compile the project.
- SSE4 is enabled by default. To disable: #undef USE_SSE4
- Memory control is disabled by default in this version. This can lead to pagefile swapping if too many files are to be encoded. You can enable memory control by using a different 'MEMORY_MODE' macro (experimental).
- Win32 version: The total WAV file size sum must not exceed 400MiB.
- Win64 version: The total WAV file size sum must not exceed 400GiB.
- Input: WAV 16-bit stereo, 44.1kHz
- Output: MP3 Stereo/Joint-Stereo, 44.1kHz

I'd like to know what you think about this project so contact me if you have any comments or feedback.
George
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: Fandango on 2009-08-02 14:51:25
I'd like to know what you think about this project so contact me if you have any comments or feedback.

I think it's great. Nice to see how multi-threading encoding is finally starting off here and there. That there's not one single effort for one codec is good, too, IMHO. Different aproaches mostly in the form of different MT-libraries, it seems, need to be tried out and tweaked to see which one is ultimately superior in certain situations.

PS: Don't forget to translate the software license to English.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: Kitsuned on 2009-08-02 15:49:49
Doesn't foobar2000 already do this if you have more than one core in your computer?  I was getting similar numbers on quad core machine when I had to do some coding for my dad.  My core duo goes about 43x if its not running warm.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: nazgulord on 2009-08-02 15:55:10
I'm not sure, but isn't it that foobar2000 just starts 2 instances of the encoder rather than one multithreaded instance? Out of curiosity, would the multithreaded one be better?
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: Fandango on 2009-08-02 16:56:37
foobar2000 does not use multi-threading encoders, but it could. (Actually out of the box it uses no encoders, but what is meant is that most users will use single-threaded encoders, simply because those are the ones available.) nazgulord is correct.

A multi-threaded encoder would be better if you have less files to encode than you have cores, if your harddrive becomes the bottleneck when encoding multiple files at the same time, and then there might be other more code-related advantages that may make a sequential file encode with a mt-enabled encoder faster than several instances of a single-threaded encoder. For example, when multiple instances of the same executable are running it means 4 times the exact same code is executed in parallel, even code that might only be needed once in a real multi-threaded encoder. Modern CPUs share their cache, so there is some internal automatic optimisation, still a less generic mt-library can be better than that or even trigger hardware optimizations better.

But IMHO, the most apparent drawbacks of using multiple instances of single-threaded encoders are when you process very big files (whole CD images instead of track based files), because to the end of your batch encode as there are less files left than you have cores, efficiency drops as not all cores are being fully used anymore, yet finishing the big CD rips that are left might still take some minutes. It's even more apparent when you only want to transcode a single CD image, it's as (in)efficient as on a single core system. And also when you use very fast encoding settings usually on systems with more than two cores but normal non-RAID disk setups, the disk access of, for instance, four I/O streams may be more than the single hard disk can handle, so that the CPU load then drops far below 98-100%. And you want your CPU to be the bottleneck not your harddrive, because more CPU intense settings usually mean higher quality or better compression.

For example because of the latter I started to use the extra modes of the WavPack encoder in foobar2000. Transcoding four lossless audio files at once using -hh is simply more than my harddrive can handle.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: saratoga on 2009-08-02 17:21:22
Out of curiosity, would the multithreaded one be better?


If you only have 1 file, then yes.  Otherwise, running two encodes in parallel will almost certainly be faster due to less overhead from threading, synchronization, etc.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: Fandango on 2009-08-02 17:23:29
Otherwise, running two encodes in parallel will almost certainly be faster due to less overhead from threading, synchronization, etc.
Are you sure?
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: saratoga on 2009-08-02 17:38:16
Otherwise, running two encodes in parallel will almost certainly be faster due to less overhead from threading, synchronization, etc.
Are you sure?


Running two encodes in parallel results in essentially perfect parallelization.  The only overhead comes from disk contention, which is still a problem for the multithreaded single process case anyway.

Running one process will encounter additional overhead due to thread synchronization, lack of granularity in parallelism, overhead for inter-thread communication, etc.  In order to make up for this, there would have to be additional work saved by running in one process and I don't see what that would be for MP3.

Of course if you have 4 cores and only 3 files for instance, the second approach may still be faster, simply because it finds a use for the fourth core . . .  For instance when I multithreaded libmad, it gave a ~90% speed up, so it was actually slower then just running two files in parallel.  Of course in my case I only had one file to decode, so it ended up being the best solution
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-08-02 19:18:35
Doesn't foobar2000 already do this if you have more than one core in your computer?  I was getting similar numbers on quad core machine when I had to do some coding for my dad.  My core duo goes about 43x if its not running warm.

I cannot confirm this on my quad core. Converting the test set with foobar2000 took me about 6 minutes which gave a 47.4x speed.  (BTW, the test was encoding to CBR 128.)

Running two encodes in parallel results in essentially perfect parallelization.  The only overhead comes from disk contention, which is still a problem for the multithreaded single process case anyway.

This is not a problem in "fpMP3Enc". I/O processing is a separate task and is controlled by a file I/O scheduler that sorts and serializes I/O operations on the same drive. The encoder tasks work only on memory.

That's the reason why foobar2000 has a 47.4x performance while "fpMP3Enc" has 80.2x. I think, on an i7 it's possible to get 150x and above... unfortunately I don't have such a system to test it.

Quote
Running one process will encounter additional overhead due to thread synchronization, lack of granularity in parallelism, overhead for inter-thread communication, etc.  In order to make up for this, there would have to be additional work saved by running in one process and I don't see what that would be for MP3.

As you mentioned above, concurrent disk access is a problem when you execute multiple encoder processes in parallel. If you run them as tasks in one process you can use a file I/O scheduler that takes care of it, as I did.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: saratoga on 2009-08-02 19:23:11
Quote
Running one process will encounter additional overhead due to thread synchronization, lack of granularity in parallelism, overhead for inter-thread communication, etc.  In order to make up for this, there would have to be additional work saved by running in one process and I don't see what that would be for MP3.

As you mentioned above, concurrent disk access is a problem when you execute multiple encoder processes in parallel. If you run them as tasks in one process you can use a file I/O scheduler that takes care of it, as I did.


Why is there a speedup for scheduling sequential reads vs. letting the OS's file caching schedule?
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-08-02 20:04:24
Why is there a speedup for scheduling sequential reads vs. letting the OS's file caching schedule?

For example, if you have four processes where each one tries to read a file sequentially on the same disk you won't get sequential reads. The OS will split the I/O operations into pieces in order to feed each process.

My scheduler does not interrupt I/O operations. For example, if you want to read two 50 MiB files that reside on the same drive in 50 1 MiB chunks each, first the 50 I/O operations of the first file are performed and then the 50 I/O operations of the second file.

Of course, my library supports setting a memory limit.

I wrote an article about what I call "Parallel File Processing" (unfortunately it's only in German) in my blog . You can find the article here (http://blog.thinkmeta.de/2009/03/parallele-dateiverarbeitung/). This is the theory behind it.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: saratoga on 2009-08-02 20:40:35
So you're getting a huge speed up by essentially just buffering the entire file into memory before processing?
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: Axon on 2009-08-02 21:15:17
Yeah, that was probably a bad example.

In the more general case, an optimal scheduler would read just enough from one file before it needs to switch to reading for another file. This would lead to pretty big buffers, but not necessarily whole-file buffering.

No idea yet if this encoder is implementing something like that, but if it is, that sort of scheduler would be extremely valuable for other open-source applications.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-08-02 21:28:34
So you're getting a huge speed up by essentially just buffering the entire file into memory before processing?

To be precise, in this version ALL files are buffered into memory WHILE processing.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-08-02 22:13:34
In the more general case, an optimal scheduler would read just enough from one file before it needs to switch to reading for another file. This would lead to pretty big buffers, but not necessarily whole-file buffering.

This is possible with my framework (see SequentialVirtualMemory classes).

In "fpMP3Enc" I've not determined the optimal memory strategy yet because I'm still working on parallelizing the encoder. It will depend on the final performance.

Quote
No idea yet if this encoder is implementing something like that, but if it is, that sort of scheduler would be extremely valuable for other open-source applications.

The framework can be used for free in (totally) non-commercial applications.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: Fandango on 2009-08-02 22:50:30
In "fpMP3Enc" I've not determined the optimal memory strategy yet because I'm still working on parallelizing the encoder. It will depend on the final performance.

Will it be optimised for fpMP3Enc or will it be more adaptable to different encoders or even decoders?

There are codec settings that decode very slowly, Monkey's Audio's extra high for example. The encoding tasks might run idle if the I/O scheduler reads too much data from such files. Maybe when the scheduler adjusts the buffer size dynamically it should look in both directions, and if decoding takes longer than encoding even give the decoding tasks higher priority?

I know fpMP3Enc currently just supports WAV, but that might change in the future, or not be the case in other projects that might want to use your library.

Anyway, it's an interesting (and logical) approach to optimize an encoder for multi-cores without parallelizing that much of the original code, I guess.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-08-03 07:00:02
Will it be optimised for fpMP3Enc or will it be more adaptable to different encoders or even decoders?

The memory strategy is not set by the framework, it's set by the application and each memory object in the application can use a different strategy.

Quote
There are codec settings that decode very slowly, Monkey's Audio's extra high for example. The encoding tasks might run idle if the I/O scheduler reads too much data from such files. Maybe when the scheduler adjusts the buffer size dynamically it should look in both directions, and if decoding takes longer than encoding even give the decoding tasks higher priority?

This can be done by having a small buffer for the input files and a large buffer for the output files. Then the scheduler would fill the buffer for the first file first, switch to the next and so on and return to the first file to read the next chunk.

Quote
I know fpMP3Enc currently just supports WAV, but that might change in the future, or not be the case in other projects that might want to use your library.

For me, fpMP3Enc is just a case study or proof of concept for my multicore framework so you cannot expect new features from me. But the fpMP3Enc source code will become available under an open source license (probably GPL) for other developers to extend it.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-08-05 19:53:08
Hi again,

I've updated the application and added three command-line switches that show how it scales under different memory conditions:

--mem-use-ignore (default)
This switch should be used if you know that you have enough physical memory free for the task. If more memory is needed than expected, page file swapping may occur.

--mem-use-file <size in MiB>
With this switch you can specify the maximum buffer size to use for each file.

--mem-use-system-free <size in MiB>
This switch can be used to specify the physical memory size that should be kept free for other applications. fpMP3Enc will stop committing memory if the available physical memory falls below this value. To avoid a deadlock, the minimum committed memory for each memory object is 128 KiB. Page file swapping will not occur unless another application needs more memory than specified.

I've made some tests with these switches. The results are:
--mem-use-file 1: 58,5x
--mem-use-file 2: 71,7x
--mem-use-file 3: 80,5x
--mem-use-file 4: 78,8x
--mem-use-file 10: 80,2x
--mem-use-system-free 5500: 80,9x (peak mem usage was about 1 GiB on my 8 GiB system)
--mem-use-ignore: 77,8x

So, on my system I could use a 3 MiB buffer per file to get the best results. This values will vary on other systems.
I recommend to use the "--mem-use-system-free" switch with a value that's OK for your system.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-08-26 22:03:29
Hello, it's me again...

The last three weeks I was working on optimizations of the vbr-new algorithm. You can download the updated version from my web site.

The benchmarks (Vista x64, Intel Q9450@2.66GHz, 8 GiB RAM):
LAME 3.98.2 (x32): 32.3x
fpMP3Enc (x64; single file encoding): 60.3x
fpMP3Enc (x64; multi file encoding): 109.7x

This means that fpMP3Enc is about 87% or 1.87x faster than LAME in single file encoding, while the speedup is 3.4x in multi file encoding.

The following optimizations were performed:
- A frame buffer (20 MiB) is used to hold the work data for about 1000 frames
- Psycho acoustics are computed frame by frame in a separate task without data dependencies
- MDCT is computed frame by frame almost in parallel to psycho acoustics (right after attack detection, which is at the very beginning)
- MDCT (left channel) and MDCT (right channel) are computed in separte tasks (for details see here (http://blog.thinkmeta.de/2009/08/multicore-mp3-encoder-mdct-im-asynchronen-fluss/); English translation will follow)
- A small part of the "VBR_encode_frame" function is split into four tasks

ABR and vbr-old should also benefit from these changes. CBR is disabled in this version because this mode has some data dependencies that I have not handled yet.

To get best results you should not use more than 3 threads on quad cores or higher for single file encoding. The value can be set by using the "--threads" option.

There is still so much to optimize in the LAME code. Let's see how far it can get...
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: spoon on 2009-08-26 23:09:46
x80 encoding speed equates to 14MB a second read (from an uncompressed file, if lossless then half that). A modern HDD should be able to do 3x that without breaking a sweat, so I am not sure where the speed differentials come from when buffering to memory. Windows can have radically different read speeds depending on the transfer buffer size selected, off the top of my head on XP 64KB was the optimum size.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: Alexxander on 2009-08-27 08:41:41
The benchmarks (Vista x64, Intel Q9450@2.66GHz, 8 GiB RAM):
LAME 3.98.2 (x32): 32.3x
fpMP3Enc (x64; single file encoding): 60.3x
fpMP3Enc (x64; multi file encoding): 109.7x

This means that fpMP3Enc is about 87% or 1.87x faster than LAME in single file encoding, while the speedup is 3.4x in multi file encoding.

Thanks for sharing your work. I'm no expert on programming but would like some things to be cleared up.

How many threads did you use with the presented results of fpMP3Enc ? I suspect the multi file encoding is done with 4 cores. If so, Lame encoding 4 or more files with one file per core would result in about 4 x 32.3=129.2 times encoding speed.

Also, should a native well designed 64 bits encoder be nearly twice as fast as its 32 bits counterpart or do the Core2Duo chips handle this well through some kind of emulation?

I'm trying to understand what is measured here and where the speed gain is coming from.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-08-27 09:45:21
x80 encoding speed equates to 14MB a second read (from an uncompressed file, if lossless then half that). A modern HDD should be able to do 3x that without breaking a sweat, so I am not sure where the speed differentials come from when buffering to memory. Windows can have radically different read speeds depending on the transfer buffer size selected, off the top of my head on XP 64KB was the optimum size.

If you consider that 8 files (4x read, 4x write) are processed in parallel then 14MB/s is pretty good on a 100MB/s drive.

With the I/O strategy used in fpMP3Enc you could use 8 threads processing 16 files in parallel and get almost the same throughput on a quad-core. On an 8-core the throughput should be far above 20MB/s.

In contrast, if you execute 4 LAME processes for parallel encoding, the concurrent disk access will lead to an I/O performance below 14MB/s.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-08-27 10:12:42
How many threads did you use with the presented results of fpMP3Enc ? I suspect the multi file encoding is done with 4 cores. If so, Lame encoding 4 or more files with one file per core would result in about 4 x 32.3=129.2 times encoding speed.

I used 3 threads in single and 4 threads in multi file encoding for the CPU bound tasks. I/O always needs two threads, one for the I/O scheduler and one for listening to the I/O completion port.

For the reason why it's not possible to get a 129.2x speed with 4xLAME, see my previous post.

Quote
Also, should a native well designed 64 bits encoder be nearly twice as fast as its 32 bits counterpart or do the Core2Duo chips handle this well through some kind of emulation?

AFAIK, the CPU intensive parts of LAME are SSE-optimized using 64- and 128-bit registers on both systems, x32 and x64. So, I don't think that an x64 encoder would be much faster than a x32 encoder.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-09-30 07:04:03
Hello,

the "final" version of fpMP3Enc is out. The work on it is finished.

The new web site is: www.thinkmeta.de (http://www.thinkmeta.de)

There, you will find detailed information about the optimizations I did. The direct link to the benchmark page is: [Benchmark] (http://www.thinkmeta.de/en/fiberpool_case_studies_fpmp3enc_benchmarks.html)

The source code is now GPLv3 for further improvements (e.g. ID3 tagging) by other developers.

Have fun!
George
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: itisljar on 2009-09-30 07:48:05
I am sorry, but where are compiled binaries? I don't really want to compile the code myself.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-09-30 08:44:30
I am sorry, but where are compiled binaries? I don't really want to compile the code myself.


The main reason for not providing the binaries is that MP3 is not free and you have to pay for it. See here (http://mp3licensing.com/royalty/software.html).
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: PatchWorKs on 2009-09-30 10:51:29
The main reason for not providing the binaries is that MP3 is not free and you have to pay for it. See here (http://mp3licensing.com/royalty/software.html).

Why not Vorbis (http://www.vorbis.com/) (or even better Theora (http://www.theora.org/)) then ?
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: skamp on 2009-09-30 13:15:05
Why not Vorbis (http://www.vorbis.com/)

Especially since Vorbis doesn't have MP3's bit reservoir constraint (from what I understand).

(or even better Theora (http://www.theora.org/)) then ?

Well, Theora is a video codec, not really on topic…
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-09-30 13:31:04
Why not Vorbis (http://www.vorbis.com/) (or even better Theora (http://www.theora.org/)) then ?

I'm thinking of making Theora multicore-capable but have not decided yet. Other projects are also very interesting, e.g. FLAC.

You should know that I'm neither an audio nor an video codec developer. And to be honest, my knowledge about MP3 is very poor.

What I wanted to show with these case studies was how algorithms can be parallelized even if everybody thinks it's too difficult or impossible. MP3 encoding was a good example for this... so I took that.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: twist3d on 2009-10-02 12:27:25
Any chance getting a compile to rarewares?
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-10-28 09:02:45
Hi again,

meanwhile, I've added ID3 tagging support to fpMP3Enc to make it usable for frontends. If there's something missing, please let me know.

I was also able to make the "vbr-new" algorithm even faster: Running with 3 threads on my quad-core system the speedup was 2.59x compared to LAME and 2.75x to single-threaded fpMP3Enc which gives a 3-core efficiency of 92%.

If you download (http://www.thinkmeta.de/en/fiberpool_download.html) the file, you will find another sample application inside, called "fpStream", which in my eyes is the next level of multicore stream processing:

While fpMP3Enc is "single/multi file -> single encoding", fpStream is "single/multi file -> single/multi encoding". This means you can take one or more input files and perform one or more encodings on each of them.

For example, you can take one WAV file and encode one CBR MP3 and one VBR MP3 file.

I've also added Vorbis support which is in initial state because you cannot set any parameters. You will need x64 versions of libogg.dll and libvorbis.dll to use it.

fpStream is the framework that is responsible for CPU, memory and disk usage. The actual stream processing is done in the plugins, provided by me or by others. Unfortunately, the code is not well documented.

Next, I will add some parameters for Vorbis encoding and implement a plugin for FLAC encoding and decoding.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: punkrockdude on 2009-10-28 10:43:09
Anyone who has compiled this encoder that could send it to me? I have zero knowledge about compiling and therefore I can't do it myself and instead I ask. I use x64 Win7 if that helps. Regards
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-10-28 19:02:59
Anyone who has compiled this encoder that could send it to me? I have zero knowledge about compiling and therefore I can't do it myself and instead I ask. I use x64 Win7 if that helps. Regards


I asked Roberto from RareWares if he can put the binaries on his site.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-10-29 09:53:07
I asked Roberto from RareWares if he can put the binaries on his site.

Good news. The binaries will be available for download soon.

With the initial version of the FLAC plugin it's possible to batch-convert FLAC files (44.1kHz/2ch/16bit) directly to MP3.

The command-line for FLAC to MP3 VBR is:

FPSTREAM filemask *.flac ( readfile flac*i -f "*fp" + flacdec fdec*i -s flac*i + fpmp3enc menc*i -s fdec*i --vbr-new + fpwritemp3file mp3*i -s menc*i -f "*n.mp3" )

You can also add another MP3 encoding task by extending the command line before the ')':

+ fpmp3enc <an ID1>*i -s fdec*i [MP3 options] + fpwritemp3file <an ID2>*i -s <an ID1>*i -f <file name>

Performance is quite good...
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-11-03 05:39:42
The download link is: http://www.rarewares.org/mp3-others.php#fpmp3enc (http://www.rarewares.org/mp3-others.php#fpmp3enc)

Many thanks to Roberto from RareWares.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: Cokemonkey11 on 2009-11-03 06:08:59
Any chance in a GUI frontend? I'm interested in using this, and I think it will fill a gap in the market.

Vorbis/FLAC/MP3 should be the top priority!

Cheers,
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: sld on 2009-11-03 06:43:09
The figures thrown around here are fantastic. Many thanks to George for reviving LAME development with regards to speed.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-11-03 06:46:25
Any chance in a GUI frontend? I'm interested in using this, and I think it will fill a gap in the market.

Yes, but it will take some time.

Quote
Vorbis/FLAC/MP3 should be the top priority!

Yes, I have already planned full support for these formats in 2009.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: punkrockdude on 2009-11-03 07:10:07
Any chance in a GUI frontend? I'm interested in using this, and I think it will fill a gap in the market.

Vorbis/FLAC/MP3 should be the top priority!

Cheers,

A GUI would be awesome. Would be enough with a version that allows us (me at least) to write the normal lame switches and translates to the switches that are being used here instead.

Best Regards
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: Cokemonkey11 on 2009-11-06 01:45:12
Any chance in a GUI frontend? I'm interested in using this, and I think it will fill a gap in the market.

Yes, but it will take some time.

Quote
Vorbis/FLAC/MP3 should be the top priority!

Yes, I have already planned full support for these formats in 2009.


Awesome! If your encoder is cross-platform I'd highly recommend GTK+ for GUI development.

I'll be sure to donate to the project as a way of saying thanks when the time comes.

Cheers,
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-11-06 09:30:18
Awesome! If your encoder is cross-platform I'd highly recommend GTK+ for GUI development.

Right now, the encoder supports only Windows OSes (recommended Vista x64, 7 x64) so I will probably use the .NET/WPF framework for the GUI. But Linux (x64) support is planned for Q1/2010.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: Brent on 2009-11-06 12:32:24
Have you thought about using .net/mono with gtk#? Will save you time when porting to linux.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: Xire on 2009-11-06 12:58:00
Any plans to support STDIN/STDOUT?
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-11-06 13:57:42
Have you thought about using .net/mono with gtk#? Will save you time when porting to linux.


That will not work because I want to use "Blend" for designing the GUI, for self-education.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-11-06 14:08:43
Any plans to support STDIN/STDOUT?

I will add stdin/stdout support in one of the next versions.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: itisljar on 2009-11-06 16:58:35
C2Q, 32 bit encoder, 32 bit windows 7, i know it says not supported... but:

Code: [Select]
Problem signature:
  Problem Event Name:    APPCRASH
  Application Name:    fpMP3Enc.exe
  Application Version:    0.0.0.0
  Application Timestamp:    4ae7fca8
  Fault Module Name:    fpMP3Enc.exe
  Fault Module Version:    0.0.0.0
  Fault Module Timestamp:    4ae7fca8
  Exception Code:    c0000005
  Exception Offset:    00016a50
  OS Version:    6.1.7600.2.0.0.256.48
  Locale ID:    1050
  Additional Information 1:    7dbb
  Additional Information 2:    7dbb9888de14eff1acc0059d88e00f09
  Additional Information 3:    2a64
  Additional Information 4:    2a64067aa8f6e15f007b8e41b2bd4c3e


It worked at home on C2D processor, the same OS, even from the same installation media
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-11-06 17:22:01
C2Q, 32 bit encoder, 32 bit windows 7, i know it says not supported... but:
...
It worked at home on C2D processor, the same OS, even from the same installation media

From my initial post:

Quote
- Win32 version: The total WAV file size sum must not exceed 400MiB.

Maybe this was the problem?

The reason why 32-bit is not supported is that the application reserves a lot of virtual memory and often exceeds the 2 GiB limit for 32-bit.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: Alexxander on 2009-11-06 17:58:14
It worked at home on C2D processor, the same OS, even from the same installation media

It didn't work on my Win7 32 bits on Intel C2D with wav files of about 40MBytes, an mp3 file of 0 Bytes is created.

Code: [Select]
Faulting application name: fpMP3Enc.exe, version: 0.0.0.0, time stamp: 0x4ae7fca8
Faulting module name: fpMP3Enc.exe, version: 0.0.0.0, time stamp: 0x4ae7fca8
Exception code: 0xc000001d
Fault offset: 0x0000bfe7
Faulting process id: 0x120c
Faulting application start time: 0x01ca5c677d812eb7
Faulting application path: F:\Progs\fpMP3Enc\x86 (not supported)\fpMP3Enc.exe
Faulting module path: F:\Progs\fpMP3Enc\x86 (not supported)\fpMP3Enc.exe
Report Id: bc443a97-c85a-11de-bb11-000ea6f7d2d1
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-11-06 19:57:56
Exception code: 0xc000001d

Now I see: The binary was compiled with SSE4 support but C2D's (except "Wolfdale") don't support this instruction set.

I will think about a solution.

People who have VS2008 installed can solve this problem by building the project without SSE4 support (see "#define USE_SSE4" in stdafx.h).
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: itisljar on 2009-11-06 21:11:07
From my initial post:
Quote
- Win32 version: The total WAV file size sum must not exceed 400MiB.

Maybe this was the problem?
The reason why 32-bit is not supported is that the application reserves a lot of virtual memory and often exceeds the 2 GiB limit for 32-bit.


Yep, that is the problem - the file was 500+ megs. OK. Too bad it doesn't work with bigger wav files - well, the option is to encode tracks instead images
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: smeargol on 2009-11-09 23:34:05
Very good work!
I would like to try it out, but so far the compiled version from Roberto has crashed on me every time I tried to use it   
I think the problem is my CPU, as I have an AMD Phenom2, which doesn't support the SSE4 instruction set. 

So, is there any possibility of getting a non-SSE4 binary?

I have been trying to compile it myself, but I have pretty much no idea what I'm doing, I just wanted to try this really badly ^^

Thank you very much in advance!


/me is off to uninstall MSVC++ Express Edition...
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-11-10 06:08:48
So, is there any possibility of getting a non-SSE4 binary?

I have been trying to compile it myself, but I have pretty much no idea what I'm doing, I just wanted to try this really badly ^^

I must admit that building the binaries from version 17 is a bit difficult since some 3rd party files are missing. I will add them in the next release.

I will also add a text file with step-by-step instructions for non-programmers that will help to build the project - starting with the MSVC++ download 
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: smeargol on 2009-11-10 13:01:56
I must admit that building the binaries from version 17 is a bit difficult since some 3rd party files are missing. I will add them in the next release.

I will also add a text file with step-by-step instructions for non-programmers that will help to build the project - starting with the MSVC++ download 


Thank you very much for the quick reply!

It is very nice of you to include instructions on how to compile, I just hope the next release is soon, I can't wait to try it out


So, thanks again, looking forward to the next release 
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-11-18 17:01:12
It is very nice of you to include instructions on how to compile, I just hope the next release is soon, I can't wait to try it out

Bad news: The Visual C++ 2008 Express Edition does not accept the solution file that was created by the Professional Edition

But it's possible to download a trial version of the Professional Edition. I will try it the next days.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: smeargol on 2009-11-20 21:58:10
Bad news: The Visual C++ 2008 Express Edition does not accept the solution file that was created by the Professional Edition

But it's possible to download a trial version of the Professional Edition. I will try it the next days.


Thanks for keeping us updated! Even bad news is news, and that you've already found an alternative is great to hear too!

I hope the trial isn't as restricted as the Express Edition, as in "dear god I hope it will compile on that" 

Good luck!
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: jamesbaud on 2009-12-03 07:29:14
I asked Roberto from RareWares if he can put the binaries on his site.

Good news. The binaries will be available for download soon.

With the initial version of the FLAC plugin it's possible to batch-convert FLAC files (44.1kHz/2ch/16bit) directly to MP3.

The command-line for FLAC to MP3 VBR is:

FPSTREAM filemask *.flac ( readfile flac*i -f "*fp" + flacdec fdec*i -s flac*i + fpmp3enc menc*i -s fdec*i --vbr-new + fpwritemp3file mp3*i -s menc*i -f "*n.mp3" )

You can also add another MP3 encoding task by extending the command line before the ')':

+ fpmp3enc <an ID1>*i -s fdec*i [MP3 options] + fpwritemp3file <an ID2>*i -s <an ID1>*i -f <file name>

Performance is quite good...


Could you provide a screenshot to configure foobar to use fpMP3Enc?

See my other request also:
http://www.hydrogenaudio.org/forums/index....st&p=671021 (http://www.hydrogenaudio.org/forums/index.php?showtopic=76193&view=findpost&p=671021)

Thanks.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: sandhuatha on 2009-12-05 20:59:13
Thanks a lot for this utility. I am very happy to see my Core i7 / 6GB machine used to it's max

How do I get this tool to recursively go into folders, find the *.flac files and convert them to mp3 in the same folder?
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-12-05 22:51:26
How do I get this tool to recursively go into folders, find the *.flac files and convert them to mp3 in the same folder?

The command-line has to be extended for that. I've added it to my list.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: edwardar on 2009-12-30 16:16:54
Any chance someone could give a foobar2000 command line for this?  I can never work out the whole pipeline thing!

I want to go from FLAC to MP3 as fast as possible!

Thanks.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2009-12-31 09:08:10
Any chance someone could give a foobar2000 command line for this?  I can never work out the whole pipeline thing!

I want to go from FLAC to MP3 as fast as possible!

Thanks.

The tool 'fpFLAC2MP3' is the one you need. AFAIK you cannot use it with foobar2000.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: -sanb- on 2010-01-02 14:06:52
how can i use them with winamp?
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: flapane on 2010-01-18 21:00:23
Will you add mp3 transcoding (mp3 to mp3) support?
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2010-01-18 21:41:20
Will you add mp3 transcoding (mp3 to mp3) support?

Currently, there are no plans to add new features to MP3.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: LordCorvin on 2010-01-23 19:06:05
Sorry if that've been asked already, but is there any objective tests of produced MP3s quality? As I understand - the process is touching a very low levels of encoder. Is it safe to use it for usual purposes and not testing - only?

Thanks.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2010-01-23 19:46:07
Sorry if that've been asked already, but is there any objective tests of produced MP3s quality? As I understand - the process is touching a very low levels of encoder. Is it safe to use it for usual purposes and not testing - only?

The quality should be exactly the same as LAME since I didn't modify the algorithm. I just split the serial code into asynchronous parts and let them execute in parallel. All encoding features including bit reservoir are still present.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: LordCorvin on 2010-01-23 21:27:20
Sorry if that've been asked already, but is there any objective tests of produced MP3s quality? As I understand - the process is touching a very low levels of encoder. Is it safe to use it for usual purposes and not testing - only?

The quality should be exactly the same as LAME since I didn't modify the algorithm. I just split the serial code into asynchronous parts and let them execute in parallel. All encoding features including bit reservoir are still present.


So, in theory, I shall get bitwise-equal output from your version and the original one? If that's true, it's indeed safe to use it without any quality tests
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2010-01-23 22:04:20
So, in theory, I shall get bitwise-equal output from your version and the original one?

Yes, if you compare the floats (with a small error) instead of the bits.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: q_b6 on 2010-01-24 01:12:09
Doesn't foobar2000 already do this if you have more than one core in your computer?  I was getting similar numbers on quad core machine when I had to do some coding for my dad.  My core duo goes about 43x if its not running warm.

I cannot confirm this on my quad core. Converting the test set with foobar2000 took me about 6 minutes which gave a 47.4x speed.  (BTW, the test was encoding to CBR 128.)

Running two encodes in parallel results in essentially perfect parallelization.  The only overhead comes from disk contention, which is still a problem for the multithreaded single process case anyway.

This is not a problem in "fpMP3Enc". I/O processing is a separate task and is controlled by a file I/O scheduler that sorts and serializes I/O operations on the same drive. The encoder tasks work only on memory.

That's the reason why foobar2000 has a 47.4x performance while "fpMP3Enc" has 80.2x. I think, on an i7 it's possible to get 150x and above... unfortunately I don't have such a system to test it.

Quote
Running one process will encounter additional overhead due to thread synchronization, lack of granularity in parallelism, overhead for inter-thread communication, etc.  In order to make up for this, there would have to be additional work saved by running in one process and I don't see what that would be for MP3.

As you mentioned above, concurrent disk access is a problem when you execute multiple encoder processes in parallel. If you run them as tasks in one process you can use a file I/O scheduler that takes care of it, as I did.


I think fb2k's relative poor performance is somehow due to Windows's poor pipe performance.
Try to write a fb2k compatible version and compare it with lame ?
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2010-01-24 18:11:26
I think fb2k's relative poor performance is somehow due to Windows's poor pipe performance.

I don't know how foobar2000 handles concurrent file access, but it seems that I've found a better way by using an I/O scheduler.
Quote
Try to write a fb2k compatible version and compare it with lame ?

In the test, I compared fpMP3Enc with foobar2000+LAME.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: flapane on 2011-06-18 17:55:34
Hi,
could you explain me why it runs single threaded on a single encoding (ie. one files only)?
Is it related to how LAME works?
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2011-06-18 20:14:15
By design, the encoder will run with at least three threads (1 CPU, 2 I/O). If you don't set the number of CPU threads in the command-line, the number of threads will be <number of processors> + 2.

If the encoding speed feels like being single-threaded, then either you're using CBR or ABR, or the number of threads is too high. Currently, using more than three CPU threads would slow down the encoding.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: JJZolx on 2012-05-21 02:10:26
Any thoughts to updating this to make it based on LAME 3.99.5?
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2012-05-21 19:17:27
Any thoughts to updating this to make it based on LAME 3.99.5?

Ha, coincidentally two days ago I've started porting LAME 3.99.5 to C++ using my new multicore framework. It will take some time...
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: lar1r on 2012-08-10 21:48:33
Any thoughts to updating this to make it based on LAME 3.99.5?

Ha, coincidentally two days ago I've started porting LAME 3.99.5 to C++ using my new multicore framework. It will take some time...


Interesting.  Will it be able to compile using the latest VS2012?
Really looking forward to this.
Any idea how it does on newer cpus (Like my FX CPU)

Thx
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: GeorgeFP on 2012-08-11 11:33:24
Interesting.  Will it be able to compile using the latest VS2012?
Really looking forward to this.
Any idea how it does on newer cpus (Like my FX CPU)


Currently, I'm working with VS2010 but I will switch to VS2012 when it's available.

At the moment I cannot say anything about the performance but the plan is to get a 4x speedup on an Intel 2600K.
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: lar1r on 2012-08-13 15:24:50
Interesting.  Will it be able to compile using the latest VS2012?
Really looking forward to this.
Any idea how it does on newer cpus (Like my FX CPU)


Currently, I'm working with VS2010 but I will switch to VS2012 when it's available.

At the moment I cannot say anything about the performance but the plan is to get a 4x speedup on an Intel 2600K.


Sounds impressive. I'll be checking back daily on this thread! Can't wait.

FYI - VS2012 RC is available here: http://www.microsoft.com/visualstudio/11/en-us/downloads (http://www.microsoft.com/visualstudio/11/en-us/downloads)
if you ever want to try. (it'll be ideal for win8 and newer cpu architectures).
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: JJZolx on 2014-01-17 15:25:17
George, has there been any progress on an update to use LAME 3.99.5?
Title: fpMP3Enc: a multi-core MP3 encoder based upon LAME 3.98.2
Post by: goa pride on 2014-01-17 21:48:11
George, has there been any progress on an update to use LAME 3100a?