Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: FLAC unicode patch: some help wanted (Read 1759 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

FLAC unicode patch: some help wanted

Hi all,

As some of you might know, FLAC and Unicode support on Windows have a bit of a past. Through the last few years, various windows-specific bits were added and changed. Some of these changes were visible on the outside (i.e. the functions exposed by the DLL, the interface), most of them were not.

Anyway, I've proposed yet another change that I hope will be the last change that is visible on the DLL interface. With this change, the libFLAC interface should now be the same cross-platform. However, that involves a change that might not be fully backward compatible: filenames passed to libFLAC have always been UTF-8 on all platforms except Windows. On Windows, it was dependent on the active codepage. With this change, filenames passed to libFLAC are always UTF-8, also on Windows.

I don't think this change will affect a lot of programs, because unicode-aware programs probably weren't using these functions at all (because they weren't usable with unicode on Windows), and I expect non-unicode-aware programs using libFLAC are extinct.

That's why I'd like to ask for your help: I've prepared a libFLAC.dll with quite a few patches that are not yet applied to git. Patches included are:
- UTF-8 on Windows change (which is what this topic is about)
- Compression improvement patch, discussed on HA here
- Bug fix, discussed on HA here
- Fixed subframe speed improvement
- some build system improvement (which made it possible for me to build this DLL with MinGW)

If you know a program that uses libFLAC as a DLL, please test whether it still works if you replace that DLL and encode and decode files with non-ASCII characters (Cyrillic, Greek, Hanzi, emoji, etc.) in its filename, and let me know.
Music: sounds arranged such that they construct feelings.

Re: FLAC unicode patch: some help wanted

Reply #1
Is it 64-bit DLL? Is it possible to make and share 32-bit one?


Re: FLAC unicode patch: some help wanted

Reply #3
Yes, the one in the first post was 64-bit. Didn't occur to me that a 32-bit version is probably very handy to test with programs that are a little less recent. Here is the 32-bit version.
Music: sounds arranged such that they construct feelings.

Re: FLAC unicode patch: some help wanted

Reply #4
- UTF-8 on Windows change (which is what this topic is about)
- Compression improvement patch, discussed on HA here
- Bug fix, discussed on HA here
- Fixed subframe speed improvement
- some build system improvement (which made it possible for me to build this DLL with MinGW)

What's taking so long for these to be merged, anyway?

Also, could you provide flac.exe as well instead of just a .dll?

Re: FLAC unicode patch: some help wanted

Reply #5
The point of this discussion is to test the DLL interface for compatibility, so providing an exe would be off-topic and probably derail the discussion.

Seems the FLAC maintainer hasn't got time/motivation to work on FLAC. He merged a bunch of PRs on March 15 2021, and his last commits before that are from May 14 2020. Someone else involved with Xiph/Mozilla has merged a few PRs, but the last activity (merging, commenting etc.) by anyone with write access to the FLAC repository has been well over 6 months ago.
Music: sounds arranged such that they construct feelings.

Re: FLAC unicode patch: some help wanted

Reply #6
Seems the FLAC maintainer hasn't got time/motivation to work on FLAC. He merged a bunch of PRs on March 15 2021, and his last commits before that are from May 14 2020. Someone else involved with Xiph/Mozilla has merged a few PRs, but the last activity (merging, commenting etc.) by anyone with write access to the FLAC repository has been well over 6 months ago.

Well that's a shame. Might be time for a fork.

Re: FLAC unicode patch: some help wanted

Reply #7
I tested 32-bit dll with qaac, refalac, freac and xrecode 3 on 32-bit windows 7 with file named "青木ヶ原樹海-Hliðskjálf-Δ δ-λάμβδα" and it works.
dll also works with SoX (for non-unicode filenames, because SoX itself doesn't support unicode)

Re: FLAC unicode patch: some help wanted

Reply #8
Thanks for posting the results. I myself tested CDex and CUETools, which all worked fine. It seems various developers have taken the warning in the api documentation seriously and not used FLAC__stream_decoder_init_file and FLAC__stream_encoder_init_file

Quote
If POSIX fopen() semantics are not sufficient, (for example, with Unicode filenames on Windows), you must use FLAC__stream_decoder_init_FILE(), or FLAC__stream_decoder_init_stream() and provide callbacks for the I/O.

With this patch it becomes

Quote
On Windows, filename must be a UTF-8 encoded filename, which libFLAC  internally translates to a appropriate  presentation to use with _wfopen
Music: sounds arranged such that they construct feelings.

Re: FLAC unicode patch: some help wanted

Reply #9
Seems the FLAC maintainer hasn't got time/motivation to work on FLAC. He merged a bunch of PRs on March 15 2021, and his last commits before that are from May 14 2020. Someone else involved with Xiph/Mozilla has merged a few PRs, but the last activity (merging, commenting etc.) by anyone with write access to the FLAC repository has been well over 6 months ago.
Did you try to just ask those guys (rillian and erikd) to simply commit whatever you want?) You seem to be the most involved in FLAC now, and competent enough to prepare next release, and they probably know this.

Re: FLAC unicode patch: some help wanted

Reply #10
Did you try to just ask those guys (rillian and erikd) to simply commit whatever you want?) You seem to be the most involved in FLAC now, and competent enough to prepare next release, and they probably know this.

Agreed. @ktf if the previous/current maintainers don't have the time or motivation to maintain the repository but you do, they should just make you the new maintainer. I have no idea how one goes about requesting to become the FLAC repository maintainer however, but the amount of work and dedication you've put into developing FLAC the past several months is exceptional and should serve to validate the qualification.

Re: FLAC unicode patch: some help wanted

Reply #11
I've been busy reaching out and have high hopes a new release will be out soon.
Music: sounds arranged such that they construct feelings.

Re: FLAC unicode patch: some help wanted

Reply #12
About unicode...
It is possible to make usage of separators (dot or comma) in flac.exe options independent of system locale for dot to be the only variant? Because this dependence seems very stupid, regarding the fact that most of command-line tools don't have such dependence nowadays.

Re: FLAC unicode patch: some help wanted

Reply #13
It is possible to make usage of separators (dot or comma) in flac.exe options independent of system locale for dot to be the only variant?

flac.exe understands e.g. 4e-1 for zero point four.

That should be in the documentation.
Last two months' worth of foobar2000.org ad revenue has been donated to support war refugees from Ukraine: https://www.foobar2000.org/

Re: FLAC unicode patch: some help wanted

Reply #14
It is possible to make usage of separators (dot or comma) in flac.exe options independent of system locale for dot to be the only variant?
I suspect there are users that depend on this behaviour, I'd rather not change it.

Quote
Because this dependence seems very stupid, regarding the fact that most of command-line tools don't have such dependence nowadays.
I think it is very stupid to ignore this. Why would a user set locale if applications ignore it anyway? The real problem here is that programs are inconsistent, a bunch of them ignore locale and others do not. I'd rather let users decide what they want, if they don't want the comma as a decimal separator they can change their locale.

As @Porcus said, using scientific notation provides a locale-independent way to set this.
Music: sounds arranged such that they construct feelings.

Re: FLAC unicode patch: some help wanted

Reply #15
So, maybe it worth at least make it clear in flac --explain ? Both, the fact of dependece and "scientific" way.

 

Re: FLAC unicode patch: some help wanted

Reply #16
Yeah, both flac --explain and https://xiph.org/flac/documentation_tools_flac.html should have e.g.
Code: [Select]
  -6, --compression-level-6          Synonymous with -l 8 -b 4096 -m -r 6
                                        -A tukey(5e-1) -A partial_tukey(2)
  -7, --compression-level-7          Synonymous with -l 12 -b 4096 -m -r 6
                                         -A tukey(5e-1) -A partial_tukey(2)
  -8, --compression-level-8, --best  Synonymous with -l 12 -b 4096 -m -r 6
                    -A tukey(5e-1) -A partial_tukey(2) -A punchout_tukey(3)

... and that is what they are actually synonymous with  ;)

(Whether or not the functions should be given with multiple -A or with semicolon ...)
Last two months' worth of foobar2000.org ad revenue has been donated to support war refugees from Ukraine: https://www.foobar2000.org/