Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: "Latinize" or "Deaccent" filenames? (Read 1305 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

"Latinize" or "Deaccent" filenames?

Hi,

I was searching the wiki and forum, but couldn't find anything. Either the functionality is not present or I used wrong keywords.

Is there a way that CUETools and CUERipper could use only the standard latin ASCII characters for names of created files? There are still filesystems that have problems handling specific national characters in filenames.

Re: "Latinize" or "Deaccent" filenames?

Reply #1
Default setting in CUETools is 'Force ANSI filenames' (in CUERipper is FilenamesANSISafe=1 in \CUERipper\settings.txt)
To remove special characters from ANSI, also tick 'Remove special characters except:" in CUETools (RemoveSpecialCharacters=1 in \CUERipper\settings.txt)
Any characters you want to keep can be added in the box below the setting (without spaces) (SpecialCharactersExceptions=-() in \CUERipper\settings.txt). Note: < > : " / \ | ? * are restricted by Windows so they cannot be excluded.

The 'Remove special characters' setting will remove any character other than a-z, A-Z, 0-9, space or underscore and replace them with the underscore '_' character. There are no configurable substitutions (currently).
korth

Re: "Latinize" or "Deaccent" filenames?

Reply #2
Thanks for your answer.

Removing special characters is some workaround, but the resulting filenames aren't "pretty".

I used the following simple code to "deaccent" in C#. I am not sure how easy would it be to port this to CUETools:

Code: [Select]
        public string DoDeaccent(string aVal)
        {
            string pRet = null;

            if ( aVal == null )
                return pRet;

            var normalizedString = aVal.Normalize(NormalizationForm.FormD);
            var stringBuilder = new StringBuilder();

            foreach ( var c in normalizedString ) {
                var unicodeCategory = CharUnicodeInfo.GetUnicodeCategory(c);
                if ( unicodeCategory != UnicodeCategory.NonSpacingMark ) {
                    stringBuilder.Append(c);
                }
            }

            pRet = stringBuilder.ToString().Normalize(NormalizationForm.FormC);

            return pRet;
        }

Re: "Latinize" or "Deaccent" filenames?

Reply #3
The library that I used in C# is UnidecodeSharpFork:
https://bitbucket.org/DimaStefantsov/unidecodesharpfork/src/master/

But the original code was created and is maintained in Python (as noted on this web page):
Quote
History

Original character transliteration tables on Perl:
2001, Sean M. Burke sburke@cpan.org
http://search.cpan.org/~sburke/Text-Unidecode-0.04/lib/Text/Unidecode.pm

Python code and later additions:
2011, Tomaz Solc tomaz.solc@tablix.org
http://pypi.python.org/pypi/Unidecode

Original C# port:
2010, Oleg Ussanov
http://unidecode.codeplex.com/

Maybe this can be included in CUETools in the future?

Re: "Latinize" or "Deaccent" filenames?

Reply #4
Out of curiosity, which filesystems are not capable of handling unicode filenames? I never had problems with filenames; finding proper fonts to display them all can be tough, tho.
TAPE LOADING ERROR

Re: "Latinize" or "Deaccent" filenames?

Reply #5
Windows 95  :D