Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: foo_clienc Command Line Encoder and non-ANSI (Read 6587 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

foo_clienc Command Line Encoder and non-ANSI

clienc seems restricted to passing ASCII-only characters to the commandline.  Does anyone know how to get it to pass Unicode or at least codepage 1252 (aka ISO-8859-1, aka Latin-1) characters in the fields such as %title%, etc.

+Reardon

foo_clienc Command Line Encoder and non-ANSI

Reply #1
Quote
clienc seems restricted to passing ASCII-only characters to the commandline.[a href="index.php?act=findpost&pid=243630"][{POST_SNAPBACK}][/a]

Since most console applications and even many GUI applications are not compiled to use Unicode entry points or GetCommandLineW to retrieve Unicode command line arguments, they would only support your local code page anyway.


Quote
Does anyone know how to get it to pass Unicode

It might be possible to make clienc use Unicode conventions for spawning external processes, but there's no guarantee that it will work.

Quote
codepage 1252 (aka ISO-8859-1, aka Latin-1) characters in the fields such as %title%, etc.[a href="index.php?act=findpost&pid=243630"][{POST_SNAPBACK}][/a]

Code page 1252 is not Latin-1, but a Windows extension thereof. Latin-1 only covers 0x20-0x7F, 0xA0-0xFF, and maps directly to the same ranges in Unicode. Code page 1252 also covers the range from 0x80-0x9F, which does not map directly to Unicode. That range varies by Windows code page.

Actually, everything should be converted to your system codepage, which in this case is 1252, and the command line encoders should be happy with that. If that is not the case, and it is actually converting to the stricter ISO-8859-1 or Latin-1, then you will be missing some characters that many applications will accept. I think...

I'll have to look over the foo_clienc source to be sure...

foo_clienc Command Line Encoder and non-ANSI

Reply #2
Quote
Since most console applications and even many GUI applications are not compiled to use Unicode entry points or GetCommandLineW to retrieve Unicode command line arguments, they would only support your local code page anyway.


Using Latin-1 (1252, Microsoft uses the description "Windows Latin 1" for this) or ISO-8859-1 (also known as "Latin-1") is fine.  Since even SBCS and non-unicode non-utf aware windows gui AND windows commandline apps will default to 1252... unless there is a cmd window codepage.  unfortunately windows default to the old 437 oem codepage in cmd launched windows.

In the cli_enc script that i wrote, i actually use cmd.exe as the "external program" and then send a series of commands separated by && as the param line.  so i send "chcp 1252 && itunesencode -blahblah" as the param line.

But the problem with cli_enc seems worse than botching the codepage translation.  It will not pass ANY character above 0x7F.  It just skips the char entirely.  I have tried this by leaving the cmd window codepage unchanged (437), changed to 850 (original EU Latin DOS set, if I remember), and 1252.

Also, as you know, 1252 is a proper superset of ISO-8859-1 with the hugely useful euro character added, among others.  So defaulting to 1252 for cli_enc seems appropriate.

Btw, I have tried itunesencode by hand and it accepts whatever quoted binary value you send it, including quoted UTF-8 values, without cp translation.

[edited for spelling]

+Reardon

foo_clienc Command Line Encoder and non-ANSI

Reply #3
Once again, CP1252 != Latin1. CP1252 and all other code pages are an extension of Latin1.

For the record, foo_clienc does support Unicode execution in Windows NT. (2000/XP/etc.) CMD.exe probably lacks Unicode support, and itunesencode probably does too.

Didn't somebody make a JScript based iTunes encoder and post it to this forum? Maybe you'll have better luck with that instead, assuming that whole system supports it.

foo_clienc Command Line Encoder and non-ANSI

Reply #4
From C:\>cmd /?

Code: [Select]
/U      Causes the output of internal commands to a pipe or file to be Unicode

From my own experience, WinNT cmd.exe has no problem with Unicode (make sure you are using a font with the necessary characters so that you can see the characters instead of just '?'); however, many console applications are at fault for ignoring anything beyond basic ASCII or in isolated cases extended ASCII.

foo_clienc Command Line Encoder and non-ANSI

Reply #5
My mistake. Although, it would appear the biggest stumblng block in this process is the itunesencode tool, which is probably an ANSI build. Even if it isn't, it might not be possible for it to feed Unicode text to iTunes.

Although, I don't see why feeding metadata to iTunes is of any consequence, as the tagging should be performed by clienc, using the installed MP4 input after the encoding finishes.

foo_clienc Command Line Encoder and non-ANSI

Reply #6
Just some observations:

Code: [Select]
D:\util\audio>cmd /U /K D:\util\audio\oggenc-gt3b2.exe -a ☻ -q4 test.wav -o test.ogg
Produces an ogg file with artist "?" sans quotes (☻ = 0x263b)

Code: [Select]
/U /K D:\util\audio\oggenc-gt3b2.exe -a ☻ -q4 %s -o %d
As a conversion string in foobar (with cmd.exe as the encoder), produces an ogg file with artist "-q4" sans quotes (☻ = 0x263b)

Code: [Select]
D:\util\audio>cmd /U /K D:\util\audio\oggenc-gt3b2.exe -a Ø -q4 test.wav -o test.ogg
Produces an ogg file with artist "Ø" sans quotes (Ø = 0xd8)

Code: [Select]
/U /K D:\util\audio\oggenc-gt3b2.exe -a Ø -q4 %s -o %d
As a conversion string in foobar (with cmd.exe as the encoder), produces an ogg file with artist "-q4" sans quotes (Ø = 0xd8)

Code: [Select]
/U /K D:\util\audio\oggenc-gt3b2.exe -a "Ø" -q4 %s -o %d
As a conversion string in foobar (notice the quotes), produces an ogg file with this lovely error message appearing during encoding:
Quote
Unicode translation error 87
Couldn't convert comment to UTF-8, cannot add
The resulting file has no artist tag. (Ø = 0xd8)


Quote
Although, I don't see why feeding metadata to iTunes is of any consequence, as the tagging should be performed by clienc, using the installed MP4 input after the encoding finishes.
That what I would think too. If I put the special characters into tags within the source file instead of setting them through the console and set the tagging to "Default," the tags are written correctly to the file after encoding no matter what crazy Unicode characters happen to be there. I should think this would apply to itunesencode as well.

foo_clienc Command Line Encoder and non-ANSI

Reply #7
Oggenc isn't likely to work with characters outside of the system codepage, unless you happen to roll your own using _tmain, TCHAR for all file paths, TEXT() for all text constants, and appropriate functions for formatting and displaying TCHAR, and build with UNICODE defined. Otherwise, it's just plain ANSI.

Hence why tagging outside of the Default internal input scheme is not really a good idea, unless you don't care about Unicode tags.

If you must apply constant tags to your files, use the Mass Tagger component.

foo_clienc Command Line Encoder and non-ANSI

Reply #8
I didn't expect it to be any better than an apples to oranges type of comparison and maybe not a comparison at all. Like I said those were just some observations I made on a quick test case using oggenc because I was too lazy to download and configure itunesencode and its associated external encoder.

I'm not much of a programmer so I haven't delved into the nuances of these programs, not for lack of interest but for lack of understanding.

The outcome of this excercise that I found interesting, however, was the case where 0xd8 (a high ASCII character yes?) passed as a static tag via the command line was added successfully to the file (artist=Ø), while 0xd8 passed as a static tag via the parameter string of foo_clienc was completely ommitted (artist=-g4 -- i.e. the next argument in the command string). This seems to be the exact problem that reardon is experiencing and I am interested in the root of the apparent omission by foo_clienc.

I am also personally curious as to why oggenc whines about UTF-8 conversions when the 0xd8 character is encapsulated in quotes in the foo_clienc parameter string but appears not to see the character without the quotes.

Personally I have no problems with foo_clienc, even on files with unicode characters in their tags. Just set the tagging scheme in foo_clienc to the desired format and everything is properly copied over automagically. Very slick.

foo_clienc Command Line Encoder and non-ANSI

Reply #9
It might be a Windows problem, when a process is launched with CreateProcessW, but retrieves its command line with GetCommandLineA.

foo_clienc Command Line Encoder and non-ANSI

Reply #10
I posted another topic regarding a similar issue, but it remained untouched...maybe it was my wording

Quote
For the record, foo_clienc does support Unicode execution in Windows NT. (2000/XP/etc.) CMD.exe probably lacks Unicode support, and itunesencode probably does too.

Didn't somebody make a JScript based iTunes encoder and post it to this forum? Maybe you'll have better luck with that instead, assuming that whole system supports it.
[a href="index.php?act=findpost&pid=244860"][{POST_SNAPBACK}][/a]


I could be wrong on this, but I have noticed a few interesting things with how foo_clienc handles certain things...

For instance, I have a CD artist named "bôa" which is stored in a directory structure similiar to "bôa"-2000-CD-Soundtrack-Duvet". Now, if I pass foo_clienc the command-line parameter "%artist%", it will send whatever program it is calling "ba", removing the "ô" completely. However, "%s" sends the progam it is calling "bôa"-2000-CD-Soundtrack-Duvet\<some track name>", without removing the "ô" character.

I actually encountered this problem using Otto42's iTunesEncode... at first I thought it was his program, but I can pass the source file "%s" as a tag, and it reads it perfectly, whereas if I pass "%artist%" as a tag, again it removes the "ô" character.

I made a very brief attempt to look at the clienc source... which seems to point to how the tag is actually being read. The source file "%s" is just being directly inserted into whereever it appears on the command-line, however for the tags themselves, clienc will...

look for % -> read until it finds another % (unless it finds %s or %d) -> retrieve the tag -> insert into command-line.  However, it seems that where it is storing the retrived tag is specified to hold/convert to a specific type of value. Of course, I only spent about 2 minutes scanning over the source, so I could definately be wrong.

At any rate, whatever the cause it would be more effective just to write a foobar plugin that performs the same task as iTunesEncode without need for the command-line...perhaps when things settle down over the next few weeks, and I install VC++ again, I'll take a whack at it.

-----
Domain

foo_clienc Command Line Encoder and non-ANSI

Reply #11
Quote
Once again, CP1252 != Latin1. CP1252 and all other code pages are an extension of Latin1.


Uh, you missed my point.  _Microsoft_ calls CP1252 "Latin 1".  It's not my description, it is theirs.  There is no univeral law keeping multiple people from using the phrase "Latin 1".  What you are really trying to say, I presume, is that ISO Latin 1 and Microsoft Latin 1 (CP1252) are different.  The latter is a superset of the former.

Quote
For the record, foo_clienc does support Unicode execution in Windows NT. (2000/XP/etc.) CMD.exe probably lacks Unicode support, and itunesencode probably does too.


Of course cmd.exe supports Unicode.  Try it.  Cut and paste some Unicode chars into a file using Notepad and 'type' the file and the cmd prompt.  If the cmd window font supports, you will see them.  Note this is pure Unicode, not UTF-8.  In any case, the issue is with itunesencode.

Let me reiterate: if I *manually* use itunesencode from the cmd prompt, and send it Unicode or UTF-8 chars, everything is fine.  The ONLY problems is when using foobar.  Foobar *strips* characters.  It doesn't convert them willy-nilly, it simply disallows any 8-bit characters from clienc launcher.

Compare it to dbPoweramp cli interface and you'll understand quickly what I am saying.

Quote
Didn't somebody make a JScript based iTunes encoder and post it to this forum? Maybe you'll have better luck with that instead, assuming that whole system supports it.



Yes, using the same interfaces (and I believe written by Otto, same person who wrote itunesencode, bless his heart).

+Reardon

foo_clienc Command Line Encoder and non-ANSI

Reply #12
Kode54 spotted some bugs in cli_enc string parsers that caused characters to be removed, this version should have the issue solved.

foo_clienc Command Line Encoder and non-ANSI

Reply #13
Quote
Kode54 spotted some bugs in cli_enc string parsers that caused characters to be removed, this version should have the issue solved.
[{POST_SNAPBACK}][/a]

It seems that the new foo_clienc has the same problem as [a href="http://www.hydrogenaudio.org/forums/index.php?showtopic=22814&view=findpost&p=224770]this post[/url].

foo_clienc Command Line Encoder and non-ANSI

Reply #14
Quote
Kode54 spotted some bugs in cli_enc string parsers that caused characters to be removed, this version should have the issue solved.
[a href="index.php?act=findpost&pid=245132"][{POST_SNAPBACK}][/a]


Yup.  This works.  Thanks a bunch.  Very happy now.

For anyone interested: using Otto42's itunesencode, I know have a really nice setup for transcoding via itunes.  Even better, if you set itunes to NOT organize you music, you can avoid extra copies of files and let Foobar do all the work of setting into the proper directory.

I don't much like batch files but have become a big fan of using "cmd.exe" as the cli encoder and passing a bunch of '&&' separated commands on the params line.

I have a neat little trick that avoids the copy and steals the file back from itunes without relying on any scripting.

+Reardon

foo_clienc Command Line Encoder and non-ANSI

Reply #15
Quote
Quote
Kode54 spotted some bugs in cli_enc string parsers that caused characters to be removed, this version should have the issue solved.
[{POST_SNAPBACK}][/a]

It seems that the new foo_clienc has the same problem as [a href="http://www.hydrogenaudio.org/forums/index.php?showtopic=22814&view=findpost&p=224770]this post[/url].
[a href="index.php?act=findpost&pid=245168"][{POST_SNAPBACK}][/a]


That appears to be a problem with the WavPack command line encoder. There should be no problem inside foo_clienc itself as all strings are processed as UTF-8 internally. Whichever tool you are having problems with must have a bug in its path processing.

Since it appears you are using tools which do not support pipes, try specifying both input and output paths, as relying on the tools to generate their own output paths is prone to error if they use the typical strrchr(path, '\\') method.

foo_clienc Command Line Encoder and non-ANSI

Reply #16
Quote
That appears to be a problem with the WavPack command line encoder. There should be no problem inside foo_clienc itself as all strings are processed as UTF-8 internally. Whichever tool you are having problems with must have a bug in its path processing.

Since it appears you are using tools which do not support pipes, try specifying both input and output paths, as relying on the tools to generate their own output paths is prone to error if they use the typical strrchr(path, '\\') method.
[a href="index.php?act=findpost&pid=245214"][{POST_SNAPBACK}][/a]

I don't understand why 0.3.2 failed but 0.3.1 works in the same situation. 
[span style='font-size:8pt;line-height:100%']
foo_clienc 0.3.1 and oggenc with/without pipe:
Code: [Select]
INFO (foo_clienc) : CLI encoder: C:\Program Files\Encoder\OGGENC.EXE
INFO (foo_clienc) : Destination file: file://E:\ソ.ogg
INFO (foo_clienc) : Source file: file://E:\ソ.flac
INFO (foo_clienc) : 44100Hz 32bps 2ch
INFO (foo_clienc) : Encoding took 9875 milliseconds, speed 7.22x
INFO (CORE) : attempting to edit file info : file://E:\ソ.ogg
INFO (CORE) : file info update successful on : file://E:\ソ.ogg

foo_clienc 0.3.2 and oggenc with pipe:
Code: [Select]
INFO (foo_clienc) : CLI encoder: C:\Program Files\Encoder\OGGENC.EXE
INFO (foo_clienc) : Destination file: file://E:\ソ.OGG
INFO (foo_clienc) : Source file: file://E:\ソ.flac
INFO (foo_clienc) : 44100Hz 32bps 2ch
ERROR (foo_clienc) : Writing to encoder failed
ERROR (foo_clienc) : Encoding failed
ERROR (foo_diskwriter) : Conversion failed.

foo_clienc 0.3.2 and oggenc without pipe:
Code: [Select]
INFO (foo_clienc) : CLI encoder: C:\Program Files\Encoder\OGGENC.EXE
INFO (foo_clienc) : Destination file: file://E:\ソ.OGG
INFO (foo_clienc) : Source file: file://E:\ソ.flac
INFO (foo_clienc) : 44100Hz 32bps 2ch
ERROR (foo_clienc) : Failed to create temp file
ERROR (foo_diskwriter) : Conversion failed.
[/span]

foo_clienc Command Line Encoder and non-ANSI

Reply #17
Quote
I don't understand why 0.3.2 failed but 0.3.1 works in the same situation.


Try this version which shows full command line sent to encoder, should help figuring what the problem is.

foo_clienc Command Line Encoder and non-ANSI

Reply #18
Quote
Quote
I don't understand why 0.3.2 failed but 0.3.1 works in the same situation.

Try this version which shows full command line sent to encoder, should help figuring what the problem is.
[a href="index.php?act=findpost&pid=245282"][{POST_SNAPBACK}][/a]

Thank you.
It seems to me that foo_clienc can't handle such problematic characters.

[span style='font-size:8pt;line-height:100%']with pipe:
Code: [Select]
INFO (foo_clienc) : CLI encoder: C:\Program Files\Encoder\OGGENC.EXE
INFO (foo_clienc) : Destination file: file://E:\ソ.OGG
INFO (foo_clienc) : Source file: file://E:\ソ.flac
INFO (foo_clienc) : 44100Hz 32bps 2ch
INFO (foo_clienc) : Full command line: "C:\Program Files\Encoder\OGGENC.EXE" -q4 - -o "E:\
[/span]

[span style='font-size:8pt;line-height:100%']without pipe:
Code: [Select]
INFO (foo_clienc) : CLI encoder: C:\Program Files\Encoder\OGGENC.EXE
INFO (foo_clienc) : Destination file: file://E:\ソ.OGG
INFO (foo_clienc) : Source file: file://E:\ソ.flac
INFO (foo_clienc) : 44100Hz 32bps 2ch
INFO (foo_clienc) : Full command line: "C:\Program Files\Encoder\OGGENC.EXE" -q4 "E:\
[/span]

foo_clienc Command Line Encoder and non-ANSI

Reply #19
I had some logic bugs in previous version. Version 0.3.3 should work.

foo_clienc Command Line Encoder and non-ANSI

Reply #20
Quote
I had some logic bugs in previous version. Version 0.3.3 should work.
[a href="index.php?act=findpost&pid=245546"][{POST_SNAPBACK}][/a]

Yes, this works. Thank you for fixing bug and Unicode hacks!

foo_clienc Command Line Encoder and non-ANSI

Reply #21
How do I make wv and wvc (WavPack hybrid) into the same file name using foo_clienc?

I read this post and try this parameter, but wvc become a short file name...

-hb320 - %d -c %_filename_ext%
(Output file name formatting is %_filename%)

foo_clienc Command Line Encoder and non-ANSI

Reply #22
Uploaded new version that only uses Unicode handling hack when it's needed. This will fix hybrid Wavpack compression.