Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: UTF-8 vs UTF-16 in ID3 Tags? (Read 10178 times) previous topic - next topic
0 Members and 1 Guest are viewing this topic.

UTF-8 vs UTF-16 in ID3 Tags?

Heya guys, I just did a decent amount of research of the differences between the 2 and im still confused as to which is better to use for ID3v2.3/4 tags. I have to admit, when I first looked at it my first though was "hmm, utf-16 sounds newer, use that" but I really still have no idea.

My main goal is to have tags that any language character can show in, If I have a CD from Japan, I want the ID3 tags to show those characters without issue.

Anyone can enlighten me on the differences between the 2 encodings and how it will apply to tags and what id like to use them for?


From what ive read and found, which may very well be very wrong  Ive come to this (please correct me): UTF-8 can display every character from any language, and does it generally more compact, except when doing Chinese, Japanese, and Korean where UTF-16 will have a slight size advantage on them, so if your ONLY working with those characters you may be better off using 16 over 8, but in my case and most peoples who may have just a small handfull of things that use those characters, your better off sticking to UTF-8.
Also read a bunch of stuff that I diddnt fully understand from my quick skimming through of stuff stating that UTF-16 isnt formatted as well as UTF-8, and not as many applications read it as well, or something.


Then from other stuff Ive read says UTF-8 is older and is mainly used when you need things to work with older applications, and newer things are straying away from it in favor of UTF-16 as its more solid and universal in how it handles characters, instead of being made around European/English characters and having other types kinda added on at a higher cost.


Thanks

UTF-8 vs UTF-16 in ID3 Tags?

Reply #1
Heya guys, I just did a decent amount of research of the differences between the 2 and im still confused as to which is better to use for ID3v2.3/4 tags.

ID3v2.3 does not support UTF-8, so the decision in that case is trivial.

My main goal is to have tags that any language character can show in, If I have a CD from Japan, I want the ID3 tags to show those characters without issue.

UTF-8 and UTF-16 are equivalent in the sense that they can encode the same set of characters. Also, regarding the amount of text typically stored in tags, the size difference between UTF-8 and UTF-16 encodings can be ignored.

The really important point is what software and hardware you want to use and which encodings it supports.

UTF-8 vs UTF-16 in ID3 Tags?

Reply #2
I think that ID3 v2.3 uses UTF-16 and ID3 v2.4 uses UTF-8.

UTF-8 vs UTF-16 in ID3 Tags?

Reply #3
UTF-8 flag for text fields was added in ID3v2.4. So you can also use both  ISO-8859-1 and UTF-16 in ID3v2.4 but can't use UTF-8 in ID3v2.3 or lower.

But together with UTF-8 ID3v2.4 has some cleanups and improvements against v2.3 so this latest format is a recommended one indeed.

My portable support only ID3v2.3 so I must use UTF-16 in v2.3 for it. But most of modern software already support v2.4. Most often mistake in ID3 tags is usage ISO-8859-1 marker with text in other 8-bit codepage, both for ID3v1 and ID3v2. This is non-standard and will get wrong text info in standard complaint software/hardware. But unicode - ether UTF-16 in ID3v2.3 or UTF-8 in ID3v2.4 - both correct and standard. It's your choice what to use.

UTF-8 vs UTF-16 in ID3 Tags?

Reply #4
UTF-8 flag for text fields was added in ID3v2.4. So you can also use both  ISO-8859-1 and UTF-16 in ID3v2.4 but can't use UTF-8 in ID3v2.3 or lower.

But together with UTF-8 ID3v2.4 has some cleanups and improvements against v2.3 so this latest format is a recommended one indeed.

My portable support only ID3v2.3 so I must use UTF-16 in v2.3 for it. But most of modern software already support v2.4. Most often mistake in ID3 tags is usage ISO-8859-1 marker with text in other 8-bit codepage, both for ID3v1 and ID3v2. This is non-standard and will get wrong text info in standard complaint software/hardware. But unicode - ether UTF-16 in ID3v2.3 or UTF-8 in ID3v2.4 - both correct and standard. It's your choice what to use.

Cool, was really wondering why ive never seen people use UTF-16 in ID3v2.4, made me think it was worse.

After some tests, ill be using ID3v2.3 as my new 4gig Iriver Clix gen2 doesnt support ID3v2.4.
If anyone is interested though, it supports ID3v2.3 fine, and Japanese language characters in the tags are displayed very well also, But only in the tags, try to have it in the actual file name and the name shows up as crazy characters, but that has nothing to do with the tags formatting anyways  But just make sure the actual file name is in your languages standard characters.