Prev Previous Post   Next Post Next
Old 25th October 2003, 02:55   #1
Francis
French Admin
 
Join Date: Nov 2000
Posts: 329
The final word on unicode

Hi guys

I noticed that the topic of unicode support in Winamp 5 has been mentionned.

Just so that the matter is settled (and to be clear, I'm not writing this to debate *about* unicode), there is no support of unicode coming up in Winamp 5. That's it. End of story.

So some of you may bring this up :



or this :



and say, "to do this, it must be supporting unicode!".

Well no, what you see here are captures I have made of the freeform skinning rendering by itself (as in: not using Win32's DrawText functions) the default charset for an operating system set to japanese or hebrew. Support for showing the correct international strings in the default language selected in the "regional settings" box of Windows is *not* unicode support. What unicode support allows you to do is to display these strings even if your Windows default os language is set to english. Mixing up japanese AND hebrew in the same program is a feature of unicode. Windows explorer, for instance, can do it, so does Internet Explorer, as long as you have the language installed (but not necessarilly set as default).

This means that, as it has been since a long time in Winamp 2, Winamp 5 can display strings in japanese or in hebrew (freeform skins couldn't previously do it, hence my captures, which I have slightly regreted making since), so long as your Windows is set to japanese or hebrew as your default language.

The reason for this has to do with the size of character strings: if you use 8bit character strings to encode text (what Winamp 2 has been doing since the beginning), you must observe an encoding scheme (even if you do not know about it and just take all chars to simply be 0...255), and there is no way to detect an encoding scheme from a given encoded string (it's just random bytes, it could just be ascii after all, ie: no encoding at all), the operating system needs to know *somehow* in what encoding you are when it wants to print the string, so that it knows wether this code is rather an accented e in french, or a 'ia' in russian (i'm making this correspondance up, btw) and so that it can print the correct characters, not just merely ascii. It does this by reading the default language setting in Windows. In 8bits-strings programs, that is being done by the classical Win32 DrawText functions, they do the character decoding on their own, and then print the characters with the requested font (usually unicode charset, but not always, but that's another story). In contrast, when you support unicode internally, there are four bytes to represent each character and these are enough to store all the unicode characters, and there is therefore no need for a notion of character encoding, everything can be in just one native encoding (unicode, in which case the encoding and the charmap 'fuse together' in a sense, or something still 8bits, like UTF-8), and you can stop thinking about it altogether, because all characters in all languages have a unique number. That's what is sometimes called "proper" unicode support.

Now, to be completely accurate, I must say that the freeform engine has some sort of unicode support, but *in the font rendering engine*, in the sense that, since we cannot rely on DrawText to make the character decoding (decoding to unicode according to the default language set in the system) because we do not use DrawText to draw our fonts (we use Freetype which does not provide the conversion), we have to provide our own conversion of encoded strings to unicode, in order to print the japanese or hebrew characters (we also for instance had to implement our own right-to-left routines). This conversion we provide uses the functions that DrawText uses, and these uses the default language setting to do the decoding, we therefore support japanese and hebrew chatacters only if you have the right language set, we cannot mix the two. True unicode support could.

Summed up, the reason true unicode support cannot be added simply (as in, just recompile, silly!) in the Winamp 2 system is that it would render our plugins incompatible, since they would have to all be recompiled for unicode support as well.

There are thoughts being given about what should be done about it, I certainly would be willing, once Winamp 5 is out, to try and make it truly support unicode without breaking everything (ie: supporting legacy encoding, UTF-8 seems to be the key here), but it would certainly be a lengthy task. Perhaps it is worth it.

Oh and let me be clear again, that's not to start a debate about unicode, if you want to have one amongst yourselves, that's fine, but I won't be participating yes, I have my own view about wether or not it should be undertaken, but we're several in the team, and this is a work that would impact everybody, so unless we all agree that this can be done safely (ie: without breaking winamp2) we probably won't be doing it.

Francis.

Bluemars - Music For The Space Traveller
Francis is offline  
 
Go Back   Winamp & Shoutcast Forums > Winamp > Winamp Discussion

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump