Old 10th October 2007, 15:52   #1
Pidgeot
Senior Member
 
Pidgeot's Avatar
 
Join Date: Jan 2002
Location: Denmark
Posts: 136
Unicode issues in Winamp and/or Bento

After installing 5.5, I've discovered a few issues with Unicode due to some Japanese CDs I have - but I'm not sure if the problems are in Winamp or in Bento.

Upon clicking on the album name in the file info box, it's supposed to search for the text in question. It tries to do this, but something goes wrong and the Unicode string is converted to ANSI:

Image showing the bug (125K)

I can briefly see the correct URL in the location bar before the page loads - after that, it chagnes to a series of question marks.

Additionally, as you can probably tell, the kanji is rendered a little strangely in the edit control used for searching (the Location edit suffers from this issue as well). It seems that it somehow decides to render those characters at double their normal width. Everything else handles these characters quite nicely - notifications, (granted, the File Info box uses a pretty small font, so it's hard to make out, but that's not a bug).

As shown in the next image, searching also has issues with the ideographic space used in Japanese - everything after that character is left out of the search.

Incorrect truncation of search term (125K)

These issues appear on both XP Pro and Vista Ultimate (both 32-bit), using two different computers. Clean install doesn't make a difference, but then again, there aren't any 3rd-party plugins.

Plug-in list attached to post. If you need more info (or maybe even a sample file), please let me know.
Attached Files
File Type: txt my_plugin_list.txt (4.6 KB, 345 views)
Pidgeot is offline   Reply With Quote
Old 10th October 2007, 16:03   #2
Pidgeot
Senior Member
 
Pidgeot's Avatar
 
Join Date: Jan 2002
Location: Denmark
Posts: 136
Ah crap, I just realized I posted in the wrong forum. Sorry about that. I'll repost over in Bug reports if need be - just say the word.
Pidgeot is offline   Reply With Quote
Old 10th October 2007, 19:05   #3
DJ Egg
Techorator
Winamp & Shoutcast Team
 
Join Date: Jun 2000
Posts: 35,857
What happens if you try the exact same search string in Internet Explorer search?

eg. http://search.aol.com/aol/search?query=(insert+album+title+here)

Does it work? Or do you also see ????? in there as well?
If the latter, then you probably don't have OS or IE support installed for that language (Winamp UI can display Japanese regardless, if the correct font is set in Prefs).

Though yeah, I guess there's a possibility that Bento browser (which is just a modified/embedded Internet Explorer window) doesn't support unicode search strings, and converts them to ansi. I'm not rightly sure, to be perfectly honest...
DJ Egg is offline   Reply With Quote
Old 10th October 2007, 19:48   #4
Pidgeot
Senior Member
 
Pidgeot's Avatar
 
Join Date: Jan 2002
Location: Denmark
Posts: 136
I've tracked down the search bug myself, with a little experimenting. It's IE that causes the error, but it's still a bug in Winamp/Bento.

Using Fiddler, I tried to access http://ja.wikipedia.org/wiki/メインページ (Japanese Wikipedia Main Page) and http://ja.wikipedia.org/w/index.php?...¸&action=edit (editing of that page) directly using IE. The former works, but the latter doesn't.

It looks like ShDocVw doesn't convert querystrings to UTF-8, unlike the path and filename parts of the URI - likely since there's no way of knowing what character set the client is running; only file names have a more or less standardized character set. Instead, it assumes the system character set when processing the URI. This means the ideographic space is converted to the nearest ANSI equivalent, which is a regular space. However, it doesn't perform URL encoding, meaning the request is cut off due to a rogue space.

The ????? shown in the Bento browser appears to come as a result of a redirect issued by the search page it calls (I take it it's due to some logging or something that shows the request was sent from Winamp). Since the string has already been corrupted, the redirect will of course be corrupted as well.

While you could argue that not URL encoding the converted string is a bug, it's probably because they couldn't make it submit the right request otherwise - if you manually typed it in, you'd have to replace a space with a plus, but if it's already been URL encoded, that plus would be replaced by %2B, resulting in a different request.

In other words, the fix is to convert the searchstring to the character set expected by the receiving end (UTF-8, in this case), and then URL encoding that before adding it to the URI.

That just leaves the peculiar font drawing oddity that stretches the kanji characters in the navigation edit controls, then - I've no idea what might cause that to happen, though.
Pidgeot is offline   Reply With Quote
Old 10th October 2007, 23:42   #5
Benski
Ben Allison
Former Winamp Developer
 
Benski's Avatar
 
Join Date: Jan 2005
Location: Brooklyn, NY
Posts: 1,057
Thanks for the report. i think I know why this is happening. Should be fixed up for 5.51
Benski is offline   Reply With Quote
Old 11th October 2007, 09:29   #6
Pidgeot
Senior Member
 
Pidgeot's Avatar
 
Join Date: Jan 2002
Location: Denmark
Posts: 136
Since I'm already listing Unicode issues here, the Cover Art Downloader seems to use an ANSI form - at least, the caption doesn't display Unicode characters properly (so there's an ANSI conversion involved somewhere in the process).

I'm not sure if it's just a display issue, or if it affects the actual searching as well, though.
Pidgeot is offline   Reply With Quote
Reply
Go Back   Winamp & Shoutcast Forums > Winamp > Winamp Bug Reports

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump