Author Topic: UTF-8 characters in posts  (Read 883 times)

0 Members and 1 Guest are viewing this topic.

Offline Deus ex Machina

  • Fellow
  • *******
  • Posts: 3030
  • Darwins +23/-3
UTF-8 characters in posts
« on: September 23, 2008, 07:22:20 AM »
As stated in the subject line. While I recognize that it is possible to write 'alpha' or 'pi' instead of using the Greek symbol, or 'GBP' instead of the pound symbol, it's somewhat tedious/frustrating.

Offline Inactive_A

  • Status: Semi-Active
  • Reader
  • ******
  • Posts: 1070
  • Darwins +5/-3
Re: Request: support for non-ASCII characters (ideally Unicode)
« Reply #1 on: September 24, 2008, 12:02:05 AM »
We will look into this.
Meanwhile you may wish to continue using the existing extended ASCII characters.

alt+0163 = £

Offline Inactive_A

  • Status: Semi-Active
  • Reader
  • ******
  • Posts: 1070
  • Darwins +5/-3
Re: Request: support for non-ASCII characters (ideally Unicode)
« Reply #2 on: September 27, 2008, 04:22:01 PM »
Currently the forum is using UTF-8.
This means your characters should be available to you, however in some cases the characters  chosen may not appear correctly in Internet Explorer unless you use a script that specifies a font that contains those characters.

http://en.wikipedia.org/wiki/UTF-8


Offline Deus ex Machina

  • Fellow
  • *******
  • Posts: 3030
  • Darwins +23/-3
Character set woes - update
« Reply #3 on: November 10, 2008, 05:23:55 AM »
Further to my previous topic, I've noticed that the issues with UTF-8 characters appear to be limited to some characters in particular - notably the ones outside the eight-bit range U+0020 .. U+00FF:

POUND SIGN (U+00A3): £
COPYRIGHT SIGN (U+00A9): ©
LATIN SMALL LETTER A WITH MACRON (U+0101): ?
GREEK SMALL LETTER THETA (U+03B8): ?
CYRILLIC SMALL LETTER PE (U+043F): ?
NOT EQUAL TO (U+2260): ?

It doesn't seem to be a font issue, as this problem occurs no matter what font I use - and this is happening on Firefox 2.x and 3.x, on both Mac and PC. Oddly, they all appear correctly in Preview mode, which makes me think it might be a database issue (or possibly a data source configuration issue).

Hopefully this additional information will prove useful. :)
« Last Edit: November 10, 2008, 05:29:11 AM by Deus ex Machina »

Offline Airyaman

  • Fellow
  • *******
  • Posts: 4662
  • Darwins +74/-7
  • Gender: Male
  • Alignment: True Neutral
    • Moving Beyond Faith
Re: Character set woes - update
« Reply #4 on: November 10, 2008, 07:10:14 AM »
If the database is not UTF-8, this could be the issue. It could be ISO-8859.
I've been struggling with racism lately. I recently came to the realization that I tend to dislike people with fake orange skin and stubby fingers.

Offline Hermes

  • Professor
  • ********
  • Posts: 9988
  • Darwins +2/-0
  • 1600 years of oppression ends; Zeus is worshiped.
Re: Character set woes - update
« Reply #5 on: November 10, 2008, 08:16:39 AM »
Unicode test ... flipped text.

This is an example.  !?

¿¡  ???d??x? u? s? s???  <== Worked in preview ... but not when I posted the message!

See: http://www.sevenwires.com/play/UpsideDownLetters.html

Quote
How does this tool get to flip text up side down and backward? The Javascript program converts English letters to unicode characters and symbols that look inverted, to make it look like you've typed upside-down text on the computer. Most of them come from the character sets "Latin Extended" and "International Phonetic Alphabet". Unfortunately there are no upside down numbers and not enough upside down capital letters, so this tool supports lowercase letters only. This page uses the font "Arial Unicode MS" to display the flipped text. You can learn the letter mapping by viewing this page's html source code.
« Last Edit: November 10, 2008, 08:18:18 AM by Hermes »
Smart people believe weird things because they are skilled at defending beliefs they arrived at for non-smart reasons. --Michael Shermer

The history of religion is a long attempt to reconcile old custom with new reason, to find a sound theory for an absurd practice.  --Sir James George Frazer

Offline Agent_007

  • Secret Agent
  • Postgraduate
  • *****
  • Posts: 654
  • Darwins +2/-0
  • Gender: Male
Re: Character set woes - update
« Reply #6 on: November 10, 2008, 08:13:46 PM »
Why do we have problems with standards? We have so many ...

Former Global Moderator Account

Offline Airyaman

  • Fellow
  • *******
  • Posts: 4662
  • Darwins +74/-7
  • Gender: Male
  • Alignment: True Neutral
    • Moving Beyond Faith
UTF-8
« Reply #7 on: November 27, 2008, 07:40:53 AM »
This forum is encoded in UTF-8. But I have an odd feeling the database is ISO-8859-1. Look at this post. In the quote at the bottom of the post, I was able to copy and paste Greek letters from Strong's and it would preview properly. Once posted, the Greek letters were turned into question marks. That will happen when characters are stored in one form of encoding (like ISO-8859-1) and then rendered in a different (UTF-8). Someone needs to check so that you have consistent encoding on site and in database.
I've been struggling with racism lately. I recently came to the realization that I tend to dislike people with fake orange skin and stubby fingers.

Offline Inactive_1

  • Emergency Room
  • ******
  • Posts: 2242
  • Darwins +10/-2
  • Gender: Male
Re: UTF-8
« Reply #8 on: November 27, 2008, 08:38:16 AM »
The database is in UTF-8, so I will have to research this issue further.

Offline Deus ex Machina

  • Fellow
  • *******
  • Posts: 3030
  • Darwins +23/-3
Re: UTF-8
« Reply #9 on: November 27, 2008, 10:51:22 AM »
Could be the drivers.