Oct 18

Are you working with foreign languages or non-Latin character sets in WordPress?

Are you experiencing this problem:

  • You write a post in a foreign language that uses a non-latin script, e.g. Chinese, Korean or Japanese. When you click ‘save’ or ‘publish’, the text is then re-displayed as unreadable garbage characters.

It is likely that your WordPress MySQL database has been created with the latin1 character set instead of the UTF8-character set which is able to correctly represent most foreign languages. Many web hosting providers have set latin1 as the default for MySQL.

I experienced this problem a while back, when I was setting up a website for a client who’s in the business of providing accommodation to international students and working holidaymakers coming to Australia. He wanted the website to have translated versions of each WordPress page in the 8 most common foreign languages spoken by his clients – French, German, Swedish, Korean, Japanese, Spanish, Portuguese and Chinese. To enable support for multiple languages in WordPress, I used the plugin WPML – The WordPress Multilingual Plugin.

Things were going well, but I hit a snag when I started to add a Korean page – it looked fine when I pasted it into the visual editor, but as soon as I saved the page and viewed it, the text rendered as unreadable garbage. After doing some research and a lot of testing and debugging, I eventually solved the problem:
Continue reading »