If you recently upgraded to wordpress 2.2+ from earlier versions you may have found your posts have been all messed up with lots of Â, †and ™ characters dotted everywhere. This happened to Blogging Tips posts, I quickly removed them from all of the affected posts but I didn’t spend any time looking into what caused this to happen.
Last week I helped out Elina from My Lil Venture by upgrading her blog to wordpress 2.2.1. Elina emailed be back telling me that all of her posts were now filled with  and †characters, clearly this problem was something to do with the wordpress upgrade.
Here is an example of the output in one of Elina’s posts
I decided to look into the problem further as there are a lot of wordpress users who are having problems with this.
What causes these characters to appear after upgrading?
Wordpress added two new lines to the wp-config.php file in version 2.2
define(’DB_CHARSET’, ‘utf8′);
define(’DB_COLLATE’, ”);
In brief, DB_CHARSET lets you define the character set which is used on your blog and DB_COLLATE lets you set the order of the character set. I don’t want to go into too much details about these new variables as it will not concern the majority of wordpress users. You can find out more information about these variables here.
So why does this problem arise? Well, if you install a fresh copy of wordpress you will not get this problem however if your blog has been upgraded to 2.2+ from an earlier version of wordpress you will. This is because the wordpress upgrade does not convert the old Latin1 character set to the new UTF-8 character set during the upgrade. I’m baffled why wordpress did not speak more about this when they released wordpress 2.2 as it’s clearly something that is going to cause problems for a lot of bloggers.
How do you remove these strange characters from your posts?
There are a few ways you can fix this problem.
- Remove the references to DB_CHARSET and DB_COLLATE in the wp-config.php file - If you simply remove the new lines from your wp-config.php file your posts should be back to normal. In the long term though it’s maybe best to convert your database. Here is a screenshot of the page I referenced before. As you can see, the unwanted characters have disappeared by simply removing the lines from the wp-config.php file.
- Convert your database the hard way - Wordpress have a guide to converting your database character set. Unless you have experience with mysql, I wouldn’t recommend doing this as there is a much better alternative (noted below). If you do choose to convert your database using this step to step guide, make sure you back up your database beforehand.
- Download the wordpress UTF-8 Database Converter plugin - g30rg3x have released a wordpress plugin called UTF-8 Database Converter which converts your database and therefore removes all strange characters from your posts. You can download it here. All you need to do is backup your database, upload this plugin and then activate it. When you have activated it and selected the converter in your plugin tab you will see this screen.
Don’t get too alarmed about this screen, just double check that you have backed up your blog database so if anything happens your covered.
Removing or commenting out the DB_CHARSET and DB_COLLATE references in your wp-config.php file is definately the quickest way to resolve this problem but I’ve no doubt that future versions of wordpress will include these new character variables so it might be worthwhile converting your database using the plugin I mentioned.
I hope that this guide will help wordpress users who have had problems with this. If you are unsure about anything please let me know and I will do my best to help.
thanks,
Kevin















Blogs Do Make Money | July 20th, 2007 at 5:15 am #
Hi Kevin, just wanna let you know I really appreciate all the effort that you put into helping me solve these problems. Thanks a bunch!
Community Building Blog | July 20th, 2007 at 6:20 am #
That’s a really interesting article - I upgraded when the new version released but didn’t notice these errors. Then again, I try to avoid using apostrophes in my articles so that may explain it.
I agree that it is odd that no mention was made of this by WordPress.
I am having a problem when I edit comments, I have to change all the code that appears in the ‘edit’ box to prevent the HTML from showing - is anyone else having this use?
- Martin Reed
Kevin | July 20th, 2007 at 7:01 am #
Blogs do make money - Glad you liked the post
CBB - If you don’t use ‘ a lot your posts should be ok. If you need any help with the edit comment problem please let me know (can’t guarantee I’ll fix it but I’ll do my best)
Rea Maor | July 20th, 2007 at 7:12 am #
Wow, you just solved me one of the biggest problems I had…
I’ve actually did most of this manually, but the plug-in conversion works Great.
Kevin | July 20th, 2007 at 7:17 am #
I checked dozens of posts yesterday but I’ve just came across one of my older posts which still had one of those characters on it ie. after the database was converted. So I’m now not 100% sure how well the converter works. I know removing those lines from the wp-config.php field works perfectly though
Let me know how you all get on with this
g30rg3_x | July 20th, 2007 at 10:20 am #
Hi Kevin thanks for referring my work…
And as you say, even the converter works greats there are some minimal bugs that m still working on it, in the next 2 weeks i will release the next version of the plugin/converter that will resolve a lot of problems and mysql incompatibility’s between and 5, and other problems that i think will make a better solution…
So as you say even i spend much time creating and findinf the correct solution, the plugin is still a not general solution, i have see the basic bloggers will not have problems converting but pro-bloggers that have biggest databases and very modificated databases that its a challenge for me that i’m pretty sure i will win in the next version…
Grettings from mexico
Kevin | July 20th, 2007 at 12:25 pm #
Your more than welcome g30rg3_x. As far as I know your the only one who has brought out anything which resolves this.
Again, I’m surprised how little wordpress mentioned this when they moved to version 2.2
let me know if you release an update, im sure readers would want to know about it
Adam Dempsey | July 22nd, 2007 at 5:12 am #
Thanks for that, I’ve had this problem a few times and never knew how to resolve it
Sara | July 22nd, 2007 at 7:48 am #
Good post Kevin
John | July 23rd, 2007 at 2:38 pm #
Thanks for the tip Kevin. I fixed the problem myself by exporting the WordPress database and then editing it with a text editor, using find and replace to replace all occurrences of apostrophes etc, then re-importing the database. But I would have preferred to not go to such lengths. Not particularly impressed that I could find no info in the support forums at WordPress.org at the time.
Michael Fultz | July 30th, 2007 at 11:39 pm #
I’m kind of new to wordpress, so any help is welcome. Thanks!