To converted database character set from latin1 to utf8. The following is my conversion steps.
Prepared Software
* MySQL 4.1 or later
* Unicode Text Editor (eg. Notepad++)
* phpMyAdmin (Optional)
1. Export the database
I used mysqldump command to export the database to SQL file. I am not sure whether it works with phpMyAdmin. Please let me know if this way works. It is very important to set default-character-set to latin1.
mysqldump -h localhost -u [username] -p –default-character-set=latin1 –insert-ignore –skip-set-charset [database] > dump.sql
2. Replace SQL statements
This procedure is replacing the charset keywords from latin1 to utf8 inside a SQL file, so that the collation of tables will be set to utf8. It is recommended to use Notepad++. If the Chinese characters can be shown in the text editor, it mean you have chosen a correct editor and the previous step is right. The replacing strings are the following:
sed -i ‘s/DEFAULT CHARSET=latin1/DEFAULT CHARSET=utf8/g’ [latin1].sql
sed -i ‘s/COLLATE=latin1_general_ci/COLLATE=utf8_general_ci/g’ [latin1].sql
sed -i ‘s/COLLATE latin1_general_ci/COLLATE utf8_general_ci/g’ [latin1].sql
Now you can save it as [utf-8]..sql. Or use the command to convert to utf-8 format. And do NOT save it directly (file name:dump.sql) becuase this SQL file is for backup.
iconv -c -f Big5HKSCS -t UTF-8//IGNORE [latin1].sql > [utf-8].sql
3. Modify the default Collation
When you create the new tables in the future, new tables will be set to the default character set, latin1. Therefore, it is suggested to modify the default Collation to utf8. You could choose either MySQL monitor or phpMyAdmin to modify and the below example is using MySQL monitor.
Login to MySQL:
mysql -u [username] -p
After logged in, input the following SQL statement:
ALTER DATABASE [database] DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;
4. Import the modified SQL file
I also selected MySQL monitor to import the modified SQL file. Please make sure that the default-character-set is utf8 in this time.
mysql -u [username] -p –default-character-set=utf8 [database]< [utf-8].sql
5. Change web applications configuration
The conversion is basically completed. If you find the text of your web application unable to show correctly, you are needed to change the charset setting.