How to convert MySQL database from Latin1 to Utf8

Once client decided to switch database from Latin1 to UTF8. I was working on a Linux platform at the time but you can do the same thing in Windows too. Some people have a site that is using a Mysql database which uses the Latin1 character set and they want to convert their database to UTF8  so some Chinese and French characters show up correctly on the web page. Character set and collation is possible to set at 4 levels on mysql 5.0 version. Server, Database, Table, Column. This means that you can have a table with a character set ‘latin1’, and yet set one (or more) of its columns to have different character sets like ‘utf8’. You can use the following commands to see what character sets are in use in your database. For DB : SHOW VARIABLES LIKE ‘character_set_database’;  OR  SHOW CREATE DATABASE db_name; For TABLE: SHOW CREATE table_name; For COLUMN: SHOW CREATE table_name; Here are the steps for I followed to convert my Latin1 database to a utf8 database.

  • Create a new database that has UTF8 specified as its default character set
  • Execute SQL statement: CREATE DATABASE new_dbname CHARACTER SET utf8 COLLATE collation;
  • Export your database structure AND data by using mysqldump command.
  • On Linux you can do like this.

mysqldump –uusername –ppassword latin1_db > dbbackup.sql (user_name and password are your MYSQL user name and password, latin1_database is the name of your existing database, and backup.sql is the name you want to give to the backup file.)

  • Modify the SQL file you just finished exporting by opening it in any editor and replace “CHARSET=latin1” with “CHARSET=utf8”.
  • When you’re done with replace, search for “latin1” to see if it shows up anywhere. You may find it in collate statements (e.g. COLLATE = latin1_swedish_ci).
  • If this is the case, you can either delete the collate statements, and let the tables default to the collation set at the database level (or server level, if collation hasn’t been specified for the database)
  • Replace the latin1 collation with the specific utf8 collation you want to use, save the file in ANSI format (encoding) with “Save as” and exit.
  • ANSI is a mother character set we can say as what we have basically just done is make sure this file is encoded using Latin1 and all the characters used in the SQL language have the same encoding in both Latin1 and UTF8. The SQL keywords and syntax will be interpreted properly regardless of MYSQL thinks it is looking at Latin1 or UTF8.
  • Import the structure and data into your new UTF8 database which was created in starting.
  • On linux, you can use command like “mysql –uusername –ppassword utf8_db < backup.sql” Now we just need to check data’s integrity to make sure that we are safe to go ahead. You can also use some comparison tool for the same.
  • If everything looks good and worked, your data is now in utf8 format. Use any GUI tool for checking the data and I hope it will be in good shape.
  • Now you can switch your application to new UTF8 DB and check from the application.
  • That’s it I guess, you’re done.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.