Wednesday, 14 November 2018

Fixing double-encoded UTF-8 data in MySQL

Double-encoded UTF-8 texts (not to mention triple-, quadruple- and so on) are a fairly common problem when dealing with MySQL. This may be due to the fact that the default character set of the connection to the server is Latin-1, but that is not relevant once the data is already corrupt.
Here is how to fix it, in two simple steps, using the mysqldump and mysql commands:
mysqldump -h DB_HOST -u DB_USER -p DB_PASSWORD --opt --quote-names \
    --skip-set-charset --default-character-set=latin1 DB_NAME > DB_NAME-dump.sql

mysql -h DB_HOST -u DB_USER -p DB_PASSWORD \
    --default-character-set=utf8 DB_NAME < DB_NAME-dump.sql
Of course, you should first replace DB_HOSTDB_USERDB_PASSWORD and DB_NAME with values, corresponding to your database setup.

0 comments:

Post a Comment