-
You can check and compare sort orders provided by these two collations here: http://www.collation-charts.org/mysql60/mysql604.utf8_general_ci.european.html http://www.collation-charts.org/mysql60/mysql604.utf8_unicode_ci.european.html utf8_general_ci is a very simple collation. What it does - it just - removes all accents - then converts to upper case and uses the code of this sort of "base letter" result letter to compare. For example, these Latin letters: ÀÁÅåāă (and all other Latin letters "a" with any accents and in any cases) are all compared as equal to "A". utf8_unicode_ci uses the default Unicode collation element table (DUCET). The main differences are: 1. utf8_unicode_ci supports so called expansions and ligatures, for example: German letter ß (U+00DF LETTER SHARP S) is sorted near "ss" Letter Œ (U+0152 LATIN CAPITAL LIGATURE OE) is sorted near "OE". utf8_general_ci does not support expansions/ligatures, it sorts all these letters as single characters, and sometimes in a wrong order. 2. utf8_unicode_ci is *generally* more accurate for all scripts. For example, on Cyrillic block: utf8_unicode_ci is fine for all these languages: Russian, Bulgarian, Belarusian, Macedonian, Serbian, and Ukrainian. While utf8_general_ci is fine only for Russian and Bulgarian subset of Cyrillic. Extra letters used in Belarusian, Macedonian, Serbian, and Ukrainian are sorted not well. +/- The disadvantage of utf8_unicode_ci is that it is a little bit slower than utf8_general_ci. So when you need better sorting order - use utf8_unicode_ci, and when you utterly interested in performance - use utf8_general_ci.
-
-
mysqldump –add-drop-table -uroot -p NOM_DE_LA_BASE | replace CHARSET=latin1 CHARSET=utf8 | iconv -f latin1 -t utf8 | mysql -uroot -p NOM_DE_LA_BASE
“charset” related tags
spirit’s tags
-