Upgrading to 1.7 if your database is already utf8

This is very important for those who modified Elgg to use utf8 encoding for the connection to the database!

If you added either of these two commands so that your strings would be stored as utf8 strings:

  1. mysql_set_charset('utf8',...)
  2. mysql_query("SET NAMES 'utf8'");

the upgrade script included with Elgg 1.7 will corrupt your database. Versions of Elgg before 1.7 used the default character encoding that php sets for a mysql connection (latin1). The upgrade script converts all strings in the database from latin1 to utf8. If your strings are already utf8, you need to skip the conversion by removing this file before running the upgrade: engine/schema/upgrades/2009100701.sql

  • Alexander, common over for some borscht man! :-)

  • @Alexander - Saying "This won't work" has nothing to do with manners--it's a matter of technical possibility.  We provided a solution for people who had edited the core against recommendation and I have explained why that solution can't be part of core.

    Also, I'm not English.

  • if my database is alreardy utf8 and i already upgrade to 1.7 and now i have characters corrupted, how can i do?

  • Restore the backup database and do the above described procedure! Well this is of course if you do backup, I also caught this, but bekad update made everything and quietly fix it!

  • d'hooooo

    now i have new members and new contents .... and that backup isn't good ....so this is a problem!

    i cant use this procedure. :(

  • Lord, you lost your content, AFAIK.

    And one word to yankee: "RTFM!!!" You can always check charset setting on any system before performing any potentially dangerous operation with texts by using

    SHOW VARIABLES LIKE 'character_set%';

    and examining returned data. BTW, server can be configured to be fully-(some charset) without any user-side actions, thus - even unmodified Elgg core work with <non-latin1> database. Example from my my.ini

    [mysql]
    ...
    default-character-set=utf8
    ...
    [mysqld]
    ...
    default-character-set=utf8

    Good style of programming isn't tabs and nicely-formatted comments, it is bullet-proof code, in which we can trust.

    Mantra of the serious programming: " All things which can happen - happens. Everything that cannot happen - also happens, but - not so often"

  • The default character set of a database is different than the encoding used in the connection to the database. When you set your my.ini or your my.conf to use

    default-character-set=utf8 
    default-collation=utf8_general_ci

    PHP still uses latin1 as the encoding of the connection. That is why Elgg 1.7 adds a call to set the connection encoding to utf8. The only people who should have problems in the upgrade are those who added code to Elgg to set the connection type to utf8. I do not know of a way to tell whether the previous strings were sent to the database in latin1 or utf8.

  • Hi,

    My elgg version is 1.7.1 and 2009100701.sql is written:

    -- Previously was the UTF8 migration that is now in code at 2010033101.
    -- Keeping this file to force an overwrite and to avoid confusion with missing migrations.

    But no exist file 2010033101.sql.

    Has a similar name file: 2010030101.sql

    Is that?

     

    Thanks

     

  • @Eric - engine/lib/upgrades/2010033101.php