Nextcloud and emojis

I have been following OwnCloud for a very long time, almost since their beginning, continued more recently with the Nextcloud fork. But it’s only during the last couple of days that I finally started playing with it. And as usual when I’m trying something, I find things that don’t work.

I first started to notice an issue with the News application that was not retrieving some news entries from RSS feeds. Suspiciously the missing entries seemed to contain some emojis which reminded me of an issue with MySQL and UnicodeMySQL has terrible default settings and in this case the issue is that the setting utf8 for character encoding does not support the whole UTF-8 encoding. Short version: UTF-8 can be up to 4 bytes long and MySQL stores up to 3 bytes.

In the logs I could find errors like Incorrect string value: '\xF0\x9F\x99\x82', where the value is in fact a smiling emoji: 🙂.

The current Nextcloud version (11) has this issue but they are working on it. They provide a way to have Nextcloud support 4-bytes UTF-8 with MySQL. I ended up with my own solution before finding that page…

There are mainly 5 issues:

  • Ensuring the tables are using the right character encoding (utf8mb4) and collation (utf8mb4_xxx).
  • Ensuring that the application uses utf8mb4.
  • Converting the current tables.
  • Ensuring that the index are big enough (in InnoDB the index length is limited and having longer characters may go beyond that limit (this is solved with specific InnoDB option and table format)).
  • Ensuring that the fields are bug enough (a 255 bytes fields contains less 4-bytes characters than 3-bytes characters). Here the solution is more complicated, each offending field should be found and the have its size increased, which may not be a good idea in some cases.

Here is a way to fix that (very similar to the Nextcloud solution):

  1. Put Nextcloud in maintenance mode:

    $ sudo -u www-data php /var/www/nextcloud/occ maintenance:mode --on
    
  2. Ensure MariaDB / MySQL server has the following settings:

    [mysqld]
    innodb_file_format = barracuda
    innodb_large_prefix = on
    innodb_file_per_table = on
    
  3. Restart MariaDB / MySQL if the configuration has been changed.

  4. Change the database character set and collation:

    ALTER DATABASE nextcloud CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
    
  5. With MariaDB, check that your tables have barracuda as FILE_FORMAT and DYNAMIC or COMPRESSED as ROW_FORMAT:

    SELECT * FROM information_schema.INNODB_SYS_TABLES WHERE NAME LIKE 'nextcloud/%';
    
  6. If the MariaDB tables are not in the right format, I used a more violent way than what is suggested by Nextcloud guys, I dumped and restored the database:

    $ mysqldump --lock-tables -h localhost -u nextcloud -p nextcloud | sed 's/COLLATE utf8_bin/COLLATE utf8mb4_bin/g' | sed 's/COLLATE=utf8_bin/COLLATE=utf8mb4_bin/g' | sed 's/DEFAULT CHARSET=utf8 /DEFAULT CHARSET=utf8mb4 /g' | sed 's/ENGINE=InnoDB/ENGINE=InnoDB ROW_FORMAT=DYNAMIC/g' > nextcloud.sql
    $ mysql -h localhost -u nextcloud -p nextcloud < nextcloud.sql
    
  7. Set Nextcloud to use utf8mb4:

    $ sudo -u www-data php /var/www/nextcloud/occ config:system:set mysql.utf8mb4 --type=boolean --value="true"
    
  8. It’s always a good idea to repair the tables:

    $ mysqlcheck -u root -p --auto-repair --optimize --all-databases
    
  9. Remove maintenance mode

    $ sudo -u www-data php /var/www/nextcloud/occ maintenance:mode --off
    

N.B.: in NextCloud solution they use collation utf8mb4_general_ci which may trigger an other kind of bug like not updating the database if you change only an accent or the case of a character, so I prefer to use utf8mb4_bin which does not have this issue (but it seems that internally Nextcloud uses utf8mb4_bin too).

Unfortunately, whatever the method, there’s one Nextcloud instance I didn’t manage to upgrade because of the field size issue: Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs.

Useful links:

Comments Add one by sending me an email.