Blogger to Wordpress

I have been tasked to migrate a few blogs from Blogger to a self-hosted WordPress. So far I have experiences migrating my own blog between different software:

Time to try a new migration combination.

Plan

The plan here is:

  • export what can be exported from Blogger;
  • start a local WordPress;
  • import the data exported from Blogger to the local WordPress;
  • fix issues and repeat the process until the data are good enough;
  • backup the local WordPress database and files;
  • load the backup on a staging WordPress instance;
  • have the blog owner configure WordPress’ theme and fix the posts;
  • migrate the staging blog to production.

Export the Blogger’s blog

Log in with the blog’s owner account on Blogger. Go to SettingsManage Blog then click on Back up content.

This will export and download an XML file containing several of the blog’s information like settings, posts, labels and comments.

Spin-up a local WordPress

First of all, I started a WordPress on my computer to try the import. So I installed it with Docker with a simple docker-compose.yml file:

services:

  wordpress:
    image: wordpress:php8.2
    restart: no
    ports:
      - "127.0.0.1:8080:80"
    environment:
      WORDPRESS_DB_HOST: db
      WORDPRESS_DB_USER: exampleuser
      WORDPRESS_DB_PASSWORD: examplepass
      WORDPRESS_DB_NAME: exampledb
    volumes:
      - wordpress:/var/www/html

  db:
    image: mariadb:lts
    restart: no
    ports:
      - "127.0.0.1:3306:3306"
    environment:
      MYSQL_DATABASE: exampledb
      MYSQL_USER: exampleuser
      MYSQL_PASSWORD: examplepass
      MYSQL_RANDOM_ROOT_PASSWORD: true
    volumes:
      - mariadb:/var/lib/mysql

volumes:
  wordpress:
  mariadb:

Then start it with docker compose up.

Open http://localhost:8080/wp-admin/ and follow the WordPress installation instruction (set the language, blog title and administrator account).

Import the blog

In WordPress, go to ToolsImportBlogger and click Install Now. Once it’s done, click Run Importer, select the XML file downloaded from Blogger, upload it, then I associated all the posts to the admin user and waited.

Once done, the posts and comments are available in WordPress 🎉.

More information about the blogger importer plugin is on wordpress.org.

Issues

Upload size

The WordPress’ Docker image is… bad. Basically they put a WordPress with default configuration running on a server with default Apache and PHP configuration.

So obviously that’s not what’s going in production, it’s only to test on my computer, but it still comes with issues. The first one being the default file upload size which is 2 MB. And of course one of the blogs I’m migrating has an XML file bigger than that.

Let’s do a quick modification of the container image:

$ docker exec container-wordpress-1 bash -c "echo \"upload_max_filesize = 40M\" > /usr/local/etc/php/conf.d/upload.ini"

After restarting the container, the new configuration is taken into account and I can upload the file.

Semi manual cleanup

The XML file has content nodes containing the raw HTML code of each post. I don’t know if it’s because the way Blogger is or the persons who wrote the posts, but the HTML content is clearly not clean. There are style sections included in each post, which is not great for consistency and portability.

So I did quite a lot of search and replace to clean some of it, but still, a lot will have to be done manually after the import.

Labels

Labels on Blogger’s posts are converted to Categories on WordPress’ posts. Which seems the obvious choice given that the Labels are in a category node in the XML file.

The issue here is that, on the blogs I’m migrating, the Labels were used like Tags. WordPress having both Categories and Tags, migrating the Labels to Tags would make more sense here.

So before running the import task, let’s do a quick modification of the blogger import plugin to store categories in tags:

$ docker exec container-wordpress-1 bash -c "sed -i 's/wp_create_categories(array_map('\''addslashes'\'', \$this->categories), \$post_id)/wp_add_post_tags(\$post_id, \$this->categories)/' /var/www/html/wp-content/plugins/blogger-importer/blogger-entry.php"

I discovered afterward that in WordPress’ import tools, there’s one called “Categories and Tags Converter”. That’s probably the normal way to fix this issue.

Images

The Blogger’s XML file contains text only. So what about all the images?

After the import is done, the website shows the pictures but looking into WordPress’ media library, I can see only a tiny subset of the pictures. In fact, it’s only the pictures of the most recent posts. The older ones still have the pictures fetched from Blogger ☹️.

This one seems to be a genuine bug. On line 118 in blogger-importer.php the process_images method is called, and it imports the pictures for 20 posts. The way it’s written shows that it was meant to be called in a loop, but somehow this part was missed.

So I opened an issue and did a Pull Request to fix it.

Crash

The import process also crashed on me. Some links seem to be invalid and that’s the reason the process crashed.

It’s an old plugin, and the quality is what I remember the average WordPress quality was when I looked at it years ago. In this specific case the code was copied from somewhere else and the person clearly stated that it should not work… and that person was right. Too bad they didn’t think about fixing it.

Hop, another issue opened for that.

Weird line breaks

After importing, the posts had plenty of line breaks in middle of sentences. This is because WordPress automatically converts line breaks to real HTML ones (<br/>)… which is a completely stupid thing to do for HTML content.

I solved this by “cleaning” the posts content during the import. Basically I minified the HTML, thus removing all unnecessary white spaces and line breaks. While doing so I took the opportunity to convert HTTP links to HTTPS.

Backup local WordPress

Database

Back up the database by connecting to the container (if you don’t have MariaDB/MySQL tools installed on your computer) then dump the database:

$ docker exec -it container-db-1 /bin/bash
# mysqldump --add-drop-table -u exampleuser -p exampledb > wordpress.sql
# exit
$ docker cp container-db-1:/wordpress.sql .

Files

At the moment I’m only interested in the uploaded files, located in WordPress’ wp-content/uploads folder.

In my system, docker volumes are in /var/lib/docker/volumes/.

Let’s create an archive file from the uploads folder:

$ sudo tar -cvzf wordpress.tar.gz /var/lib/docker/volumes/container_wordpress/_data/wp-content/uploads

Misc

Once the database is restored in a new environment, you may need to update URLs in the database to make it work:

UPDATE wp_options SET option_value = 'https://blog.example.net' WHERE option_name = 'siteurl';
UPDATE wp_options SET option_value = 'https://blog.example.net' WHERE option_name = 'home';
UPDATE wp_posts SET post_content = REPLACE(post_content, 'http://localhost:8080/', 'https://blog.example.net/');

And that’s it for the “interesting” part of the conversion of a Blogger blog to a WordPress one.

Comments Add one by sending me an email.