Blogger to Wordpress
I have been tasked to migrate a few blogs from Blogger to a self-hosted WordPress. So far I have experiences migrating my own blog between different software:
Time to try a new migration combination.
Plan
The plan here is:
- export what can be exported from Blogger;
- start a local WordPress;
- import the data exported from Blogger to the local WordPress;
- fix issues and repeat the process until the data are good enough;
- backup the local WordPress database and files;
- load the backup on a staging WordPress instance;
- have the blog owner configure WordPress’ theme and fix the posts;
- migrate the staging blog to production.
Export the Blogger’s blog
Log in with the blog’s owner account on Blogger.
Go to Settings
→ Manage Blog
then click on Back up content
.
This will export and download an XML file containing several of the blog’s information like settings, posts, labels and comments.
Spin-up a local WordPress
First of all, I started a WordPress on my computer to try the import.
So I installed it with Docker with a simple docker-compose.yml
file:
services:
wordpress:
image: wordpress:php8.2
restart: no
ports:
- "127.0.0.1:8080:80"
environment:
WORDPRESS_DB_HOST: db
WORDPRESS_DB_USER: exampleuser
WORDPRESS_DB_PASSWORD: examplepass
WORDPRESS_DB_NAME: exampledb
volumes:
- wordpress:/var/www/html
db:
image: mariadb:lts
restart: no
ports:
- "127.0.0.1:3306:3306"
environment:
MYSQL_DATABASE: exampledb
MYSQL_USER: exampleuser
MYSQL_PASSWORD: examplepass
MYSQL_RANDOM_ROOT_PASSWORD: true
volumes:
- mariadb:/var/lib/mysql
volumes:
wordpress:
mariadb:
Then start it with docker compose up
.
Open http://localhost:8080/wp-admin/
and follow the WordPress installation instruction (set the language, blog title and administrator account).
Import the blog
In WordPress, go to Tools
→ Import
→ Blogger
and click Install Now
.
Once it’s done, click Run Importer
, select the XML file downloaded from Blogger, upload it, then I associated all the posts to the admin user and waited.
Once done, the posts and comments are available in WordPress 🎉.
More information about the blogger importer plugin is on wordpress.org.
Issues
Upload size
The WordPress’ Docker image is… bad. Basically they put a WordPress with default configuration running on a server with default Apache and PHP configuration.
So obviously that’s not what’s going in production, it’s only to test on my computer, but it still comes with issues. The first one being the default file upload size which is 2 MB. And of course one of the blogs I’m migrating has an XML file bigger than that.
Let’s do a quick modification of the container image:
$ docker exec container-wordpress-1 bash -c "echo \"upload_max_filesize = 40M\" > /usr/local/etc/php/conf.d/upload.ini"
After restarting the container, the new configuration is taken into account and I can upload the file.
Semi manual cleanup
The XML file has content
nodes containing the raw HTML code of each post.
I don’t know if it’s because the way Blogger is or the persons who wrote the posts, but the HTML content is clearly not clean.
There are style sections included in each post, which is not great for consistency and portability.
So I did quite a lot of search and replace to clean some of it, but still, a lot will have to be done manually after the import.
Labels
Labels
on Blogger’s posts are converted to Categories
on WordPress’ posts.
Which seems the obvious choice given that the Labels
are in a category
node in the XML file.
The issue here is that, on the blogs I’m migrating, the Labels
were used like Tags
.
WordPress having both Categories
and Tags
, migrating the Labels
to Tags
would make more sense here.
So before running the import task, let’s do a quick modification of the blogger import plugin to store categories in tags:
$ docker exec container-wordpress-1 bash -c "sed -i 's/wp_create_categories(array_map('\''addslashes'\'', \$this->categories), \$post_id)/wp_add_post_tags(\$post_id, \$this->categories)/' /var/www/html/wp-content/plugins/blogger-importer/blogger-entry.php"
I discovered afterward that in WordPress’ import tools, there’s one called “Categories and Tags Converter”. That’s probably the normal way to fix this issue.
Images
The Blogger’s XML file contains text only. So what about all the images?
After the import is done, the website shows the pictures but looking into WordPress’ media library, I can see only a tiny subset of the pictures. In fact, it’s only the pictures of the most recent posts. The older ones still have the pictures fetched from Blogger ☹️.
This one seems to be a genuine bug.
On line 118 in blogger-importer.php
the process_images
method is called, and it imports the pictures for 20 posts.
The way it’s written shows that it was meant to be called in a loop, but somehow this part was missed.
So I opened an issue and did a Pull Request to fix it.
Crash
The import process also crashed on me. Some links seem to be invalid and that’s the reason the process crashed.
It’s an old plugin, and the quality is what I remember the average WordPress quality was when I looked at it years ago. In this specific case the code was copied from somewhere else and the person clearly stated that it should not work… and that person was right. Too bad they didn’t think about fixing it.
Hop, another issue opened for that.
Weird line breaks
After importing, the posts had plenty of line breaks in middle of sentences.
This is because WordPress automatically converts line breaks to real HTML ones (<br/>
)… which is a completely stupid thing to do for HTML content.
I solved this by “cleaning” the posts content during the import. Basically I minified the HTML, thus removing all unnecessary white spaces and line breaks. While doing so I took the opportunity to convert HTTP links to HTTPS.
Backup local WordPress
Database
Back up the database by connecting to the container (if you don’t have MariaDB/MySQL tools installed on your computer) then dump the database:
$ docker exec -it container-db-1 /bin/bash
# mysqldump --add-drop-table -u exampleuser -p exampledb > wordpress.sql
# exit
$ docker cp container-db-1:/wordpress.sql .
Files
At the moment I’m only interested in the uploaded files, located in WordPress’ wp-content/uploads
folder.
In my system, docker volumes are in /var/lib/docker/volumes/
.
Let’s create an archive file from the uploads
folder:
$ sudo tar -cvzf wordpress.tar.gz /var/lib/docker/volumes/container_wordpress/_data/wp-content/uploads
Misc
Once the database is restored in a new environment, you may need to update URLs in the database to make it work:
UPDATE wp_options SET option_value = 'https://blog.example.net' WHERE option_name = 'siteurl';
UPDATE wp_options SET option_value = 'https://blog.example.net' WHERE option_name = 'home';
UPDATE wp_posts SET post_content = REPLACE(post_content, 'http://localhost:8080/', 'https://blog.example.net/');
And that’s it for the “interesting” part of the conversion of a Blogger blog to a WordPress one.
Comments Add one by sending me an email.