Import IntenseDebate in Wordpress
After migrating a blog from blogger to WordPress, the next step was to import the comments. On the original blog, the comments where not inside blogger but using a third party system called IntenseDebate.
On IntenseDebate website, there’s an option to export all the comments for a given blog as an XML file.
The XML file looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<output>
<blogpost>
<url>http%3A%2F%2Fblogname.blogspot.com%2F2010%2F01%2Fmy-blog-post-title.html</url>
<title>http://blogname.blogspot.com/2010/01/my-blog-post-title.html</title>
<guid>http://blogname.blogspot.com/2010/01/my-blog-post-title.html</guid>
<comments>
<comment id='123456789' parentid='0'>
<isAnon>1</isAnon>
<name><![CDATA[Commenter&#039;s Name]]></name>
<email>commenter@example.com</email>
<url></url>
<ip>12.34.56.78</ip>
<text><![CDATA[My blog's link <a href="http:\/\/example.net\/2010\/01\/yolo.html?utm_source=rss&amp;utm_medium=rss" target="_blank"> with some things $amp; others. ]]></text>
<date>2010-01-20 10:11:12</date>
<gmt>2010-01-20 10:11:12</gmt>
<score>0</score>
</comment>
</comments>
</blogpost>
</output>
There used to be WordPress plugins to import those files, but they haven’t been maintained for more than 10 years… Shouldn’t be too difficult to come up with a little script to get the job done.
Looks like some cleanup, unescaping and decoding of content and URL is required. But there are 2 important things to find out:
- how to match a
blogpost
node from the XML file with a post in WordPress; - how to keep the comments threaded.
The second point is easy, each comment has an id
attribute that we will use as identifier in WordPress, and each comment has a parentid
attribute pointing to the parent comment.
For the first point, after looking a bit in WordPress database, it appears that while importing the blogger’s XML file, a blogger_permalink
entry has been created for each post in wp_postmeta
table.
That permalink, once prefixed with https://
and domain name, is what we find in the url
tag of the blogpost
node.
Here is what I used:
|
|
Some explanations:
- Lines 3 to 10: a function to retrieve WordPress post id from a blogpost URL.
- Lines 12 to 20: a function to clean content. The content was not consistant, hence requiring different unescaping methods and still some had to be manually done.
- Line 23: open database connection.
- Line 26 and 27: create prepared statements for the 2 SQL queries that will be executed several times.
The first one is to find the blog post ID, cf. function
getPostId
. The second one is to insert a comment in the database. - Line 28: looping over each blog post from the XML file.
- Line 29: find the post id.
- Line 36: looping over each comment of current blog post from the XML file.
- Lines 37 to 46: extract and parse the different fields needed to create a comment entry.
- Lines 48 and 49: bind the fields to the prepared statement to insert the comment in the database then execute it.
- Line 54: update the comments count on each blog post.
This script and the comments XML file are then uploaded to the container running WordPress. The script can be executed with PHP CLI or by calling it through the web interface if exposed (not recommended).
Note: raw HTML is saved in the comment. Make sure it does not contain something you don’t want or could be a security issue.
Comments Add one by sending me an email.