Import IntenseDebate in Wordpress

After migrating a blog from blogger to WordPress, the next step was to import the comments. On the original blog, the comments where not inside blogger but using a third party system called IntenseDebate.

On IntenseDebate website, there’s an option to export all the comments for a given blog as an XML file.

The XML file looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<output>

<blogpost>
<url>http%3A%2F%2Fblogname.blogspot.com%2F2010%2F01%2Fmy-blog-post-title.html</url>
<title>http://blogname.blogspot.com/2010/01/my-blog-post-title.html</title>
<guid>http://blogname.blogspot.com/2010/01/my-blog-post-title.html</guid>
<comments>
	<comment id='123456789' parentid='0'>
		<isAnon>1</isAnon>
		<name><![CDATA[Commenter&amp;#039;s Name]]></name>
		<email>commenter@example.com</email>
		<url></url>
		<ip>12.34.56.78</ip>
		<text><![CDATA[My blog&#039;s link <a href="http:\/\/example.net\/2010\/01\/yolo.html?utm_source=rss&amp;amp;utm_medium=rss" target="_blank"> with some things $amp; others.  ]]></text>
		<date>2010-01-20 10:11:12</date>
		<gmt>2010-01-20 10:11:12</gmt>
		<score>0</score>
	</comment>
</comments>
</blogpost>
</output>

There used to be WordPress plugins to import those files, but it hasn’t been maintained for more than 10 years… Shouldn’t be too difficult to come up with a little script to get the job done.

Looks like some cleanup, unescaping and decoding of content and URL is required. But there are 2 important things to find out:

The second point is easy, each comment has an id attribute that we will use as identifier in WordPress, and each comment has a parentid attribute pointing to the parent comment.

For the first point, after looking a bit in WordPress database, it appears that while importing the blogger’s XML file, a blogger_permalink entry has been created for each post in wp_postmeta table. That permalink, once prefixed with https:// and domain name, is what we find in the url tag of the blogpost node.

Here is what I used:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
<?php

function getPostId($xml_post, $ps) {
  $url = substr(rawurldecode($xml_post->url), strlen('https://www.blogname.com'));
  mysqli_stmt_bind_param($ps, 's', $url);
  mysqli_stmt_execute($ps);
  mysqli_stmt_bind_result($ps, $post_id);
  mysqli_stmt_fetch($ps);
  return $post_id;
}

function decode($text) {
  $text = stripslashes($text);
  $text = html_entity_decode($text, ENT_QUOTES | ENT_XML1 | ENT_SUBSTITUTE, 'UTF-8');
  $text = html_entity_decode($text, ENT_QUOTES | ENT_HTML5 | ENT_SUBSTITUTE, 'UTF-8');
  $from = ['&rsquo;'];
  $to = ['’'];
  $text = str_replace($from, $to, $text);
  return trim($text);
}

mysqli_report(MYSQLI_REPORT_ERROR | MYSQLI_REPORT_STRICT);
$sql_conn = mysqli_connect('localhost', 'username', 'password', 'dbname') or die('Connection failed: ' . mysqli_connect_error());
$xml = simplexml_load_file('intensedebate-backup.xml') or die('Error reading XML file.');

$ps1 = mysqli_prepare($sql_conn, "SELECT post_id FROM wp_postmeta WHERE meta_key='blogger_permalink' AND meta_value=?");
$ps2 = mysqli_prepare($sql_conn, "INSERT INTO wp_comments (comment_ID, comment_post_ID, comment_author, comment_author_email, comment_author_url, comment_author_IP, comment_date, comment_date_gmt, comment_content, comment_parent, user_id) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)");
foreach($xml->blogpost as $post) {
  $post_id = getPostId($post, $ps1);
  mysqli_stmt_reset($ps1);
  if (!$post_id) {
    print('No post found for ' . $post->title . PHP_EOL);
    continue;
  }
  print('Processing comments for ' . $post->title . PHP_EOL);
  foreach($post->comments->comment as $comment) {
    $id = $comment->attributes()->id;
    $parent_id = $comment->attributes()->parentid;
    $ip = $comment->ip;
    $email = strtolower(trim($comment->email));
    $user_id = $email == 'livgaunt@gmail.com' ? 1 : 0;
    $author = $user_id ? 'Liv' : decode($comment->name);
    $author_url = $user_id ? 'https://www.blogname.com/author/liv' : $comment->url;
    $date = $comment->date;
    $date_gmt = $comment->gmt;
    $content = decode($comment->text);
    print('- ' . join(' | ', [$id, $parent_id, $user_id, $ip, $author, $email, $author_url, $date, $date_gmt]) . PHP_EOL);
    mysqli_stmt_bind_param($ps2, 'iisssssssii', $id, $post_id, $author, $email, $author_url, $ip, $date, $date_gmt, $content, $parent_id, $user_id);
    mysqli_stmt_execute($ps2);
    mysqli_stmt_reset($ps2);
  }
}

$result = mysqli_query($sql_conn, "UPDATE wp_posts p SET comment_count = (SELECT COUNT(*) FROM wp_comments c WHERE c.comment_post_ID = p.ID) WHERE p.post_type = 'post'");
if ($result === FALSE) {
  print('Failed to update comments count on posts.' . PHP_EOL);
}

mysqli_close($sql_conn);

Some explanations:

This script and the comments XML file are then uploaded to the container running WordPress. The script can be executed with PHP CLI or by calling it through the web interface if exposed (not recommended).

Note: raw HTML is saved in the comment. Make sure it does not contain something you don’t want or could be a security issue.

Comments Add one by sending me an email.