Migrating your blog from any BlogML based platform to WordPress

|  Posted: September 12, 2012  |  Categories: General Technical
Tags: General

 

BlogEngine.NET to WordPress

First I need to apologies to my readers for couple of things. One you might have noticed lot of updates recently in your readers with old posts, that’s mainly due to the migration process and second I had to delete 100’s of comments that’s been left in the blog for over 7 years now. The main reason for simply thrashing all the comments is due to the amount of spam comments I had in my blog. When I exported the blog (as a BlogML) file, the size of the file was around 35MB. That’s mainly due over 60,000 comments in the blog from world wide spammers. It was not practical to go through all of them and filter the good ones. I also dont want to carry those spam comments to my new platform. Clearing the comments brought the file size to mere 2MB.

Why WordPress?

Technical fellow like me tend to choose the platform we are familiar with (in my case Microsoft based, ASP.NET) rather than the platform that’s matured. I moved to BlogEngine.NET some 4 years ago from BlogSpot, mainly because that was one of the familiar ASP.NET based blogging platform available that time. It gives you some level of comfort working on the known environment like IIS, Virtual directory etc. But honestly I never changed a single line of code apart from the occasional html theme file changes.  Moving to WordPress is a whole new world with php, MySQL. But with my 2 weeks experience, I’m already feeling comfortable with the rich plugin eco-system and managed to do everything I wanted to do. Just hired a designer to get a really cool custom theme within 4 days.

Unfortunately the migration process was not very straight forward, none of the articles I found on search helped me to achieve what I want. Every step I had to do resulted in writing some small utilities, I combined all of them into a simple console application (BlogML.Helper.exe) with command line parameters to help people going through this process.

At high level we are going to convert BlogML into WRX file (WordPress Extended RSS import file format) and import it into WordPress.  I took portion of the source code from this open source blog migrator project.

Warning: I’m providing this source code and process AS-IS without any warranty :-). You can download the complete BlogEngine.NET to WordPress migration tool source code.

Do you want to keep the comments?

The very first step is to login to your blog admin console and export your content as BlogML. In BlogEngine.NET, you go to Settings>Import/Export.

Once downloaded, you need to make a decision whether to keep the existing comments or not.  As I explained earlier, if you have used BlogEngine.NET there is a high chance you blog is flooded with 1000’s of spam comments. This is mainly due to the link building loop hole in the platform spammers exploited. This is your chance to rectify it.

From the following screen shot you can see there are few options available with the tool.

blogml.helper.exe

  • RemoveComments
  • ExportToWRX
  • QATarget
  • QASource
  • NewWRXWithOnlyFailedPosts

The very first option is to specify /Action:RemoveComments  /BlogMLFile:<You exported file name> as shown below

remove comments from blogml files

The results will show the name of the posts and number of comments been removed. It will also update the supplied file.

Correct the categories manually

There is a slight variation in the way categories are handled between BlogML and WRX, so open the BlogML file in your XML editor and manually replace the categories by simple “find and replace”. You need to convert the GUID based category id to text based. Example: In the following case convert “9bdfceee-7814-4d4d-b77d-deaf893e402e” to biztalk, “1550e058-44fa-42c3-a666-5826f4c50874” to biztalk-azure etc

blog categories

You can also use this opportunity to consolidate all your categories. Ex: If there are only 1 or 2 posts in the category, place those posts in different category and delete those categories. In my opinion there should not be more than 10 categories for your blog. For fine grained categorisation you can use tags.

Upload all the images and files to WordPress

In case of BlogEngine.NET all your uploaded blog images will reside under the folder blog\App_Data\Files. Copy them across and upload it under the WordPress upload folder blog\wp-content\uploads\files

Correct all the internal links in the blog posts

It’s much easier to correct all your internal links at this stage, since you are dealing with one single XML file. Some of the links you need to worry about

Spend as much time as possible in this step to make sure you correct all the above links. This is really time consuming process depending on the size of your blog, but it’s much quicker to do it here. If you blog is really big, look in the option of updating the source code to automate this task.

Convert BlogML to WRX format

At this stage you should have a healthy BlogML file with all the links corrected, all the categories consolidated and corrected. You simply run the tool with the following command

BlogML.Helper.exe /Action:ExportToWRX /BlogMLFile:BlogML.xml /SourceUrl:blogs.digitaldeposit.net/saravana /TargetUrl:blogs.biztalk360.com

The above command should create 4 files as shown below

blogml to wrx format

BlogML.WRX.xml is the WordPress extensible RSS file ready to import.

BlogML.WRX.TargetQA.txt file contains all the URL’s based on new blog for testing

BlogML.WRX.TargetQA.txt file

BlogML.WRX.SourceQA.txt file contains all the current URL to check for 301 redirect

BlogML.WRX.Redirect.txt files contains the redirect statements for each post

Import WRX file in WordPress

Login to your WordPress admin page, on the left hand navigation select “Tools\Import”. Import page will come with list of supported import options. You pickup “WordPress” from the list.  If this is your first visit, WordPress will ask you to install the plugin and activate it. Once it’s ready, it will show you the Import page as shown below

importing wordpress

Navigate to your BlogML.WRX.xml file and click “Upload file and import”. Once the file is uploaded, it will show the list of options to choose the user under which you want import the posts, you can choose an existing user or create new and click import.

Dealing with Import issues

In my case the import always resulted in only partial import of posts. From your BlogML.WRX.TargetQA.txt file (looking at line numbers) generated earlier,  you will know the total number of posts that’s present in your blog. If it’s not matching the number of posts imported, you need to follow these steps to get all of them in. This is bit time consuming process. Run the following command

BlogML.Helper.exe /Action:QATarget /QATargetFile:BlogML.WRX.TargetQA.txt

import BlogML.WRX.TargetQA.txt file

The tool will check each ?URL (your blog posts) and record the status whether “OK” or “Protocol Error” and at end of the execution it will produce a file called BlogML.WRX.TargetQA.Report.txt. The content will look like this

blog urls

Now run the following command in the tool

BlogML.Helper.exe /Action:NewWRXWithOnlyFailedPosts /WRXFile:BlogML.WRX.xml /QAReportFile:BlogML.WRX.TargetQA.Report.txt

which will produce a new WRX file called BlogML.WRX.OnlyFailed.xml  only with the posts that resulted in “Protocol Error” (in other words, those posts that didn’t import properly).

Now login into WordPress admin and follow the import steps using the new file BlogML.WRX.OnlyFailed.xml . You may need to repeat these steps few times until all the posts are imported.

Don’t forget the 301 Redirects

It’s very important to make sure you redirect your original blog links to new ones.  Otherwise you risk the chance of loosing all the Google ranks you acquired over years. I’m not going to explain in this article how to do it. But if you remember, the tool creates a BlogML.WRX.Redirect.txt file automatically to assist you in the process of setting the redirects.

Author: Saravana Kumar

Saravana Kumar is the Founder and CTO of BizTalk360, an enterprise software that acts as an all-in-one solution for better administration, operation, support and monitoring of Microsoft BizTalk Server environments.

  • Clay Miller

    Thank you for the blog post. I have migrated my blog posts from BlogEngine to WordPress, but I just can’t get the 301 redirects to work. I have run your program so I have the old and new URLs in one file, but I can’t find anywhere how to do the 301 redirects. I’m on a Windows server using IIS6. My hosting company can’t help either.

    So were you able to get the 301 redirects to work?

    Any help would be greatly appreciated.

    Thank you,

    Clay Miller

  • rts

    This was awesome. Worked perfectly. Thanks!

  • MichaelFreidgeim

    Thank you for the post and source code. I’ve done some changes to the code and want to publish them. Do you mind if I will create a project on github? Or do you want to create a github project yourself and I will do a fork from it and then submit patch to your project?

  • MichaelFreidgeim

    I’ve published the source code with my fixes/enhancements at https://github.com/MNF/BlogML-to-WRX

    • Thanks, sorry couldn’t reply back earlier. I’m fine you deposited the code in a proper place.

One Platform Operations, Monitoring and Analytics Software
BizTalk360

microsoft biztalk

Learn more

Over 500 customers across 30+ countries depend on BizTalk360

ServiceBus360

Azure Composite Application

Learn more

Start manage and monitor your Azure Services in minutes

One Platform - Operations, Monitoring and Analytics Software
BizTalk360

microsoft biztalk

Learn more

Over 500 customers across 30+ countries depend on BizTalk360

One Platform - Operations, Monitoring and Analytics Software
ServiceBus360

Azure Composite Application

Learn more

Start manage and monitor your Azure Services in minutes

Back to Top