Database Merging Made Easy: Something We’ve Been Working On for the Past Year

#

Mergebot is the new top secret product we’ve been working on for the better part of a year. It addresses the database merging problem that most of you have encountered. But before we get into the solution, let’s first explore the problem.

Hang on, what about Data Hawk!?

TL;DR – Data Hawk is now Mergebot.

You may have heard us mention “Data Hawk” as the thing we’ve been working on. You may have even seen our landing page at datahawk.io. Unfortunately I didn’t do a thorough enough search for other companies using “Data Hawk” (or something similar) and it turns out there are a number of companies that might have a problem with us using Data Hawk. So we went with the lawyers on this and abandoned the name and branding to avoid trademark issues. After a lengthy debate, we renamed the product to Mergebot.

Now, back to the main event…

The Database Merging Problem

Ok, so you’ve just landed an exciting new client: Springfield Box Factory (boxes!!!). They need some development done to the site. They have an existing WordPress site, so you get a copy of the files and database, set up the site locally and start coding.

A couple of weeks later, you’re ready to deploy your update but realize that while you were making changes to your database locally, the client was also updating the live database and visitors to the site were also adding data. If you just overwrite the live database with your local database, you’ll lose all the changes to the live site made in the past 2 weeks. Dun dun dun!

dun dun dun

Magic Merge

When we’re faced with this situation, we wish we had a magical merge tool that would just mash the local and live databases together perfectly and everything would be hunky dory. We desperately search Google for this solution, but find nothing, bupkis.

If such a tool did exist, how would it work? First, it would need more than just your local and live databases. It would also need a copy of the database from 2 weeks ago (when you first set up the site locally). Then it could determine the differences between your local database then and now. Maybe it would generate a bunch of queries that you could run on production to apply those changes. Seems simple enough.

Unfortunately it’s too simple. When you apply the changes to the live site, the client’s changes get overwritten. What if you updated the title of a page from “Box Sizes” to “Box Dimensions” and the client also updated it on the live site to “Sizes”. The script would overwrite the client’s change. Not a big deal in the case of a page title, but it doesn’t take a creative genius to imagine major problems overwriting client changes could cause.

7395256948_075103a347_o

A magic merge tool would also need to determine the differences between the database from 2 weeks ago and the current live database, then determine if any of those changes conflict with any changes you made locally. If it does find conflicts, it needs to present a UI to resolve those conflicts. A UI where you can look at the changes and tell the tool to use your local change, leave the live database as is, or change it to something completely different.

Things have gotten pretty complicated, but it still sounds pretty good. A heck of a lot better than doing things manually. But wait! Because the live site’s database is changing (because it’s live) while we’re running this tool, we need to take the live site offline while we run the magic merge tool, resolve conflicts, and test. Most clients don’t like it when their site is offline (especially if it generates all their revenue). Understandable. So the pressure is on to get the site back up. At the same time, you really don’t want to be rushing through merging and testing. Being pulled in two directions makes for an awful, stressful situation. This is developer hell.

Workarounds

Most developers who have encountered this database merging problem come up with ways to avoid it. The one I’ve heard most often (and what I used to do when I was freelancing) is to take notes as you change things locally. Then when it comes time to deploy, go through the notes and perform those same changes on the live site.

If you’re keen to automate deployment, you can also script the changes with an SQL script or a PHP script. This usually involves quite a bit of extra work and it can be tricky to figure out what exactly you’re supposed to script. For example, let’s say you need to change a plugin setting. To script that, you’ll have to capture the queries that are run as a result of saving the setting or look at the code to see what it does and try replicate it.

Although scripting does take significant extra effort, there are a couple of huge benefits:

  1. You can test deployment before deploying to the live site
  2. You can keep your local database fresh with the latest live data

That is, you can replace your local database with the live database and run your script to get all your changes back. You can do that as often as you like and could even automate it to run overnight so you have a fresh site to work on every day.

One thing scripting doesn’t help with is overwriting changes on the live site. That is, any updates in the script just updates the data on the live site without checking if it was changed.

Our Solution: Mergebot

Mergebot combines the benefits of scripting and the (fictional) magic merge tool without any of the negatives.

You install the Mergebot plugin on your local and live sites. Then as you make changes locally your changes are sent to our cloud app (mergebot.com).

Later, you can safely refresh your local database by replacing it with the live database then running Mergebot to apply your changes. Not only have you refreshed your local database, you’ve also just tested a database deployment locally. You can continue developing and refresh your local database regularly, testing your deployment regularly.

If some data changed in the live database that you’ve also changed locally, Mergebot will detect it when applying changes and present a nice UI asking you to resolve the conflict. It will also save this resolution for the next time you refresh your local database or when it comes time to deploy to the live site. That is, you won’t ever have to resolve the same conflict twice.

When it’s time to deploy to the live site, put the site in maintenance mode, deploy your code changes, then execute Mergebot to deploy your database changes. Since you already executed Mergebot locally (probably a few times), resolved the conflicts, and tested, you’ll have confidence that it will all work.

Mergebot handles all the complicated translation of IDs between databases and even detects IDs in shortcodes and meta values.

Although not required, we will recommend using WP Migrate DB Pro with Mergebot as they’ve been integrated together. For example, Mergebot ignores any WP Migrate DB Pro queries, like saving settings and any temporary migration data. And when you pull your live database down and replace your local database, WP Migrate DB Pro will preserve your Mergebot settings.

Calling All Early Adopters!

We’re very happy with the progress we’ve made in the past year tackling this exceptionally challenging problem and are really looking forward to getting Mergebot into the hands of our customers. We’re currently fixing a few bugs we’ve identified in our testing but are pretty much ready to onboard our first group of beta testers.

If you consider yourself an early adopter, simply sign up for the beta and we’ll send you an invite very soon. If you’re like me and prefer to wait until all the bugs are worked out, just sign up for news and we’ll let you know once Mergebot is out of beta.

About the Author

Brad Touesnard

As founder of Delicious Brains Inc., Brad wears many hats; from coding and design, to marketing and partnerships. Before starting Delicious Brains, Brad was a busy freelance web developer, specializing in front-end development.