Database Merging Made Easy: Something We’ve Been Working On for the Past Year

Mergebot is the new top secret product we’ve been working on for the better part of a year. It addresses the database merging problem that most of you have encountered. But before we get into the solution, let’s first explore the problem.

Hang on, what about Data Hawk!?

TL;DR – Data Hawk is now Mergebot.

You may have heard us mention “Data Hawk” as the thing we’ve been working on. You may have even seen our landing page at datahawk.io. Unfortunately I didn’t do a thorough enough search for other companies using “Data Hawk” (or something similar) and it turns out there are a number of companies that might have a problem with us using Data Hawk. So we went with the lawyers on this and abandoned the name and branding to avoid trademark issues. After a lengthy debate, we renamed the product to Mergebot.

Now, back to the main event…

The Database Merging Problem

Ok, so you’ve just landed an exciting new client: Springfield Box Factory (boxes!!!). They need some development done to the site. They have an existing WordPress site, so you get a copy of the files and database, set up the site locally and start coding.

A couple of weeks later, you’re ready to deploy your update but realize that while you were making changes to your database locally, the client was also updating the live database and visitors to the site were also adding data. If you just overwrite the live database with your local database, you’ll lose all the changes to the live site made in the past 2 weeks. Dun dun dun!

dun dun dun

Magic Merge

When we’re faced with this situation, we wish we had a magical merge tool that would just mash the local and live databases together perfectly and everything would be hunky dory. We desperately search Google for this solution, but find nothing, bupkis.

If such a tool did exist, how would it work? First, it would need more than just your local and live databases. It would also need a copy of the database from 2 weeks ago (when you first set up the site locally). Then it could determine the differences between your local database then and now. Maybe it would generate a bunch of queries that you could run on production to apply those changes. Seems simple enough.

Unfortunately it’s too simple. When you apply the changes to the live site, the client’s changes get overwritten. What if you updated the title of a page from “Box Sizes” to “Box Dimensions” and the client also updated it on the live site to “Sizes”. The script would overwrite the client’s change. Not a big deal in the case of a page title, but it doesn’t take a creative genius to imagine major problems overwriting client changes could cause.

7395256948_075103a347_o

A magic merge tool would also need to determine the differences between the database from 2 weeks ago and the current live database, then determine if any of those changes conflict with any changes you made locally. If it does find conflicts, it needs to present a UI to resolve those conflicts. A UI where you can look at the changes and tell the tool to use your local change, leave the live database as is, or change it to something completely different.

Things have gotten pretty complicated, but it still sounds pretty good. A heck of a lot better than doing things manually. But wait! Because the live site’s database is changing (because it’s live) while we’re running this tool, we need to take the live site offline while we run the magic merge tool, resolve conflicts, and test. Most clients don’t like it when their site is offline (especially if it generates all their revenue). Understandable. So the pressure is on to get the site back up. At the same time, you really don’t want to be rushing through merging and testing. Being pulled in two directions makes for an awful, stressful situation. This is developer hell.

Workarounds

Most developers who have encountered this database merging problem come up with ways to avoid it. The one I’ve heard most often (and what I used to do when I was freelancing) is to take notes as you change things locally. Then when it comes time to deploy, go through the notes and perform those same changes on the live site.

If you’re keen to automate deployment, you can also script the changes with an SQL script or a PHP script. This usually involves quite a bit of extra work and it can be tricky to figure out what exactly you’re supposed to script. For example, let’s say you need to change a plugin setting. To script that, you’ll have to capture the queries that are run as a result of saving the setting or look at the code to see what it does and try replicate it.

Although scripting does take significant extra effort, there are a couple of huge benefits:

  1. You can test deployment before deploying to the live site
  2. You can keep your local database fresh with the latest live data

That is, you can replace your local database with the live database and run your script to get all your changes back. You can do that as often as you like and could even automate it to run overnight so you have a fresh site to work on every day.

One thing scripting doesn’t help with is overwriting changes on the live site. That is, any updates in the script just updates the data on the live site without checking if it was changed.

Our Solution: Mergebot

Mergebot combines the benefits of scripting and the (fictional) magic merge tool without any of the negatives.

You install the Mergebot plugin on your local and live sites. Then as you make changes locally your changes are sent to our cloud app (mergebot.com).

Later, you can safely refresh your local database by replacing it with the live database then running Mergebot to apply your changes. Not only have you refreshed your local database, you’ve also just tested a database deployment locally. You can continue developing and refresh your local database regularly, testing your deployment regularly.

If some data changed in the live database that you’ve also changed locally, Mergebot will detect it when applying changes and present a nice UI asking you to resolve the conflict. It will also save this resolution for the next time you refresh your local database or when it comes time to deploy to the live site. That is, you won’t ever have to resolve the same conflict twice.

When it’s time to deploy to the live site, put the site in maintenance mode, deploy your code changes, then execute Mergebot to deploy your database changes. Since you already executed Mergebot locally (probably a few times), resolved the conflicts, and tested, you’ll have confidence that it will all work.

Mergebot handles all the complicated translation of IDs between databases and even detects IDs in shortcodes and meta values.

Although not required, we will recommend using WP Migrate DB Pro with Mergebot as they’ve been integrated together. For example, Mergebot ignores any WP Migrate DB Pro queries, like saving settings and any temporary migration data. And when you pull your live database down and replace your local database, WP Migrate DB Pro will preserve your Mergebot settings.

Calling All Early Adopters!

We’re very happy with the progress we’ve made in the past year tackling this exceptionally challenging problem and are really looking forward to getting Mergebot into the hands of our customers. We’re currently fixing a few bugs we’ve identified in our testing but are pretty much ready to onboard our first group of beta testers.

If you consider yourself an early adopter, simply sign up for the beta and we’ll send you an invite very soon. If you’re like me and prefer to wait until all the bugs are worked out, just sign up for news and we’ll let you know once Mergebot is out of beta.

About the Author

Brad Touesnard

As founder of Delicious Brains Inc., Brad wears many hats; from coding and design, to marketing and partnerships. Before starting Delicious Brains, Brad was a busy freelance web developer, specializing in front-end development.

  • Take my money now!

  • That sounds absolutely brilliant!
    I have wanted something like this for ages. Is there any way you can make it work with Staging sites? I use Flywheel now and it would be a brilliant addition to online staging sites as we are constantly publishing new articles, comments etc on live site so standard staging sites don’t work for us. I have never used a local development system though so would need some training on that. Dale.

    • Robert Gordon

      Curious about this too, being a WP Engine customer and using their staging environment.

    • Yep, it will work with staging sites, but we’re only going to support 2 sites at a time to start. See comment http://disq.us/p/1b0756w

  • Blue Fern Solutions

    Yes! Thank you. Can’t wait πŸ˜‰

  • What about the situation, when live site is example.com, local development site is example.dev, and staging site is staging.example.com? Will there be a tool to automate those domain specific changes in DB as well?
    So I will be able to get merged updates from live, push DB updates to staging from dev, and from staging to prod and vice versa?

    • For the first release we’re limiting it to 1 dev site and 1 live site to keep the scope sane, but the plan is to allow any number of “dev” sites which could be staging sites, or even other dev sites run by other developers on your team. This will allow developers to collaborate on a deployment and to catch any conflicts in the data they each change.

      • Robert Gordon

        Does the “dev” site HAVE to be local? Or can it be something that is worked on live on a staging domain and then pushed to production?

  • And there is also a bunch of other questions, like:
    1) who will own and where do you host all the data that users send to your server
    2) how is it secured
    3) is it encrypted and how if “yes” and why not if “no”
    etc.

    Would be nice to get some time in future a post about all that as well πŸ™‚

    • You will definitely own your data. We haven’t decided if we’ll encrypt the data yet, but we’ll be hosting it on AWS which is PCI DSS Certified. We’ll also probably add the option to completely remove your data from our servers once it’s deployed to the live site.

  • Kevin W. Hoffman

    I’m really looking forward to your team solving this problem. Could you shed some light on the cloud component? What type of data is stored in the cloud?

    • Any time you run a query locally, we capture it and send it to our cloud app.

  • John McKenzie

    So, that’ll be the fastest I’ve ever signed up to a beta πŸ˜‰ what fabulous news, avoids multiple export files, I’m really looking forward to a version where it can be used across staging and development environments also. Awesome.

  • joe aliferis

    This sounds amazing. How about this for a scenario – could Mergebot handle this ?

    Client has an ecommerce site in cubecart and wants a new site in WordPress/woocommerce

    so you setup a new WP install and migrate the shop data from cubecart to WP/Woocommerce

    during development on the new Woo site you modify some of the simple products to become variable products

    before the new site goes live you need to re0import all the shop data from the live site again so the new site is up to date with prods, customers, orders, etc when you make the switch

    but this would normally reset all the prods you had upgraded to variable prods – making them all simple prods again

    Can mergebot help with this ?

    cheers

    joe

    • I would write a script that migrates the cubecart data to WP/WooCommerce and then use Mergebot to record all the other data changes you make locally (like configuring WooCommerce settings). Then at launch, deploy code, run your migration script, then run a Mergebot deploy.

      • joe aliferis

        Hi Brad

        that script you mention already exists and I have run it

        its not intelligent enough to only import changes – it just drops the woo data and re-imports it

        so is the answer yes, mergebot could help manage the changes to products ?

        Cheers

        joe

  • mbm

    Somehow Delicious Brains knows exactly what plugins I need and has been working on them top secretly for a year.

    I have a client who I was developing a new homepage and some other small sections for, and put it up for review on my development server. It took them like 3 months to get feedback together, and in those three months they’ve changed dozens of areas of their site… So I was looking at the prospect of just having to redo most of it.

    In other words — Take my money.

  • I had to deal with exactly this issue last week and it wasn’t pleasant. Every migration I did was nerve wrecking.

    • Hey Baadier, I ran into you at WCCT last year, was surprised to see you here too. Hope you’re doing well!

      • Thanks Kurt, so random to find you here πŸ™‚

        Are you coming to WordCamp again?

        • I’d like to, but it really depends on how much work I can get done this week… it’s 50/50. You?

          • Still debating whether I should or not.

            I’ve been to the last few in Cape Town and the days are fun but it’s 2 days and I’ve been crazy busy over the last few months.

            I dont recall if they make the talks available afterwards or not but would love to watch them.

            Just not sure if I can wing the time or not.

  • Robert Gordon

    I’ve been looking for a solution to this issue for many years for various different projects. One such current project is an ever-evolving membership site (using Membermouse) where we are constantly making changes, upgrades etc. We use WP Engine and pulling down to staging and then deploying back to production isn’t really an option because new members and new data is being collected (when a member logs in, or where they visit etc) during that time. I’ve tried a few nightmarish merging techniques in the past by excluding the Membermouse database tables but they are so intertwined into the rest of the database that it’s nearly impossible to create such a workflow. In addition to this I’ve tried Versionpress (I’m sure you’ve heard of them) to which even after their beta it didn’t work that great and required the use of WP-CLI and that’s not something I can use on WP Engine. Anyway this solution looks promising and I look forward to being a beta test to see if this helps solve our merging issues! πŸ™‚

  • Andrew

    I’m curious if this will ever be provided as a self-hosted option? This could be extremely helpful for my clients, but there’s a definite no-go if we need to export the data externally.

    • We’re currently thinking that a self-hosted option will be more of an Enterprise level solution. Kind of like https://www.youtube.com/watch?v=3oJCoPd9wh0

      • Andrew

        https://m.youtube.com/watch?v=iOtf2vVScOc

        By saying you may or may not encrypt the data you’re not giving us much of a choice. And yeah, I do work in enterprise hosting, so the glove fits, haha. This isn’t really pivoting like in Silicon Valley though. Feels more like self-hosted would be moving away from your inevitable subscription model which I totally get. Just doesn’t sound like something we can use though the features sound sweet.

  • Josh McKee

    I will eat all these delicious brains. This is a tool I keep thinking of writing myself, but I am so glad somebody else did all that hard work. Looking forward to the beta!

  • coccoinomane

    Mergebot seems exactly what we need at the web agency where I work! Will it support synching DB changes for all plugins? I am thinking of WooCommerce, ACF, Gravity Forms…

    On a side note, Borek Bernard wrote a nice article on how Mergebot will differ from VersionPress: https://blog.versionpress.net/2016/08/mergebot-early-thoughts

    • We plan to support the most popular plugins out-of-the box (like the ones you mention) and allow you to configure Mergebot to work with any plugin.

  • Mic

    @bradt66:disqus Could there be a preview video for the article coming up?

    • Could be. πŸ˜‰

      • @bradt66:disqus
        So now as you have a beta, is it possible to have some video demonstration? I don’t want to jump into work with it and all its hassle (and take seat of others that may need it more), but I do want to see it in action somehow πŸ™‚

  • Brian Hogg

    1) So there would there be no option to not send queries off to your server and store them on an instance we control?

    2) How will this handle things like serialized data (ie. postmeta tor options that plugins add, etc)? Or even multiple options/meta changes and merging some but not others leading to inconsistencies that breaks the site?

    (You weren’t kidding that this is complex)

    • 1) Nope.
      2) We handle this with schemas defined in JSON. Here’s one for WP core:


      {
      "relationships": {
      "options": [
      {
      "option_name": "%category_children",
      "option_value": "terms",
      "serialized": {
      "key": "terms:term_id",
      "val": {
      "key": "ignore",
      "val": "terms:term_id"
      }
      }
      }
      ]
      }
      }

      You’ll be able to define your own schemas for any plugins we don’t support out-of-the-box and any custom theme / plugin data that you create yourself.

  • cruiseback

    This litterally could not have arrived at a better time! I have been banging my head against the table for the last 3 days trying to find out how I am going to push a Woocommerce based staging copy – I made and have worked on since 6 months ago – back on to the live site without loosing orders or customer reviews while also not loosing all my shiny new pages, posts, updated products etc.

    I hope mergebot can help me out in this regard! Thanks for this happening now.

    • If it can’t, you can move using SQL and WP-API for WooCommerce data. It’s extremely un-pretty, and it will make you shake your fist at the API as some read-only fields force you to also handle using manual SQL; but you do as much as you can in WooCommerce, then remit using SQL.

  • Mergebot has the potential to be the greatest plugin in WordPress history. Can’t wait for the release.

  • Yep via HTTPS.

  • dklebedev

    VersionPress vs mergebot. I’m getting me popcorns and bills.

  • Awesome !! This was the biggest problem while doing makeovers for existing sites and later merging things manually & through https://wordpress.org/plugins/wp-cfm/ etc. Looking forward for the beta.

  • Arjan Snaterse

    The solution seems pretty cool and neat, but how about the cloud storage? Somewhere it feels not so funny to store all my data external with a third party.

    How about safety/privacy? And how long do you store all the data, and to what level can we control the data which is stored in your cloud?

    Or will there be options to save it on your own custom locations?

  • coccoinomane

    Hi! Will Mergebot support Multisite WordPress Installations from day one?

    • It doesn’t support it right now, but I am working on it currently! πŸ™‚

  • Anton VS

    Take my money.

  • guy

    Hey guys, looks interesting.. Can you offer an indication on prices?

    I already use WP Migrate DB Pro, which is very useful, but I’m not a “power-user”.. spending another $xxx for an addon might not be an option, but it might be if the pricing is right.

  • chrismccoy

    will this be along the same lines of a tool like ramp from crowd favourite?

  • TBH I still am unsure how much manual work is involved reading this, and statements like “magic merge tool without any of the negatives.” worry the pants off me as most database systems have not cracked that lofty goal yet.

    Having recently completed a migration of a disaster-base of WP + WPML + 99 other plugins to multiple WP Multisites (hyperbolic but probably not far off); I’m aware of some of the challenges with nested dependencies of posts and specifically custom-post-types with meta and relations to other posts.

    I’m especially aware of limitations of existing import export tools and merge-conflict tools as well, but I do recognise it’s not a failure on product offerings but the broad-scope migrations can take.

    I Too would have loved a tool to click-and-go and merge the posts, meta, custom tables etc; but I suspect trying it one-size fits all, would require a lot of effort and disappointment as there is such a broad scope of plugins. In the end we opted for a scripted solution with SQL fixes that required both databases to be installed on a shared machine.

    I Hope it’s awesome, if I get another WP migration or project needing, I’ll try it out for sure. I have to be honest; I’m expecting to be disappointed because too much is out of control of such a system.

  • Stu Fuller

    Absolutely hate the idea of being tied to the availability and accessibility (read, ongoing service costs) of connecting to your server.

    I should be able to run my own server, with my own license, for the version I purchase, forever. If I choose to upgrade at a later time, fine, but I should not be forced into a subscription to keep using a product I already purchased.

    Reminds me of the “new and improved” smush-pro that forces you to rely on their servers which requires constant renewal payments.

    No thanks.

  • I need this like … yesterday. Why is this not in my life right this very second??

  • If you achieve this, I will throw money at you guys. Literally throw it. Given, you don’t need to worry because unless I’m throwing silver dollars it won’t hurt. And I might throw enough of it that you could make a pillow. In which case you not only don’t need to be afraid but you can actually look forward to a great nights sleep.

    I’d pay per-site license, and I service a couple dozen sites a year. But a developer license so I can use one license key and purchase additional sites on that would be greatly appreciated. Also, allowing three domains (one for localhost / staging / production) somehow would be ideal.

    • Liz Lockard

      Thanks Spencer! We’d appreciate lighter currency thrown at us πŸ™‚

      Your feedback is awesome.

      If you’re not already signed up for news on Mergebot, do that so you can be the first to know when we go public – https://mergebot.com/

      • Thanks Liz. So I signed up for, and paid for, the Beta and was redirected to install instructions but didn’t get a download link. Where do I go for support?

        • Hi Spencer, you can send a support request inside the app (click on the top right avatar > ‘Email Us’). Alternatively, you should have received an invite to our Slack channel where you can get help/give feedback.

          The installation guide should give you all you need to get started. The plugin can be installed from the WordPress plugin repo. If you would like a 1-on-1 walkthrough of Mergebot then let us know πŸ™‚