Building Custom WP-CLI Commands for Massive Data Migrations

#
By Mike Davey, Senior Editor

Does this sound familiar? You need to update a custom field for 50,000 products or refactor user metadata across a massive membership site. You write a clean PHP script, drop it into a template or a hook, and hit refresh.

Thirty seconds later, you’re staring at a 504 Gateway Timeout.

The browser is a terrible place for data migrations. Between Nginx/Apache timeouts and PHP-FPM execution limits, it seems clear that web servers are designed to serve pages, not process massive datasets. In this situation, you need to step outside the browser and into the command line.

Why Your Scripts Are Crashing

The problem comes down to both timeouts and available memory. When you use get_posts() with 'posts_per_page' => -1, WordPress® attempts to load every single matching object into RAM at once. On a large site, this is a one-way ticket to a “Memory Exhausted” error.

Even if you process posts in batches, WordPress’s internal object cache grows with every record you touch. If you don’t clear it, your memory usage will climb steadily until the process dies.

The Solution: PHP Generators

Instead of loading a massive array into memory, we use PHP Generators. A generator allows you to iterate over data without building a massive array in RAM first. By using the yield keyword, the function provides one item at a time and “pauses” its execution until the next item is requested. When combined with a manual cache flush, this ensures it uses the same amount of RAM to process 1,000,000 posts as it does for 10.

Setting Up WP_CLI::add_command

WP-CLI is the standard for WordPress command-line interaction. By using WP_CLI::add_command(), we can register a custom tool that bypasses the web server entirely.

Building the Batching Loop

To keep things efficient, we create a generator to fetch IDs from the database. Then, we loop through that generator, processing each item one by one. By clearing the internal memory for every individual post and performing a global cache flush at regular intervals, we keep the server’s RAM usage completely flat from start to finish.

if ( defined( 'WP_CLI' ) && WP_CLI ) {
    WP_CLI::add_command( 'db-migrate', 'DB_Migration_Command' );
}

class DB_Migration_Command {

    /**
     * Updates product metadata in bulk.
     *
     * ## OPTIONS
     *
     * [--batch-size=<number>]
     * : How many posts to process before performing a global cache flush.
     * ---
     * default: 500
     * ---
     *
     * [--dry-run]
     * : Run the script without saving changes to the database.
     */
    public function products( $args, $assoc_args ) {
        global $wpdb;
        $dry_run    = isset( $assoc_args['dry-run'] );
        // Ensure batch size is at least 1 to avoid division by zero errors
        $batch_size = max( 1, (int) $assoc_args['batch-size'] );

        // 1. Get total count for the progress bar feedback
        $total_count = $wpdb->get_var( "SELECT COUNT(ID) FROM {$wpdb->posts} WHERE post_type = 'product' AND post_status = 'publish'" );
        $progress    = \WP_CLI\Utils\make_progress_bar( 'Updating Products', $total_count );

        // 2. The Safety Net: Start a SQL Transaction
        $wpdb->query( 'START TRANSACTION' );

        try {
            foreach ( $this->get_all_product_ids() as $index => $post_id ) {

                // Validate data before processing to prevent partial corruption
                if ( ! $this->validate_product( $post_id ) ) {
                    WP_CLI::warning( "Skipping Post ID: $post_id - Invalid status." );
                    continue; 
                }

                if ( ! $dry_run ) {
                    // Core function ensures hooks fire for search/caching plugins
                    update_post_meta( $post_id, '_new_meta_key', 'updated_value' );
                }

                $progress->tick();

                // 3. Memory Management: The "Scalpel"
                // Clear the internal cache for the post we just finished to keep RAM low
                clean_post_cache( $post_id ); 

                // 4. Memory Management: The "Sledgehammer"
                // Periodically clear the global object cache for massive datasets
                if ( 0 === ( $index + 1 ) % $batch_size ) {
                    wp_cache_flush(); 
                }
            }

            $progress->finish();

            // 5. Finalize or Rollback
            if ( ! $dry_run ) {
                $wpdb->query( 'COMMIT' );
                WP_CLI::success( 'Migration complete and saved to database!' );
            } else {
                $wpdb->query( 'ROLLBACK' );
                WP_CLI::log( 'Dry run complete. No changes were saved.' );
            }

        } catch ( Exception $e ) {
            // Undo everything if a critical error occurs
            $wpdb->query( 'ROLLBACK' );
            WP_CLI::error( 'Critical Error: ' . $e->getMessage() );
        }
    }

    /**
     * A Generator that yields post IDs one by one to save memory.
     * Replaces get_posts() with -1 which can cause memory exhaustion.
     */
    private function get_all_product_ids() {
        global $wpdb;
        $query = "SELECT ID FROM {$wpdb->posts} WHERE post_type = 'product' AND post_status = 'publish'";
        $ids   = $wpdb->get_col( $query );

        foreach ( $ids as $id ) {
            yield (int) $id; // "Pauses" execution to keep the array out of memory
        }
    }

    /**
     * Simple validation stub to ensure data integrity.
     */
    private function validate_product( $post_id ) {
        return get_post_status( $post_id ) === 'publish';
    }
}

Using WP_CLI\Utils\make_progress_bar() transforms a silent, terrifying script into a transparent process. It gives you real-time feedback and an estimated time of completion.

Managing the Object Cache

WordPress’s internal object cache grows with every record you touch. If you don’t clear it, your memory usage will climb steadily until the process dies. While wp_cache_flush() is the most straightforward way to prevent this, it can be a “nuclear option” on production sites using Redis or Memcached because it empties the cache for every visitor on the site.

Our script uses a hybrid approach. We use clean_post_cache() to surgically remove the memory used by a single post immediately after processing it. Then, we use wp_cache_flush() just once per batch to ensure the internal PHP memory remains lean without constantly hammering your persistent cache server.

Defensive Feedback & Transactions

When things go wrong, you want standard output. Use WP_CLI::success() for positive results and WP_CLI::warning() to notify the operator of minor data issues without killing the script.

Most importantly, the code utilizes SQL Transactions (START TRANSACTION, COMMIT, and ROLLBACK). This ensures atomicity, the “all or nothing” principle where either the entire migration succeeds perfectly or the database remains completely untouched. If a critical error occurs mid-process, the database automatically reverts to its original state. This is your safety net, ensuring you never have to deal with the nightmare of a production site stuck in a messy, “half-migrated” state.

Failure Resilience: Transactions and Logging

Crossing your fingers and hoping isn’t a good strategy when processing 50,000 records. If your script crashes halfway through, you are left with inconsistent data. By wrapping your loop in a SQL transaction, you ensure that changes are only finalized if the entire batch completes successfully. If a critical error occurs, the database can roll back to its original state.

For minor issues, like a single product missing a required field, don’t kill the script with WP_CLI::error(). Instead, use WP_CLI::warning() to alert the operator and log the offending ID to a text file for manual review later. This keeps the migration moving while ensuring no record is left behind.

The Workflow: Safety First with WP Migrate

Even with a --dry-run flag, running a migration script for the first time is stressful. This is where WP Migrate becomes an essential part of your toolbox.

  1. Pull to Local/Staging: Use WP Migrate to pull your production database to a local environment.
  2. Test the Command: Run your custom CLI command locally. Watch the progress bar and check for errors.
  3. Verify the Data: Inspect a few records to ensure update_post_meta() did exactly what you expected.
  4. Push or Deploy: Once verified, you can either run the command on production via SSH or push your local database back to staging for a final QA check.

The “Panic Button” Snapshot

Even with local testing and dry runs, never touch a production database without a quick backup. Right before you hit “Enter” on your custom migration command, run a manual export of your database via WP-CLI:

wp db export pre-migration-backup.sql

This means that even if the server crashes, you still have a timestamped file ready for an immediate wp db import. It’s a quick step that can save you hours of stress.

Conclusion

Massive data migrations don’t have to be a gamble. By moving your logic into a custom WP-CLI command and implementing a batching strategy, you bypass the limits of the web server and gain total control over your server’s resources.

Just hoping a script finishes won’t get you very far. Instead, build tools that provide feedback, manage memory, and offer a safe path to completion.

About the Author

Mike Davey Senior Editor

Mike is an editor and writer based in Hamilton, Ontario, with an extensive background in business-to-business communications and marketing. His hobbies include reading, writing, and wrangling his four children.