Uploading images to Media Library fails with Memory Exhausted

A few things

  • Use media_handle_sideload so that WordPress moves the files to the right location and validates them for you, creates the attachment post etc, none of this manual stuff
  • Don’t run this once and expect it to do everything. You’re just going to run into the same problem but further into the import. If you give it infinite memory you’ll have a time limit execution problem where the script simply runs out of time
  • Process 5 files at a time, and repeatedly call run it until nothing is left to process
  • Use a WP CLI command, don’t bootstrap WordPress and call it from the GUI. Call it from Cron directly and skip the pinging a URL business. CLI commands get unlimited time to do their job, and you can’t call them from the browser. The GET variable with the key becomes completely unnecessary.
  • Escape escape escape, you’re echoing out these arrays and values, assuming what they contain is safe, but what if I snuck a script tag in there? Unable to delete file <script>...</script> after import.. Biggest security step you can take that makes the most difference yet the least used

A Prototype WP CLI command

Here is a simple WP CLI command that should do the trick. I’ve not tested it but all the important parts are there, I trust you’re not a complete novice when it comes to PHP and can tighten any loose screws or minor mistakes, no additional API knowledge necessary.

You’ll want to only include it when in a WP CLI context, for example:

if ( defined( 'WP_CLI' ) && WP_CLI ) {
    require_once dirname( __FILE__ ) . '/inc/class-plugin-cli-command.php';
}

Modify accordingly, dumping the command in functions.php of a theme and expecting it all to work will cause errors as WP CLI classes are only loaded on the command line, never when handling a browser request.

Usage:

wp mbimport run

Class:

<?php
/**
 * Implements image importer command.
 */
class MBronner_Import_Images extends WP_CLI_Command {

    /**
     * Runs the import script and imports several images
     *
     * ## EXAMPLES
     *
     *     wp mbimport run
     *
     * @when after_wp_load
     */
    function run( $args, $assoc_args ) {
        if ( !function_exists('media_handle_upload') ) {
            require_once(ABSPATH . "wp-admin" . '/includes/image.php');
            require_once(ABSPATH . "wp-admin" . '/includes/file.php');
            require_once(ABSPATH . "wp-admin" . '/includes/media.php');
        }

        // Set the directory
        $dir = ABSPATH .'/wpse';
        // Define the file type
        $images = glob( $dir . "*.jpg" );
        if ( empty( $images ) {
            WP_CLI::success( 'no images to import' );
            exit;
        }
        // Run a loop and transfer every file to media library
        // $count = 0;
        foreach ( $images as $image ) {
            $file_array = array();
            $file_array['name'] = $image;
            $file_array['tmp_name'] = $image;

            $id = media_handle_sideload( $file_array, 0 );
            if ( is_wp_error( $id ) ) {
                WP_CLI::error( "failed to sideload ".$image );
                exit;
            }

            // only do 5 at a time, dont worry we can run this
            // several times till they're all done
            $count++;
            if ( $count === 5 ) {
                break; 
            }
        }
        WP_CLI::success( "import ran" );
    }
}

WP_CLI::add_command( 'mbimport', 'MBronner_Import_Images' );

Call repeatedly from a real cron job. If you can’t, then either use WP Cron, or have a hook on admin_init that checks for a GET variable. Use the code inside the run command with some modifications.

When WP CLI Isn’t An Option

Using a standalone PHP file that bootstraps WP is a security risk and a great target for attackers if they want to exhaust your server resources ( or trigger duplication issues by hitting the URL multiple times all at once ).

For example:

// example.com/?mbimport=true
add_action( 'init', function() {
    if ( $_GET['action'] !== 'mbimport' ) {
        return;
    }
    if ( $_GET['key'] !== get_option('key thing' ) ) {
        return;
    }
    // the code from the run function in the CLI command, but with the WP_CLI::success bits swapped out
    // ...
    exit;
}

Repeated Calling

It might be that your external service can’t call this repeatedly. To which I say:

  • Don’t rely on the external service, have your own server call it regardless, even if there’s no work to do
  • A standard WP Cron task would also work
  • Run it every 5 minutes
  • Make the task call itself if there’s still stuff to do, using a nonblocking request. This way it’ll keep spawning new instances until it’s finished e.g.

            if ( $count === 5 ) {
                wp_remote_get( home_url('?mbimport=true&key=abc'), [ 'blocking' => false ]);
                exit;
            )
    

GUI?

If you want a progress meter for a UI in the dashboard, just count how many jpeg files are left in the folder to import. If you need to configure it, then build a UI and save the settings in options, then pull from options in the CLI script.

Have You Considered Using the REST API?

Sidestep the entire process and add the files via the REST API. You can do a POST request to example.com/wp-json/wp/v2/media to upload the jpegs. No code on your site necessary

https://stackoverflow.com/questions/37432114/wp-rest-api-upload-image