Importing existing s3 Media into WP

In theory you’d think it could work like that.

In reality, it doesn’t.

There’s a simple reason no plugin exists that would do this. Whenever you import media into WordPress, it creates thumbnail files of various sizes, (even for videos).

In order to create those thumbnails, it has to download (you guessed it) the entire media file. If it’s videos, you may can get away with downloading just enough to get the first frame and use that for the thumbnails, but that would take some custom coding to stop the download after a certain number of bytes.

So, in the end, it’s just not practical considering you could download the files in bulk much easier that some process that did them one at a time.

What is practical is something that would download one file at a time, insert it into WordPress (not directly into the database, but by calling the appropriate WP functions), and then upload all the associated thumbnails for that file up to S3, and then delete the downloaded file and generated thumbnails.

As it stands, no plugin does that.

A prohibitive factor to this method is that during the entire download, thumbnailing, uploading, and deletion would have to be done through the duration of a single web call (i.e. in your browser you hit a button that calls WordPress and fires off all this work), and if the server has settings to disconnect after so many seconds of inactivity, or php has a setting to only run scripts for a set length of time, and your process for a single file extended beyond these timeouts, the process would be aborted, you’d have leftover files on your system, and you’d be asking the developer for help, thinking it was his plugin when really it’s just a bad way to do things.


Addendum, since you’re asking about mp3 data, here’s what you’d have to do. Maybe some not-so-good-news.

Large brush strokes:

  1. Find a plugin that does importing and see if you can copy/modify that process to suite your needs.
  2. Find a plugin that does the s3:// url switch and see if you can hijack it for your purposes.

Step 1:

Check out the add-from-server plugin. class.add-from-server.php has function handle_import_file( ). If you read through that, you’ll see it actually pulls meta data from the mp3 (call to wp_read_audio_metadata), so this would be expecting the file to be local. I think you have two choices: (1) you don’t really need the meta data, so you hack up the code to just skip that, or (2) you have the meta data in some local form and instead of pulling it from the mp3 you pull it from your local source.

Step 2:

I use tcs3 and I’ve had to hack it. I’m thinking about long term I need to just copy it to my own plugin so I don’t accidentally auto-update overtop my changes. Bottom line, the plugin doesn’t work as is and I had to edit the code just to get it to work.

The lesson I took from all of this is it doesn’t look like you can just edit the tables and change all http://example.com/ to s3://mybucket/ and have it work.

WP uses a filter (in this case, wp_get_attachment_url) to build the ‘correct’ url, by calling everyone who has registered with that filter, passing them the local URL and letting them modify the URL.

In order for me to get this to work correctly I had to have tcs3’s code check if it had it’s ‘is_on_s3’ flag set, and if it did to change the URL, otherwise to leave it alone. No joke, that’s not how their code works. Before I made that change it was changing all the URLs to s3 whether it had actually uploaded the file or not.

My recommendation is for you to make your own plugin, copying Add To Server as the starting point and (a) modifying the use of meta data and (b) provide your own wp_get_attachment_url filter that can tell if the URL is one of your s3 files and change the URL appropriately.