Heroku Memory Error with PHP and reading large file from S3

I'm using the AWS 2.3.2 SDK for PHP to try to pull down a large file (~4g) from S3 using their stream wrapper, which should allow me to use fopen / fwrite to write the file to disk and not buffer into memory.

Here is the reference:

http://docs.aws.amazon.com/aws-sdk-php-2/guide/latest/service-s3.html#downloading-data

Here is my code:

public function download()
    {

        $client = S3Client::factory(array(
                    'key'    => getenv('S3_KEY'),
                    'secret' => getenv('S3_SECRET')
                    ));

        $bucket = getenv('S3_BUCKET');
        $client->registerStreamWrapper();

        try {
            error_log("calling download");
            // Open a stream in read-only mode
            if ($stream = fopen('s3://'.$bucket.'/tmp/'.$this->getOwner()->filename, 'r')) {
                // While the stream is still open
                if (($fp = @fopen($this->getOwner()->path . '/' . $this->getOwner()->filename, 'w')) !== false){

                    while (!feof($stream)) {
                        // Read 1024 bytes from the stream
                        fwrite($fp, fread($stream, 1024));
                    }
                    fclose($fp);
                }
            // Be sure to close the stream resource when you're done with it
            fclose($stream);
        }

The file downloads but I continually get error messages from Heroku:

2013-08-22T19:57:59.537740+00:00 heroku[run.9336]: Process running mem=515M(100.6%) 2013-08-22T19:57:59.537972+00:00 heroku[run.9336]: Error R14 (Memory quota exceeded)

Which leads me to believe this is still buffering to memory somehow. I've tried to use https://github.com/arnaud-lb/php-memory-profiler, but got a Seg Fault.

I also tried to download the file using cURL with CURLOPT_FILE option to write directly to the disk and i'm still running out of memory. The odd thing is according to top my php instance is using 223m of memory so not even half of the allowed 512.

Anybody have any ideas? I'm running this from php 5.4.17 cli to test.

Answers


Did you already try with a 2x dyno, those have 1GB of memory?

What you also can try is downloading the file by executing a curl command in PHP. It's not the cleanest way but it will be much faster/more reliable and memory friendly.

exec("curl -O http://test.s3.amazonaws.com/file.zip", $output);

This example is for a public URL. If you don't want to make your S3 files public you can always create a signed URL and use that in combination with the curl command.


Need Your Help

Android CheckBox - Removing a previously setOnCheckedChangeListener

android checkbox android-cursoradapter

I have an application that displays a ListView using a CursorAdapter that I have customized. Within my custom CursorAdapter.bindView, I have a CheckBox object that I set the checked value (based on a

What does it mean to say that a method (such as 'react)' does not return?

scala actor

In Scala actors, we always learn that "react does not return". What exactly does this mean? What is the difference between a method that "does not return" and one that returns Unit.

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.