django (or wsgi) chain stdout from subprocess

I am writing a webservice in Django to handle image/video streams, but it's mostly done in an external program. For instance:

  1. client requests for /1.jpg?size=300x200
  2. python code parse 300x200 in django (or other WSGI app)
  3. python calls convert (part of Imagemagick) using subprocess module, with parameter 300x200
  4. convert reads 1.jpg from local disk, convert to size accordingly
  5. Writing to a temp file
  6. Django builds HttpResponse() and read the whole temp file content as body

As you can see, the whole temp file read-then-write process is inefficient. I need a generic way to handle similar external programs like this, not only convert, but others as well like cjpeg, ffmepg, etc. or even proprietary binaries.

I want to implement it in this way:

  1. python gets the stdout fd of the convert child process
  2. chain it to WSGI socket fd for output

I've done my homework, Google says this kind of zero-copy could be done with system call splice(). but it's not available in Python. So how to maximize performance in Python for these kind of scenario?

  1. Call splice() using ctypes?
  2. hack memoryview() or buffer() ?
  3. subprocess has stdout which has readinto(), could this be utilized somehow?
  4. How could we get fd number for any WSGI app?

I am kinda newbie to these, any suggestion is appreciated, thanks!

Answers


If the goal is to increase performance, you ought to examine the bottlenecks on a case-by-case basis, rather than taking a "one solution fits all" approach.

For the convert case, assuming the images aren't insanely large, the bottleneck there will most likely be spawning a subprocess for each request.

I'd suggest avoiding creating a subprocess and a temporary file, and do the whole thing in the Django process using PIL with something like this...

import os
from PIL import Image
from django.http import HttpResponse

IMAGE_ROOT = '/path/to/images'

# A Django view which returns a resized image
# Example parameters: image_filename='1.jpg', width=300, height=200
def resized_image_view(request, image_filename, width, height):
    full_path = os.path.join(IMAGE_ROOT, image_filename)
    source_image = Image.open(full_path)
    resized_image = source_image.resize((width, height))
    response = HttpResponse(content_type='image/jpeg')
    resized_image.save(response, 'JPEG')
    return response

You should be able to get results identical to ImageMagick by using the correct scaling algorithm, which, in general is ANTIALIAS for cases where the rescaled image is less than 50% of the size of the original, and BICUBIC in all other cases.

For the case of videos, if you're returning a transcoded video stream, the bottleneck will likely be either CPU-time, or network bandwidth.


I find that WSGI could actually handle an fd as an interator response

Example WSGI app:

def image_app(environ, start_response):
    start_response('200 OK', [('Content-Type', 'image/jpeg'), ('Connection', 'Close')])
    proc = subprocess.Popen([
        'convert',
        '1.jpg',
        '-thumbnail', '200x150',
        '-', //to stdout
    ], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    return proc.stdout

It wrapps the stdout as http response via a pipe


Need Your Help

Make part of contentEditable unable to delete

javascript jquery html html5

I have a contentEditable Div, and I'd like to be able to "protect" portions of it from deletions.

C# remove item from list of integers int l = {1,2,3} - or use recursion to add them

c# list recursion integer sum

So there's this blog that gives Five programming problems every Software Engineer should be able to solve in less than 1 hour and I'm just revisiting some of the concepts.