Do form parameter names need to be encoded when doing a POST?
Quick version: Do the names of parameters of "forms" being sent using the standard multipart/form-data encoding need to be encoded?
Longer version: The upload form on 1fichier.com (a service to upload large files) uses the following to specify the file parameter to upload:
<input type="file" name="file" size="50" title="Select the files to upload" />
The name of the parameter is file (notice the brackets).
Using LiveHTTPHeaders I see that the parameter is sent like this (i.e. with brackets) when submitting the form in Firefox. However, for a program I'm writing in Python, I am using the poster module to be able to upload files using the standard multipart/form-data encoding. If I enter the parameter name with the brackets, it gets sent like this:
Internally, poster encodes the names of the parameters using this function:
def encode_and_quote(data): """If ``data`` is unicode, return urllib.quote_plus(data.encode("utf-8")) otherwise return urllib.quote_plus(data)""" if data is None: return None if isinstance(data, unicode): data = data.encode("utf-8") return urllib.quote_plus(data)
The urllib.quote_plus documentation says that this is only "required for quoting HTML form values when building up a query string to go into a URL". But here we're doing a POST, so the form values don't go in the url.
So, do they still need to be encoded, or is it an error of poster to be doing this?
So if your POST request is encoded as multipart/form-data (which poster is doing), then no, parameter names don't need to be encoded this way. I suggest filing a bug with the author (ahem...), he might be willing to fix it in a future release ;)
A workaround is to set your MultipartParam's name attribute directly, e.g.
p.name = 'file'
Although in essence this question has been answered, I'm including some more details on how to dig through those RFCs.
RFC 2388 section 3 states that a Content-Disposition header is reqired. Non-ASCII data should be encoded using RFC 2047 even though that looks like a conflict. RFC 2183 section 2 describes the format of this Content-disposition header. The name fits in the general parameter rule of that grammar, but references
Content-Disposition: form-data; name="file" (correct) Content-Disposition: form-data; name=file (invalid) Content-Disposition: form-data; name="file%5B%5D" (wrong name) Content-Disposition: form-data; name=file%5B%5D (wrong name)
One more note for non-ASCII file names: the current HTML 5 specification draft requires not encoding them in a 7-bit safe manner, but instead transferring them in the encoding used throughout the request. A question about non-ascii field names is what brought me to look at this question of yours today.