Python Requests library redirect new url
I've been looking through the Python Requests documentation but I cannot see any functionality for what I am trying to achieve.
In my script I am setting allow_redirects=True.
I would like to know if the page has been redirected to something else, what is the new URL.
For example, if the start URL was: www.google.com/redirect
And the final URL is www.google.co.uk/redirected
How do I get that URL?
You are looking for the request history.
The response.history attribute is a list of responses that led to the final URL, which can be found in response.url.
response = request.get(someurl) if response.history: print "Request was redirected" for resp in response.history: print resp.status_code, resp.url print "Final destination:" print response.status_code, response.url else: print "Request was not redirected"
>>> import requests >>> response = requests.get('http://httpbin.org/redirect/3') >>> response.history (<Response >, <Response >, <Response >) >>> for resp in response.history: ... print resp.status_code, resp.url ... 302 http://httpbin.org/redirect/3 302 http://httpbin.org/redirect/2 302 http://httpbin.org/redirect/1 >>> print response.status_code, response.url 200 http://httpbin.org/get
the documentation has this blurb http://docs.python-requests.org/en/latest/user/quickstart/#redirection-and-history
r = requests.get('http://www.github.com') r.url #returns https://www.github.com instead of the http page you asked for
This is answering a slightly different question, but since I got stuck on this myself, I hope it might be useful for someone else.
If you want to use allow_redirects=False and get directly to the first redirect object, rather than following a chain of them, and you just want to get the redirect location directly out of the 302 response object, then r.url won't work. Instead, it's the "Location" header:
r = requests.get('http://github.com/') r.status_code # 302 r.url # http://github.com, not https. r.headers['Location'] # https://github.com/ -- the redirect destination
I think requests.head instead of requests.get will be more safe to call when handling url redirect,check the github issue here:
r = requests.head(url, allow_redirects=True) print(r.url)