-
Notifications
You must be signed in to change notification settings - Fork 8
added redirect support #8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
inglesp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @rkrp -- this is a great start. Well done in particular on working out how to get the test server to return redirects.
I've left one comment which I'd like you to address if you can.
| rsp = session.get(url, allow_redirects=follow_redirects) | ||
|
|
||
| if rsp.status_code // 100 == 3: | ||
| continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should still yield a response, even if it is a redirect. For instance, a user of the library might want to know about all requests that are redirected. We should also extract any URLs from the Location header of a 3xx response.
Please could you:
- remove these two lines, so that we continue to yield all responses;
- add a check in this section so that if the response is a 3xx, we try to extract a URL from the Location header.
Does that make sense?
Once you've done that, you should also add yourself to AUTHORS.txt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a check in this section so that if the response is a 3xx, we try to extract a URL from the Location header.
@inglesp By doing this, wouldn't we essentially be following redirects? I am confused.
If we are proceeding this manner, I think it would be a better approach to scrap the body of the 3xx HTTP response also, in addition to adding the url from location header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this a bit more,follow_redirects is not the right name for the new argument, since it suggests that 3xx responses shouldn't be followed at all.
Can you think of a better name for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@inglesp How about pause_at_redirects or scrap_redirects?
Any thoughts on this?
I think it would be a better approach to scrap the body of the 3xx HTTP response also, in addition to adding the url from location header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @rkrp. I'm so sorry it's taken me so long to reply to you.
What about auto_follow_redirects?
And yes, we should also scrape the body of the 3xx response.
Added feature whether to follow redirects or not.