Skip to content

Conversation

@rkrp
Copy link

@rkrp rkrp commented Oct 12, 2016

Added feature whether to follow redirects or not.

Copy link
Owner

@inglesp inglesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rkrp -- this is a great start. Well done in particular on working out how to get the test server to return redirects.

I've left one comment which I'd like you to address if you can.

rsp = session.get(url, allow_redirects=follow_redirects)

if rsp.status_code // 100 == 3:
continue
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should still yield a response, even if it is a redirect. For instance, a user of the library might want to know about all requests that are redirected. We should also extract any URLs from the Location header of a 3xx response.

Please could you:

  • remove these two lines, so that we continue to yield all responses;
  • add a check in this section so that if the response is a 3xx, we try to extract a URL from the Location header.

Does that make sense?

Once you've done that, you should also add yourself to AUTHORS.txt

Copy link
Author

@rkrp rkrp Oct 13, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a check in this section so that if the response is a 3xx, we try to extract a URL from the Location header.

@inglesp By doing this, wouldn't we essentially be following redirects? I am confused.

If we are proceeding this manner, I think it would be a better approach to scrap the body of the 3xx HTTP response also, in addition to adding the url from location header.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rkrp. Sorry for the slow reply -- have been AFK for a few days.

I understand your confusion, and I think it's because the wording of #5 is unclear. I'm sorry about that!

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this a bit more,follow_redirects is not the right name for the new argument, since it suggests that 3xx responses shouldn't be followed at all.

Can you think of a better name for it?

Copy link
Author

@rkrp rkrp Oct 21, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@inglesp How about pause_at_redirects or scrap_redirects?

Any thoughts on this?

I think it would be a better approach to scrap the body of the 3xx HTTP response also, in addition to adding the url from location header.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rkrp. I'm so sorry it's taken me so long to reply to you.

What about auto_follow_redirects?

And yes, we should also scrape the body of the 3xx response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants