requests.get really slow when stream=True #2015

cournape · 2014-04-23T14:30:08Z

I noticed that using stream=True is really slow in some cases. Code that shows the issue:

import requests


url = "https://api.enthought.com/eggs/rh5-64/numpy-1.8.0-1.egg"
target = "numpy-1.8.0-1.egg"

use_streaming = True

if use_streaming:
    resp = requests.get(url, stream=True)
else:
    resp = requests.get(url)

resp.raise_for_status()

with open(target, "wb") as target:
    if use_streaming:
        for chunk in resp.iter_content():
            target.write(chunk)
    else:
        target.write(resp.content)

With use_streaming=True, it takes around 40 sec, and only 2 sec when False. Running this script with strace, it looks like the chunk size is 1 byte:

...
recvfrom(4, "\345", 1, 0, NULL, NULL)   = 1 # I see many lines like this

Looks like for some reason the chunk size is ridiculously small ?

I am using requests 2.2.1 on python 2.7 on debian.

The text was updated successfully, but these errors were encountered:

cournape · 2014-04-23T14:34:05Z

Hm, if I read the documentation correctly, I would have seen that the default chunk size is 1 byte... so nvm.

Is there a rationale for such a small size ?

Lukasa · 2014-04-23T14:35:59Z

As you've spotted, by default iter_content()'s chunk size is 1, ensuring that it returns as rapidly as possible. This favours responsiveness over throughput (because we endure a large number of syscalls).

Whether this is a good idea or not is unclear. We've got a giant issue that covers this (see #844), and that issue has not been decided emphatically. I'm inclined to increase the size, but wary about the risk of breaking things.

Note that the bug here is not to do with streaming: if you use r.content in either mode it'll be fast. It's simply the use of iter_content at its default chunk size that causes problems.

Anyway, the central issue is in #844, so I'll close this to centralise there.

See https://github.com/kennethreitz/requests/issues/2015 for discussion

Lukasa closed this as completed Apr 23, 2014

boydgreenfield referenced this issue in onecodex/onecodex Sep 27, 2016

Fix download speed issue by setting chunk_size=1024

c490cc5

See https://github.com/kennethreitz/requests/issues/2015 for discussion

github-actions bot locked as resolved and limited conversation to collaborators Sep 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

requests.get really slow when stream=True #2015

requests.get really slow when stream=True #2015

cournape commented Apr 23, 2014

cournape commented Apr 23, 2014

Lukasa commented Apr 23, 2014

requests.get really slow when stream=True #2015

requests.get really slow when stream=True #2015

Comments

cournape commented Apr 23, 2014

cournape commented Apr 23, 2014

Lukasa commented Apr 23, 2014