Read timeout with requests on a paid account : Forums : PythonAnywhere

Read timeout with requests on a paid account

I am hosting a Telegram bot on Pythonanywhere, using a paid account, and it works like a charm. The problem comes when I try to perform a daily scrape from stats.nba.com, in order to gather player statistics for the bot. I get this error: HTTPSConnectionPool(host='stats.nba.com', port=443): Read timed out. (read timeout=5)

I also tried rising the timeout to one minute but it fails. The request line is this one: r = requests.get(url=URL, timeout=5, headers={'Referer': 'https://www.nba.com/', 'User-Agent': USERAGENT})

The same instruction runs fine on my PC from Ubuntu and/or Windows. What am I missing to perform the GET on Pythonanywhere?

arrakjs | 2 posts | March 20, 2021, 1:54 p.m. | permalink

It's possible that the site you're trying to access has some kind of protection in place to prevent scraping, based on the IP address of incoming requests -- they're spotting that the requests from PythonAnywhere are coming from an IP address associated with a cloud computing environment (as opposed to one owned by a residential ISP) and are just ignoring them.

Unfortunately there's not really anything we can do about that from our side; you'd have to get in touch with them to ask them to lift the block (which I appreciate might not be easy with a large site like nba.com).

giles | 222 posts | PythonAnywhere staff | March 20, 2021, 5:58 p.m. | permalink

Bad to know, but at least I have an explanation now, thanks.

arrakjs | 2 posts | March 20, 2021, 11:35 p.m. | permalink

FWIW , I am seeing a similar issue (same API) ... so there's a good chance NBA is limiting requests from your website, which is a shame. Oh well

friendofmario | 2 posts | Feb. 13, 2024, 5:10 p.m. | permalink

@friendofmario you're using a free account, so you would have restricted internet access; check if the site you're trying to reach is on our allow-list.

pafk | 167 posts | PythonAnywhere staff | Feb. 14, 2024, 8:01 a.m. | permalink

@pafk , stats.nba.com seems to be in the list, so does it mean that even free accounts can connect to it...? I have not succeeded, even once.

friendofmario | 2 posts | March 2, 2024, 11:55 p.m. | permalink

Yes, that's right -- if they're blocking the IP addresses of cloud computing environments, then there's nothing we can do about it.

giles | 222 posts | PythonAnywhere staff | March 3, 2024, 3:52 p.m. | permalink