instaloader: Error 429 even with proxy
I know the problems with 429 error, but i can’t understand the issue with 429 after proxy change.
I did too lot of request on my proxies, so i had blocked (i hope for 24h). Then i bought another proxies and… i can’t do anything.
How it’s possible that i have a 429 on first request after proxy and agent change? Please help me understand rate control and requests problems cause i can’t set it for a big sets of data
from datetime import datetime
from itertools import dropwhile, takewhile
import instaloader
from instaloader.exceptions import ProfileNotExistsException, TooManyRequestsException, ConnectionException, LoginRequiredException
import os
from time import sleep
os.environ['https_proxy'] = PROXY
agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.140 Safari/537.36 Edge/18.17763'
SINCE = datetime(2020, 5, 31)
UNTIL = datetime(2020, 5, 20)
PROFILES = ... # long list of profiles
L = instaloader.Instaloader(user_agent=agent,
max_connection_attempts=3,
dirname_pattern='backup/posts/1',
filename_pattern='{date_utc} {mediaid} {target}_post',
download_pictures=False,
download_videos=False,
download_video_thumbnails=False,
download_geotags = False,
download_comments = False,
compress_json=False,
post_metadata_txt_pattern=""
)
for nr, PROFILE in enumerate(PROFILES):
print('{}: {}'.format(nr, PROFILE))
while True:
try:
posts = instaloader.Profile.from_username(L.context, PROFILE).get_posts()
for post in takewhile(lambda p: p.date > UNTIL, dropwhile(lambda p: p.date > SINCE, posts)):
print(post.date)
L.download_post(post, target=PROFILE)
break
except ProfileNotExistsException:
print('{} - not exist'.format(PROFILE))
break
except (TooManyRequestsException, ConnectionException):
print('TooManyRequestsException, ConnectionException')
sleep(900)
pass
except LoginRequiredException:
print('{} -LoginRequiredException'.format(instaloader.Profile.from_username(L.context, PROFILE).is_private))
break
I’m changing proxy, agent, even entire environment and still:
HTTP redirect from https://www.instagram.com/PROFILE/ to https://www.instagram.com/accounts/login/
The request will be retried in 666 seconds, at 00:16.
JSON Query to PROFILE: 429 Too Many Requests: redirected to login [retrying; skip with ^C]
HTTP error code 429 was returned because too many queries occured in
the last time. Please do not use Instagram in your browser or run
multiple instances of Instaloader in parallel.
GraphQL requests:
other
last 10 minutes: 1 requests
last 11 minutes: 1 requests
last 15 minutes: 1 requests
last 20 minutes: 36 requests
last 30 minutes: 235 requests
last 60 minutes: 571 requests
472f257a40c653c64c666ce877d59d2b
last 10 minutes: 0 requests
last 11 minutes: 0 requests
last 15 minutes: 0 requests
last 20 minutes: 0 requests
last 30 minutes: 8 requests
last 60 minutes: 17 requests
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 2
- Comments: 23 (7 by maintainers)
I can confirm the error message "JSON Query to api/v1/media/…/info/: 429 Too Many Requests Unable to fetch high quality image version of Post … ", and I may add a few observations.
The funny thing is: JDownloader 2, a universal downloader application, has absolutely no trouble downloading the same bunch of Instagrams in a snap (up to about 50 tested in a row, without any issues), while Instaloader gets stuck and temp blocked (?) after only a few minutes / files. It seems, those two tools have a different approach accessing the IG servers…
My Setups:
One after another, I’ve been running Instaloader 4.4.4 in a VirtualBox with Debian 10 and in another VirtualBox with Xubuntu 20. On both systems, the exact same errors occur.
Host for the VirtualBox is Windows 10 (Version 10.0.18363.900), and it connects through a 100 Mbit/s connection either directly or through a “Private Internet Access” VPN with varying IP addresses.
Jdownloader 2 is running as an application on the Host (Windows 10), using the same internet connection. When Instaloader is already stuck again, typically Jdownloader 2 will still work on the same VPN IP.
Also, Instaloader and Jdownloader 2 both use the same Instagram account to authenticate themselves. While Instaloader is already blocked, JDownloader 2 can still use the same account.
I should also mention that the two virtual Linux boxes are working nicely for other downloads. No connection issues there.
(As a side note, on Jdownloader 2 the Instagram module seemed to broken a few days ago and is now working again normally. I didn’t check their forum, but it’s very well possible that there was an additional issue with IG’s servers. Instaloader however has had the trouble with 429 way before that happened.)
I’d love to use Instaloader for periodic automatic downloads in Linux, but at the current state it’s unusable to me. Hoping for a fix… Thanks.
@aandergr much love my friend ❤️
I didn’t know that this problem got tackled with version 4.4.5 and i was using 4.4.4. I updated and implemented various adaptions to make sure, my application works and now it does 😃
Thank you very much man 😃
The problem discussed here might be fixed with Version 4.4.5rc1, which includes a change of how post and profile metadata is accessed (#727).
To upgrade:
Thanks for sharing your observations, @midnightbash.
The error message you mention indicates that Instaloader has trouble accessing the iPhone endpoint, which we use to obtain higher-quality versions of the pictures since Version 4.4.4. It might be worth trying out version 4.4.3 to see if the problem disappears when loading lower-quality pictures.
We recently noticed that Instaloader uses fewer requests per sliding window than what currently seems to be possible. Version 4.5 will most probably use a higher request rate, and make it easier for the user to control rate behavior (#716).
Like discussed in #652 it might be that Instagram has a special restriction policy for Cloud and VPN services in terms that they block anonymous requests. The same restrictions might apply for proxy services which would result in the here observed behavior.
This seems to be true as well, but I don’t want to declare it as confirmed until I know more. All my servers running Bibliogram from Vultr have been blocked from accessing user profile pages (i.e. hxxps://www.instagram.com/username and hxxps://www.instagram.com/username?__a=1) since somewhere around June 1st. This includes a new server that I just created that presumably does not have any recent outgoing requests from its IP. However, this block is NOT status code 429, it’s status code 302 with a redirect to the login page.
/graphql requests from my servers do not seem to be restricted. They’re subject to the usual limits.