google-play-scraper: [BUG] reviews_all doesn't download all reviews of an app with large amount of reviews
Library version 1.2.6
Describe the bug I cannot download all the reviews of an app with large amount of reviews. The number of downloaded reviews is always a multiple of 199.
Code
result = reviews_all("com.google.android.apps.fitness")
print(len(result))
# get 995
Expected behavior
Expect to download all the reviews with reviews_all, which should be at least 20k
Additional context No
About this issue
- Original URL
- State: open
- Created 4 months ago
- Comments: 22
This mod did not work for me either. I tried a different approach that worked for me:
In reviews.py:
Here’s my code ( I am fixing the number of reviews I need and break the loop when that number has crossed):
and in reviews.py I added the mod as my original comment.
This is probably a dupe of #208.
The error seems to be the play service intermittently returning an error inside a 200 success code, which then fails to parse as the json the library expects. It seems to contain this
....store.error.PlayDataErrormessage.The error seems to happen frequently but not reliably. Scraping in chunks of 200 reviews, basically every request has a decent chance of crashing, resulting in usually 200-1000 total reviews scraped before it craps out.
Currently, the library swallows this exception silently and quits. Handling this error lets the scraping continue as normal.
We monkey-patched around it like this and seem to have gotten back to workable scraping:
Me too, and I found that the output number is always a multiple of 199. It seems that Google Play randomly block the retrieval of next page of reviews.
Im seeing the same issue even when I set the number of reviews (25000 in my case). Im only getting back about 500 and the output number changes each time I run it.
I have tried with your code, and it worked for me running on colab
Thanks @adilosa and @paulolacombe , your posts are worked for me 😃
Hey @Shivam-170103, you need to use the code lines that @adilosa provided to replace the corresponding ones in the reviews.py function file in your environment. Let me know if that helps as I am not that familiar with Google Colab.
Still not able to get more than a few hundred reviews.