query-server: Few search engines crashing the app, taking too much time to parse, e.g. Parsijoo.

I’m submitting a …

  • bug report
  • feature request

Expected behavior:

App should stop parsing if taking too much time.

Steps to reproduce: go to https://query-server.herokuapp.com/, search chelsea on google, or parsijoo, and select news option. Rest search engines works fine.

screenshot 2018-02-02 10 50 33

I am working on it

About this issue

  • Original URL
  • State: open
  • Created 6 years ago
  • Comments: 22 (22 by maintainers)

Most upvoted comments

Yes, you can send the PR to deal these undesirable cases.

Yeah sure @rupav Take your time. I would investigate from my side as well. Not claiming the issue, but would investigate.

I think this keeps the server in a infinite request response loop and finally the application crashes. Please make this as a priority @vaibhavsingh97 . Its hampering the clients using this server

Not sure about Parsijoo. But try printing the response content from google. I think they may be redirecting to captcha page or it may be possible that they have identified this as automated requests and have blocked access to your ip. Just try print response.text to see what content in printed. Check the body specifically if its html. Let me know what works.

Sometimes some search engine URLs redirect you to their captcha page. This is why it keeps on loading forever.

@rupav also, for google news there should be another issue so please remove the mention of google news from the issue name to avoid confusion.

Google News search is not been implemented yet. For Parsijoo, goto: http://khabar.parsijoo.ir/search?q=chelsea There isn’t any result corresponding to chelsea in News search.