scrapy-splash: Proxy connection is being refused

The error below suggest that my proxy connection is being refused. The proxy was tested with curl and it is infact working, it requires no credentials which is why the username and password fields were omitted in set_proxy. What else could be the reason for this connection being refused?

–ERROR RenderErrorInfo(type=‘Network’, code=99, text=‘Proxy connection refused’, “message”: “Lua error: [string "…"]:10: network99”, “type”: “LUA_ERROR”, “error”: “network99”},

–TESTING PROXY curl --proxy 127.0.0.1:24000 "http://lumtest.com/myip.json"

–SPIDER CODE

import scrapy
from scrapy_splash import SplashRequest

script = """
function main(splash)
  splash:on_request(function(request)
    request:set_proxy{
        host = "127.0.0.1",
        port = 24000,
    }
  end)

  assert(splash:go{
    splash.args.url,
    headers=splash.args.headers,
    http_method=splash.args.http_method,
    body=splash.args.body,
    })
  assert(splash:wait(0.5))

  local entries = splash:history()
  local last_response = entries[#entries].response
  return {
    url = splash:url(),
    headers = last_response.headers,
    http_status = last_response.status,
    html = splash:html(),
  }
end
"""
class TestlumSpider(scrapy.Spider):
    name = "testlum"
    allowed_domains = ["amazon.ca"]

    def start_requests(self):
		url = "https://www.amazon.ca/dp/1482703270"
 	        yield SplashRequest(url, self.parse, endpoint='execute',
                            args={'lua_source': script,})

    def parse(self, response):
        pass

About this issue

  • Original URL
  • State: open
  • Created 7 years ago
  • Comments: 15 (5 by maintainers)

Most upvoted comments

Hi, everyone. I’ve managed to struggle with the same issue for the last couple of hours. The problem is accessing a localhost from a docker container. According to docker docs: “The host has a changing IP address (or none if you have no network access). From 18.03 onwards our recommendation is to connect to the special DNS name host.docker.internal, which resolves to the internal IP address used by the host. The gateway is also reachable as gateway.docker.internal.” https://docs.docker.com/docker-for-windows/networking/#per-container-ip-addressing-is-not-possible So the solution in this case is to change the host name: from function main(splash) splash:on_request(function(request) request:set_proxy{ host = "127.0.0.1", port = 24000, } end) to function main(splash) splash:on_request(function(request) request:set_proxy{ host = "host.docker.internal", port = 24000, } end)