selenium: [🐛 Bug]: Timeout exception thrown randomly with RemoteWebDriver
What happened?
When executing tests at scale there are some random failures with read timeouts (seems to wait for 3 mins). We are observing this primarily when clicking an element but have also observed it during navigate to page. Seems to be a reoccurrence of https://github.com/SeleniumHQ/selenium/issues/9528
We are using dockerized grid v4.10 with the hub/node model and we confirmed that the docker process starts with jdk-http-client
java -Dwebdriver.http.factory=jdk-http-client -jar /opt/selenium/selenium-server.jar --ext /opt/selenium/selenium-http-jdk-client.jar hub --session-request-timeout 3600 --session-retry-interval 15 --relax-checks true --bind-host false --config /opt/selenium/config.toml
How can we reproduce the issue?
Unfortunately similar to https://github.com/SeleniumHQ/selenium/issues/9528 we are unable to find a consistent repro for this. Seems flaky and rerunning the test passes but we are still seeing this happen daily.
Relevant log output
We use an internal framework to run selenium tests and the DEBUG logs indicate that a 500 was returned during the click event. We check to see if the element is enabled and clickable (which passes) but the subsequent click seems to fail.
DEBUG 2023-08-15 19:24:02.928 selenium.webdriver.remote.remote_connection:remote_connection.py:303 -> POST http://selenium-four-hub-nomad.service.eng.consul:4444/session/ab89759c17aac47ed60f6d6b9a8b6882/element {"using": "xpath", "value": "//*[@class='extension-name'][text()='JSON']"}
DEBUG 2023-08-15 19:24:02.943 urllib3.connectionpool:connectionpool.py:473 -> http://selenium-four-hub-nomad.service.eng.consul:4444 "POST /session/ab89759c17aac47ed60f6d6b9a8b6882/element HTTP/1.1" 200 95
DEBUG 2023-08-15 19:24:02.943 selenium.webdriver.remote.remote_connection:remote_connection.py:319 -> Remote response: status=200 | data={"value":{"element-6066-11e4-a52e-4f735466cecf":"3AFE27AB794812F985C270183415B5A4_element_21"}} | headers=HTTPHeaderDict({'content-length': '95', 'content-type': 'application/json; charset=utf-8', 'cache-control': 'no-cache'})
DEBUG 2023-08-15 19:24:02.944 selenium.webdriver.remote.remote_connection:remote_connection.py:346 -> Finished Request
DEBUG 2023-08-15 19:24:02.944 selenium.webdriver.remote.remote_connection:remote_connection.py:303 -> GET http://selenium-four-hub-nomad.service.eng.consul:4444/session/ab89759c17aac47ed60f6d6b9a8b6882/element/3AFE27AB794812F985C270183415B5A4_element_21/enabled {"id": "3AFE27AB794812F985C270183415B5A4_element_21"}
DEBUG 2023-08-15 19:24:02.961 urllib3.connectionpool:connectionpool.py:473 -> http://selenium-four-hub-nomad.service.eng.consul:4444 "GET /session/ab89759c17aac47ed60f6d6b9a8b6882/element/3AFE27AB794812F985C270183415B5A4_element_21/enabled HTTP/1.1" 200 14
DEBUG 2023-08-15 19:24:02.962 selenium.webdriver.remote.remote_connection:remote_connection.py:319 -> Remote response: status=200 | data={"value":true} | headers=HTTPHeaderDict({'content-length': '14', 'content-type': 'application/json; charset=utf-8', 'cache-control': 'no-cache'})
DEBUG 2023-08-15 19:24:02.962 selenium.webdriver.remote.remote_connection:remote_connection.py:346 -> Finished Request
DEBUG 2023-08-15 19:24:02.963 selenium.webdriver.remote.remote_connection:remote_connection.py:303 -> POST http://selenium-four-hub-nomad.service.eng.consul:4444/session/ab89759c17aac47ed60f6d6b9a8b6882/element {"using": "xpath", "value": "//*[@class='extension-name'][text()='JSON']"}
DEBUG 2023-08-15 19:24:02.978 urllib3.connectionpool:connectionpool.py:473 -> http://selenium-four-hub-nomad.service.eng.consul:4444 "POST /session/ab89759c17aac47ed60f6d6b9a8b6882/element HTTP/1.1" 200 95
DEBUG 2023-08-15 19:24:02.979 selenium.webdriver.remote.remote_connection:remote_connection.py:319 -> Remote response: status=200 | data={"value":{"element-6066-11e4-a52e-4f735466cecf":"3AFE27AB794812F985C270183415B5A4_element_21"}} | headers=HTTPHeaderDict({'content-length': '95', 'content-type': 'application/json; charset=utf-8', 'cache-control': 'no-cache'})
DEBUG 2023-08-15 19:24:02.979 selenium.webdriver.remote.remote_connection:remote_connection.py:346 -> Finished Request
DEBUG 2023-08-15 19:24:02.980 selenium.webdriver.remote.remote_connection:remote_connection.py:303 -> POST http://selenium-four-hub-nomad.service.eng.consul:4444/session/ab89759c17aac47ed60f6d6b9a8b6882/element/3AFE27AB794812F985C270183415B5A4_element_21/click {"id": "3AFE27AB794812F985C270183415B5A4_element_21"}
DEBUG 2023-08-15 19:27:03.006 urllib3.connectionpool:connectionpool.py:473 -> http://selenium-four-hub-nomad.service.eng.consul:4444 "POST /session/ab89759c17aac47ed60f6d6b9a8b6882/element/3AFE27AB794812F985C270183415B5A4_element_21/click HTTP/1.1" 500 3517
DEBUG 2023-08-15 19:27:03.007 selenium.webdriver.remote.remote_connection:remote_connection.py:319 -> Remote response: status=500 | data={
"value": {
"error": "script timeout",
"message": "Unable to execute request for an existing session: java.util.concurrent.TimeoutException\nBuild info: version: '4.10.0', revision: 'c14d967899'\nSystem info: os.name: 'Linux', os.arch: 'amd64', os.version: '4.18.0-477.13.1.el8_8.x86_64', java.version: '11.0.19'\nDriver info: driver.version: unknown",
"stacktrace": "java.lang.RuntimeException: Unable to execute request for an existing session: java.util.concurrent.TimeoutException\nBuild info: version: '4.10.0', revision: 'c14d967899'\nSystem info: os.name: 'Linux', os.arch: 'amd64', os.version: '4.18.0-477.13.1.el8_8.x86_64', java.version: '11.0.19'\nDriver info: driver.version: unknown\n\tat org.openqa.selenium.grid.router.HandleSession.execute(HandleSession.java:160)\n\tat org.openqa.selenium.remote.http.Route$PredicatedRoute.handle(Route.java:384)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:69)\n\tat org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:347)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:69)\n\tat org.openqa.selenium.grid.router.Router.execute(Router.java:87)\n\tat org.openqa.selenium.grid.web.EnsureSpecCompliantResponseHeaders.lambda$apply$0(EnsureSpecCompliantResponseHeaders.java:34)\n\tat org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:63)\n\tat org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:347)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:69)\n\tat org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:347)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:69)\n\tat org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:347)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:69)\n\tat org.openqa.selenium.remote.http.Route$CombinedRoute.handle(Route.java:347)\n\tat org.openqa.selenium.remote.http.Route.execute(Route.java:69)\n\tat org.openqa.selenium.remote.AddWebDriverSpecHeaders.lambda$apply$0(AddWebDriverSpecHeaders.java:35)\n\tat org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)\n\tat org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:63)\n\tat org.openqa.selenium.remote.ErrorFilter.lambda$apply$0(ErrorFilter.java:44)\n\tat org.openqa.selenium.remote.http.Filter$1.execute(Filter.java:63)\n\tat org.openqa.selenium.netty.server.SeleniumHandler.lambda$channelRead0$0(SeleniumHandler.java:44)\n\tat java.base\u002fjava.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)\n\tat java.base\u002fjava.util.concurrent.FutureTask.run(FutureTask.java:264)\n\tat java.base\u002fjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat java.base\u002fjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat java.base\u002fjava.lang.Thread.run(Thread.java:829)\nCaused by: java.util.concurrent.TimeoutException\n\tat java.base\u002fjava.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)\n\tat java.base\u002fjava.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)\n\tat org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute(JdkHttpClient.java:377)\n\tat org.openqa.selenium.remote.tracing.TracedHttpClient.execute(TracedHttpClient.java:54)\n\tat org.openqa.selenium.grid.web.ReverseProxyHandler.execute(ReverseProxyHandler.java:91)\n\tat org.openqa.selenium.grid.router.HandleSession.execute(HandleSession.java:125)\n\t... 26 more\n"
}
} | headers=HTTPHeaderDict({'content-length': '3517', 'Cache-Control': 'none', 'Content-Type': 'application/json; charset=utf-8'})
DEBUG 2023-08-15 19:27:03.008 selenium.webdriver.remote.remote_connection:remote_connection.py:346 -> Finished Request
### Operating System
Linux
### Selenium version
4.10
### What are the browser(s) and version(s) where you see this issue?
Chrome v.114.0.5735.133
### What are the browser driver(s) and version(s) where you see this issue?
v.114.0.5735.133
### Are you using Selenium Grid?
4.10
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 40 (29 by maintainers)
Commits related to this issue
- [grid] close HttpClients after the session is gone #12558 — committed to SeleniumHQ/selenium by joerg1985 9 months ago
- Revert "[grid] close HttpClients after the session is gone #12558" This reverts commit e8c77b8e08f74905a049805922b57d8a299e2338. — committed to SeleniumHQ/selenium by joerg1985 9 months ago
- [grid] ensure we do not close HttpClients waiting for responses — committed to SeleniumHQ/selenium by joerg1985 10 months ago
- [grid] keep HttpClient alive until unused #12558 — committed to SeleniumHQ/selenium by joerg1985 8 months ago
- [grid] close HttpClients after the session is gone #12558 — committed to aguspe/selenium by joerg1985 9 months ago
- Revert "[grid] close HttpClients after the session is gone #12558" This reverts commit e8c77b8e08f74905a049805922b57d8a299e2338. — committed to aguspe/selenium by joerg1985 9 months ago
- [grid] keep HttpClient alive until unused #12558 — committed to SeleniumHQ/selenium by joerg1985 8 months ago
- [grid] keep HttpClient alive until unused #12558 (#12978) * fixed merge conflict * [grid] use a URI as key to the map * [java] make the json parsing exception text more helpful * [java] fix... — committed to SeleniumHQ/selenium by joerg1985 8 months ago
@pujagani i think i have some code to provoke this, i will create a new issue for the click retry on timeout.
closing based on 5f3f7d096a75700e055effbe13cbc22117014e2c, e8c77b8e08f74905a049805922b57d8a299e2338 & #12838
Yes, I think so. So now that it is narrowed down, I am going to try to fix our code tomorrow.
I also observed that if the ErrorFilter is added then the response contains the exception details. So the error is not swallowed.
@titusfortner i just pushed the timeout less version in https://github.com/SeleniumHQ/selenium/commit/e8c77b8e08f74905a049805922b57d8a299e2338. So #12368 must not care about this timeout any more.
@pujagani here is some code to provoke this, the fallback (aka null) is taken as the operation itself failing.
Yes, thanks for the hint. I have a change prepared to completly remove this timeout. We can check every minute and if it has not been used for some time, if not check the session is alive or not. If the session is alive let the client alive, if not close it.
But lets see if this issue is resolved with the increased timeout first.
Thank you for the update. Due to the traffic we run on the grid I’ll try capture FINE level logs during the weekend and attach them here.