prefect: Occasional http2 connection errors (KeyError)
First check
- I added a descriptive title to this issue.
- I used the GitHub search to find a similar issue and didn’t find it.
- I searched the Prefect documentation for this issue.
- I checked that this issue is related to Prefect and not one of its dependencies.
Bug summary
Occasionally, flows crash with a connection-related exception that seems to originate from h2. So far this could only be observed in longer flow runs (>2h) and seems not to be related to any specific workload.
Possibly related to https://github.com/PrefectHQ/prefect/issues/7442, https://github.com/PrefectHQ/prefect/pull/9429
Reproduction
Let enough flows run for long enough.
Error
Crash detected! Execution was interrupted by an unexpected exception: KeyError: 789
prefect.flow_runs
Crash details:
Traceback (most recent call last):
File "/home/ray/anaconda3/lib/python3.8/site-packages/prefect/_internal/concurrency/calls.py", line 293, in aresult
return await asyncio.wrap_future(self.future)
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/ray/anaconda3/lib/python3.8/contextlib.py", line 189, in __aexit__
await self.gen.athrow(typ, value, traceback)
File "/home/ray/anaconda3/lib/python3.8/site-packages/prefect/task_runners.py", line 187, in start
yield self
File "/home/ray/anaconda3/lib/python3.8/site-packages/prefect/engine.py", line 539, in begin_flow_run
terminal_or_paused_state = await orchestrate_flow_run(
File "/home/ray/anaconda3/lib/python3.8/site-packages/prefect/engine.py", line 849, in orchestrate_flow_run
result = await flow_call.aresult()
File "/home/ray/anaconda3/lib/python3.8/site-packages/prefect/_internal/concurrency/calls.py", line 295, in aresult
raise CancelledError() from exc
prefect._internal.concurrency.cancellation.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ray/anaconda3/lib/python3.8/site-packages/prefect/engine.py", line 2221, in report_flow_run_crashes
yield
File "/home/ray/anaconda3/lib/python3.8/contextlib.py", line 662, in __aexit__
cb_suppress = await cb(*exc_details)
File "/home/ray/anaconda3/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 597, in __aexit__
raise exceptions[0]
File "/home/ray/anaconda3/lib/python3.8/site-packages/prefect/engine.py", line 1597, in create_task_run_then_submit
task_run = await create_task_run(
File "/home/ray/anaconda3/lib/python3.8/site-packages/prefect/engine.py", line 1642, in create_task_run
task_run = await flow_run_context.client.create_task_run(
File "/home/ray/anaconda3/lib/python3.8/site-packages/prefect/client/orchestration.py", line 1986, in create_task_run
response = await self._client.post(
File "/home/ray/anaconda3/lib/python3.8/site-packages/httpx/_client.py", line 1877, in post
return await self.request(
File "/home/ray/anaconda3/lib/python3.8/site-packages/httpx/_client.py", line 1559, in request
return await self.send(request, auth=auth, follow_redirects=follow_redirects)
File "/home/ray/anaconda3/lib/python3.8/site-packages/prefect/client/base.py", line 282, in send
response = await self._send_with_retry(
File "/home/ray/anaconda3/lib/python3.8/site-packages/prefect/client/base.py", line 216, in _send_with_retry
response = await request()
File "/home/ray/anaconda3/lib/python3.8/site-packages/httpx/_client.py", line 1646, in send
response = await self._send_handling_auth(
File "/home/ray/anaconda3/lib/python3.8/site-packages/httpx/_client.py", line 1674, in _send_handling_auth
response = await self._send_handling_redirects(
File "/home/ray/anaconda3/lib/python3.8/site-packages/httpx/_client.py", line 1711, in _send_handling_redirects
response = await self._send_single_request(request)
File "/home/ray/anaconda3/lib/python3.8/site-packages/httpx/_client.py", line 1748, in _send_single_request
response = await transport.handle_async_request(request)
File "/home/ray/anaconda3/lib/python3.8/site-packages/httpx/_transports/default.py", line 371, in handle_async_request
resp = await self._pool.handle_async_request(req)
File "/home/ray/anaconda3/lib/python3.8/site-packages/httpcore/_async/connection_pool.py", line 268, in handle_async_request
raise exc
File "/home/ray/anaconda3/lib/python3.8/site-packages/httpcore/_async/connection_pool.py", line 251, in handle_async_request
response = await connection.handle_async_request(request)
File "/home/ray/anaconda3/lib/python3.8/site-packages/httpcore/_async/connection.py", line 103, in handle_async_request
return await self._connection.handle_async_request(request)
File "/home/ray/anaconda3/lib/python3.8/site-packages/httpcore/_async/http2.py", line 185, in handle_async_request
raise exc
File "/home/ray/anaconda3/lib/python3.8/site-packages/httpcore/_async/http2.py", line 144, in handle_async_request
await self._send_request_body(request=request, stream_id=stream_id)
File "/home/ray/anaconda3/lib/python3.8/site-packages/httpcore/_async/http2.py", line 261, in _send_request_body
await self._send_end_stream(request, stream_id)
File "/home/ray/anaconda3/lib/python3.8/site-packages/httpcore/_async/http2.py", line 280, in _send_end_stream
self._h2_state.end_stream(stream_id)
File "/home/ray/anaconda3/lib/python3.8/site-packages/h2/connection.py", line 883, in end_stream
frames = self.streams[stream_id].end_stream()
KeyError: 789
Versions
Version: 2.14.11
API version: 0.8.4
Python version: 3.8.15
Git commit: e6d7d76d
Built: Thu, Dec 14, 2023 5:45 PM
OS/Arch: linux/x86_64
Server type: cloud
Additional context
The stream_id from the final KeyError is different for each crash.
About this issue
- Original URL
- State: open
- Created 5 months ago
- Comments: 16 (7 by maintainers)
Hey, it’s really difficult to say what is going on without a reproduction.
Can you try pinningclicked on the wrong h2 repo 🤦♂️h2 < 4.0.0
and see if that helps? It looks like they released4.0.0
2 weeks ago and it lines up with the timeline for your errors.