grpc: [Aio] Async IO manager deadlock with current API
Current gRPC Python API will deadlock using Async IO manager. This problem needs to be solved before merging the first PR.
Reproduction script:
import grpc
import asyncio
from grpc.experimental import aio
async def main():
print('===Testing Aio Channel===')
async with aio.insecure_channel('localhost:50051') as channel:
print(await channel.unary_unary(b'')(b''))
print('===Finish Aio Channel===')
print('===Testing Normal Channel===')
with grpc.insecure_channel('localhost:50051') as channel:
print(channel.unary_unary(b'')(b'')) # <- deadlock
print('===Finish Normal Channel===')
aio.init_grpc_aio()
asyncio.get_event_loop().run_until_complete(main())
Log:
I0814 16:23:13.264381376 220688 call.cc:1927] grpc_call_start_batch(call=0x564af3f8b4a0, ops=0x564af3f69a50, nops=6, tag=0x7f0903e0dc50, reserved=(nil))
I0814 16:23:13.264387822 220688 call.cc:1526] ops[0]: SEND_INITIAL_METADATA(nil)
I0814 16:23:13.264392291 220688 call.cc:1526] ops[1]: SEND_MESSAGE ptr=0x564af3dcbc20
I0814 16:23:13.264396790 220688 call.cc:1526] ops[2]: SEND_CLOSE_FROM_CLIENT
I0814 16:23:13.264401459 220688 call.cc:1526] ops[3]: RECV_INITIAL_METADATA ptr=0x7f0903e1ee88
I0814 16:23:13.264406097 220688 call.cc:1526] ops[4]: RECV_MESSAGE ptr=0x7f0903e58f20
I0814 16:23:13.264411516 220688 call.cc:1526] ops[5]: RECV_STATUS_ON_CLIENT metadata=0x7f0905300bf8 status=0x7f0905300c10 details=0x7f0905300c18
D0814 16:23:13.264423396 220688 dns_resolver.cc:242] Start resolving.
I0814 16:23:13.264452976 220688 completion_queue.cc:963] grpc_completion_queue_next(cq=0x564af3f17fd0, deadline=gpr_timespec { tv_sec: 1565824993, tv_nsec: 464450963, clock_type: 1 }, reserved=(nil))
I0814 16:23:13.464982891 220688 completion_queue.cc:963] grpc_completion_queue_next(cq=0x564af3f17fd0, deadline=gpr_timespec { tv_sec: 1565824993, tv_nsec: 664976288, clock_type: 1 }, reserved=(nil))
I0814 16:23:13.665982102 220688 completion_queue.cc:963] grpc_completion_queue_next(cq=0x564af3f17fd0, deadline=gpr_timespec { tv_sec: 1565824993, tv_nsec: 865976002, clock_type: 1 }, reserved=(nil))
I0814 16:23:13.866982076 220688 completion_queue.cc:963] grpc_completion_queue_next(cq=0x564af3f17fd0, deadline=gpr_timespec { tv_sec: 1565824994, tv_nsec: 66976175, clock_type: 1 }, reserved=(nil))
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 21 (11 by maintainers)
@pfreixes +1 to the implicit
grpc_init_aio()
call as long as we have very clear error messages when a failure occurs as a result of this call.I don’t think it particularly matters how we accomplish it in terms of implementation. The use case that we’re enabling here is coexisting synchronous gRPC and asyncio gRPC in the same process. People need this in order to incrementally migrate from their existing sync stack to asyncio.
As for testing, it would be great if we could figure out a way to run all of the existing synchronous tests with something async added into the test process. Then we can be 100% sure that things are compatible. You might consider adding a new test runner for this.
I’m planning to start coding the solution for supporting the synchronous stack on top of the
Aio
package Today, for circumventing the “deadlock” issue. The solution would be basically inspired by the POC that I made for having the io loop running in a separated thread [1].Just for having all on the same page, the solution proposed would allow us to:
Aio
environment.The solution wouldn’t provide support for:
Aio
package on an already initialized synchronous environment.I’m still considering the implications of having a synchronous server running on top of an already initialized
Aio
environment, and most important if it does make sense. Initially, I would be tempted to not trying to give support to this use case.Also, I’m considering on removing the need of having to initialized explicitly the
Aio
package by calling thegrpc_init_aio
function, so doing this behind the scenes when either an Aio channel or an Aio server is created. For unsupported combinations, anException
would be raised. This theoretically would help also to close these issues [2] [3]/cc @gnossen @lidizheng
[1] https://github.com/Skyscanner/grpc/commit/996636ac853acd2ce60afbb8843181321a9915b4 [2] https://github.com/grpc/grpc/issues/20806 [3] https://github.com/grpc/grpc/issues/19985
For option 1, this is on the road map of C-Core team. It is possible that they will provide SWE to do it. I agree it is non-trivial, but it will benefits all wrapper languages. Particularly, the migration cases.
The
RuntimeError
solution should serve as a fallback plan. For now, I agree it is better to throw an exception, after all, exception with a reason is much nicer than deadlock.