dapr: Data corruption in actor/service invocation under high rps
Expected Behavior
Correct serialization (99.9% of the time) on the HTTP endpoints for actors
Actual Behavior
When running a Proxy call for the Actor implementation, it randomly fails and spits out a weird JSON string, as if something happened on the protocol transport. Below you can find 2 different occurrences:
Occurrence dapr/python-sdk#1:
# /home/xanrin/dapr-python-sdk/dapr/actor/client/proxy.py (line 71)
b'[[0.056063172301151286,0.43107083116247324,-0.21845425248263262,-1.2186224010316111],1.0,true,{}]lse,{}]'
# /home/xanrin/dapr-python-sdk/dapr/serializers/json.py (line 49)
b'[[0.056063172301151286,0.43107083116247324,-0.21845425248263262,-1.2186224010316111],1.0,true,{}]lse,{}]'
# What I expect:
b'[[0.056063172301151286,0.43107083116247324,-0.21845425248263262,-1.2186224010316111],1.0,true,{}]'
Occurrence dapr/python-sdk#2:
# /home/xanrin/dapr-python-sdk/dapr/actor/client/proxy.py (line 71)
b'[[0.02180521361883352,0.9327880131478761,-0.04958054991416051,-1.4610110113961423],1.0,false,{}]]'
# /home/xanrin/dapr-python-sdk/dapr/serializers/json.py (line 49)
b'[[0.02180521361883352,0.9327880131478761,-0.04958054991416051,-1.4610110113961423],1.0,false,{}]]'
# What I expect (there is a ] too much):
b'[[0.02180521361883352,0.9327880131478761,-0.04958054991416051,-1.4610110113961423],1.0,false,{}]'
Stacktrace
== APP == Process ForkServerProcess-4:
== APP == Traceback (most recent call last):
== APP == File "/usr/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
== APP == self.run()
== APP == File "/usr/lib/python3.7/multiprocessing/process.py", line 99, in run
== APP == self._target(*self._args, **self._kwargs)
== APP == File "/usr/local/lib/python3.7/dist-packages/stable_baselines/common/vec_env/subproc_vec_env.py", line 18, in _worker
== APP == observation, reward, done, info = env.step(data)
== APP == File "/mnt/e/Projects/roadwork-rl/src/Lib/python/roadwork/roadwork/client/client_dapr.py", line 91, in step
== APP == obs, reward, done, info = asyncio.get_event_loop().run_until_complete(self.proxy.SimStep({ 'action': action }))
== APP == File "/usr/local/lib/python3.7/dist-packages/nest_asyncio.py", line 59, in run_until_complete
== APP == return f.result()
== APP == File "/usr/lib/python3.7/asyncio/futures.py", line 181, in result
== APP == raise self._exception
== APP == File "/usr/lib/python3.7/asyncio/tasks.py", line 249, in __step
== APP == result = coro.send(None)
== APP == File "/home/xanrin/dapr-python-sdk/dapr/actor/client/proxy.py", line 71, in __call__
== APP == return self._message_serializer.deserialize(rtnval, self._attr_call_type['return_types'])
== APP == Actor sending: b'[[0.056063172301151286,0.43107083116247324,-0.21845425248263262,-1.2186224010316111],1.0,true,{}]lse,{}]'
== APP == Decoding: b'[[0.056063172301151286,0.43107083116247324,-0.21845425248263262,-1.2186224010316111],1.0,true,{}]lse,{}]'
== APP == File "/home/xanrin/dapr-python-sdk/dapr/serializers/json.py", line 49, in deserialize
== APP == Actor sending: b'[[0.056063172301151286,0.43107083116247324,-0.21845425248263262,-1.2186224010316111],1.0,true,{}]lse,{}]'
== APP == Decoding: b'[[0.056063172301151286,0.43107083116247324,-0.21845425248263262,-1.2186224010316111],1.0,true,{}]lse,{}]'
== APP == obj = json.loads(data, cls=DaprJSONDecoder)
== APP == File "/usr/lib/python3.7/json/__init__.py", line 361, in loads
== APP == return cls(**kw).decode(s)
== APP == File "/usr/lib/python3.7/json/decoder.py", line 340, in decode
== APP == raise JSONDecodeError("Extra data", s, end)
== APP == json.decoder.JSONDecodeError: Extra data: line 1 column 96 (char 95)
Steps to Reproduce the Problem
This happens when running an application that is utilizing a ThreadPool and running > 1000 reqs / sec (estimated)
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 45 (33 by maintainers)
Awesome, thanks! It worked!! 😃 I ran 8 processes @ ~42 req / s resulting in 100k steps in 294s without a crash!
@XavierGeerinck we merged the fix. You can use this edge version of dapr.
Selfhost mode:
Kubernetes with helm: