dapr: Actor type don't failover when app crashes but sidecar remains healthy

In what area(s)?

/area runtime /area operator /area placement /area docs /area test-and-release

What version of Dapr?

0.4.x

Expected Behavior

Placement service should be aware of the health of the application for a given actor type, not only the sidecar instance. If the app crashes but another app is up and registered for the same actortype, then calls to actor instance (new or existing Id) fail. In other words, the sidecar should report the health of the app too, so placement service can trigger rebalancing.

Client should successfully invoke actor methods:

== APP == Actor 764eddc2-9155-426e-b406-0da1eb48cefa got a reply: 2020-02-19 13:37:44.801

== APP == Actor 764eddc2-9155-426e-b406-0da1eb48cefa got a reply: 2020-02-19 13:37:45.018

== APP == Actor f955f205-0955-4f96-a021-6c6ce0e21421 got a reply: 2020-02-19 13:37:45.075

== APP == Actor 55f0f686-f125-4fe6-85d4-2a7bd37fb44e got a reply: 2020-02-19 13:37:45.242

== APP == Actor f955f205-0955-4f96-a021-6c6ce0e21421 got a reply: 2020-02-19 13:37:45.884

== APP == Actor 764eddc2-9155-426e-b406-0da1eb48cefa got a reply: 2020-02-19 13:37:45.940

== APP == Actor 55f0f686-f125-4fe6-85d4-2a7bd37fb44e got a reply: 2020-02-19 13:37:46.163

== APP == Actor f955f205-0955-4f96-a021-6c6ce0e21421 got a reply: 2020-02-19 13:37:46.422

== APP == Actor 764eddc2-9155-426e-b406-0da1eb48cefa got a reply: 2020-02-19 13:37:46.760

== APP == Actor f955f205-0955-4f96-a021-6c6ce0e21421 got a reply: 2020-02-19 13:37:46.788

Actual Behavior

Error in client:

== APP == io.dapr.exceptions.DaprException: ERR_ACTOR_INVOKE_METHOD: rpc error: code = Unknown desc = error activating actor type DemoActor with id 60dad782-a802-4b39-acff-0d5487471756: dial tcp4 127.0.0.1:3000: connectex: No connection could be made because the target machine actively refused it.

== APP ==       at io.dapr.client.DaprHttp.lambda$invokeApi$3(DaprHttp.java:201)

== APP ==       at reactor.core.publisher.MonoCallable.subscribe(MonoCallable.java:56)

== APP ==       at reactor.core.publisher.Mono.subscribe(Mono.java:4105)

== APP ==       at reactor.core.publisher.Mono.block(Mono.java:1662)

== APP ==       at io.dapr.examples.actors.http.DemoActorClient.callActorForever(DemoActorClient.java:61)

== APP ==       at io.dapr.examples.actors.http.DemoActorClient.lambda$main$0(DemoActorClient.java:42)

== APP ==       at java.base/java.lang.Thread.run(Thread.java:834)

== APP ==       Suppressed: java.lang.Exception: #block terminated with an error

== APP ==               at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:93)

== APP ==               at reactor.core.publisher.Mono.block(Mono.java:1663)

== APP ==               ... 3 more

Steps to Reproduce the Problem

Checkout Java SDK, have Maven and Java SDK installed.

mvn clean install
dapr run --app-id demoactorservice --app-port 3000 -- java -jar examples/target/dapr-java-sdk-examples-exec.jar io.dapr.examples.actors.http.DemoActorService -p 3000

Wait for service to start. Open a new terminal (terminal #2), then:

dapr run --app-id demoactorclient -- java -jar examples/target/dapr-java-sdk-examples-exec.jar io.dapr.examples.actors.http.DemoActorClient

Open a third terminal, then:

pskill java.exe
dapr run --app-id demoactorservice2 --app-port 3001 -- java -jar examples/target/dapr-java-sdk-examples-exec.jar io.dapr.examples.actors.http.DemoActorService -p 3001

Go back to terminal #2 and rerun:

dapr run --app-id demoactorclient -- java -jar examples/target/dapr-java-sdk-examples-exec.jar io.dapr.examples.actors.http.DemoActorClient

Now, see errors:

== APP == io.dapr.exceptions.DaprException: ERR_ACTOR_INVOKE_METHOD: rpc error: code = Unknown desc = error activating actor type DemoActor with id 60dad782-a802-4b39-acff-0d5487471756: dial tcp4 127.0.0.1:3000: connectex: No connection could be made because the target machine actively refused it.

== APP ==       at io.dapr.client.DaprHttp.lambda$invokeApi$3(DaprHttp.java:201)

== APP ==       at reactor.core.publisher.MonoCallable.subscribe(MonoCallable.java:56)

== APP ==       at reactor.core.publisher.Mono.subscribe(Mono.java:4105)

== APP ==       at reactor.core.publisher.Mono.block(Mono.java:1662)

== APP ==       at io.dapr.examples.actors.http.DemoActorClient.callActorForever(DemoActorClient.java:61)

== APP ==       at io.dapr.examples.actors.http.DemoActorClient.lambda$main$0(DemoActorClient.java:42)

== APP ==       at java.base/java.lang.Thread.run(Thread.java:834)

== APP ==       Suppressed: java.lang.Exception: #block terminated with an error

== APP ==               at reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:93)

== APP ==               at reactor.core.publisher.Mono.block(Mono.java:1663)

== APP ==               ... 3 more

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 35 (35 by maintainers)

Most upvoted comments

I was also thinking about making dapr cli exit when app exits. If we add the mode where dapr cli can run without app, we can add this feature to the cli as well.

This only happens locally, not in a clustered environment like k8s. This is because when running locally, the two processes register as the same node, and no rebalancing happens.

I guess we can close this issue.