tokio: deadlock in tokio-1.0

Version

│ │ ├── tokio v1.0.2 │ │ │ └── tokio-macros v1.0.0 │ │ ├── tokio-util v0.6.1 │ │ │ ├── tokio v1.0.2 () │ │ │ └── tokio-stream v0.1.2 │ │ │ └── tokio v1.0.2 () │ ├── tokio v1.0.2 () ├── tokio v1.0.2 () ├── tokio-compat-03 v0.0.0 ├── tokio-stream v0.1.2 () ├── tokio-util v0.6.1 ()

Platform

linux 4.9.0-8-amd64 #1 SMP Debian 4.9.144-3 (2019-02-02) x86_64 GNU/Linux

Description My program pulls data from a redis server and send them to another one. There are many futures —— some of them pull data from the source, some push to the target, and some print stats into stdout periodically(5 seconds). And when pulling and pushing data, the program modifies some metrics managed by prometheus-0.1.0. The program works well for months under tokio-0.3.0, but when I upgraded to tokio-1.0, deadlock happens. When it is blocked, the program no longer print stats, meaning that some of the futures are not polled by the tokio runtime(multi-threaded mode).

I executed perf top, and here is the output:

image

and here is what it showed when I selected the first line and then it’s annotation: image

These statistics seldom change for a long time.

So what happend in my process. What’s wrong with the atomic operations?

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 23 (14 by maintainers)

Most upvoted comments

Thank you @Flakebi. The coop system consuming permits without doing any work certainly sounds like a problem that can cause a deadlock. Thanks for pointing it out!

I have opened a new issue for tracking this specifically, a link to which you can find right above this reply.

@tancehao Can you check whether your deadlock is caused by the same issue by replacing Budget::initial with Budget::unconstrainted in Tokio? You can replace your Tokio dependency with a custom modified one using a patch section in your Cargo.toml.

@taiki-e I tested your patch and it works. Also looks quite elegant to me 😃

It’s the other way around.

Ah, right, I knew I missed something. Thanks for the explanation!

It seems like the Pending never gets to the executor, it’s stuck at 0. I’ll keep debugging.