ponyc: Crash in a scheduler exhaustion example

Hi,

I was writing an example of how monopolizing the scheduler threads with tight loops could lead to standstill. Expected behavior: some output, then full CPU load, but no output. However, the program seems to crash pretty soon (in release mode). The same program does not crash in debug mode:

use "time"
use "collections"

class Tick is TimerNotify
  let _env : Env
  var _counter: U64 = 0

  new iso create(env: Env) =>
    _env = env

  fun ref _next(): String =>
    _counter = _counter + 1
    _counter.string()

  fun ref apply(timer: Timer, count: U64): Bool =>
    _env.out.print("Tick "+_next())
    true

actor Loader
  var _counter: U64 = 0
  let _env : Env

  new create(env: Env) =>
    _env = env

  be spin() =>
    _spin()

  fun ref _spin() =>
    while _counter >=  0 do
      _counter = _counter + 1
    end
    _env.out.print(_counter.string())


class Load is TimerNotify
  let _env : Env
  let _loaders: List[Loader] = List[Loader]
  var _counter: U64 = 0

  new iso create(env: Env) =>
    _env = env

  fun ref _next() =>
    _counter = _counter + 1
    _env.out.print("Started task "+_counter.string())
    let l = Loader(_env)
    _loaders.push(l)
    l.spin()

  fun ref apply(timer: Timer, count: U64): Bool =>
    _next()
    true


actor Main
  new create(env: Env) =>
    let timers = Timers
    let timer = Timer(Tick(env), 0, 1_000_000_000 /*1s*/)
    timers(consume timer)
    env.out.print("Started the ticker")
    let load_timer = Timer(Load(env), 0, 2_000_000_000 /*2s*/)
    timers(consume load_timer)

on Windows: ponyc && test.exe → crash, but ponyc --debug && test.exe → no crash

0.21.0-acd811b [release]
compiled with: llvm 3.9.1 -- msvc-15-x64

same on OSX:

0.21.0 [release]
compiled with: llvm 3.9.1 -- Apple LLVM version 9.0.0 (clang-900.0.39.2)

For a cross-check, 0.20 crashes too:

0.20.0-0b2a2d2 [release]
compiled with: llvm 3.9.1 -- msvc-14-x64

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 16 (11 by maintainers)

Most upvoted comments

This probably got fixed when we upgraded to LLVM 12, which allowed to have infinite loops at the IR level.

I am no longer able to reproduce this as of ponyc 0.46.0.

FYI, as of 0.37.0 this is still a problem with optimized builds.

However, the error is a little different.

~/pony-scratch/2441 ➜ lldb -- ./2441
(lldb) target create "./2441"
Current executable set to '/home/sean/pony-scratch/2441/2441' (x86_64).
(lldb) run
Process 6653 launched: '/home/sean/pony-scratch/2441/2441' (x86_64)
Started the ticker
Tick 1
Started task 1
Process 6653 stopped
* thread #3, name = '2441', stop reason = signal SIGILL: illegal instruction operand
    frame #0: 0x0000000000403191 2441`Loader_Dispatch + 81
2441`Loader_Dispatch:
->  0x403191 <+81>: ud2
    0x403193:       nopw   %cs:(%rax,%rax)
    0x40319d:       nopl   (%rax)
2441`Loader_Trace:
    0x4031a0 <+0>:  movq   0x108(%rsi), %rsi
(lldb) bt
* thread #3, name = '2441', stop reason = signal SIGILL: illegal instruction operand
  * frame #0: 0x0000000000403191 2441`Loader_Dispatch + 81
    frame #1: 0x0000000000409ad6 2441`ponyint_actor_run + 214
    frame #2: 0x000000000041255f 2441`run_thread + 2287
    frame #3: 0x00007ffff7f9b609 libpthread.so.0`start_thread + 217
    frame #4: 0x00007ffff7d46103 libc.so.6`__clone + 67

As to why LLVM is turning that “jmp to self” into ud2, I’m not sure.

Many parts of the LLVM optimiser treat infinite loops with no side-effects as causing undefined behaviour, based on the C++ standard. The LLVM IR itself doesn’t define infinite loops very well. This is a recurrent topic on the LLVM mailing list. See for example here and here.

with debug runtime but release build. looks like a gc bug.

(lldb) run Process 8042 launched: ‘./issue-2441’ (x86_64) Started the ticker Tick 1 Started task 1 Process 8042 stopped

  • thread #2: tid = 0x60e437, 0x0000000100000e7b issue-2441Env_Trace + 11, stop reason = EXC_BAD_ACCESS (code=1, address=0x10) frame #0: 0x0000000100000e7b issue-2441Env_Trace + 11 issue-2441`Env_Trace: -> 0x100000e7b <+11>: movq 0x10(%rbx), %rsi 0x100000e7f <+15>: leaq 0x283ea(%rip), %rdx 0x100000e86 <+22>: movl $0x2, %ecx 0x100000e8b <+27>: callq 0x100015c10 ; pony_traceknown at trace.c:109

(lldb) bt warning: could not load any Objective-C class information. This will significantly reduce the quality of type information available.

  • thread #2: tid = 0x60e437, 0x0000000100000e7b issue-2441`Env_Trace + 11, stop reason = EXC_BAD_ACCESS (code=1, address=0x10)
    • frame #0: 0x0000000100000e7b issue-2441Env_Trace + 11 frame #1: 0x0000000100013a2e issue-2441ponyint_gc_handlestack(ctx=0x00000001097ff7c8) + 94 at gc.c:619 frame #2: 0x0000000100015999 issue-2441pony_recv_done(ctx=0x00000001097ff7c8) + 89 at trace.c:65 frame #3: 0x0000000100001433 issue-2441Loader_Dispatch + 51 frame #4: 0x00000001000094fb issue-2441handle_message(ctx=0x00000001097ff7c8, actor=0x00000001298f4400, msg=0x00000001298f4200) + 635 at actor.c:164 frame #5: 0x0000000100008fbe issue-2441ponyint_actor_run(ctx=0x00000001097ff7c8, actor=0x00000001298f4400, batch=100) + 174 at actor.c:226 frame #6: 0x000000010001d4c5 issue-2441run(sched=0x00000001097ff780) + 261 at scheduler.c:527 frame #7: 0x000000010001cc19 issue-2441run_thread(arg=0x00000001097ff780) + 57 at scheduler.c:568 frame #8: 0x00007fff9db1899d libsystem_pthread.dylib_pthread_body + 131 frame #9: 0x00007fff9db1891a libsystem_pthread.dylib_pthread_start + 168 frame #10: 0x00007fff9db16351 libsystem_pthread.dylib`thread_start + 13