godot: Memory allocator crashes or spews infamous _set_color error
Godot version: v3.2.1.stable.mono.official (64 bit)
OS/device including version: Windows 10 Home N (10.0.18362 Build 18362)
Issue description: The underlying custom allocator in Godot is having trouble with high-frequency allocations. I believe this is due to a race condition in core/map.h
. Given a high frequency of allocs/deallocs (bound by a cap, see reproduction project) the allocator fails and the process either crashes or spews this infamous error:
ERROR: _set_color: Condition “p_node == _data._nil && p_color == RED” is true. At: ./core/map.h:154
I say infamous because there are few similar bug reports surrounding this evasive error, all of which involve a multi-threaded physics server. In the reproduction project linked below, you’ll find my replication of this bug without any multi-threaded physics server.
Related & not duplicate: https://github.com/godotengine/godot/issues/8630 https://github.com/godotengine/godot/issues/6512
Steps to reproduce:
1: (Recommended) Disable vsync to show greater discrepancy among idle_frame, _Process, and _PhysicsProcess. If vsync is enabled, the bug will take longer to reproduce.
2: Create a queue of thousands of object references, managed by Godot’s custom allocator
3: Repeatedly dequeue old references and enqueue new references using any of Godot’s life-cycle functions (fastest reproduction is in the idle frame, vsync disabled)
4: If you are using the idle frame (with vsync disabled), reproducing should only take a few seconds. View the logger to witness a crash or spewing of the infamous error. If you are using _Process or _PhysicsProcess, it may take a couple minutes, but the crash/spew always occurs.
5: (Optional): Build from source, set a breakpoint on map.h:154 and run the project. Assuming there is no crash, wait for the errors to spew. Once the spewing starts, attach the debugger and analyze the params. Look for condition p_node == _data._nil && p_color == RED
, and check that it is truthy. If it is truthy, repeatedly step over, and you should see the “Finalizer” thread is stuck in an infinite loop as it attempts to balance the RB-Tree in map.h
.
Minimal reproduction project: reproduction-project.zip
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 7
- Comments: 20 (10 by maintainers)
I just tried adding a scoped mutex on
language_bind_mutex
around the offending line incsharp_script.cpp
and it appears that the issue no longer shows up in either the repro. project here or in my personal projects. Not sure if it’ll have additional side effects but I’ll submit a PR for it once I’ve tested it onmaster
.I am using gdscript in my project (not C#) and some custom native C++ extensions and I’m encountering this same error. After reading this post I suspect it’s due to the [back end] C++ extensions instantiating several hundred Godot objects that are passed to the gdscript [front end], which it uses to allow the player to configure the game. The Godot objects are all derived from Reference. They’re just simple data. This error seems to have appeared after I added the new back end logic that generates several hundred Godot objects.
I’m also using an old version of Godot: git commit 978d71b8393ec425830ba48253dca4f32484edd1 from May 16 2019. I’m aware of the reference counting bug in that version and have worked around it by delegating all object creation to gdscript helper methods so I don’t think the C++ back end is doing anything wrong that way that could be causing a different error that ultimately causes the ‘ERROR: _set_color: Condition “p_node == _data._nil && p_color == RED” is true.’ error.
My primary reason for this post is to provide a data point proving that this issue is not exclusive to C#. I believe the C# bindings and my custom bindings have similar behavior. Perhaps other language bindings do/would cause this error as well.
Are we getting any fixes for this in 3.2 stable version ? I regret using C# for production because of this .
The reason I was encountering this error frequently was due to creating and matching a RegEx ~180 times per second, which is obviously unnecessary. This same code was running in GDScript for a long time and never had an issue, but it can hardly run a few minutes in mono without encountering this issue. I am very certain this issue is mono exclusive.
Also using C#, also experiencing this problem. I had actually written my project in GDScript initially and did not encounter this issue at all. Shortly after converting to C# and making a small handful of changes, I’m running into this problem. It’s certainly possible that I changed something that caused this, but it seems more likely to me that it’s just not an issue with GDScript.