cython: [BUG] Large files compile very slowly in the C compiler
Describe the bug
It takes a very long time for the C compiler to run on large files like ExprNodes.py and Nodes.py. This is seen in the CI -all tests repeatedly timing out (although partly this is because we can’t compile them in parallel on Python 2.7)
One possible indicator is the warning from gcc:
note: variable tracking size limit exceeded with ‘-fvar-tracking-assignments’, retrying without
(although that could potentially be toggled independently or the limit increased)
Obviously it’s inevitable that a big Cython file will generate a big C file and that a big C file will take a while to compile, but potentially we could do better.
There’s a few gcc flags to try to profile compilation time (https://stackoverflow.com/questions/13559818/profiling-the-c-compilation-process), and they suggest that the module init function (where all the module-level user code goes) is the main culprit (unsurprisingly)
Environment (please complete the following information):
- Linux, CI, most obviously in Python 2.7
Additional context
I tried a couple of approaches to fix the problem.
- First I created small sub-scopes within the module init function. https://github.com/da-woods/cython/tree/morelocaltemps. This didn’t achieve any speedup or get rid of the warning, but may be worth using some of the change for other reasons (https://github.com/da-woods/cython/commit/c9014165baaf56bf3af0e08971f7e3eda46b428e#commitcomment-57239451)
- Second, I tried to split each stat at module-level into a separate function (https://github.com/cython/cython/pull/4386). This gave appreciable speed-ups for large modules. However the PR was very intended as a proof of concept with little attention to code quality…
I think a variant of the second approach is probably worthwhile. My current thought that we shouldn’t do it on a “per-stat” basis but maybe give each class creation a separate function (that’s easy to do for cdef classes, slightly harder for regular classes). That would likely give the appropriate granularity and keep things grouped in logical units.
Improvements made to mitigate this in Cython 3.1
More efficient string constant storage:
- https://github.com/cython/cython/commit/f39526df12fb33db6eed318e37deb8d5dfe2a3ba
- https://github.com/cython/cython/commit/368a750952f97042f3e8728f18c350f80413c84d (fixed in https://github.com/cython/cython/commit/f5f83fbf803d1b3ba88f71d12bb48306815efd3c)
- https://github.com/cython/cython/commit/0d5af7b68d062b766cb59ee1b76ad342d004f8dc
Shorter code generation:
- https://github.com/cython/cython/commit/e6c621a91265d94a6c0dad9975e85b46c493706d
- https://github.com/cython/cython/commit/6526ecf4acfa829c1ec50a5d11db4cef18386e60
- https://github.com/cython/cython/commit/904741890210d681102780dbd9f41bdf1ae561d2
- https://github.com/cython/cython/commit/12241b84055b1226223577a7f32a8a2c0bff47ee
More efficient code object creation that no longer permanently stores tuples in the global module state:
About this issue
- Original URL
- State: open
- Created 3 years ago
- Comments: 39 (35 by maintainers)
Commits related to this issue
- Refactor stringtab generation to reduce code size and compile time (GH-6018) The stringtab can be a global constant rather than a function local variable that's largely initialized at runtime. Sto... — committed to cython/cython by da-woods 4 months ago
- Initialize code objects with loop over table Hopefully helps with https://github.com/cython/cython/issues/4425. It reduces binary object size by ~0.5% (module-dependent of course). Code objects can... — committed to da-woods/cython by da-woods 4 months ago
- Initialize code objects with loop over table (#6028) Hopefully helps with https://github.com/cython/cython/issues/4425. It reduces binary object size by ~0.5% (module-dependent of course). Code... — committed to cython/cython by da-woods 4 months ago
- Avoid unpacking method calls at top-level class scope (GH-6054) More marginal size reductions. See https://github.com/cython/cython/issues/4425 — committed to cython/cython by da-woods 4 months ago
Linking to the improvements that we made so far for 3.1:
More efficient string constant storage: https://github.com/cython/cython/commit/f39526df12fb33db6eed318e37deb8d5dfe2a3ba https://github.com/cython/cython/commit/368a750952f97042f3e8728f18c350f80413c84d (fixed in https://github.com/cython/cython/commit/f5f83fbf803d1b3ba88f71d12bb48306815efd3c) https://github.com/cython/cython/commit/0d5af7b68d062b766cb59ee1b76ad342d004f8dc
Shorter code generation: https://github.com/cython/cython/commit/e6c621a91265d94a6c0dad9975e85b46c493706d https://github.com/cython/cython/commit/6526ecf4acfa829c1ec50a5d11db4cef18386e60 https://github.com/cython/cython/commit/904741890210d681102780dbd9f41bdf1ae561d2
EDIT: Overall, from what I tried, this can reduce the extension module size by 4-10%.