prql: Stack overflow on Windows
See https://github.com/PRQL/prql/pull/2700#issuecomment-1575206792 and #2713 and https://github.com/PRQL/prql/issues/2781#issuecomment-1595272822
I tried a few simple examples with the latest prqlc.exe built from main (8f9b77802da49971b94ebeb4baee7ccc725675bf), and I suspect that this has nothing to do with count and everything to do with the number of filter and take and sort.
Working
from employees
filter gross_cost > 0
group {title} (
aggregate {
ct = count s"*",
}
)
sort ct
filter ct > 200
from employees
filter gross_cost > 0
take 20
sort ct
filter ct > 200
sort ct
take 20
from employees
filter gross_cost > 0
take 20
sort ct
filter ct > 200
sort ct
take 20
filter ct > 200
sort ct
filter ct > 200
sort ct
Not working
from employees
filter gross_cost > 0
group {title} (
aggregate {
ct = count s"*",
}
)
sort ct
filter ct > 200
take 20
sort ct
filter ct > 200
take 20
sort ct
filter ct > 200
take 20
sort ct
filter ct > 200
take 20
from employees
filter gross_cost > 0
sort ct
filter ct > 200
take 20
sort ct
filter ct > 200
take 20
sort ct
filter ct > 200
take 20
sort ct
filter ct > 200
take 20
from employees
filter gross_cost > 0
take 20
sort ct
filter ct > 200
sort ct
take 20
filter ct > 200
sort ct
filter ct > 200
sort ct
filter ct > 200
_Originally posted by @eitsupi in https://github.com/PRQL/prql/issues/2717#issuecomment-1596099365_
About this issue
- Original URL
- State: open
- Created a year ago
- Reactions: 1
- Comments: 30 (30 by maintainers)
It depends on exactly when the stack is filled, so it will vary with every new variable in a function that requires stack memory.
So stack memory usage of our compiler grows linearly with call depth of the PRQL program. Which is not ideal and but would be hard to change.
We can change how fast does it grow with call depth and debugging a little, it looks like we do have a bunch of variables on the stack that could be on the heap. Will optimize a little.
I did some quick diagnostics. Using this diff from the current
main(similar to #2908 but uses an unstable feature):And then running with:
We can then look at how deep the stack traces are with:
I then set some bound at which to log the whole stack trace (currently 120 deep), and then looked at one:
It looks like the resolver goes very deep. Is that because it structures
foo | bar | bazasbaz(bar(foo)), such that a long query is a very deeply nested query?…rather than working like an intepreter, evaluating
r0=bar(foo)and thenbaz(r0)?If so — that’s an elegant approach, but maybe it scales badly with program size?
I need to sign off for a bit, and probably won’t be doing much coding in the next couple of days. I can look at this more after that though!
Great find @eitsupi ! That seems likely. So I guess one of our structs is huge (and this isn’t fortran!).
So it’s about the size of objects on the stack, rather than having a very long call-chain. I’ll have a look at what this might be (unless @aljazerzen sees this and have a good idea what this is likely to be).
And then probably we can just put that thing in a
Box— the performance hit would be trivial.Sure.
log.txt
Also does not work.