itertools: Audit for missing `fold` implementations.
Implementing fold for our iterators allows us to provide more performant internal iteration. We provide it in some cases, but not all. I’d like us to take three steps towards resolving this:
- Take an inventory of whether our adapters provide
fold. - Implement
foldfor missing adapters. - (Potentially!) Implement a clippy lint to prevent regressions.
Happy to provide mentorship for any of these items.
About this issue
- Original URL
- State: open
- Created 9 months ago
- Reactions: 4
- Comments: 27 (24 by maintainers)
I’m interested to contribute to this crate and I digged a bit into the source code. I think one potential candidate to override
foldisCombinations.https://github.com/rust-itertools/itertools/blob/master/src/combinations.rs#L100
What do you guys think ?
I can create a PR in coming few days to implement this.
@kinto-b We won’t specialize it if it’s slower (but that alone is a valuable information). About combinations, the slowest thing is heap-allocate the generated items.
foldby nature generates them all (to give them tof) so we can’t win anything on that side. The only thingCombinations::foldcould do is simpler code after the initialization phase while the defaultfoldrepeatedly doingnexthave “init code” lying around without use after a few steps, basically what you did but even it was a perf win, I did not expect much out of it since the heap-allocation is the bottleneck here and nothing else really impact performance (or at least I think so). I kinda wonder if mutate a combination IN-PLACE and give clones would be faster but it’s a lot of work for a probable small win (if any):while !self.increment_indices_and_comb(&mut comb) { acc = f(acc, comb.clone()); }or something like that.I will eventually work again on configure
clippy::missing_trait_methodsand (if accepted) we will be able to mark that somefoldare allowed to not be specialized and ensures all otherfoldare.@Philippe-Cholet Hi there, on
Combinations::fold, once #914 is merged, I think something like this should work as the core of an implementation:If you think that looks reasonable I’ll put through a PR.
Edit: I tried this out and, although it works, it is markedly slower than the unspecialized implementation
@phimuemue Oh… I find embarrassing I was aware about
--message-format jsonfor “check” (never used) but not for “clippy”.After some tinkering on jq playground, I think I have a new git alias (it’s not git-related but useful anyway):
then
Output for `git missing-trait-methods Iterator size_hint`
I’m looking to give this issue a try. Started with Combinations and I wonder how does folding on that one would look like? with
LazyBufferand such. I can also use some mentoring/walkthrough of the crate if someone is willing to show me around.Yes, the idea is to add lints to clippy itself to warn us about missing
foldspecializations. For any adaptors where we explicitly didn’t want to specializefold, we would need to write#[allow(clippy::missing_iterator_fold)]. This would allow us to audit for missing clippy lints as simple as runningcargo clippy, and to provide an automated warning if we ever tried to add a new adaptor without a fold implementation.@Philippe-Cholet I think you’re the most knowledgeable there. Could you mentor/review this?
Once we do this, we should test all of them via
test_specializations.Whoops, yes, of course. I’ll modify the issue accordingly.
Note that the vast majority of the time, you should be overriding
fold, notfor_each.The default
for_eachjust callsfoldwith a trivial()accumulator, which is easily optimized out (since in the default Rust calling convention, it’s not even passed).Thus the only reason to override
for_eachin addition tofoldis if you can optimize something by knowing that the accumulation is stateless, but I can’t off-hand recall any case where I’ve seen that to be the case.