guava: Document, test, and null-annotate for `toImmutableMap`'s behavior when `mergeFunction` returns `null`
API(s)
`com.google.common.collect.ImmutableMap.toImmutableMap(Function<? super T, ? extends K>, Function<? super T, ? extends V>, BinaryOperator<V>)`
How do you want it to be improved?
It’s not entirely clear whether mergeFunction
is permitted to return null
or not.
According to type annotations it’s not (T
extends @Nullable Object
, but K
and V
don’t). Current implementation however (com.google.common.collect.CollectCollectors.toImmutableMap(Function, Function, BinaryOperator)
complies with Map.merge(K, V, BiFunction)
's description, that is, it removes value from temporal LinkedHashMap
if the merge result is null
.
(BTW note that Map.merge
takes BiFunction
instead of BinaryOperator
; in case of Guava, having “just one” V
forces both parameters and return type to be annotated together).
Why do we need it to be improved?
Please either explicitly allow nulls or change implementation to throw NPE.
Example
ImmutableMap<Integer, Integer> map = Stream.of(0, 0).collect(ImmutableMap.toImmutableMap(e -> e, e -> e, (a, b) -> null));
Current Behavior
This results in empty map
and no exceptions, which may, or may not, be fine.
Desired Behavior
Maybe such code should lead to NPE, although allowing nulls appears to be less surprising for people assuming that merging should behave exactly as described in Map.merge
or Collectors.toMap(Function, Function, BinaryOperator)
.
Concrete Use Cases
I’m unable to provide concrete use cases. I was implementing #6822 for my own purposes and it struck me that I can’t really choose if I should ban nulls entirely or not. I was willing to align with ImmutableMap.toImmutableMap(Function, Function, BinaryOperator)
but found that what I’m looking for is unspecified.
Checklist
-
I agree to follow the code of conduct.
-
I have read and understood the contribution guidelines.
-
I have read and understood Guava’s philosophy, and I strongly believe that this proposal aligns with it.
About this issue
- Original URL
- State: open
- Created 8 months ago
- Comments: 16 (16 by maintainers)
Commits related to this issue
- Document and test how `ImmutableMap.toImmutableMap` behaves when the `mergeFunction` returns `null`. (The test is Google-internal for now because we're in the process of reshuffling our `Collector` t... — committed to google/guava by cpovirk 8 months ago
- Document and test how `ImmutableMap.toImmutableMap` behaves when the `mergeFunction` returns `null`. (The test is Google-internal for now because we're in the process of reshuffling our `Collector` t... — committed to google/guava by cpovirk 8 months ago
- Document and test how `ImmutableMap.toImmutableMap` behaves when the `mergeFunction` returns `null`. (The test is Google-internal for now because we're in the process of reshuffling our `Collector` t... — committed to google/guava by cpovirk 8 months ago
If nulls are used to eliminate duplicates then yes, it is going to be error prone. However, I can imagine some exotic use cases, involving
toImmutableTable
but alsotoImmutableRangeMap
, which we all need desperately 😉, where you may want to use nulls as a way to avoid storing neutral elements in some collection / table / map (as being uninteresting or for whatever other reason). Merging-x
andx
could then result in null instead of0
, symmetric difference of two identical sets could result in null instead of empty set and so on. I know that this can be accomplished with post-processing (e.g. streaming, filtering out “zeroes”, recollecting), and that in general it would be less error-prone method than the one involving nulls, but we are where we are.I would love to see consistency between
toImmutableTable
andtoTable
. Firstly, it may be useful to some. Furthermore, consistency may be “less important”, but it is still important, I hope 😃 After all, it may be the only quality that that led me - and consequently you - to this issue.Thanks, I’m glad to hear that
BiFunction
is at least an option. You have found a clever way to show how much more complex that makes the signature, though 😉 (I do seriously think that the complexity may explain the difference betweenCollectors.toMap
, which has complex signature that we don’t want to become yet more complex (so it usesBinaryOperator
), andMap.merge
, which is relatively straightforward (so it usesBiFunction
). But I’m speculating.I wonder if the sweet spot might be
BiFunction<V, V, @Nullable V>
: We don’t do the PECS wildcards that explode the length of the type (the way thatBiFunction<? super V, ? super V, ? extends @Nullable V>
would), but we have enough distinct types that we can put@Nullable
where we want it.Still, this matters relatively rarely, so it’s not necessarily worth going more complex than
BinaryOperator
.And I can give some data about how rarely! I ran some tests overnight, and I have a couple different things to report:
@NonNull
from my earlier post is complex enough that it causes problems…null
from themergeFunction
oftoImmutableMap
. (I don’t think we have any tests of our own, though… 😃) Most (maybe all?) of them fit into the category of “If we see any duplicates, then drop the key altogether,” which I will note does not actually work if there might be 3, 5, 7, etc. copies of a key, IIUC 😃 So the functionality gets used, which is all the more reason to document and test it.So I think my plan now is:
BinaryOperator
vs.BiFunction
if we follow up on https://github.com/google/guava/issues/6822