go: go/types, types2: missing types for unions

Pointed out in slack by @dominikh, we’re not recording types for union (T1 | T2) or approximation (~T) elements in interfaces, or inlined in constraints. This is inconsistent and may be the first time we don’t record a type for a binary or unary expression.

We should consider changing this behavior to record the relevant union type. It is not clear what to do in the case of A | B | C: this expression consists of two binary expressions, but we only build one union in the type checker.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 2
  • Comments: 27 (22 by maintainers)

Commits related to this issue

Most upvoted comments

Let’s not block the beta on this. It affects almost no users directly. Yes, it affects tool authors, but the whole point of the beta is to get the code into the hands of both users and tools authors, so that they can give us feedback. This probably isn’t the only problem they will run into.

We do expect the release to be about a month late overall, and that is okay. This is quite possibly the most significant release since Go 1. It will take time to get right, and we will give it that time. It may be that a second beta will be needed, and that would be okay too. But the most important thing we can do right now is package up what we have - which is really very good!, just not perfect - and give it a wider audience.

I’m sorry, but I’m feeling rather dissatisfied with this solution. It ignores most of the points I made in https://github.com/golang/go/issues/50093#issuecomment-991441289 in particular with regard to backwards compatibility for old tools. With the current fix, existing tools will still encounter BinaryExprs without types, and will still need to add special checks handling type parameters even when they don’t have to care about them.

the API changes that we have made were communicated and agreed upond in the August time frame

That is, without a usable implementation until much later, which means the changes were difficult to evaluate. I feel like the tooling ecosystem wasn’t given the opportunity to test these changes and provide their feedback. Few of us would’ve been able to update our tools while battling with implementation bugs in the type checker and compiler. Now seems to be like a good time to test things. Furthermore, looking over #47781, it doesn’t look like much attention was paid to this particular issue in the first place, and the proposal saw very little discussion overall.

The way I see it, we’re stuck with an unclean API because “it’s too late to change the API”, but by the time useful feedback could be provided, it was already too late.

I’m sorry, but I’m feeling rather dissatisfied with this solution.

No I’m sorry. We didn’t mean for closing this to be dismissive of the underlying issue; rather, we thought that it was a clear bug that we’re not recording a type for the top-level binary expr, and it was perhaps unfortunate but OK that we didn’t choose to use a new node type for unions. (and for the record, I argued that it was OK, not @griesemer). Based on the response, let’s reopen and reconsider. We certainly don’t want to “spend the next ten years explaining this problem over and over”, as @seebs put it, just because we didn’t want to take a step back here. (though I’m not yet convinced that that prediction is accurate). If it is significantly better to change the API now, I agree we should do it, painful though it may be.

Let me also offer a blanket apology for the timing. We’ve been rather focused on updating go/types and x/tools, and I wasn’t expecting there to be fundamental problems with the go/ast proposal, which at the time seemed uncontroversial, probably for the same reasons as @mvdan said “sounds good”.

With that said, let’s try to keep this issue about making the best decision right now, given everything we know, rather than figuring out what happened. I don’t mean to deflect blame: once generics lands, I plan to write up a postmortem for myself, and I’d be happy to share it and accept input from others.

As I see it, we need to decide the following, very soon:

  • Does this need to block the beta?
  • Do we need to change the current representation of unions in the AST?

(I put those in perhaps the reverse order as one would expect, because the singular most pressing issue is whether to block the beta).

Once those have been decided, we need to address the following, pretty soon:

  • What is the ideal representation for unions in the AST?
  • What updates are needed for go/types, and other downstream tools.

In my opinion, this shouldn’t block the beta. My rationale for this is that a decision here might break some tools, but that is less of a problem than delaying the critical exposure we get from cutting the beta.

As for whether a change is needed, I’m not yet sure. My argument against changing the current representation essentially comes down to the fact that we already parse the value expression A | B | C as a binary expr, and there may be unexpected consequences to having the same textual expression parse differently depending on context. What would parser.ParseExpr do? What consequences could this have on tools or editors in the future, particularly if we ever allow A | B outside of constraints?

How much would it mitigate the problem if we recorded types for every binary expr in A | B | C? We didn’t do this mostly because types.Union memoizes its type set, so doing this is potentially non-linear in memory. We could actually move that memoization to a map during type checking, in which case recording a type for the A | B in A | B | C would not use significant memory. This would solve the problem of BinaryExprs without a type, but of course wouldn’t change the fact that a BinaryExpr can now be a type expr. However, there is already precedent (with StarExpr and Ident) for expression types that may be a value or a type. Could it be the case that there will be an adjustment period for BinaryExpr and UnaryExpr, but once that period is over things will be fine?

To add my 2c: I don’t think any external tool making heavy use of go/ast and go/types has fully adapted to the changes for generics. The first one looks to be Staticcheck, and Dominik has already explained how the changes are leaning towards the “breaks my code in many places” end of the spectrum.

I think there might have been a slight miscommunication when we discussed the go/ast and go/types API proposals. At least speaking for myself, all I could say is “sounds good” - it was impossible to tell whether my tools would break or not until the APIs were actually fully implemented. So I would say we had a hopeful plan back in August, but we’ve only started testing it for real in the past few weeks.

And, as much as it is fairly late to re-evaluate, I also think we have a small time window to make it happen. Beta1 isn’t out yet, and it is looking like 1.18 will be late by a full month if not longer. I’d very much rather see a couple more last-minute API changes for the sake of breaking fewer tools once beta1/rc1/1.18 come out. It seems to me like we wouldn’t even be that late, given that the changes for go/types finished landing just a week ago.

This seems really pretty unfortunate. If they’re not binary expressions, and can’t be used in code that previously worked on all binary expressions, this appears to me to violate the compatibility promise, at least in spirit – it makes code that doesn’t otherwise need to know or care about generics start panicing doing something that was previously totally safe.

My very limited experiments with AST walking have usually not involved any kind of fancy secondary processing at a higher level to make sure that I’m blocking descent into trees that are labeled as binary expressions but actually secretly internally they are not binary expressions and their sub-expressions won’t have types. I don’t think they should have to.

Ideally, I think, if these aren’t really binary expressions, they shouldn’t look like them in the syntax tree. If they are binary expressions, they should have types exactly the way other binary expressions would.

This feels like one of those things where we’re going to spend the next ten years explaining this problem over and over, but won’t be able to fix it because it was in a release.

It seems to me that it’s going to be less work to recognize that you’re walking the internals of an interface and that binary expressions need to be handled differently. That is, rather than generalizing all code, it seems simpler to specialize this one case. This would be what you’d do if there was a new union node, after all.

I disagree. I think that handling a new node type requires less work than adding special handling of an existing node type.

  • If unions were a new node type, then a lot of existing uses of go/ast would continue to work as-is. For example, most analyzes don’t care about all kinds of nodes, but only a subset. An analysis that flags x + 0 only cares about binary expressions, for example.
  • Code that cares about all node types is usually set up to fail in an easily recognizable way when encountering a new node type, for example by using an exhaustive type switch (one that panics in its default branch). In contrast, code that doesn’t expect BinaryExprs to represent unions will panic in unexpected ways.
  • Handling the context of a node is cumbersome in go/ast. You either have to maintain your own stack, or use the helpers in astutil. Existing code that never had to consider the context is doing neither, and is just looking at a node at a time using ast.Inspect. In contrast, handling a new node type is usually just another case in a type switch.
  • Unions can show up in at least two places, because syntactically the interface can be implicit. This makes it more intricate to handle unions. It may also require further changes in the future should unions be allowed in more places.
  • My guess is that a lot more code will want to look at actual binary expressions than at unions. And virtually all code looking at binary expressions will have to check the context.

It’s quite possible that I’m overlooking all of the complications that a new node type would cause, but in the context of Staticcheck, I cannot think of any case where I would prefer the reuse of BinaryExpr over a new node type.

To be clear, I’m not at all concerned with the beta, only the final release, and I agree with Russ.

As far as changing the AST representation is concerned: I don’t have a clear favourite. I’d be equally fine with introducing a new node type, not treating type lists as binary expressions, or properly treating them as binary expressions and recording type information for all the subexpressions. I think both approaches have about the same amount of backwards compatibility for existing tools, and both approaches allow easily differentiating between type lists and “real” binary expressions. It is only the current mixture of “we’re representing it as a BinaryExpr, but not treating it as one” that’s causing issues.

I don’t think that it’d be a big problem for BinaryExpr referring to types. Most code should be able to handle that transparently.

I’d be equally fine with introducing a new node type, not treating type lists as binary expressions, or properly treating them as binary expressions and recording type information for all the subexpressions.

Unless someone feels strongly that we should use a new node type, I think it makes more sense to just treat them as proper binary expressions. Same rationale as I mentioned above: parsing A | B | C differently based on context seems problematic, if not now then in a good percentage of potential futures.

I believe the two CLs above achieve the desired result, by recording types for all subexpressions of the union, including the UnaryExpr tilde-terms. For the latter, I record a Union type with a single term, which is consistent with what we would do for the ~A constraint expression.

I think we can merge these once they’re reviewed, even if we’re not done discussing here. They are mostly-safe changes that just add additional information to types.Info.