TypeScript: Literal type narrowing regression

TypeScript Version: typescript@2.1.0-dev.20160913

Code

type FAILURE = "FAILURE";
const FAILURE = "FAILURE";

type Result<T> = T | FAILURE;

function doWork<T>(): Result<T> {
  return FAILURE;
}

function isSuccess<T>(result: Result<T>): result is T {
  return !isFailure(result);
}

function isFailure<T>(result: Result<T>): result is FAILURE {
  return result === FAILURE;
}

function increment(x: number): number {
  return x + 1;
}

let result = doWork<number>();

if (isSuccess(result)) {
  increment(result);
//          ~~~~~~
//          Argument of type 'string | number' is not assignable to parameter of type 'number'.
//            Type 'string' is not assignable to type 'number'.
}

Expected behavior:

result (before the narrowing) should be "FAILURE" | number
result (after the narrowing) should be number

Actual behavior:

result (before the narrowing) is string | number
result (after the narrowing) is string | number

About this issue

Original URL
State: closed
Created 8 years ago
Comments: 24 (20 by maintainers)

Most upvoted comments

@wallverb I’ll think about the suggestion of showing widening (or some other indicator) in the hints. In general I’m not crazy about showing syntax that isn’t supported in the language, and I don’t think the distinction merits elevation to an actual type modifier.

ahejlsberg on Sep 24, 2016

@wallverb Yeah, I specifically tried to capture as many subtleties as possible in a short example, and it definitely adds another layer to the rules. I think the intuition is reasonably simple though–as long as an explicit type annotation hasn’t been encountered for a particular literal in an expression, that literal type will widen to string when inferred for a mutable location.

ahejlsberg on Sep 24, 2016

Oh, one last thing. A very simple way to understand the confusion Godfrey had initially was that these two results seem incoherent and surprising:

let x: Event = foo();

// works differently from

let x = foo(); // where foo() is explicitly defined as returning Event

wycats on Sep 21, 2016

My TLDR:

I think it’s fatally problematic for const and let to produce different types for the same rhs expression (especially in the only-assigned-at-declaration let case).
I think we’re over-rotating on concerns about let mutation
I think allowing the semantics of let mutation of literal subtypes to diverge from other kinds of subtypes is creating more problems than it solved.

Onward:

@mhegazy I’m quite a bit confused about the way you’re applying the stated principles to this problem. Among other things, I’m pretty concerned about so many special rules applied to literal subtypes that are not applied to other subtypes. I’ll get to that in a bit.

A literal is substitutable with its symbolic name, i.e. replacing func(“foo”) to const foo = “foo”; func(foo); should behave the same

In order to make this work, you needed to define “its symbolic name” as “a name declared as a const binding”.

In particular, this doesn’t work:

type Event = "onmouseover" | "onmouseout";
function onmouseover(): Event { return "onmouseover"; }
function onmouseout(): Event { return "onmouseout"; }
function register(event: Event, callback: DOMCallback) { window.addEventListener(event, callback); }

let event = onmouseover();
register(event, e => { console.log(e); });

In this case, event is a symbolic name that we have declared as returning an Event (which is an abstraction over the precise literals we’re talking about). We have other functions that also take Events, but because our “symbolic name” is technically mutable (even though we haven’t mutated it), this code fails to type check.

The fact that const event = onmouseover() would be the only change needed to make the same code type check is incredibly unintuitive.

I agree with and accept the constraint about symbolic names, but I think a TypeScript user’s plausible intuition doesn’t differentiate between const and assigned-only-at-initialization let. Try to look at the above example with fresh eyes and see if it feels intuitive to you.

Types are always treated the same, regardless whether they are inferred or explicitly listed. This has been a principal that TS has operated under, and there are a lot of assumptions that are built on this, e.g. declaration file emit does not differentiate between inferred and explicit type annotations.

Seems reasonable, but perhaps not a hard constraint, given how tricky this problem is getting.

In particular, one approach that gets at the heart of the problem is to have .d.ts generators always emit the wider type if one is not specified. Because we have an opportunity to decide what should happen when we generate a .d.ts, and because we all agree that the inferred return type of a function shouldn’t change based on usage, I don’t think there’s any problem with assuming that inferred types are always wider, while explicitly declared types are as-declared.

Identifying a declared type should be always possible. and once identified it should not be changed. To clarify the terminology, a declared type is either the explicit type annotation on a declaration, or the type of the initialization expression if one is not present, other wise any. This is what makes var x = 1 be a number.

I think this is too loose of a description of the semantic questions. Identified by who, in what context? Do you mean “when hovering over a variable in the IDE”? Or do you mean “once a function has a return type, it shouldn’t change”?

In particular, I strongly agree with the decision that the TS made to avoid cross-function inference. So I agree that identifying the type of a function should always be possible, and once identified should not be changed unless the function body has changed.

I also agree that the type of a declared variable should never change from one type to an unrelated one. However, it is not at all obvious to me that those sources of agreement imply that automatic widening within a single function body should be disallowed. That question depends a lot on expected intuitions, as well as how effective we can make our error messages at guiding people in the right direction.

One final intuition, is that mutable locations (e.g. let, array element, or property declaration) are not literal type locations in general, mainly as literal types are not very useful in these conditions. e.g. var x = 0; assuming that the type of x is 0 is not very useful, and most likely the user intent is to have x be a number.

I simply disagree that this is an “intuition”. I would accept that it might be “a heuristic we can teach people”, but I strongly disagree that this behavior is something that somebody would intuit without instruction and memorization.

I also disagree that literal types are not very useful in these conditions, especially when type aliases are used:

type Event = "onmouseover" | "onmouseout";
function onmouseover(): Event { return "onmouseover"; }
function onmouseout(): Event { return "onmouseout"; }
function register(event: Event, callback: DOMCallback) { window.addEventListener(event, callback); }

let event = onmouseover();

if (someCond) {
  event = onmouseout();
}

register(event, e => { console.log(e); });

In this case, the literal is pretty useful and intuitive, and it’s surprising that event becomes string simply because I used let.

With all that said, I’d like to take another tack at expressing my concern about giving literals special semantics.

Here’s an example that’s pretty similar conceptually to the problem:

function text(text: string, inline = false) {
  let el = document.createElement('div');

  if (inline) {
    el = document.createElement('span');
  }

  el.innerText = text;
}

As in the string literal situation:

a human author might very well expect document.createElement to return an Element, just as a human might expect a string to be string
HTMLDivElement is a subtype of Element, just as "onmouseover" is a subtype of string
a human author would find the error here very confusing ([ts] Type 'HTMLSpanElement' is not assignable to type 'HTMLDivElement'. Property 'align' is missing in type 'HTMLSpanElement'.) just as a human author might find the string error confusing.
a user might well want precisely an HTMLDivElement (if they were trying to use align 😉) just as a user might well want precisely an "onmouseover" (if calling a function that expects it).

A great part of the reason that this doesn’t matter as much as expected in practice is that mutation to a different subtype of the same supertype is relatively rare. In the absence of mutation, passing the subtype to a function that expects the supertype works as expected.

Finally, I think it’s important to acknowledge that the primary scenario driving the decision to treat const and let differently is this:

let x = "hello";

if (someCond) {
  x = "goodbye";
}

First, some examples to illustrate why it’s the driving scenario:

function print(s: string) {
  console.log(s);
}

function hello(h: "hello") {
  helloworld(h, "world");
}

let x = "hello"; // let's assume we make x: "hello"
print(x); // type checks, because "hello" is a subtype of string
hello(x); // type checks, because "hello" is "hello"

However:

let x = "one";

if (someCond) {
  x = "two"; // we think it's problematic/unintuitive for this to be a type error
}

I think this is definitely a concern worth thinking about, because it’s absolutely true that most people wouldn’t expect the narrower type in this scenario.

However, I think solving it by differentiating between let and const is an overly broad hammer to attack this problem with.

Remember that the same problem, with the same intuitive hurdles, exists for other kinds of types:

let x = document.createElement('div');

if (someCond) {
  x = document.createElement('span'); // we aren't as concerned about making this a type error
  // [ts] Type 'HTMLSpanElement' is not assignable to type 'HTMLDivElement'.
  //       Property 'align' is missing in type 'HTMLSpanElement'.
}

We could use the same logic about mutable locations to argue that let x should produce a wide type (maybe the most general subtype of Object) and that users should write const x if they want the narrow type, but we don’t (and I don’t think we should either).

The most natural way to address the various constraints is to respect the types that users wrote down and use wider types if no narrower type is ever specified.

This addresses any scenario where an explicit type was declared for a literal anywhere:

function onmouseover(): Event { return "onmouseover"; }

let x = onmouseover(); // x: Event because that's what the function said

We can now address the exact literal case much more narrowly, with one of the following solutions:

Option 1. const and “only-assigned-at-initialization let to a literal” uses the narrower type, while actually-muated let uses the wider type. The user can use a type ascription on the original let to get a narrower type if they’re mutating the let.

Option 2. all const and let to a string literal use the narrower type, and if the user mutates a let variable to a different subtype of the same primitive type, we instruct them to add : string to the original declaration (I can see why this might seem annoying, but it’s consistent with how we handle Element, not so common, and a good error would go a long way).

Option 3. use internal-to-function-only inference to give mutated let variables a union type of all of the string literals that the variable is assigned to.

Important: any type produced by an abstraction gets the explicit return type specified by the function. function() { return "foo"; } has the inferred return type string.

I want to flesh out Option 3 because it has some non-obvious properties:

// inferred return type: `string`
function hello() {
  return "string";
}

// inferred return type: `string`
function world() {
  return hello();
}

type Food = "Hamburger" | "Hot Dog" | "Salad";
function burger(): Food {  // return type Food
  return "Hamburger";
}

function hotdog(): Food { // return type Food
  return "Hot Dog";
}

// inferred return type: Food, from burger()
// note: inferred return types are always fixed, but they are also inductive
function hamburger() {
  return burger();
}

function eat(food: Foo) {

}

let food = burger(); // inferred type: Food, supplied by burger();
eat(food); // type checks

function main() {
  let food = burger(); // inferred type: Food

  if (Math.random() > 0.5) {
    food = hotdog(); // type checks, food: Food, and hotdog(): Food
  }

  let name = "Yehuda"; // inferred type: "Yehuda" | "YEHUDA";

  if (Math.random() > 0.5) {
    name = "YEHUDA";
  }

  // inferred type: "TypeScript" | Food
  let language = "TypeScript";

  if (Math.random() > 0.5) {
    language = burger(); 
  }

  // inferred type: string
  let author = "Anders";

  if (Math.random() > 0.5) {
    author = hello(); // reminder -> hello(): string
  }
}

The interesting thing about these semantics are:

Determining the type of a mutable let can always be accomplished by unioning the type of all direct assignments to the mutable variable
A particular function always has a fixed return type, which is either the specified named type, the wider type, or inductively determined from the functions it calls.
In many common cases, it will be possible to determine the type of a mutable variable (when initially assigned to a subtype of a primitive) purely syntactically (if the string literal "foo" and "bar" are the only assignments to a mutable variable, its type is "foo" | "bar"). But in all cases, it is possible to determine the type of such a variable with a single hop to a function or slot with a fixed type.
The behavior of const is as desired (const x = "foo" -> const x: "foo" = "foo"), but it produces a shallower cliff to let.

There is also an alternative variant of these semantics that might be be even easier to implement and satisfy more constraints than the current design (but still have some pitfalls):

When the initial right-hand-side of a let variable is a string literal, its type is the union of all of the right-hand-sides in the same function, iff all of the rhs’es are string literals. Otherwise, the type of a let variable is the type of its rhs.

Applying the same inductive rules to this variant allows explicit types to be respected, and avoids a semantic difference between let and const (because const x = "foo" will trivially satisfy the string literal restriction). The main caveat is that let x = "Hamburger"; x = hotdog(); would produce x: string, but that’s a rather minor caveat compared to the much larger cliffs we’re facing today.

I apologize for how long this reply has gotten – I didn’t intend to write such a big a wall of text initially, and that may account for some incoherence between the beginning and ending of the comment. Please ask for clarifications or further fleshing out if something is confusing.

wycats on Sep 21, 2016

Some principals/intuitions that were involved in this change:

A literal is substitutable with its symbolic name, i.e. replacing func("foo") to const foo = "foo"; func(foo); should behave the same
Types are always treated the same, regardless whether they are inferred or explicitly listed. This has been a principal that TS has operated under, and there are a lot of assumptions that are built on this, e.g. declaration file emit does not differentiate between inferred and explicit type annotations.
Identifying a declared type should be always possible. and once identified it should not be changed. To clarify the terminology, a declared type is either the explicit type annotation on a declaration, or the type of the initialization expression if one is not present, other wise any. This is what makes var x = 1 be a number.
One final intuition, is that mutable locations (e.g. let, array element, or property declaration) are not literal type locations in general, mainly as literal types are not very useful in these conditions. e.g. var x = 0; assuming that the type of x is 0 is not very useful, and most likely the user intent is to have x be a number.

With these all in perspective, I believe the position we are in now is much better than where we were. there are some breaking changes that this entails, but overall the system is better positioned.

mhegazy on Sep 20, 2016