HVM: Incorrect result on some inputs

HVM produces incorrect results on some inputs, one of which being:

(Main arg) = ((λa (a (λb ((λc (λd (b (c d)))) a)))) (λe (e e)))

On this input, HVM produces the result λx1 λx2 ((x1 x2) (x1 x2)) while the correct result is λx1 λx2 ((x1 x1) (x2 x2)) (as verified by https://crypto.stanford.edu/~blynn/lambda/, with the corresponding input syntax by adding dots after the variable of each lambda: ((λa. (a (λb. ((λc. (λd. (b (c d)))) a)))) (λe. (e e)))).

Besides, on other inputs such as:

((λa λb ((b a) λc λd (λe (d a) a)) λf (f f)) λz z)

HVM produces the result λx1 (x1 λx2 (x2 _)) which is not even well-formed.

About this issue

Original URL
State: closed
Created 2 years ago
Comments: 43 (18 by maintainers)

Most upvoted comments

@Ekdohibs map (map f) l is completely fine! As is basically anything you’ll come up with in practice. I think this is the main reason people get scared, they don’t realize what the limitation really entails!

As to silently failing, that’s just like every other undefined behavior, not unlike adding a number and a string, or reading arrays out of bounds. HVM is an untyped compile target, not meant to be a user-facing language. Obviously languages that compile to it will prevent these programs during type checking.

VictorTaelin on Feb 1, 2022

This limitation should probably be noted in the README. You probably also want to add more information on the limitation to make clear under what situations this is (im)possible. I assume this limitation is impossible to hit if the type system precludes applying a function to itself?

L-as on Feb 1, 2022

Victor gave an example of what isn’t allowed on the Telegram channel:

let f = λs. λz. (s (s z)) # a lambda that clones its argument
let g = f                 # a clone of that lambda
(g f)                     # the clone will clone the original

He said:

this is a limitation of the runtime and should raise a type error, just like you can’t have two concurrent &mut references in Rust

nothingnesses on Feb 1, 2022

Ah, I see. Sadly, I’m not sure how that could be implemented. After the initial compilation from λ-calculus to interaction nets, the graph will go through intermediate states that don’t even correspond to λ-terms. It is not obvious to me how we could identify exactly when a divergence occurs. But I’m not sure if that’s not possible either; perhaps there is some easy way to do so. I’ll reply here again if I have any additional insight.

VictorTaelin on Dec 15, 2022

The issue is that HVM is fundamentally unusable for most functional programming languages unless this is fixed at its root. Supporting the full LC is not a necessity, but when you go from e.g. GHC Core to HVM, there must be a way of detecting whether this GHC Core term can be translated to HVM. It is possible to make a naïve “type checker” for this, but it’s very hard to make one that supports a language expressive enough to support common Haskell idioms AFAIK. If you can not guarantee that, then it’s not useful in real-world applications. Why would you build a program that maybe works, when hitting the limit is possible with trivial and standard Haskell code?

L-as on Aug 9, 2022

@Ekdohibs think about it like this:

Every duplicator on the source (dup x y = term) has a unique label.
A duplicator can’t copy a term that has a duplicator with the same label as itself.

For example:

(λf(dup#0 x y = f; (x y)) λg(dup#1 a b = g; (a b)))
---------------------------------------------------
dup#0 x y = λg(dup#1 a b = g; (a b)); (x y)
---------------------------------------------------
(λg(dup#1 a b = g; (a b)) λg(dup#1 a b = g; (a b)))
---------------------------------------------------
dup#1 a b = λg(dup#1 a b = g; (a b)); (a b)

The last line is undefined behavior, because it has a dup with a #1 label attempting to copy a function which has a dup with a #1 label. Notice that the first level of self-copying is fine. It is when clones clone clones that you get a problem.

This can only happen when a function copies itself, and then the copy also copies itself. Emphasis on the “self”. Functions copying other functions is fine. Repeated self-copying is the only way, because, otherwise, it would be impossible for two identical labels to collide, since every dup has a globally unique label.

In general, any program that isn’t allowed is similar to (λx(x x) λx(x x)) in shape. For example, the first program you posted is just a larger version of it:

(Main arg) = ((λa (a (λb ((λc (λd (b (c d)))) a)))) (λe (e e)))
(Main arg) = ((λa (a ........................ a..)) (λe (e e)))

And the second one expands to it after a few steps.

It is not hard to come up with a static checker for this requirement. For example, the old Formality-Core had one, but it was too restrictive, because it stratified programs in levels, so that a dup on level X could only copy terms on level X+1, avoiding this problem. A much better solution, I believe, would be to just track a function’s lifecycle and make sure it isn’t cloning itself, followed by its clones also cloning themselves.

VictorTaelin on Feb 1, 2022

The thing is, this is not undefined behavior.

Note that HVM is not a Lambda Calculus runtime, but a Symmetric Interaction Calculus runtime. It is a coincidence that 99% of the functional programs and λ-terms you can write in practice can be compiled to a lightweight interaction combinator, but that’s just it - a happy coincidence, that allows us to have a functional-looking syntax.

That said, there is no UB in HVM. It will always give you the correct answer, in relation to Symmetric Interaction Calculus semantics, which, in some cases, differ from λ-Calculus semantics. It is a common misconception that HVM will “fail” when it attempts to reduce a λ-term that is deemed “incompatible”, but it will not. It will just give a different answer, since the semantics of interaction combinators differ from λ-Calculus semantics regarding the behavior of non-linear variables.

That’s why it wouldn’t make sense to halt execution even if it was possible to detect such divergent cases (and it isn’t). Not everyone comes from a λ-calculus point of view. It is perfectly valid to use HVM as just a raw interaction net runtime.

VictorTaelin on Dec 12, 2022

The code I had in mind was something like:

f :: Monad m -> m (a -> b) -> m a -> m b
f x1 x2 = join $ fmap (\f -> fmap f x2) x1

x = f ((\x -> x) : (\x -> x+ 2) : []) (0 : 1 : [])

Notice how the Monad m in the definition of f has a side-effect that the fmap is passed as an argument once typeclasses are lowered, so that the function f makes two clones and applies one to the other.

Ekdohibs on Feb 6, 2022

With the following code I get incorrect results (as expected):

(Map Nil f) = Nil
(Map (Cons x xs) f) = (Cons (f x) (Map xs f))

(Main arg) =
  let l1 = (Cons (λx(x)) (Cons (λx(+ x 2)) Nil))
  let l2 = (Cons 0 (Cons 1 Nil))
  (Map l1 (λf(Map l2 f)))

The result given by HVM is (Cons (Cons 0 (Cons 3 (Nil))) (Cons (Cons 0 (Cons 3 (Nil))) (Nil))) while (Cons (Cons 0 (Cons 1 (Nil))) (Cons (Cons 2 (Cons 3 (Nil))) (Nil))) is expected and is what we get if we replace an instance of Map by an identical copy.

However, I (personally) find this code to be very reasonable: no self-application, only the classic “Map” function is used, and the most unusual part is arguably the list of functions (which is not even that unusual).

Ekdohibs on Feb 3, 2022

No, because the Y-combinator just creates infinite copies of the function it receives, but it never actually copies itself, nor does the function it receives copy itself. But applying the Y-combinator to itself is undefined behavior.

I understand your point that it might be annoying to demand such a checker for any language interested to compile to HVM, though. Perhaps we could provide a checker as part of HVM, somehow. I’ll think about that.

VictorTaelin on Feb 1, 2022

This seems to be violating the “clones can’t clone their own clones” limitation, explained here. HVM is just a (much) faster functional runtime based on the abstract algorithm, not a full λ-calculator. Attempting to make a clone clone its own clone is considered undefined behavior. I stress that this only happens in highly artificial examples (like these!).

Edit: I now consider HVM should be seen as an Interaction Net runtime that happens to be a suitable as a functional back-end, as long as compatibility is ensured by a type-checker or something equivalent. But there is no such a thing as UB, as these reductions are correct w.r.t. INet semantics. See below for more info.

VictorTaelin on Dec 15, 2022

@VictorTaelin What do you think about this?:

An impure solution to the problem of matching fans: https://arxiv.org/abs/1710.07516

(It builds upon this earlier paper: https://arxiv.org/abs/1701.04691)

This solution only requires a (concurrent) hashmap. It’s implemented here https://github.com/codedot/lambda and online demo here: https://codedot.github.io/lambda/ E.g. if you enter (g: g g) (s: z: s (s z)), you get the right result v1, v2: v1 (v1 (v1 (v1 v2))). (And if you enter @Ekdohibs’s original term as ((a: (a (b: ((c: (d: (b (c d)))) a)))) (e: (e e))), you get the expected result v1, v2: v1 v1 (v2 v2).)

In the benchmarks table, it shows that it reaches the result in fewer interactions than other approaches. Although it fails to reduce the term (f: f (f (x: x))) (i: (f: f (x: x) (f (x: x))) (x: (h, u: h (h u)) (y: x (i y)))) for some reason.

Boscop on May 8, 2023

Right, so I think you got hung up on @Kesanov’s use of “undefined behavior”. I think they meant that terms with divergent semantics could be defined as UB from the point of view of a lambda-calculus-based functional language implementation that would compile to HVM. I.e., the compiler would guarantees lambda-calculus semantics within some boundaries, and outside of these boundaries all bets would be off, which is what would allow it to optimize the code (by compiling to HVM).

Now, @Kesanov was not suggesting that HVM crash on divergent programs, but just that it would be useful (for developers of lambda-calculus compilers) to have a debug mode to detect these divergent cases, just like address sanitizers and other instrumentations can detect when a program goes outside of its semantically-specified subset. Do you still think this would be impossible to implement?

LPTK on Dec 14, 2022

I think it would be useful to have a “debug” mode, that would throw a descriptive runtime error and halt the execution when this undefined behaviour is encountered.

This might make HVM great compilation target even for untyped languages. It would also simplify debugging for compiler writers or curious tinkerers that just want to play with HVM.

Kesanov on Dec 12, 2022

In the Y-combinator case, it is neither the function it receives nor the Y-combinator which copies itself, but the (\r. f (r r)) subterm. If we expand the body of the Y-combinator to (\r. dup#0 s t =r ; f (s t)) (\r. dup#1 s t = r; f (s t)), after a reduction it becomes (\r. dup#1 s t = r; f (s t)) (\r. dup#1 s t = r; f (s t)), and the dup#1 needs to copy a dup#1.

Ekdohibs on Feb 1, 2022