go-ipld-prime: API design question: Should Node.AsFoo methods return error?

The ipld.Node interface currently has a bunch of methods for unboxing its content to a primitive golang type:

type Node interface {
    // {... elided ...}

    IsNull() bool
    AsBool() (bool, error)
    AsInt() (int, error)
    AsFloat() (float64, error)
    AsString() (string, error)
    AsBytes() ([]byte, error)
    AsLink() (Link, error)

    // {... elided ...}
}

These methods return errors.

The question is: should they?

the utility of errors

Returning errors is generally considered better than panicing, as a rule of thumb and a default stance in the golang community at large.

Returning errors is also often faster, when you’re doing something where speed matters. Returning an error is just moving some memory around, and later facing a conditional jump instruction: this is about as cheap as it gets (especially if you return an error value, rather than allocating a new value with any specific information). By contrast, panicing and recovering is a pretty complex thing.

Panicking also (note: in the current versions of the go compiler, and to the best of my knowledge) prevents functions from being inlined. This can be a pretty significant obstruction to optimization. And these unboxing functions we have here are certainly small, nearly-trivial functions that we’d like to have inlined!

… but a bunch of that is dubiously applicable

We’re returning new error values that include helpful contextual information in almost all cases. Which means we’re allocating in the error path. Which means it’s already not fast: you’d much rather not hit the error, if you’re concerned about the speed. In practice, for these methods, that means: if you care about speed at all, you’re absolutely going to want to check the Kind() method and switch on that, rather than blindly calling these AsFoo() methods and checking errors.

Worrying about inlining is vacuous. We’re talking about an interface. There’s already no inlining.

So that shoots down most of the concrete reasons to prefer error returns, at least as far as I can think of. That leaves us with just the vague “rule of thumb” reason intact.

downsides of returning errors

Returning errors from these methods makes them significantly more irritating to use in many common situations.

You can’t chain calls easily when a function has multiple returns.

You can’t use the result as an argument to other functions, either. (Mostly. There’s some special cases; see next heading.)

Even if you’re not worried about ergonomics of function composition, even checking these error values often just plain feels silly in practice in a lot of the code I’ve written. Very frequently, I statically know the error is impossible, either because I know for sure what shape of data I’m handling, or because I’ve done a n.Kind() switch just a moment ago anyway. So, if I’m writing answer, _ := node.AsFoo() in the majority of cases… is having this error return really worth it?

other mitigation techniques

‘must’

The must package seems to do well here.

E.g.:

package must

func String(x string, e error) string {
    // panic if e is not nil, return x otherwise; you get the idea
}

… and this can then be used as:

AnyOtherFunctionTakingAString(must.String(someNode.AsString()))`

The must pattern can get icky if you have to use it recursively (this is why it’s Not Great when it comes to handling a series of “Lookup” operations), but since all the “AsFoo” methods are leafs of the call tree of handling some IPLD data, it tends to work out fine here.

We don’t have must.String and must.Int methods at present, but it’d take all of about five minutes to whip them off.

‘fluent’

The fluent package also already does away with error returns.

It’s clearly okay for this package to do it, even if it’s unclear for the core ipld.Node interface, because it’s opt-in.

It’s got downsides though (namely, it incurs an allocation for boxing into the fluent interface; see the Background section in issue https://github.com/ipld/go-ipld-prime/issues/42 for some other discussion about that). A mitigation to an ergonomic issue that’s got unavoidable performance costs is not a very good mitigation.

related APIs

Several other methods already don’t return errors for wrong kind, such as Length(). But this is a little different from the AsFoo() methods, because Length can make a clear out-of-bound return (a negative number).

The MapIterator() and ListIterator() methods don’t return error, but they did used to return a dummy iterator which would return an error on first use. I recently concluded this was also silly, and elected to change these methods to simply returning nil if they’re used on the wrong kind. But this is a little different from the AsFoo methods, because they return interface types, so we can use nil as a clear out-of-bound sentinel value.

The LookupString() and LookupIndex() methods (and etc methods that traverse a piece of recursive data and return another Node) do return error. But this is a little different from the AsFoo methods, because there’s more than one error they can return (wrong-kind vs key-not-found vs not-a-struct-field, etc), and those errors can’t always be avoided by probing another method (by contrast, the AsFoo method errors can always be avoided by probing Kind first). These methods also have more (and different, Because Recursive) mitigations in the form of the fluent, traversal, proposed `cursor, and other packages.

what’s the balance?

Are returning errors from these methods worth it?

Can we make sufficiently user-friendly higher level interfaces that we don’t care if the core API is a little rough?

What’s the balance?

Feedback welcome.

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 15 (7 by maintainers)

Most upvoted comments

Error values and fluent APIs aside, I still strongly believe that these “as T” methods should simply mirror the reflect.Value API.

mvdan on Sep 5, 2020

the amount of extra code needed to deal with getting those type names into the error text is pretty significant

I’m not quite visualizing that problem. Perhaps you can show me some examples.

I opened a new issue for this, and loaded it up with some examples: https://github.com/ipld/go-ipld-prime/issues/73

(It might not be a solvable issue. And we might not “need” to “solve” it; it’s just lament about boilerplate. I just wanted to mention it as something that’s within ~about one jump of relevance when we discuss these error returns.)

warpfork on Sep 2, 2020

Do we have ideas on how we could make this feel nicer?

If you want chaining and error reporting, the only nice way that I know of (besides panics directly) is to “chain” together a query, working out the result at the end. For example:

// Lookup gets chained, and Do stops at the first operation that errors, returning that error.
bar, err := node.Lookup("foo").Lookup("bar").Do()

the amount of extra code needed to deal with getting those type names into the error text is pretty significant

I’m not quite visualizing that problem. Perhaps you can show me some examples.

mvdan on Aug 27, 2020