nom: IResult VS Result: making Incomplete part of errors?
There have been a lot of demands that I change nom’s basic type from IResult to std::result::Result. IResult has the following definition:
pub enum IResult<I,O,E=u32> {
Done(I,O),
Error(Err<I,E>),
Incomplete(Needed)
}
This was originally inspired from attoparsec’s IResult in which the Partial branch contained a closure to be called when more data is available. For various reasons, I was not able to make the closure idea work (note that Rust was very far from 1.0 at the time), so I chose to show how much data was needed to ask the user to parse again.
I open this issue to study what a change from IResult to Result would entail. I make no promise to do that change, and I will not put the issue to a vote. I will however take into account the responses I get.
The proposal is to make the Incomplete branch part of the Error branch, which would allow employing Result. I do not know yet what the end type would look like. I see two possibilities for the Err type: containing either a Needed or the other error branches, or flattening Needed at the same level.
So, to detail the arguments:
pro:
- can reuse all of the
Resultmethods and the code relying on it - less surprising for developers
- nom users can easily ignore all of the
Incompleteusage, parsers are easier to write - it might make the macros simpler, because in pattern matching, most combinators would handle 2 cases instead of 4:
DoneandError. In some combinators, there will be only three of them:Done,ErrororIncomplete(Needed::Unknown),Incomplete::Needed::Size(sz), because the calculation of needed data must still happen - a lot of nom parsers are stuck on an old version and likely will never update (because it does the job as is), so less code to rewrite, maybe?
- a lot of people find
Incompleteconfusing, since they work on complete data (like a file completely read in memory) - I could add in there a “non backtrackable error” that can contain the same thing as a normal error, but would make everything return instead of testing other branches
- it might make compilation faster (compilation time is the reason for the nom fork in syn)
con:
- it means a nearly complete rewrite of nom. It’s not necessarily an issue, I’m willing to make the time for it if needed
- there are still a lot of nom users relying on
Incomplete, and this is a big breaking change for them - even parsers that do not use
Incompletewould need to be updated to useResult - it might make some combinators confusing. All of the backtracking combinators like
alt!currently return onIncompleteinstead of testing the next branch. So, instead ofalt!andalt_complete!, do aalt!andalt_incomplete!? - nom was designed with streaming in mind, I am worried hiding
Incompletewill hide this benefit of the library
also, I am not sure about the timeline here. I am doing nom 2.0 very soon and it introduces some breaking changes, but I’m worried this change might be too big and drive users away. On the other hand, a 3.0 would likely happen far in the future, and there would be even more code (in nom and in code relying on nom) to update.
so, what do people think?
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Reactions: 5
- Comments: 26 (11 by maintainers)
To be honest, I’d actually suggest that
futures-rshas a good type that would be worth stealing, here:Rather than putting
NotReadyin the error (for similar reasons toIResult), it puts it in theOk. In addition, it allows seamless interoperation with the (growing) futures ecosystem, and trivial implementations ofFuture.A user could simply do an
and_thenon whatever future-based IO they are doing to get their input, and call theirnom-based parser inside.It’s worth noting that
std::io::ErrorKindhas some variants that represent recoverable kinds of errors, likeWouldBlockandTimedOut. So such a thing isn’t inconsistent with the rest of rust.Also I don’t see how hiding
Incompletebehind an error kind would make the streaming story that much harder, besides being mentioned in the documentation, I’d imagine most people would use the built-in nom Producer/Consumer construction (IE: They’d never have to deal with theIncompleteerror kind directly).another interesting thing, while you got me worrying about struct size, I have a way to make
simple_errorsandverbose_errorscompatible, by using the same enum insimple_errors, but with only one element. But things get more interesting: I can add more context insimple_errors(adding the current slice) without growing the structure, and make the verbose version smaller than in nom 3.0.Basically, instead of:
I would have:
and in verbose mode:
That way, the
verbose_errorsfeature would be additive, and as you can see in this playground, it’s smalller than in nom 3:FYI this is now done, with an additional element in the
Errstructure:Failure(ErrorKind<E>)to prevent a parser from trying other branches and make combinators likemany*fail instead of returning a valuemmmh, so, a few things here:
IResultvsResult, but mostly for the calculations of how much data is needed. Right now I’m not sure we always need to know how much data we need when a parser returnsIncomplete, especially with network protocols. MaybeNeededcould in fact hold the input slice and a needed number of bytes, so we only need to calculate once (offset from base slice + needed bytes) and move the calculation in a separate function. That would reduce the code size greatlyNeededshould be, I’m torn between both arguments, but I lean a bit more on putting it inOk:Err: it’s the common approach, but mainly because of historical reasons. C functions returning integer fell into the “0 is success, everything else an error” pattern that caused so many bugs. We have better types in Rust now. The ability to ignoreIncompleteis great though.Ok:Incompleteis not an error. It’s a very common case. I think that pattern matching onOk(Done(i,o))and puttingOk(Incomplete(_))andErrin the same case (maybe with a helper function?) would not be that hard.From my point of view, it would remove a difficulty from nom usage. Now I have no idea what will be the result but it seems very interesting!