embedded-hal: [RFC] Revamp `Error` types

As demonstrated by #181 and other issues there’s a real world usability issue with the Error type of the Result of operations being an associated type. An associated type means that a driver or user of an HAL impl is required to adapt to the custom Error type; while possible for applications this makes universal drivers impossible.

In my estimation the main reason for this was probably caused by embedded-hal predating the improvements to Rust type and specifically Error handling. Now we do have better possibilities and with the upcoming big 1.0 breakage it should be possible to also revamp the Error types without too much grief.

Currently our trait definition e.g. for I2C looks as follows:

pub trait Read {
    /// Error type
    type Error;
    ...
    fn try_read(&mut self, address: u8, buffer: &mut [u8]) -> Result<(), Self::Error>;
}

My proposal would be to instead provide a specific Error type such as:

#[non_exhaustive]
pub enum Error {
    /// An unspecific bus error occured
    BusError,
    /// The arbitration was lost, e.g. electrical problems with the clock signal
    ArbitrationLoss,
    /// A bus operation received a NACK, e.g. due to the addressed device not being online
    NoAcknowledge,
    /// The peripheral buffer was overrun
    Overrun,
    /// The peripheral buffer ran out of data
    Underrun,
}

and then redefine all traits as:

pub trait Read {
    fn try_read(&mut self, address: u8, buffer: &mut [u8]) -> Result<(), Error>;
}

The #[non_exhaustive] flag (available from Rust 1.40.0, cf. https://blog.rust-lang.org/2019/12/19/Rust-1.40.0.html) allows us to easily extend the reasons in the feature without being a breaking change because it requires implementations which care about the actual error to also have a wildcard match branch, which is a good idea anyway.

By having the Error types defined in embeded-hal we do some advantages:

  • Drivers can now universally check the error kind reported by a HAL implementation (cf. #181 where a chip might return a NACK due to currently not being able to handle more requests, yet the driver cannot make use of this information and retry later) and act accordingly
  • More uniform and exhaustive documentation of available error conditions next to the API itself
  • Better guidelines for HAL authors which cases might be work distinguishing
  • Improved portability of embedded applications between different hardware

Dowsides:

  • More breakage right now
  • MRSV 1.40

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 13
  • Comments: 123 (96 by maintainers)

Commits related to this issue

Most upvoted comments

A few hours ago we were discussing the issue again and the solution proposed in https://github.com/rust-embedded/embedded-hal/issues/229#issuecomment-648403992 came out very much in favour.

I haven’t realised it at the time but after some careful explaining by @eldruin of ideas behind this approach, I did some exploration work to figure out whether there’s any cognitive or programming overhead or other stumbling blocks with this approach and I’m happy to report that I couldn’t.

Much to my surprise and delight it is not even necessary to .into() the Err in case one is matching against the T, so there’s virtually no downside in comparison to fixed error types for the typical case while still allowing to go wild with custom errors for special cases.

So 👍 from me.

Err, you certainly have a way of twisting words in your favour. 🙃

Please refrain from personal attacks.

Let’s resume the different propositions.

The Error to rule them all

Each embedded hal trait define its Error, with no choice for the hal implementation. That’s the original @therealprof proposition.

#[derive(Debug, Copy, Clone, Eq, PartialEq, Ord, PartialOrd, Hash)]
#[non_exhaustive]
pub enum I2cError {
    Nack,
    Overrun,
    Other,
}
pub trait Read {
    fn read(&mut self) -> Result<(), I2cError>;
}

Advantages:

  • driver implementation can handle the Nack case if they want
  • simpler interface: no parametric error type

Disadvantages:

  • I2cError must propose every imaginable cases
  • If a hal implementation want a new error, it must modify the embedded hal Error
  • Error can’t be as precise as wanted. For example, a linux driver might want to put the file path of the device, and embedded hal can’t handle PathBuf.
  • final user might have a degraded error message as the error is filtered by the error type of embedded hal

Convertible Error

That’s the current proposition. Instead of forcing the Error, some bounds are imposed on the Error to the hal implementors, allowing drivers to convert the error to something fixed.

#[derive(Debug, Copy, Clone, Eq, PartialEq, Ord, PartialOrd, Hash)]
#[non_exhaustive]
pub enum I2cError {
    Nack,
    Overrun,
    Underrun,
    Other,
}
pub trait Read {
    type Error: Into<I2cError> + Clone + core::fmt::Debug;
    fn read(&mut self) -> Result<(), Self::Error>;
}

Advantages:

  • driver implementation can handle the Nack case if they want
  • hal implementation can use a custom error to be as precise as needed, including having std types on them.
  • hal implementation can use directly the I2cError as the error if that’s enough

Drawbacks:

  • more complicated than a fixed type
  • The into trait, consuming the error, force to add the clone bound to the error to be able to get the I2cError without losing the original error
  • The fact that this is a conversion will suggest that the driver can erase the original error type, exposing only the I2cError in its interface.
  • If the driver erase the error type, the final user will not have the improved error.

The last drawback afraid me in inconsistent driver implementation: some driver returning Read::Error, some driver returning Read::I2cError, some driver returning sometimes Read::Error, sometimes Read::I2cError. I suppose that the recommendation for drivers is to returns the original Read::Error to avoid lost of information, at using I2cError internally to handle some kind of errors.

An Error bound on the trait with a kind method

This is my proposition. The first drawback it tries to fix is the Clone bound: in reality, the embedded hal error is just a plain enum, that can be copy, no reason to eat the original error to do the conversion, just a &Read::Error should be enough, and we can remove the Clone bound.

Then, the desired feature (be able to handle the Nack error in a driver) remind me the handling of the special error “broken pipe”, that might be traited as an EOF in some case (think of an implementation of grep for example). In std, broken pipe in an io::ErrorKind. Thus, my idea is to mimick this interface, providing an ErrorKind for each trait providing the common errors that driver implementation might want to manage, and not every possible error.

The ErrorKind also implements its Error trait, allowing to use the ErrorKind directly as an error if this is enough as in the proposition 2.

This proposition allows everything possible with proposition 2, without a Clone bound, and with an interface expliciting that the error must not be lost by the Driver implementation.

So we have:

pub trait I2cError: core::fmt::Debug {
    fn kind(&self) -> I2cErrorKind;
}

#[derive(Debug, Copy, Clone, Eq, PartialEq, Ord, PartialOrd, Hash)]
#[non_exhaustive]
pub enum I2cErrorKind {
    Nack,
    Overrun,
    Underrun,
    Other,
}

impl I2cError for I2cErrorKind {
    fn kind(&self) -> I2cErrorKind {
        *self
    }
}
pub trait Read {
    type Error: I2cError;
    fn read(&mut self) -> Result<(), Self::Error>;
}

Advantages:

  • driver implementation can handle the Nack case if they want
  • hal implementation can use a custom error to be as precise as needed, including having std types on them.
  • hal implementation can use directly the I2cError as the error if that’s enough

Drawbacks:

  • more complicated than a fixed type

Conclusion

I think proposition 3 is better than proposition 2. Everything that can be done with proposition 2 can be done with proposition 3, and with a more ergonomic interface. Proposition 1, as more opinionated and simpler, proposes a different alternative.

hmm the type Error: core::fmt::Debug + Into<I2cError>; idea seems pretty interesting to me, though does seem that it would require Clone which is not always viable for errors (especially on linux), and how should inner types not-mappable to flat types be handled?

it seems like TryFrom<&Self::Error> for I2cError would be pretty were it achievable?

  • None communicates the case where the inner error is not representable using the flat type
  • HAL impls that don’t have other methods should be able to use the I2cError types directly, others can define still define their own
  • Drivers can choose to map into generic errors for handling without consuming the error / while still bubbling the underlying error to the application if desired.

The only real issue I have is that these approaches might promote error erasure in drivers, if you have to .into() to get the I2cError type the odds of a driver author returning the generic I2cError instead of the underlying one would seem high. This is not an concern with the type Error: I2cError bounds approach, but, could probably be mitigated in documentation, and this way match works.

I’m also interested in learning why we need suddenly need this extra amount of details for an error case:

we have always had this facility, and i at least have used it heavily in my radio work (thought the applications using this are not public 😕). even where some of these cases are not handled in the driver, you have the expressiveness to manage errors at the application, to attempt reconnections at different levels of the stack, and even just to let the developer/user know what actually happened. I would be personally v frustrating to have to connect a debugger to work out whether it’s USB or SPI or networking that failed and returned me a BusError, or as mentioned above, to have to fork drivers to add out-of-band mechanisms for propagating the actual underlying errors.

Please stop claiming this proposal would encourage anyone to fork a library to use it. The whole proposal is about turning implementation specific errors into e-h defined general errors to make it more universal and easier to use.

i completely understand this perspective (and have had to do this a couple of times for some cursed C/FFI work), it’s a real pain to have to add underlying error context to driver objects and, nearly impossible to do so in a way that is safe with multithreading, async, or even multiple calls in a no_std context.

I don’t think a MSRV of 1.41 would be a blocker if it provides a major benefit for years to come

It’s not something to be done without consideration and deliberation, given the impact on everyone else in the ecosystem. But I too would be okay with a bump in MSRV, especially packages with another pile of breaking changes.

  • is there a difference between a recoverable-error and a non-recoverable error?

As far as I am concerned it’s up to the application to decide what is recoverable and what is not.

is a NACK actually an error, or, normal part of bus interaction?

Both. Originally it was supposed to signal that a device does not exist on the bus but nowadays is sometimes is used to signal that the device is busy.

could this normal-failure be represented in a clearer way on the interface?

You do have a strange interpretation of “clearer”… 😅

should retry etc. (which is part of the justification for non-NACK errors) be driver or application level (and is that a decision we can/should make here)

I think both. E.g. if you have a driver talking over some interface and it is waiting for a certain stream of data to arrive and the HAL signals an overrun or a jammed line, the driver might be able to retry the last communication in order to fix the issue. If you don’t have a driver but your application is handling the data directly you similarly might want to be able to handle such situations (which is currently not possible in a generic way).

totally agree that we are missing some API detail, but am still not swayed by the exhaustive (yet non-exhaustive) enum approach. Aside from the other complaints it seems like introducing a bunch of application detail into an otherwise abstract crate

I absolutely disagree. Having a common set of error types for general use is one the achievements of the latest iterations of e.g. libstd. I think it is fundamental for portable and composable software to have a common set of error kinds to rely on and that’s also the reason why custom approaches like failure are now deprecated.

and I don’t know how we would choose to accept or not-accept enum variants.

Easy: if it makes sense, is not covered by an existing error kind we’ll accept it. There should be a very low bar to including new error kinds and since additions are non-breaking I would not worry about that at all.

I am one of the persons that thinks that forwarding underlying error information must be possible. That is possible without ImplError. ImplError is only about standardisation of underlying error causes. That is a second level of standardisation.

I believe error types would be more flexible without attempting to represent every possible hardware failure situation inside each error type. This information can be transported separately with the approach from proposal 3.

ImplError started as a later addition to forward underlying errors. @therealprof wrote:

That is correct, I missed that in the initial proposal and added it after a ton of discussion and requests. So?

So, standardisation of underlying errors was not the most pressing need in the community.

We already have proprietary methods on HAL impls, e.g. to construct them or initialise or make them change something internally (like switching modes on GPIO pins).

That’s true for HALs, yes. But if you have a device driver crate that has a driver.measure(&mut i2c) API, then that driver will not be able to return HAL-specific information because it does not know about the HAL.

This would mean either the driver would need to be modified so it knows HAL-specific APIs, or the HAL would need some kind of “return the last error that happened” method. Which is horrible and doesn’t work if a driver chains multiple calls to the HAL’s APIs. It can also break completely in an async context.

The alternative being a totally implementation-defined errno-like mechanism to get the rest of the information whenever an error pops up, which is bolted manually onto other libraries. I am sorry but to me that sounds pretty horrible.

I think there are two cases that we need to consider: A generic driver (which will not be able to deal with any platform-specific or project-specific error) and a hardware-specific application.

A generic driver will be very happy about unified errors because it allows it to handle certain cases like NACKs. I think nobody disagrees with the idea that there should be a list of predefined error variants.

However, an application like the vacuum robot may want to handle different kinds of bus errors differently.

If the e-h trait does not allow nesting errors (and thus the error information is not available by using the e-h API), then the driver will need to be modified with a custom non-e-h-compatible interface that returns the needed error information. For example:

// Pseudocode

// This is the hardware-specific API with more error information
impl Driver {
  fn write_with_error_details(&mut self, bytes: &[u8]) -> Result<(), DriverError> {
    ...
  }
}

// This is the generic driver API
impl Write for Driver {
  fn write(&mut self, bytes: &[u8]) -> Result<(), I2cError> {
    self.write_with_error_details(bytes)
      .map_err(|e| match e {
        DriverError::Gpio(_) => I2cError::BusError,
        DriverError::I2c(_) => I2cError::BusError,
      })
  }
}

The problem is that now the driver contains “prorietary” APIs that need to be called by the application. The developer of the application would need to fork or copy-paste the driver code in order to modify it.

If instead the e-h error enum allows wrapping a generic error, then the driver crate could simply pass through the underlying error, and the application could handle it if desired.

Of course this does not mean that we should not try to cover all possible error cases, but it still offers an escape hatch for developers that don’t want to go through the “propose-change-to-e-h > get-change-accepted > wait-for-release > update-driver > wait-for-release > update-application” cycle.

@ryankurte @eldruin So I was looking into your proposal to have a generic custom error kind. I do see a few problems with that approach:

  • You still have to parametrize each type use with the HAL impl specific Error which funny enough makes the implementation less universal despite being more generic. The current associated type does not have suffer from the problem this approach would introduce
  • It essentially allows a HAL impl to opt out of using the standard types by wrapping it in a Custom or Other type rendering the plan to a good deal toothless; we absolutely want to encourage implementers to use new shared error kinds and extend them within e-h if they’re not sufficient (I so think that with a good look at some powerful implementation we can get a very good idea of what should be included in a first throw)
  • I don’t think nesting errors makes a lot of sense for embedded applications. std applications are moving away from nesting due to useability issues and crates like anyhow making it really easy to get a nice chain of errors iff error variants are not manually nested and embedded applications are even more simplistic than that due to the requirement that the application actually needs to know about the used peripherals and even initialise them specifically; there’s no such thing as a library hidden several layers down magically initialising an peripheral and doing fancy stuff without the application knowing anything about that

I think with the new fallible-first approach it is very important to deliver a decent set of standard error kinds, too, and not leave it to impls to come up with a wild variety of custom errors (or none at all).

Thanks for your responses. I don’t have time to read them all in detail right now but here’s one paragraph which stuck out in the notifcation going by:

Regarding the suitability of a similar Error type for other protocols than I2C, I am unsure. I would have said I2C is the only “real” protocol that e-h supports which has errors of its own and not just kind of “hand off some bits, no strings attached” 😃 Nevertheless, maybe someone here has more experience on this and can think of some errors for SPI or so.

I can see other applications as well. For instance if you have an application running e.g. over a UART it would be very helpful being able to see buffer overruns so a driver could start over. For GPIO (where we’re discussing things like tri-state GPIOs and changing directions) it might be useful to know that a pin is in a wrong mode for some operation.

But even other than in drivers, whenever you’re doing real error handling in an application (other than the unfortunately common plain unwrapping) this is currently not at all portable so whenever you want to run an application on a different MCU and properly handle different error situations you’d have to write MCU specific error handling – which is not great.

To define the set of enum, what about checking a few existing errors in hal, and see how we would transfer it to the corresponding ErrorKind.

For example, the error used in https://github.com/stm32-rs/stm32f4xx-hal/pull/218/files look quite protocol related and should fit in the ErrorKind type.

OTOH, the DMABufferNotInDataMemory error in nrf-hal doesn’t seem appropriate for a dedicated entry in kind, and, I think, would enter in the Other case.

linux embedded hal is not really helpful as it seems to directly use std::io::Error as its error type.

Other seems to be the right name, as it is also used in std::io::ErrorKind::Other https://doc.rust-lang.org/std/io/enum.ErrorKind.html#variant.Other

So, maybe we can begin with a small set of variant in the embedded-error kind types, try to use it in a few hal, and see if it need additions. It would be semver compatible.

The only thing to really choose are:

  • if we add some “generic” error that apply to every errors (as OS related errors, as permission denied and the like, corresponding to the actual ImplError) and its name. ImplError is not really talkative, maybe GenericError or something like that?
  • if we use a “generic” error, does the Other variant belong to the generic error, or to the specific errors.

I am sure people will come up with more.

Which is encouraged. But if you can’t come up with additions in a hurry I have my doubts that we’re going to be swamped with an unwieldy number of requests. 😏

My ability to come up with examples right on the spot is beside the point. A high amount of requests within a short period of time (leading to “swamping”) is also beside the point. The point is that the number of ImplError error variants is expected to grow, which adds to my point of an unwieldy number of ever-so-slightly-different error variants.

A driver getting an OutputPin should not care / cannot care about whether the pin comes from the MCU or through an I2C multiplexer. It really cannot handle ImplError::PowerDown.

Maybe not PowerDown but it may want to retry a few times if it receives an ImplError::Timeout.

Again, Timeout is already part of I2cError, not ImplError. A generic driver would be able to handle it without needing ImplError.

While ImplError allows to get a bit more of information, it cannot tell exactly what went wrong without making ImplError overwhelming (because only the implementation knows), which is precisely my point.

It doesn’t need to tell exactly what went wrong but if it can point into the right direction that is a huge help. Without it all a non-device specific application can only do a best-effort problem resolution (or just crash), with a useful ImplError it has a ton more options.

If it can only point into the right direction, you will still need to process the implementation-specific error if you want to fix it or know what actually happened. We may as well cut the middle-man and just encourage applications to choose ImplError as their implementation error type if they are satisfied with the level of information. That would keep the errors simple.

Why force an additional set of somewhat unrelated errors into every error definition if we already have a mechanism to get the underlying information?

It seems like it is either overwhelming and thus difficult to use or a small extension of the available variants and thus imprecise.

I do not agree. Have you looked at the available kinds? Any kinds you would add from the top of your head?

I am sure people will come up with more.

Since generic drivers have no use for ImplError either, why bother adding it?

Actually they do. Examples include things like handling (or not being able to handle) NACKs, signalling Timeouts, disconnected devices, discovery problems… Those are even more relevant when stacking implementations, e.g. the driver for a GPIO expander connected via SPI would like to signal that a connectivity issue is the problem for your pin toggling problems.

Things like NACKs are already part of the I2cError enum, not ImplError. Same thing with I2cError::Timeout (although that is part of SMBus, not I2C).

A driver getting an OutputPin should not care / cannot care about whether the pin comes from the MCU or through an I2C multiplexer. It really cannot handle ImplError::PowerDown.

While ImplError allows to get a bit more of information, it cannot tell exactly what went wrong without making ImplError overwhelming (because only the implementation knows), which is precisely my point.

Since it does not provide a complete solution and is also not useful for generic drivers, I would avoid it and just let I2C methods return I2C-proper errors, or Other.

let applications/HALs define their own.

That is an declared anti-goal of this approach, pretty much the only thing I am not okay with. If some system wants to define their own proprietary errors to allow more fine grained handling or better diagnostics that is okay but it should be exception and not the norm.

MCU HALs can reuse an unified error definition like ImplError by importing it themselves form a central crate like embedded-error. I have nothing against that. Similar to the rand_core dependency, for example.

However, I would not make ImplError part of I2cError.

I think TryInto would also be ok (and technically more correct) but it would make the code a bit more complicated since the conversion will not be as seamless as with Into due to the need to unwrap the result.

I believe the seamlessness of the associated type + Into<_> bounds approach is one of the key elements here since you absolutely don’t want error handling to be awkward or people will find rather invent nasty shortcuts instead of doing it right.

Thanks @PTaylor-us, that sounds like an interesting solution. It would allow for using only OP’s Error without needing ImplError or Other(E) by introducing an error handling “hook” which could process any implementation-specific error and if appropriate return one of the standarized I2C errors. Maybe we can break the deadlock here, then.

I see a couple detail differences to what I understand from your comment, though:

  • I understand the error handler would need to be injected into the HAL or into drivers for “special” peripherals like an I2C switch or I/O expander. i.e. embedded-hal trait implementations.
  • I am not sure if we mean the same when we say “driver”. i.e. “I2C driver” vs. “EEPROM driver”.
  • Also, an EEPROM driver would be the place to make the NoAcknowledge -> WouldBlock transformation.

But anyway these are just unimportant details and the proposal still stands.

I think it would be most interesting if we can come up with just one error handler trait which we can use for all fallible things (i.e. GPIO, SPI, etc), although we have not extended the discussion to those traits just yet.

For I2C, I understand an error handler would provide the chance to process the implementation-specific error and would itself transform that into a variant of I2C::Error or even Ok. So I came up with this:

trait ErrorHandler<T, InError, OutError> {
    // takes a reference to the input error to avoid copy requirements
    fn handle(&self, e: &InError) -> Result<T, OutError>;
}

The trait type parameters allow for several implementations depending on the input/output errors that it should handle.

I have put together a demonstration of how an error handler could be injected into a bitbanging I2C implementation here in the playground.

Here is the handler implementation from there. It is a bit like Into<Error>:

impl ErrorHandler<(), digital::Error, i2c::Error> for Handler {
    fn handle(&self, e: &digital::Error) -> Result<(), i2c::Error> {
        match e {
            digital::Error::PinError => {
                // maybe log something
                Err(i2c::Error::BusError)
            },
        }
    }
}

Thoughts?

Of course the driver can pass through the first layer of the HAL-specific error, why not?

But then the driver needs to know about the HAL. If the driver should be generic (only using embedded-hal APIs, e.g. a driver crate published on crates.io) then that’s not possible without modifying the generic driver for the application-specific code.

If you want to handle HAL specific errors in the driver you will always need to do that… If the type is generic as suggested by @eldruin you need to know exactly what the type will because you can’t match on it either way.

The driver doesn’t need to handle the error. It would simply pass it on to the application, which can then decide to handle the error, or not. The driver isn’t able to do hardware specific error handling if it’s written in a generic way.

If you call std::fs::read, then you will get back a generic error that wraps the platform-specific error. A generic library can still handle certain classes of errors without needing to know what platform the user is on. It cannot handle platform-specific errors, it just passes the error to the application. If the application targets Windows, then that application can use the Windows specific errors to do additional error handling. Nobody would expect you to fork and modify the generic library so it knows about the Windows specific I/O error messages.

Note: My mental model is a hardware-specific application, using the hardware-specific HAL, but calling generic drivers (e.g. from crates.io) through embedded-hal.

I really don’t see what’s so bad about: (…) i2c.get_err()

What if the driver does this?

impl TempHumiSensor {
    /// Wake up the sensor, try to do a measurement, go to sleep.
    pub fn low_power_measure<I2C: Write>(i2c: &mut I2C) -> Result<u8, i2c::Error> {
        i2c.try_write(WAKEUP)?;
        let result = i2c.try_write(DO_MEASUREMENT);
        i2c.try_write(SLEEP)?;
        result
    }
}

This method would ensure that the sensor goes back to sleep even if the measurement failed. If you get back an error, and then call i2c.get_err(), how do you know whether it’s the error from the try_write(DO_MEASUREMENT) call or from the try_write(SLEEP) call?

Also, as mentioned above, asynchronous execution could also break such an API if I2C calls are interleaved.

If you want to have a proprietary application which cares about those implementation specific details then you can still offer an API in your bitbang-hal driver to provide more detailed information about the problem and ways to recover

Yes, that’s precisely the problem. A user would need to fork all driver crates and modify them with a custom API in order to provide detailed error information.

Especially if we want to promote embedded-hal in a commercial environment, it seems a bit weird to first advertise a “production-ready ecosystem of mature generic and compatible drivers” but then say “if you want to do error handling for your hardware-specific project, you will need to fork all generic drivers and adjust them to your needs”.

but that’s off the path for what embedded-hal can provide IMHO

No, embedded-hal can definitely provide a solution for this if we want (the solution with a generic parameter that @eldruin and others proposed). The only question is whether we want to sacrifice this use case in order for the syntax to look slightly more clean (without the type parameter).

Hmm, there is a mismatch in the types. I am not sure if it is intended.

Sorry, just me being sloppy.

No problem.

Err(i2c::Error::BusError(digital::Error(pin))) => // log diagnostics or so

That will not work. You either need have a dedicated type for the BusError which includes all possible variants, like digital::Error or you need to make it a generic type which is incredibly painful to use.

Yes, we would need a generic error variant like already proposed:

#[non_exhaustive]
pub enum Error<E> {
    /// An unspecific bus error occured
    BusError(E),
    // ...
}

This is nothing really new, all drivers already define a generic Error<E> type and to me it has been quite fine so far. For example, in a driver:

pub enum Error<E> {
  I2C(E),
  InvalidInputData,
}

Would now become:

pub enum Error<E> {
  I2C(i2c::Error<E>),
  InvalidInputData,
}

We now have an unified error layer in between but otherwise the driver code would see very few changes.

What would happen to your application if we did that? Well, if your application uses I2C with a bitbang-hal implementation a read might return i2c::Error(digital::Error(BogusPin)) while any other I2C peripheral implementation would return i2c::Error(BusError) (or a more specific Error) for the very same thing so your error handling needs to cater for all possible cases.

Hmm, there is a mismatch in the types. I am not sure if it is intended. The bitbang-hal::i2c::Read::try_read() method would return embedded_hal::i2c::Error::BusError(embedded_hal::digital::Error(BogusPin)). A hardware I2C peripheral would return embedded_hal::i2c::Error::BusError(Something else or just nothing).

If somebody is not interested in causes for the BusError and just wants to retry, they would write this:

use embedded_hal::i2c;
match i2c.try_read() {
  Err(i2c::Error::BusError(_)) => // retry or so, no dependencies on whatever is in there
  // ...
}

If somebody is interested in some underlying causes, then of course they have a dependency to whatever is inside, but this dependency is already there. e.g. they already know they are using bitbang-hal and that the error might be from the pins the application itself put into it. Here the code:

use embedded_hal::i2c;
match i2c.try_read() {
  Err(i2c::Error::BusError(digital::Error(pin))) => // log diagnostics or so
  Err(i2c::Error::BusError(e)) => // something else happened
  // ...
}

So the point is that code that does not care about underlying errors can just ignore them and have no dependencies and code that cares about them can use them.

Let’s take a look at a real public example I know of: In the e.ziclean-cube the MCU (stm32f101) is connected via GPIO to an I2C accelerometer (KXCJ9). In order to interact with the accelerometer, a bitbanging I2C implementation must be used (bitbang-hal). If the MCU hal returns an error when trying to set or read a pin during an I2C operation, how would you give that information to the application?

I do not know if it would be fine to just start over, but I think if this comes up frequently, it would be useful to the application to be able to know that the MCU pins are acting up, and not the accelerometer, for example, and that is the reason for some failures. This would immediately simplify diagnostics. I think it would be useful to make something like this somehow possible.

@therealprof i agree with most your points except for the error nesting bit, ime the flat approach only works in std because they nest Box<dyn Error> so you can handle flat cases without erasing useful underlying information. The critiques of the alternatives I proposed are totally valid, however, as noted above none of the modern rust error handling stuff works without alloc, which i do not think an acceptable compromise for us 😦

there’s no such thing as a library hidden several layers down magically initialising an peripheral and doing fancy stuff without the application knowing anything about that

I think it’s also important to remember that this is not only used in no_std contexts? I think linux-embedded-hal is a pretty good example, especially where async implementations are calling out to libc and mio.

I don’t think that solves any of the issues and adds more convolution to the API.

This solves the, there is a specific error that may occur during normal operation of the interface that a driver may wish to handle, problem. Which may be specific to I2C, and in a way that doesn’t restrict error types or delete error information, which afaik would resolve #181, and is generalisable to other cases similar to this.

Easy: if it makes sense, is not covered by an existing error kind we’ll accept it. There should be a very low bar to including new error kinds and since additions are non-breaking I would not worry about that at all.

The other question about this is how you ensure people use the correct error types, and how you interpret and use those both at the driver and application level? Say we end up with BusFaultA and BusFaultB or something, we loose abstraction in that the driver author now needs to be aware of what an underlying impl might return, and any addition to this enum that should be handled at the driver level requires a driver update, and the application author now needs to know both what the underlying impl can return, and what the driver handles / forwards. Some of this is implicit to having defined errors, some of this is a facet of the flat approach I think.

I think with the new fallible-first approach it is very important to deliver a decent set of standard error kinds, too, and not leave it to impls to come up with a wild variety of custom errors (or none at all).

A single flat non-generic set would simplify things like this which is nice, but at the same time, seems to make it functionally impossible to report meaningful errors (device disconnect, permissions, pin not mounted, etc.) at the application level, which does not seem to me to be tenable.

It would be great to improve the situation, but, I would posit that any improvement should not be worse / less-expressive than the current approach, even if only for a subset of uses 😕

From looking through the different HALs, error types that exist right now:

  • STM32L0: Overrun, Nack, PECError, BusError, ArbitrationLost
  • STM32L1: OVERRUN, NACK
  • STM32L4: Bus, Arbitration, Nack
  • STM32F0: OVERRUN, NACK, BUS
  • STM32F1: Bus, Arbitration, Acknowledge (Nack), Overrun
  • STM32F3: Bus, Arbitration
  • STM32F4: OVERRUN, NACK
  • STM32F7: Bus, Arbitration, Acknowledge, Overrun, Busy
  • STM32G0: Overrun, Nack, PECError, BusError, ArbitrationLost
  • nRFx (TWIM): TxBufferTooLong, RxBufferTooLong, Transmit, Receive, DMABufferNotInDataMemory
  • LPC8xx: EventTimeout, MasterArbitrationLoss, MasterStartStopError, MonitorOverflow, SclTimeout
  • Linux: Nix(nix::Error), Io(io::Error)
  • TM4C: Bus, Arbitration, DataAck, AdrAck

I don’t know enough about the different implementations (and SMBus, which seems to add some other error types) to know what all those variants mean, but it feels a bit wrong to throw away error information for error types that we as embedded-hal developers did not think of, or where platform-specific error handling could be added. Maybe some people with more experience can judge that.

I only found this RFC now, but independently suggested the same solution as @ryankurte / @eldruin:

https://github.com/rust-embedded/embedded-hal/issues/181#issuecomment-647401030

I think it’s important that the implementation can provide custom errors if it wants to, but that we can have implementation-independent error variants for things like Nack / ArbitrationLoss / etc so that I²C device drivers can use polling instead of fixed sleep times (when waiting for something to be processed).

This should probably be decided before the 1.0.0 release, right?