modular-bitfield: Allow returning bytes in MSB order
Currently (at least on my platform, ARMv7), bytes are returned in LSB order but my driver requires them to be MSB. How tricky would it be to add a configuration option for this?
Example:
#[bitfield]
pub struct MyBitfield {
foo: B16,
}
let test = MyBitfield::new().with_foo(0x1234u16);
test.as_bytes() returns [0x34, 0x12] but I need it to be [0x12, 0x34]. While I could reverse the order of the result from as_bytes, it is error-prone and would IMO be better captured by the bitfield itself.
About this issue
- Original URL
- State: open
- Created 4 years ago
- Reactions: 3
- Comments: 27 (11 by maintainers)
Going a step further… The internal bit storage order being able to be reversed would be nice. Trying to write a driver for this (look at page 63+) and am finding that a crate like this would be quite helpful to define how to speak with the device, but the LE ordering of the crate makes it so I have to do a lot of driver internal fiddling, even on the
i8register values. It gets way more obvious when you start looking ati16andi24values thanks to the combination of some registers. Even with ainto_be_bytesoption this crate wouldn’t let me store correct values in such cases.I know you can swap the SPI controllers read/write mode to support LSB, but most controllers I’ve seen are MSB by default. It’d be nice to have a “protocol mode” as a result (for lack of a better term).
I’m interested in this feature as well. Getting bitfields and endianness to work correctly is very tricky and needs a library solution.
From my experience, the only way to handle this situation is to carefully define a bitfield layout in a way that’s independent of endianness. You do this by defining it in terms of the underlying representation. So taking the example from the previous comment and using
#[repr(u16)], it has the following layout:Notice that this layout works regardless of the machine endianness: within the 16 bits of a
u16, the MSB isB1and everything else isB2. No bytes are mentioned in this definition.(Note, however, that the order of the fields within this representation is completely ambiguous.
B1could be the LSB, for example. In fact, GCC actually reverses the field layout based on endianness. This is a mistake IMO, and should not be repeated by this library. Pick one order, document it, and stick to that. Backwards compatibility will likely determine your choice.)As a separate step, you define the byte order when this representation needs to be viewed as bytes. There are three options: host endian, little endian, and big endian. The representation in memory should always be in the host endianness (so the underlying
u16is just that), with options to read and write it explicitly in either big or little endian. In the end, that means providingfrom_le_bytesandfrom_be_bytes. Also, don’t forgetfrom_ne_bytesfor native endianness (like Rust does with primitive types, which is guaranteed to be fast). I’d also suggest renaming toto_{le,be,ne}_bytesfor consistency with std.The above solution works with every interface I’ve worked with and layout I’ve seen from a C or C++ compiler. It’s also compatible with how people do bitfields manually (bit shifts on integer values). (With the possible exception of mixed endianness mentioned above, though to be honest I don’t even know what that really means in this context. Regardless, I think it’s rare enough to ignore from a library perspective.)
This interpretation apparently departs from the current implementation, which uses byte arrays under the hood. To be frank, I think that’s a dubious implementation to begin with. However, it does allow for space optimization when the number of bits is something like 24 bits instead of 32, and it allows for very large bitfields. I’m not sure the advantages are significant (you can break up a bitfield if needed and the space overhead is likely minimal). As a compromise if that’s still desired, I suggest using a byte array when a representation isn’t provided, but consider not providing a way to access the bytes.
In summary, I recommend doing the following:
into_bytesandfrom_bytesin_le_bytes,in_be_bytes,in_ne_bytes,from_le_bytes,from_be_bytes, andfrom_ne_bytes. These should basically just be calls to the equivalent function provided by the underlying type.Yeah, in the meanwhile, I have even worked with protocols which use mixed LSB/MSB encoding (yes, i am talking to you, IEEE 802.15.4).
Probably the proper way is to introduce
LE16andBE16types in place ofB16so that the endianness can be specified on a field by field basis. This gets very close to the zerocopy crate, but that one doesn’t have bitfield support (which is kind of the selling point of this library), so I don’t see a conflict here. We would need to define what the odd types (e.g. B15) means for the two variants. For example:How does the memory layout when
foois stored as LE differ from the layout whenfoois stored BE? What does it even mean to define endianness for a 15 bits? Maybe we should just not support this for non-even fields (i.e.LE16but noLE15)?Hm, I think I see the use case (correct me if I’m wrong)… imagine it’s some HW register somewhere, you could:
vs. having to do a two-step process:
It’d prevent you from being able to atomically set/get all fields simultaneously, but that may be fine. In this case there’s only one interesting field anyway.