libipld: Block API
There is a new Block API in the JavaScript IPLD implementation, which I find quite nice.
I was thinking about using a similar one for Rust IPLD. I’m still implementing/playing around with the idea, but I thought I publish a draft early on:
Block
A Block
is an IPLD object together with a CID. The data can be encoded
and decoded.
All operations are cached. This means that encoding, decoding and CID calculation happens at most once. All subsequent calls will use a cached version.
Methods
impl<'a> Block<'a>
pub fn new<R>(cid: Cid, raw: Vec<u8>) -> Self where R: Registry
Create a new Block
from the given CID and raw binary data.
It needs a registry that contains codec and hash algorithms implementations in order to be able to decode the data into IPLD.
pub fn encoder(node: Ipld, codec: &'a dyn Codec<Error = Error>, hash_alg: &'a dyn Hash) -> Self
Create a new Block
from the given IPLD object, codec and hash
algorithm.
No computation is done, the CID creation and the encoding will only be performed when the corresponding methods are called.
pub fn decoder(aw: Vec<u8>, codec: &'a dyn Codec<Error = Error>, hash_alg: &'a dyn Hash) -> Self
Create a new Block
from encoded data, codec and hash algorithm.
No computation is done, the CID creation and the decoding will only be performed when the corresponding methods are called.
pub fn decode(&mut self) -> Ipld
Decode the Block
into an IPLD object.
The actual decoding is only performed if the object doesn’t have a copy of the IPLD object yet. If that method was called before, it returns the cached result.
pub fn encode(&mut self) -> Vec<u8>
Encode the Block
into raw binary data.
The actual encoding is only performed if the object doesn’t have a copy of the encoded raw binary data yet. If that method was called before, it returns the cached result.
pub fn cid(&mut self) -> Cid
Calculate the CID of the Block
.
The CID is calculated from the encoded data. If it wasn’t encoded
before, that operation will be performed. If the encoded data is already
available, from a previous call of encode()
or because the Block
was
instantiated via encoder()
, then it isn’t re-encoded.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 30 (16 by maintainers)
Next week 😕
Also reading the proposal again, the concept of a registry is weird in a compiled language. What’s wrong with something like this:
Not an expert but some high level thoughts: it seems like by implementing a set of independent traits one can “eat your cake and have it too” while both keeping the code idiomatic to Rust and also achieving the API signature that you want / are used to.
I think some careful thought about how to design the individual traits and as much helpful context we can get us gonna be the trick here. This also might be something one might be able to reason about and experiment with either in the tests or in the examples folder, since it’s hard to design all this abstractly without building toward a use case or two, no matter how contrived and simple.
One last and rather important point I somehow forgot. Part of the value of IPLD is to be codec agnostic. That flexibility is lost if you build on codec specific interfaces and a lot of thought went into how the
Block
API could maintain that flexibility throughout the stack. It’s probably worth pulling in @Gozala since he provided a lot of the insight here when we were designing the new JS Block interface.I was wondering what
store
is about and then found it in an older comment from @dvc94ch I missed to reply to.The current direction of IPLD implementations is to move away from having block storage as a central piece. The data should be able to come from anywhere, memory, network or disk. Hence the idea of the Block API came up. You get the block from “somewhere” and then work with it.
I can speak pretty well to how this ends up being used in JavaScript.
Once you start layering different interfaces you end up with a lot of actors doing encoding, decoding, and/or hashing. Combining these into a single interface means that you can provide a common API to pass around that normalizes and caches all of this.
This also means that you can pass a single interface to any actor, regardless of whether or not they want the encoded state, the decoded state, or the content address for linking. All of these states can be generated just-in-time and then cached for future consumers of the block.