wasm-c-api: C++ opaque handle implementation is based on undefined behaviour
If a program attempts to access the stored value of an object through a glvalue whose type is not similar ([conv.qual]) to one of the following types the behavior is undefined:52
- the dynamic type of the object,
- a type that is the signed or unsigned type corresponding to the dynamic type of the object, or
- a char, unsigned char, or std::byte type.
The implementation of opaque handles dynamically allocates an object of one type and casts it to an unrelated type. Accessing the object through a pointer to one of the API types (e.g. in order to call copy
or kind
) is then undefined behaviour.
Is there documentation somewhere which outlines the use of this strategy as opposed to inheritence- or composition-based ones which don’t break this rule? I realise that it may work on compilers which this has been tested with, but it seems somewhat risky to mandate UB through the interface, as implementations may change, or this may trigger sanitizer fails.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 8
- Comments: 17 (11 by maintainers)
Commits related to this issue
- Fix wasm.cc to build against the wasm.hh presently in https://github.com/WebAssembly/wasm-c-api/issues/119 . — committed to wasmerio/wasmer by nlewycky 4 years ago
Hi, I am UBsan, and I’m currently complaining about this.
There’s a straight-forward fix:
own<T>
fromunique_ptr<T>
tounique_ptr<T, Deleter>
where Deleter calls destroy()Foo::make()->destroy()
doesn’t compile (if it did, it would lead to a double-free).Then implementers can:
Example: wasm.hh has:
and impl-wasm.cc has:
A few comments on this approach:
A
unique_ptr<Derived, Deleter<Derived>>
doesn’t freely cast to aunique_ptr<Base, Deleter<Base>>
. I think the easiest fix it to pass the explicit T to make_own when doing a make + downcast combined as shown in my definition of make() above.With this approach the potential bug of deleting the object of base type is prevented:
which is necessary if we expect the implementer to extend the size of the object when deriving.
The other main alternative is to use
virtual
. I assume we’re avoiding that?I’ve mostly implemented it in the draft PR #161. There’s a pre-existing bug with Shared<> that I’m not sure what approach to take to address and am looking for input. wasm-v8.cc defines an explicit template specialization for
Shared<Module>::~Shared()
, which is a problem because no specialization was declared and the destructor (or nowdestroy()
in my approach) may be implicitly instantiated. I know this aspect of C++ templates is not well understood so I’ll give a primer:std::vector<char>
for instance. For implicit instantiations, multiple definitions is not a linker error and it is optional whether the compiler chooses to emit the the symbols to the .o file (for example, it may depend on the optimization level whether a function template is inlined into all callers so now there is no need for a symbol – this is what leads to most cases of mysterious C++ template linking errors).template Shared<int>;
).extern template Shared<int>;
).template<> struct Shared<int> { /* ... */ };
)template<> struct Shared<int>;
)(not discussed: partial templates or template templates)
It’s incorrect to call an explicit specialization when expecting an instantiation per https://eel.is/c++draft/temp.expl.spec#7.sentence-1 . So this combination: wasm.hh:
wasm-v8.cc:
is a non-starter. (If it helps, you can imagine that an explicit specialization has a different type signature, so it could be mangled differently, though both popular ABIs chose to mangle them the same which is why it still links.) Any code that might trigger implicit instantiation of the declaration of d’tor/destroy() must know that this is defined with an explicit specialization.
How would we do that? We could forbid C++ implementations from using an explicit specialization. We could tell users that it’s normal to have to include a second vendor-specific header file even though the entire interface defined in wasm.hh is portable at the source level. We could declare all the explicit specializations in wasm.hh and require all implementations to define for each specialization? We could replace
Shared<T>
with a non-template?