libc: How to deal with breaking changes on platform ? [BSDs related]

I open an issue on libc because it is here the problems will start to show up. Depending the solution or the way to deal with, modifications could occurs in rustc repository too.

At OpenBSD, we don’t care about breaking API/ABI between releases. Once a release is done, the API/ABI is stable, but there is no guarantee that it will be compatible with the next release.

Currently, in the upcoming 6.2 version of OpenBSD (6.1-current), there is a breaking change that will affect libc : si_addr should be of type void *, not char * (caddr_t). Here the current definition in libc.

Under OpenBSD, we deal with ABI under LLVM by using a triple like: amd64-unknown-openbsd6.1. For Rust, instead we use an unversioned platform, resulting all OpenBSD versions to define the same ABI (which isn’t properly right).

Do you think it possible to switch from *-unknown-openbsd to *-unknown-openbsd6.0, *-unknown-openbsd6.1, … without having to duplicate all code in libc for each target ? and without having to add a new target in rustc for each OpenBSD release ?

Any others ideas on the way to deal with it ?

About this issue

  • Original URL
  • State: open
  • Created 7 years ago
  • Reactions: 8
  • Comments: 85 (66 by maintainers)

Commits related to this issue

Most upvoted comments

@asomers @semarie This is an artifact of myopic developers who have nothing more than a superficial understanding of the Linux ecosystem post-2010 - Linux is the post-M$ monoculture. This is going to be a pervasive issue over the longer term and thus runs counter to Rust’s mission statement as a general system’s language. This isn’t rocket science and I hope we don’t end up needing to go down the Go “replace all the host libs” route.

Not only does it not solve OpenBSD’s use case; it doesn’t solve the use case where operating systems make changes that don’t break the ABI. Both FreeBSD and Linux occasionally change syscalls and provide backwards compatible syscalls with the old signature and syscall number but a new name. For example, FreeBSD 8’s “compat7.shmctl” syscall is identical to FreeBSD 7’s “shmctl”. Similarly, operating systems make changes to system libraries and provide backwards compatibility by bumping the SHLIB version and providing the old libraries as optional packages.

Currently libc handles neither of these cases. Either the libc binding tracks the new function’s signature, which breaks Rust programs at runtime on older versions, or the libc binding stays with the old signature, which breaks Rust programs at runtime on newer versions. It’s simply not possible for the current libc to compile correctly on multiple versions of an operating system. Your previously suggested solution is to simply remove a binding whenever the OS changes it. But that would break any crates that use the binding, violate semver, and still result in runtime failures for crates that use the old libc but were built on a new OS.

You suggest that libc’s consumers should be responsible for versioning issues, but I don’t think that’s possible. Let’s take stat(2), for example, which will likely change in FreeBSD 12. Suppose that when FreeBSD 12 is released, somebody tries to compile the nix crate on it. The linker will be satisfied that stat is present in libc, but the signature will be totally wrong, so nix will fail at runtime. Cargo won’t produce any kind of warning. If I understand you correctly, you suggest that stat should at this point be removed from libc. But that won’t fix nix until somebody updates that crate’s dependencies, and even then it will only change a runtime failure into a compile time failure. Must the nix developer then write a build script that checks __FreeBSD_version and reimplement all of stat’s FFI bindings for FreeBSD 12? That would finally fix the problem. But according to crates.io, libc has 1268 dependent crates, and all of them would have to independently write the same build script and add the same FFI bindings for stat on FreeBSD 12.

Alternatively, libc could assume that all operating systems provide backwards but not forwards compatibility (sorry OpenBSD). Then it could pick a minimum supported version, and always link against that version’s shared libraries, Currently Cargo doesn’t provide a mechanism to specify an exact shared library version to link against, but that could be added. This would fix all of the runtime failures, but at substantial cost: dependent crates would lack access to new OS features, and both developers and users would have to install the compat library packages. Not only would new features that change APIs be unavailable, but the shared library lock would mean that entirely new functions would be unavailable as well, unlike the current situation where using newly added functions will generate link failures when building on an old OS.

In either case, developers will likely fork libc to update their favorite bindings, resulting in a Balkanization of libc and dependent crates that don’t support older OS versions.

I understand that cross-compilation is a really cool feature, but I fear that you’re underestimating the severity of this problem. Have you looked into how embedded cross development toolchains work? AFAIK the host system requires full headers for the target. Maybe Rust needs to do the same.

@alexcrichton I pushed a WIP branch on my github repository. I hope code will be more explicity than my explaination about what I called having support for OS version.

Tree is at https://github.com/semarie/rust/tree/target-os-version . Please note my code isn’t working for now.

Basically, it is:

  • extending Target to embedding a (possibility empty) os-version string
  • exposing the string as target_os_version symbol (in the same way than target_os)

It would be possible to do have conditionnal code against OS version (OpenBSD 6.1 or OpenBSD 6.0), in the same way we have conditionnal code against OS name (OpenBSD or FreeBSD).

@asomers How would that work exactly? Have code inside the libc crate which is protected by #[cfg(min_libc_version = ...)]?

Exactly. Then consumers can set the min_libc_version in their own .cargo/config files.

@comex Can you distill the implications of your last post?

Yes, Rust has it’s own notions of target and outside of Linux and Windows that notion is wrong. I just want to be able to use Rust across different 3-tuples (OS/ABI/arch) as opposed to the current 2-tuple (OS/arch) with the ABI set to wherever they happened to copy the header values from.

@comex you are confusing ELF symbol version with soname versioning. Libraries bump their sonames due to major changes, like ncurses did two years ago and glibc did in 1997. But ELF symbol versioning is used to maintain backwards compatibility while making minor changes to the library. Glibc is very conservative in its use of ELF symbol versioning, but it does occasionally bump a symbol version. Look through its sources and you’ll find a few examples.

Rust’s difficulty comes from its extensive use of FFI and cross-compiling, which cause it to ignore all header files. The problem isn’t backwards or forwards compatibility per-se. Rather, it’s that libc simply cannot express bindings for more than one version of a library. Right now, it simply isn’t possible to build a Rust program that uses libc and will run on FreeBSD 12. It doesn’t matter what host you build it on. And if libc ever updates its bindings to the FreeBSD 12 ones, then it will no longer be possible to build a Rust program that will work on FreeBSD 11.

@asomers As an old timer I can tell you that Linux distros never used symbol versioning with great success and thus shared libraries were viewed as broken by design. At my work places they would turn off automatic RHEL security updates because it frequently crippled all their systems with incompatible symbol resolution. Moving to a frozen ABI is a (healthy) response to their inability to make versioning work. The Rust developers assumption that everything looks like Linux has led them to overlook the fact that the source contract is not the same as the ABI contract. Building for a given version of the OS assumes that you’re using the system headers from that version. FreeBSD will happily run dynamically linked binaries at least as far back as 4.x if not much further. However, if you’re building on version X you’re expected to use headers for version X, not whatever version you originally chose to copy the definitions from.

https://github.com/rust-lang/libc/issues/775

I don’t quite understand how Rust initially missed out on OS and ABI versioning quite so badly - deciding to assuming that structures and types are immutable over time or removing key structures from libc altogether. Nonetheless, Rust can conditionally compile based on configuration values, what is stopping this?

So far people have proposed 4 basic solutions to this problem, and they’re all inadequate:

  • Adding a build script would break cross-compiling.
  • Changing target triples into target quadruples (eg amd64-unknown-openbsd6.1) would be way too much work.
  • Adding backwards-compat feature flags (eg freebsd-11-compat) doesn’t work for non-backwards compatible changes, such as OpenBSD’s.
  • Splitting libc into subcrates and keeping only the minimum functionality in libc itself isn’t really a solution at all; it just kicks the can down the road.

In addition, none of those solutions even attempt to address the problem faced by rare or proprietary OS forks. Right now if somebody makes a new OS fork, like Bitrig, TrueOS, or yet another shiny Android fork, the only way to get Rust support is to add a new target_os in Rust . That’s way too much work for forks that will only change a handful of syscalls. Worse, adding a new target_os isn’t even possible for proprietary forks like Isilon’s OneOS. The only way to solve the problem is to absolve rust-lang/libc from the responsibility of maintaining C library bindings for every single OS version in the world. OS’s need the power to vendor their own libc crate. I propose a new way to do this, which would require a small change to Cargo but no changes to libc. Since it’s not a libc change per se, let’s discuss it on the forum here: https://internals.rust-lang.org/t/pre-rfc-global-source-replacement-in-cargo-for-os-bindings/9383 .

This is why this oversight really perplexes me. Having the version be an explicit part of the OS target is not hidden and is visible to anyone who has looked at a configure script for any cross-platform OSS program:

 | |                     \-+- 69334 mmacy make DIRPRFX=lib/clang/ all
 | |                       \-+= 69335 mmacy sh -e
 | |                         \-+- 69336 mmacy make all DIRPRFX=lib/clang/libllvm/
 | |                           |-+= 71356 mmacy sh -ev
 | |                           | \-+- 71357 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                           |   \--- 71358 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit
 | |                           |-+= 71499 mmacy sh -ev
 | |                           | \-+- 71500 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                           |   \--- 71501 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit
 | |                           |-+= 71769 mmacy sh -ev
 | |                           | \-+- 71770 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                           |   \--- 71771 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit
 | |                           |-+= 71772 mmacy sh -ev
 | |                           | \-+- 71773 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                           |   \--- 71774 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit
 | |                           |-+= 71778 mmacy sh -ev
 | |                           | \-+- 71779 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                           |   \--- 71780 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit
 | |                           |-+= 71781 mmacy sh -ev
 | |                           | \-+- 71782 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                           |   \--- 71783 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit
 | |                           |-+= 71784 mmacy sh -ev
 | |                           | \-+- 71785 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                           |   \--- 71786 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit
 | |                           \-+= 71787 mmacy sh -ev
 | |                             \-+- 71788 mmacy c++ -O2 -pipe -I/home/mmacy/devel/build/usr/home/mmacy/devel
 | |                               \--- 71789 mmacy /usr/bin/c++ -cc1 -triple x86_64-unknown-freebsd11.1 -emit

@kev009 - Exactly. A big selling point of Rust is its efforts to eliminate UBs. In light of that, no one can really claim it to be “cross platform” until this issue is formally addressed.

This isn’t just a problem for OpenBSD. FreeBSD 12, when it comes out, will change a number of important types, like ino_t and struct stat. If libc’s policy is to only bind the greatest common denominator between versions, then overtime it will shrink into irrelevance. Such a policy really just kicks the version compatibility can down the road.

Would it be possible to generate bindings dynamically at build time? When writing Ruby bindings, I’ve always preferred that approach to FFI. If not, then I think libc needs a way to distinguish between OS versions, just as it currently distinguishes between OSes.

@gnzlbg @semarie’s proposal would work well for OpenBSD, but at great cost to the Rust ecosystem. Adding target_os’s for every version of FreeBSD and NetBSD would have even more cost. And it wouldn’t solve the problem for rare and proprietary OSes. This is why I proposed an alternate solution at https://internals.rust-lang.org/t/pre-rfc-global-source-replacement-in-cargo-for-os-bindings/9383 . In a nutshell:

  • Rust would work on any OS based on one of the supported ones, including weird forks like the Nintendo Switch.
  • Rustup would only support one version of each OS. For example, Rustup would only provide a toolchain for the latest OpenBSD. Older versions of OpenBSD would be supported through OpenBSD’s package manager.
  • Where possible, Rustup’s toolchain would be backwards compatible with a range of OS versions. For FreeBSD, that means the toolchain would target FreeBSD 11 (but could built crates targeting FreeBSD 12). In a few years Rustup would retarget its toolchain to FreeBSD 12.
  • lib would no longer have to worry about the differences between different platform versions. In the case of FreeBSD, libc would target one particular version (probably 11), and a FreeBSD-12 specific libc would be provided by FreeBSD itself.
  • Most of the code changes would be confined to Cargo.

@raphaelcohn please don’t rewrite every OS"s C library in Rust. That’s basically what Go tried, and it hasn’t gone smoothly. It turns out there’s some hard stuff in the C libraries. Plus, it doesn’t help libc’s portability problem. It just kicks the can down the road. As I described in that forum post, I think the solution is more decentralization; libc wouldn’t have to support every single version of every single OS if Cargo allowed OSes to provide their own libcs.

Would it be possible to back up a bit and determine the ideal support matrix? For example, should libc limit support to the actively supported versions of an operating system (e.g. FreeBSD 11 & 12, OpenBSD 6.3 & 6.5, DragonFlyBSD 5.4)? If so it seems like while there would be some churn in the actual flags, the maintenance burden shouldn’t be so high. With such a narrow support matrix the following config flags could, in theory, work: unix, bsd, freebsd (a.k.a. freebsd12), freebsd11, freebsd12.

Regardless, IMO, the ABI ought to be indicated somehow regardless of whether the ABI is tied to the OS release (e.g. OpenBSD) or the library itself (e.g. glibc 1 vs 2).

In either case the top-level README seems a bit out of date in terms of what’s actually being tested.

@raphaelcohn

However, like communism, these sorts of centralizing approaches always seem to fail in practice (unless there’s serious money to be made). It’s a bit like the sirens; the call is alluring and impossible to counter, yet the reality is often ugly.

I have no idea what you’re talking about. What centralizing?

The idea proposed is that, since glibc sometimes lags behind on wrapping Linux syscalls, maybe the Linux kernel devs should provide a Linux kernel syscalls library to ensure that they’re available in a timely manner. It’d be no more centralized than having the same people maintain both the kernel source and the kernel menuconfig to make sure they stay in sync, which obviously works out just fine.

@gnzblg I agree it would be a backwards incompatible change; it really depends how it is handled. At the moment, libc is ‘stuck’ in a 0.2 series of releases and so there is an opportunity - although it would require a major effort. Something like 6 months full time, I suspect, just to get a new, robust and agreed shape together.

I was thinking a little more over the weekend inspired by

with its own release schedule and I think that would also allow a solution to version legacy and its associated maintenance. A new major release of a per-platform-like library would be free to drop support for an older version. Alignment with OS releases (or at least, with deprecation of long term support releases and the like) becomes quite possible, but historic code, because of cargo’s excellent versioning, could continue to work.

Such a splitting out would also make it easier to introduce pull requests for very platform specific features, and would lessen the burden on those maintaining libc to know all of its oddities on many platforms.

@ssokolow Interesting. However, like communism, these sorts of centralizing approaches always seem to fail in practice (unless there’s serious money to be made). It’s a bit like the sirens; the call is alluring and impossible to counter, yet the reality is often ugly.

I was pondering on how best to tackle a newer libc structure, too, to make it easier to maintain and observe where things are ‘missing’. Something that seems useful is to have a layout of code in files which mirrors the public headers. Perhaps headers could even map roughly to modules? There can’t be a 1:1 mapping, and there would always be exceptions, but it might make some things a touch easier.

@raphaelcohn I agree that these approaches have downsides. I agree with you that:

A better long-term approach may be to actually slim down libc to just a sub-set of POSIX, and then provide companion crates, eg for musl, glibc, ulibc, FreeBSD X.X, etc, which contain library-specific features. It may be that a companion crate might actually cover several very closely related libraries (eg musl + glibc + ulibc),

And for windows the winapi crate already does this. For macos there are actually a couple of system libraries, and these are not exposed in one but in many crates. So each platform could have its own crate (e.g. freebsd, openbsd, etc.) with its own release schedule, and as long as the fundamental types (e.g. c_int) are defined somewhere else, all of these can interoperate.

The downside of this approach is that “slim down libc” is a backwards incompatible change. We might be able to extract the parts of libc that libstd actually uses into its own crate, and make sure that libstd is as “version independent” as possible.

I am still sad about that.

Please remember that currently FreeBSD 12 uses a compatibility layer for ino_t (FreeBSD 11 uses 32bits version, and FreeBSD 12 64bits) and silenciously truncate inode values if not representable on 32 bits. One day it will hurt, and it could be badly.

Refusing to accept the reality that OS introduces breaking changes from time to time, and enforcing that at libc level is bad.

@comex I think you’re starting down the right track now. The current rust libc is trying to do something that is not really tractable, an impedance mismatch with C language calling and linkage conventions. In effect: lazy binding with a static set of bindings. That feels like something you might do in a dynamic language, and there you might not care much about correctness or performance and can further do runtime introspection or generation to smooth out the issues. So I do feel strongly the current Rust behavior here is undefined behavior on all platforms and think it is unfair to characterize it as a BSD issue. It happens to work widely for now, but it is a major design flaw that just happen to be seen on BSD right now and will fester as rust wants to grow.

Let’s step back to buildeng history for a bit. If I were a major ISV, I want to support some forward range of OS versions. We’ll chose Red Hat EL. So I build on EL5, and I build my app on there and it will work for probably a very long string of future versions perhaps with compat packages that maintain an old sover since sover ought to be bumped for ABI changes. You can substitute Red Hat EL for Windows or HP-UX or anything else, OS makers usually strive hard to keep that kind of forward compat. There is one exception, OpenBSD, where forward compat is not guaranteed at all, and binaries are expected to match the running major.minor. But OpenBSD’s choices aren’t really an issue for C language authors if you understand the root issue, they just choose no amount of forward compat. You build on the oldest platform you wish to support for all OSes. As an implementation detail this could be in chroots, jails, containers or just plain path, compiler and linker flag gymnastics for cross building. That’s how things have been done for a long time.

Now, there is a second and much less reliable case for building with backwards compat. In a big IDE this might be a hidden build process which looks like the above path/flag munging to target the compiler to the right runtime and linkage. In something like MSVCxx that might be by bundling a runtime environment with the installer. I’m not sure rust should bother with this at the moment, but it’s not something I’ve put a great deal of thought into.

Another thing I haven’t put a great deal of thought into is direct usage of syscalls by the language or libraries. Things discussed like soname and sover go out the window with that.

@semarie that’s not really a nice way to put it, it’s been demonstrated to be an issue on “priority” OSes like Linux and OS X. The fact that many things are working is an undefined behavior and this should not fester.

The problem just got worse. Linux 4.11, released today, added a new system call: statx. Until libc learns to understand versions, it can’t add support for statx. I really think that cargo needs some sort of configure step analagous to autoconf’s configure. https://www.phoronix.com/scan.php?page=news_item&px=Linux-4.11-Statx-System-Call

Unfortunately I don’t really know how we’d handle this, I just figured that platforms wouldn’t do this.

If this happens a lot we’ll just need to document what’s wrong and stop adding new bindings, it’ll be up to crates to implement version compatibility.