runtime: Fallback host rid is broken on non-portable builds

Microsoft’s portable builds perform a fall-back to the portable rid when the /etc/os-release/ rid is not known.

This doesn’t work for non-portable builds because the information for the portable rid is missing in Microsoft.NETCore.App.deps.json.

For example, below is the full runtimes section of a Fedora 37 build. It does not include a section for linux-x64 (which is the fallback host rid).

  "runtimes": {
    "fedora.37-x64": [
      "fedora.37",
      "fedora-x64",
      "fedora",
      "linux-x64",
      "linux",
      "unix-x64",
      "unix",
      "any",
      "base"
    ]

For source-build, there should be an exact match with the rid found here and the /etc/os-release rid. That may not be the case:

  • Because the /etc/os-release rid is unexpected (like in https://github.com/dotnet/runtime/issues/79196).
  • It also happens when a distro gets upgraded, and the new version does not (yet) include packages for that .NET version. Then the existing .NET build will not know the new rid (fedora.38-x64 is unknown).

In these two cases, the Microsoft build will continue to work, while the source-build builds start failing.

We should include this information in the non-portable builds so the fallback works.

cc @ViktorHofer @ericstj @am11 @omajid

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 45 (45 by maintainers)

Most upvoted comments

I think we should work towards breaking free of /etc/os-release dependency. It does not scale at all.

Some context: In .NET 7, work was put to remove hard dependency on procfs (/proc) and now we don’t fail in runtime and SDK even if you unmount /proc. The reason being, even on linux, there is no guarantee that procfs is present or functioning everywhere despite it is highly recommended facility provided by the linux kernel (vs., for instance, on FreeBSD, /proc is optional and not present by default).

/etc/os-release, in comparison to /proc, is a fragile resource for an ecosystem to depend on. It is a wrong contract for any serious operation. It was invented as part of systemd which not all distros use as their init system; but regardless, most distros add /etc/os-release as part of their base installation. We are parsing it in number of places and failing the execution if we find something (which we think is) wrong. If the fragility of /etc/os-release doesn’t concern us then even better; we should write our own (equally fragile) ~/.dotnet/distro-info.json file and depend on that instead of something out of /etc which we don’t control.

It also occurs to me if we adopt this plan that dotnet --info showing a distro-specific RID would make no sense, at least for a portable Linux build.

I just realized that the host itself does not print the RID at all - only the SDK. Although it does rely on the host, since that is what RuntimeInformation.RuntimeIdentifier comes from.

it is OK to support portable RIDs for that scenario.

The proposal has the host (distro-specific or not) supporting portable RIDs - via the algorithmic model - in all scenarios, right? (Hence distro-specific app and cross-plat plugins should still work.)

Switch host to algorithmic model.

The algorithmic model enables the runtime to consume fedora.37-x64 from nuget packages. I’m not sure there are many of these. Such packages aren’t scalable because you need to create assets per distro version. Providing this algorithm depends on whether you want to support this use-case.

The main use-case for the specific rid (fedora.37-x64) is to distinguish portable assets from source-built assets. This already works because the source-build runtime recognizes the rid, allowing it to use assets that have been source-built for that rid.

This means that such executable will only run on Linuxes where the specific RID exists in the shared framework

The source-built apphosts are anyhow not portable due to their glibc dependency. An apphost built for Fedora 37 won’t work on RHEL8 for example.

@richlander when the host doesn’t recognize the rid in /etc/os-release it uses a fallback rid. That fallback rid is the portable rid that corresponds to the platform. On Fedora 36 for example, that is linux-x64 both for the Microsoft portable build, as the source-build non-portable build. However, the runtime on the non-portable build doesn’t understand what linux-x64 means (it’s not defined in Microsoft.NETCore.App.deps.json as shown in the first comment), which breaks the fallback on the non-portable build.

We’ll make the fallback work on non-portable by changing the fallback rid to match the non-portable rid.

The bottom line is that both portable and non-portable builds will be able to use portable assets independent of what’s in /etc/os-release.

@vitek-karas you can assign this to me.

Or to change the fallback rid to one that is present in Microsoft.NETCore.App.deps.json: the target rid.

I spent some time yesterday thinking about this issue and what you propose is exactly what I imagined would be the right solution. Instead of finding a portable RID, just make sure that the target RID is honored in all scenarios (direct match and host fallback).

cc @vitek-karas

Rather than changing Microsoft.NETCore.App.deps.json to include the fallback rid, we may want to change the the hostpolicy so the fallback rid matches the source-build target rid.

@ViktorHofer the code for the fallback is here:

https://github.com/dotnet/runtime/blob/58719ec90b3bbae527dd81685bf8670b993fe8f9/src/native/corehost/hostpolicy/deps_format.cpp#L126-L136

It falls back to the corresponding portable rid defined here:

https://github.com/dotnet/runtime/blob/58719ec90b3bbae527dd81685bf8670b993fe8f9/src/native/corehost/hostmisc/pal.h#L78-L90

Information about this portable rid is not present in a source-built Microsoft.NETCore.App.deps.json. So the fallback is broken.

The fix is either to add the fallback rid to Microsoft.NETCore.App.deps.json. Or to change the fallback rid to one that is present in Microsoft.NETCore.App.deps.json: the target rid.