distribution-spec: Allow registries to reject non-existent subjects in manifests

During conformance testing it was found that registries which require strong references between manifests and blobs fail conformance due to MUST language in the spec requiring acceptance of a manifest referencing a non-existent subject manifest. While subject fields may be described as a weak reference, listing and querying them at large scale may require a strong reference (such as foreign key in a database) or may simply be inheriting the data model used in 1.0 which always had referenced objects (as viewed from the merkle DAG) uploaded first.

The arguments for MUST language was to (1) support registries which may have reference only repositories, storing content elsewhere, and (2) ensure referrers exist at manifest pull time since there is no atomic way to upload referrers with manifests.

For (1) the burden will be on the client to handle this case on upload, as a registry is not required to support such repositories.

For (2) clients can retry or check for freshness when validation is a requirement or clients can ensure tags are only updated once all content is available. Similar issues have occurred in the past with multi-platform images. If images were uploaded before all platforms were available, then clients could see a race condition between the platform they need being built and the image they pull having that platform available. The same solution could apply here, use push by digest or a temporary tag when pushing manifests that should not be considered fully available and “tag” it once complete (via upload of the manifest using tag reference).

Changing the language the MUST to MAY makes most sense here. Additionally we can add guidance in the spec on how to perform manifest uploads more transactionally. In the future we could consider a more explicit way to create and manage transactions.

Related to https://github.com/opencontainers/distribution-spec/issues/340 https://github.com/opencontainers/distribution-spec/pull/341

About this issue

  • Original URL
  • State: open
  • Created 10 months ago
  • Reactions: 3
  • Comments: 52 (33 by maintainers)

Most upvoted comments

I keep thinking about this problem, and I think we’re honestly really close to uploads that look transactional today. I’ll try to explain further, but hopefully this all makes sense. 🤞

For context, I’ll start with the main use case I think is really important for the issue at hand: being able to make sure a signature is available before users try to pull an image (so that policy can be enforced accurately). Without this, we’ll have frequent “brown outs” of image pulls as users race the push of the manifest vs the signature, and the frequency of users hitting those edge cases will increase with the number of users (cue my DOI maintainer hat where we’ve experienced exactly this with previous incarnations of our multi-architecture support and the angry users that generated).

Now my proposed (small, incremental) solution!

Uploads of all objects can occur entirely by digest, including manifests. The only way users can discover a manifest to pull (using only official OCI distribution-spec APIs) is by:

  1. querying a tag
  2. querying the tag listing API, then querying a tag from it
  3. already knowing the digest (in which case they very likely either have the object already, or looked up the digest via one of the previous means)

So if I push an object by digest, the chances of someone trying to pull it before I’m “ready” are very, very low. Thus, it’s only tagging that needs to be transactional, right? The only issue I see there is that “tagging” is not a direct action we can perform – it’s a side effect we can achieve by uploading a manifest by name instead of by digest (and manifests might not be small – possibly as much as 4194304 bytes, which is potentially a heavy upload just to update a pointer).

In other (shorter) words, I’m proposing a new (or updated) API endpoint for lightweight tagging of an existing uploaded manifest without having to upload the entire manifest contents again, and I believe this satisfies the need for a transactional API, hopefully in a way that’s easy for existing registries to implement.

In the signatures example, that means this could be our extended “transactional” flow:

  1. upload blobs (inc. config)
  2. upload image manifest (by digest)
  3. upload image signature (subject -> image manifest, by digest)
  4. update/add tag to point to new digest, signalling that the image is “ready” for use
    (but not having to re-upload potentially 4MiB of useless extra data to accomplish this flow)

While I would prefer, from a client perspective, to keep this MUST to assist in clients having the order option, there was never an intention to break 1.0 image registry storage systems with this MUST requirement. Loosening now seems the only appropriate course of action.

The direction depends on the perspective.

The images and registry is designed around a merkle tree. The reference artifact ends being the top hash in the merkle tree (even if it may not be the first hash clients request). Requiring a registry to accept only the top hash breaks the ability of a registry to verify the merkle tree before accepting content. In terms of “wrong” or “right” way, the reference design has good reasons for this direction. The “MUST” language breaks existing registry design and best practice though.

Any updates here?

data models we design follow use cases (vs) data models we design prescribe use cases.

In a world where we are building a greenfield application, this is true. However, after we have a working model, any new use cases fundamentally must take into consideration the legacy models and account for changes necessary to support the work.

References are a great concept. Sparse manifests is an interesting concept. But in a world where we have an existing data model, there are many ramifications on the registry side and a lot of undefined behaviors to address. Even with a MUST, we cannot pretend like it’ll be some perfect world of interoperability - there are several gray areas in the spec that can and will lead to different implementations that clients need to account for.

This proposal hinges on multiple prerequisites, each of which would need to be true:

  1. “Registries should be validating the content before allowing it to be pushed.” We’ve certainly allowed this in the past. But there is a request to better support sparse manifests so that a local copy of an image doesn’t require copying all of the platforms that are not used in a given environment (those running a local mirror for amd64/arm64 nodes don’t want to be forced to pull s390 and windows images). And we are seeing a push to support separate repositories for image metadata from the images themselves from both the notation and cosign communities. We also have the concept of external references to descriptors that are not pushed to a registry (previously used to support windows layers).
  2. “Removing a MUST makes the spec more flexible for various implementations.” That is a registry centered view, allowing multiple registry implementations. But the MAY language for registries becomes a MUST to client implementations. Effectively “clients that want portability MUST push an image manifest before any manifest that references it in the subject field.” There is no option we can pick that makes the spec more flexible, only a shift in who has more flexibility than the other.
  3. “Every descriptor in the manifest is automatically part of the DAG and a hard reference.” I have two concerns with this assumption.
    • First, every descriptor is not known in advance, and as a spec changes, content that was previously unknown and allowed suddenly becomes known and forbidden. That creates a lack of forward content portability that we should be careful when adding.
    • I’m also not convinced that every descriptor should be part of the DAG. The subject descriptor was intentionally designed for a back reference API and not intended to be followed like other DAG content. If OCI wanted to create a block list manifest in the future, containing descriptors of known malicious content, this view would require that all of the malicious content is first pushed to a registry before the block list could be pushed. Instead of assuming every descriptor is part of the DAG, perhaps OCI should be providing better guidance when a descriptor is not part of that assumption.

I don’t believe we have a solution that finds the common ground between the two views here, and we’re unlikely to reach that point with more discussion. We’ve put the issue up for a vote to see if there was consensus, and majority is leaning towards moving forward without a change to the spec. Given the community’s desire to get to a release, I’d suggest we either close this discussion or time box it to prevent it from continuing indefinitely.

storing signatures/attestations/sboms/etc in a repository separate from the image.

I must point out once again that changing a MUST to a MAY doesn’t forbid anything. The distribution-spec doesn’t disallow “sparse images” (index without all referenced content) today; it merely doesn’t mandate them.

This discussion seems to stem from the fact that the artifact reference points the “wrong” way. Digests must exist before the manifest because the manifest points at the digests. Following the same logic, artifacts point at the manifest, so the manifest should exist before the artifact.

There are a number of reasons behind the decision to have artifacts function the way they do. But by keeping the MUST here, it comes across like we want artifacts to act like dependencies of the manifest despite functionally existing the other way around. The increase in requests is a downside, but also a consequence of the initial design.

where the current spec prescribes a particular data-model that is breaking compared to the rest of the spec/object relationships.

data models we design follow use cases (vs) data models we design prescribe use cases.

Since we have had a lengthy discussion on this, had a vote that’s been open for several weeks, and the vote is leaning against this request, my suggestion is to close this issue and move on. I say that with a lot of hesitancy because I’d much rather find a solution where everyone meets in the middle. But in this case, it’s been very contentious because there is no middle option that we’ve been able to find.

+1 here, I want to point out again that this wasn’t a simple oversight or miss, it was a conscious decision to enable a common scenario that is in use today by real workloads - storing signatures/attestations/sboms/etc in a repository separate from the image.

This was captured as a core requirement in the earliest stages of this WG and the MUST was placed there for a reason, to support that scenario.

Looking over the suggestion, for myself this doesn’t offer any value over option 2, so my vote on the issue is unchanged. Concerns I have include:

  • This is a major breaking changing to all of the existing 1.1-rc* implementations and would require clients and registries to do a rewrite, along with recreating the conformance tests, and returning to the community for feedback. It feels worse than the 999 change to me.
  • Registries could still reject the push of content with a subject pointing to a missing manifest, so the differing opinions on the DAG discussion are unresolved by this.
  • The OCI-Subject header wouldn’t be used and so every push of a manifest with a subject requires the full fallback processing, adding to client overhead and additional API round trips.
  • Registries with an eventually consistent backendcannot support conditional requests, so there’s no possibility of an eventually consistent referrers response to a referrers race if this is implemented on the client side, where that could be done with a server side solution.
  • Clients that work with a single artifactType were requesting the filter as an efficiency, and servers have the option of whether they want to support it. We had discussed whether this should be only a client side feature in the working group, and I don’t want to restart previously settled debates for something that a registry can opt out of.

Given that, I’m opposed to the proposal. I think it would have been worth considering during the working group, but this late in the release cycle, I feel it’s too disruptive to the community that has already written so much code both in the working group and now against the RC releases. A registry can decide to not implement the 1.1 spec, sticking with the 1.0, where the subject field is not defined and not part of the DAG, and where clients push the fallback tag as an index. I think gives an identical result to the proposal (content addressability, no separate API, no filtering, all client managed) while allowing clients to push content in either order.

Since we have had a lengthy discussion on this, had a vote that’s been open for several weeks, and the vote is leaning against this request, my suggestion is to close this issue and move on. I say that with a lot of hesitancy because I’d much rather find a solution where everyone meets in the middle. But in this case, it’s been very contentious because there is no middle option that we’ve been able to find.

My concern continues to be that a well behaved client, with cooperation of other well behaved clients, would not be able to ensure content is complete without performing a fully recursive query on every manifest. An upload could be interrupted at any point, triggering a future retry. When that happens with multi-platform images, clients know where they left off because of which manifests are already pushed, and can assume that child manifests and blobs have been pushed too. But with referrers, that assumption is no longer valid and every child manifest must also be checked for missing referrers on every single copy.

@sudo-bmitch that is always going to be the case with referrers, as they were created in order to be able to add references at any time in the future, in order to support workflows such as signatures for later approvals, versus build time. This does make mirroring essentially very difficult without an additional API to give a feed of referrer changes, as there is no real definition of “complete” anyway.

Correct; how would you sign the tag with the existing data model today?