cargo: Tracking issue for Deduplicate Cargo workspace information RFC

This is a tracking issue for the RFC: Deduplicate Cargo workspace information (https://github.com/rust-lang/rfcs/pull/2906).

Documentation:

Steps:

Changes from RFC

  • Not locking workspace dependencies, see alexchrichton’s comment
  • Not inserting a version into path dependencies when publishing, see epage’s comment
  • Changed the table package fields inherit from, from workspace to workspace.package, see epage’s comment
  • Fields
    • Explicitly not supporting resolver, see epage’s comment
    • Supporting edition which the RFC showed inconsistent status for
    • Supporting rust-version which is newer than the RFC
    • Supporting include and exclude which the RFC couldn’t, see epage’s comment

Unresolved questions:

  • Do we stablize inheriting from workspace.package and workspace.dependencies or do we switch which tables we use? see epage’s comment
  • Preferred documentation style: RFC uses field = { workspace = true } but TOML allows field.workspace = true
  • Is the performance good enough?

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 93
  • Comments: 47 (37 by maintainers)

Commits related to this issue

Most upvoted comments

This feature would be incredible 😄

If you then change your repository, you then have one place to update.

I see this less about reducing boiler plate and more about having one place to edit,. so long as the boilerplate isn’t too high.

In particular, I manage a good number of crates. They all follow the same pattern so whenever I improve the pattern, I update the first alphabetically. Every once in a while, I will then diff that first repo against all of the other repos, updating them. There are many times when I forget there are other crates in the repo and don’t update them. This will reduce the chance of that happening.

We can always add change to implicit in a new edition if we find that it is too much.

Team member @epage has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn’t been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

Ok I’ve got a moment to write up some instructions, I shall do so! I think this’ll be posted on TWiR soon to see if others are interested, so I’m hoping that I can help guide someone interested to implement this.

To get started, I’d probably break this down based on the reference level explanation:

Update [workspace] parsing

First you’ll probably want to implement [workspace] parsing updates.

That’ll be done by updating this structure with all the relevant fields. You’ll need to take care when handling the dependencies, being careful to share code with existing dependency parsing.

Once the fields are parsed you’ll want to carry them over to WorkspaceRootConfig which contains the format of each field that is persisted throughout the lifetime of Cargo itself. This’ll be where dependencies are elaborated (again, sharing code from before) into a Dependency structure for example.

At this point you should be able to write some tests. All our tests are in tests/testsuite/*.rs and you’ll probably want to make a new test suite at something like tests/testsuite/deduplicate_workspace.rs and add mod deduplicate_workspace; to tests/testsuite/main.rs. You can then run tests with cargo test deduplicate_workspace.

Placeholder { workspace = true }

Next you can tackle workspace = true directives in Cargo.toml. The best way to implement this is probably going to be making a generic thing like:

pub enum MaybeWorkspace<T> {
    Defined(T),
    Workspace { workspace: bool },
}

If possible, the best way to handle this will be to not persist this MaybeWorkspace type all throughout Cargo. Ideally a Manifest in Cargo would retain the actual value for each manifest. I’m not sure whether this is available at this time, but in theory you’ll want to “elaborate” a manifest with a WorkspaceRootConfig on-hand where you can resolve any MaybeWorkspace<T> with the &WorkspaceRootConfig next to it.

New dependency directives

To implement new dependency directives you’ll update this structure along with this method. Again ideally there will be a new &WorkspaceRootConfig argument which allows resolving all workspace = true dependencies by basically clone-ing the Dependency from the workspace.

Inferring version directives

For inferring version on path dependencies that will need to happen on the packaging phase of Cargo. First you’ll want to update where the error is generated today. Next you’ll want to update to_registry_toml which will update the manifest (or probably a clone of the current manifest) with version numbers.


And I think that should at least be enough to get almost all the way there! I’m sure there’s things that will likely come up in review and questions a long the way. The hardest part will probably be getting WorkspaceRootConfig present while parsing other manifests, that may require some refactoring in Cargo (or maybe there’s a better way I’m not seeing!)

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

Not all fields are as trivially updatable.

I’ve started work on this.

Thank you for the super detailed writeup, @alexcrichton . You can see my progress here, but for convenience I’ll keep this list updated:

EDIT: The PR is open, so please check there.

Aiming to complete this list relatively soon - hopefully ~tomorrow~ in the coming week.

A lot of the existing documentation for cargo prefers a = { b = c} over a.b = c. For instance, see this which specifies git deps as regex = { git = "whatev" }, or this part of the reference which specifies optional dependencies as gif = { version = "0.11.1", optional = true }

The way this is implemented in cargo-add:

  • When there is only a workspace = true, use dep.workspace = true
  • When other keys are set, use dep = { workspace = true, ... } To mirror the dep = "version" short-form and long-form

The main motivation for the preferred style is more all of the other keys (e.g. version.workspace = true) since no other keys will be set, making this style shorter and easier to read (in my opinion). We then chose to keep this consistent when going into the dependencies and to be like setting dep = "version". Keep in mind that unlike git, workspace will have a lot fewer keys that can be set with it (since they can only be set in the workspace dependency and not in the package dependency) so its a lower probability of adding more keys (optional being most likely).

In fact, I don’t think I’ve seen the a.b = true syntax used anywhere in the current cargo doc (though I may have missed it!)

Looks like dotted keys were added in 2018 (no idea how much toml-rs lagged behind that) and there is momentum for adopting change.

Plus, I suspect it doesn’t scale well if you have a lot of items in a table, so we should be judicious in where its used.

That is true, but mass updates of these fields are already pretty trivial sed -i 's#https://repo#https://new_repo' **/Cargo.toml.

While some things can be trivially updated using command line tools, first-party support that is guaranteed to work is usually a better solution.


I see this less about reducing boiler plate and more about having one place to edit,. so long as the boilerplate isn’t too high.

@epage is correct that this is about having one place to edit more specifically deduplicating manifest information across workspace members. Having users write repository.workspace = true makes this an opt-in feature where inheriting by default would be (more than likely) a breaking change and be opt-out. Solving both being opt-out and it being a breaking change would require extending the current implementation to be similar to:

[workspace.package]
repository = { value = "https://repo", default = true, override = false }

If that is something that should be done, I think it should be a different change set or have an RFC to discuss it. This is mostly because it is out of the scope of this RFC.


As for [package.metadata], the RFC mentions it just above this link:

For now the metadata key is explicitly left out (due to complications around merging table values), but it can always be added in the future if necessary.

Yep, dependencies are very useful, I’m currently testing using this for internal workspace references, it’s nice to be able to collapse from the below down to stylish-core.workspace = true.

stylish-core.path = "../core"
stylish-core.version = "0.1.0"
stylish-core.default-features = false

One other set of fields I just noticed I have duplicated everywhere is:

[package.metadata.docs.rs]
all-features = true
targets = ["x86_64-unknown-linux-gnu"]
rustdoc-args = ["--cfg", "docsrs"]

Reskimming the RFC discussion I don’t see any mention of package.metadata, and something like package.metadata.docs.rs.workspace = true syntax wouldn’t be possible for it as it could already have meaning inside the inner tables.

Deviations from the RFC are summarized in https://github.com/rust-lang/cargo/issues/8415#issuecomment-1112618913.

Specifically you are looking for

Not locking workspace dependencies, see https://github.com/rust-lang/rfcs/pull/2906#issuecomment-1095098143

FWIW, I like field = { workspace = true } better because:

  • It is visually faster to read the dep name and also read that it’s inheriting from workspace.
  • Lot of existing usage uses {}

Where as I find field.workspace = true is little bit difficult to parse since we need to trim the .workspace when we are glancing at it.

Because of the impact of conflicts between workspace fields intended to control a workspace with those that are intended for inheritance (metadata, include, exclude), a small part of me wonders if we should tweak this to have the inherited fields in an explicit table, like a workspace.inherit. While I would appreciate include support, the overall value of handling these existing cases seems low compared to the user-simplicity of workspace looking like package.

@kornelski brought this up in the RFC discussion:

publish = { value = false, inherit = true } is odd, because it’s not inheriting from anything, it sets itself up as a template. In a way that gives inherit a double meaning of being an import, or being a template.

Maybe the explicit transfer of properties from the workspace to members could be marked differently? e.g. with a template section:

[workspace.package-defaults]
publish = false

A separate section would also avoid future collisions between workspace and package keys.

I didn’t really see any other acknowledgement of this in the thread.

I brought up having these in a workspace.inherit. @kornelski mentioned workspace.package-defaults. I think to be consistent with dependencies, we should do workspace.package.

If we are ok with an extra layer of nesting, we could consider workspace.default.package and workspace.default.dependencies. We’d have to bike shed on the word default to convey the right semantics (“default” implies implicit while we do explicit inheritance).

I’m going to propose we switch the implementation to workspace.package and mark it as an open issue for resolving. This will unblock workspace.package.exclude which will unblock workspace.package.include.

For projects that need to toggle it we figured that you could disable default features in [workspace.dependencies] and then enable the default feature in directives as necessary. This should still give the same benefit of a workspace-level definition and dependencies enabling the default feature would be just a little more wordy.

@alexcrichton Ah, I see. That’s not bad at all, and addresses my main concern about managing the version numbers centrally. Thanks for clarifying that!

The piece I was missing is the fact that specifying a non-additive, non-default feature set at the package level will no longer require that default-features = false be set locally (in the package-level Cargo.toml file), as long as that value has been set at the workspace level.

I now see that it is right there in the text of the rfc that I quoted, but I suppose the way I was reading it was too heavily shaped by current practice (where default-features = false and features = [...] always appear as a pair in the non-additive case) 😃

@salewski it’s always something we can tweak at a later date (or even now before stabilization), but in our discussions amongst the Cargo team awhile back the conclusion was that we couldn’t really think of a non-surprising way of dealing with default-features. For projects that need to toggle it we figured that you could disable default features in [workspace.dependencies] and then enable the default feature in directives as necessary. This should still give the same benefit of a workspace-level definition and dependencies enabling the default feature would be just a little more wordy.

Hello, I’ve been making steady progress on this and wanted to give an update:

The WorkspaceRootConfig is not available when we’re parsing the manifest, so we have to locate and parse it somewhere in do_read_manifest(). Currently the logic for finding the root and parsing its Config.toml lives in struct Workspace itself and makes use of workspace.packages to cache parsed tomls.

I see a few different ways to proceed:

  1. Require a Workspace in do_read_manifest somehow. Workspace::new() currently parses the entire workspace, so this could get complicated.
  2. Extract a new ParserCache out of Workspace. It would sit in front of the Packages cache, accept a manifest_path argument, and return a deserialized TomlManifest.
  3. Ignore the packages registry entirely. This would be easiest, but worst-case would require parsing every toml in the workspace every time you parse any manifest. I don’t think this is a good solution, but mention it for completeness’ sake.

I’m not blocked - am currently leaning towards approach 2 - but please let me know if I’m missing anything or if there’s a better way to proceed.

Edit: In case anyone’s seeing this now, I went with 2 and it seems to work 😃

One place where this is very helpful is for git dependencies. Git dependencies have a lot of “noise” (much longer than version numbers) and are hard to edit automatically. If you have a mismatch you end up with duplicate versions, because cargo doesn’t know how to unify versions from different rev commits. Having a single place to edit git deps would be extremely useful.

Starting FCP for stablizing workspace inheritance where workspace package’s could share metadata and dependency definitions.

We sought community feedback and I have seen projects actively using it. Feedback has been positive though not very loud. We had questions from users in this thread that seemed to be resolved. With nothing known in the way, I felt it was time to propose to merge.

@rfcbot fcp merge

Deviations from the original RFC

  • Not locking workspace dependencies, see alexchrichton’s comment
  • Not inserting a version into path dependencies when publishing, see epage’s comment
  • Changed the table package fields inherit from, from workspace to workspace.package, see epage’s comment, this PR, and this thread
  • cargo-add, documentation, and tests prefer field.workspace = true over field = { workspace = true }
  • Added cargo-add support see this comment and this PR
  • Fields
    • Explicitly not supporting resolver, see epage’s comment
    • Supporting edition which the RFC showed inconsistent status for
    • Supporting rust-version which is newer than the RFC
    • Supporting include and exclude which the RFC couldn’t, see epage’s comment

We had questions on whether workspace.package is the best fit but we weren’t able to come up with anything better. It does improve upon the RFC putting inheritable package keys under workspace. As the usage of this spreads, if we find out this is too confusing, we can always deprecate the current table and put it under a new table.

One issue I often have in workspaces is that features are unified between all crates when using cargo check by itself, but not when using cargo check -p whatever. So you end up recompiling half the dependency tree when you try to build anything less than “all crates at once”. https://docs.rs/cargo-hakari/0.9.13/cargo_hakari/about/ works around this by having all crates depend on a workspace-hack crate which unifies the features even when using -p whatever.

Would that be addressed by this RFC? Would workspace = true unify features between all dependencies, even if they are not direct dependencies? Or does that need other changes to cargo?

Are we ok with the default style (e.g. docs, cargo-add) being foo.workspace = true over foo = { workspace = true }?

A lot of the existing documentation for cargo prefers a = { b = c} over a.b = c. For instance, see this which specifies git deps as regex = { git = "whatev" }, or this part of the reference which specifies optional dependencies as gif = { version = "0.11.1", optional = true }. In fact, I don’t think I’ve seen the a.b = true syntax used anywhere in the current cargo doc (though I may have missed it!)

I don’t really mind either way, but I think having a unified style for subkeys would make it easier to understand what’s going on.

I am not sure if I am capable enough to work on this, but I would like to follow the development to see how things work. Does https://github.com/alexcrichton/toml-rs also needs to be updated?