licensed: Difference between ignored and reviewed

Recently I got a question from my teammate:

What’s the difference between reviewed and ignored in .licensed.yml?

I found only the following:

# These dependencies are explicitly ignored.
ignored:
  bundler:
    - some-internal-gem

  bower:
    - some-internal-package

# These dependencies have been reviewed.
reviewed:
  bundler:
    - bcrypt-ruby

  bower:
  - classlist # public domain
  - octicons

For me it’s still not really clear when I should put a gem into ignored: and when into reviewed:. Can you guide me?

Probably this should be clarified in the docs or combined into a single category to avoid confusions and save time debating with colleagues.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 20 (20 by maintainers)

Commits related to this issue

Most upvoted comments

@sergey-alekseev @mlinksva

reviewed is meant for dependencies that need to be cached and checked, but do not have a license found that matches the allowed configured licenses.

ignored is meant for dependencies that shouldn’t be cached at all, and will not have cached metadata written to the repo.

Good question! I’ll update the docs when I have a chance, though I’d also welcome a PR 😄

@mlinksva @benbalter I’ve pared down https://github.com/jonabc/licensed-example/tree/master/vendor/gems/ruby/2.6.0/gems to only include the dependencies in the audit above.

I’m not expecting anything to be actionable for dependencies on the no license contents list, but for the others it’s not immediately clear which cases should be detected or not. If anyone gets a chance could you take a look and see if there is anything in there you’d expect that licensee would already detect, or would be reasonable to detect with additional work?

🙇

Makes sense, I haven’t thought about it. So Licensed can’t fetch those license files from local directories as they simply do not exist there. They can still be grabbed from repositories and filled in manually then when necessary.

I’ve updated https://github.com/jonabc/licensed-example/ to include the full content after bundle install --path vendor/gems . This is the “source of truth” that github/licensed uses to determine dependency data.

It’s mostly complete but won’t contain data on the ruby default gems like bundler and json

@sergey-alekseev I agree with you that the remote repositories have license files and/or licenses in the readme, but those files aren’t included in the distributed gem - you can check each gemspec spec.files property to see what is included.

The only special case here is bundler, which I’ve found to kind’ve do its own thing and may or may not be directly installed with ruby. For example my local bundler installation doesn’t have a README or LICENSE

➜  licensed-example git:(master) ✗ ls -l ~/.rbenv/versions/2.6.4/lib/ruby/gems/2.6.0/gems/bundler-1.17.2/
total 0
drwxr-xr-x  4 jonabc  staff  128 Oct  9 10:28 exe
➜  licensed-example git:(master) ✗ ls -l ~/.rbenv/versions/2.6.4/lib/ruby/2.6.0/bundler/
build_metadata.rb          deployment.rb              gem_helpers.rb             lockfile_generator.rb      ruby_dsl.rb                source.rb                  vendored_persistent.rb
capistrano.rb              deprecate.rb               gem_remote_fetcher.rb      lockfile_parser.rb         ruby_version.rb            source_list.rb             vendored_thor.rb
cli/                       dsl.rb                     gem_tasks.rb               match_platform.rb          rubygems_ext.rb            spec_set.rb                version.rb
cli.rb                     endpoint_specification.rb  gem_version_promoter.rb    mirror.rb                  rubygems_gem_installer.rb  ssl_certs/                 version_ranges.rb
compact_index_client/      env.rb                     gemdeps.rb                 plugin/                    rubygems_integration.rb    stub_specification.rb      vlad.rb
compact_index_client.rb    environment_preserver.rb   graph.rb                   plugin.rb                  runtime.rb                 templates/                 worker.rb
compatibility_guard.rb     errors.rb                  index.rb                   process_lock.rb            settings/                  ui/                        yaml_serializer.rb
constants.rb               feature_flag.rb            injector.rb                psyched_yaml.rb            settings.rb                ui.rb
current_ruby.rb            fetcher/                   inline.rb                  remote_specification.rb    setup.rb                   uri_credentials_filter.rb
definition.rb              fetcher.rb                 installer/                 resolver/                  shared_helpers.rb          vendor/
dep_proxy.rb               friendly_errors.rb         installer.rb               resolver.rb                similarity_detector.rb     vendored_fileutils.rb
dependency.rb              gem_helper.rb              lazy_specification.rb      retry.rb                   source/                    vendored_molinillo.rb

@sergey-alekseev thanks for your patience!

I just completed a quick audit of the example you provided. Here’s my findings for things without license contents, and those marked other

No license contents

This class of failures is what we’re trying to address in https://github.com/github/licensed/issues/189.

  • aws-eventstream, has license from package manager apache-2.0
  • aws-partitions, has license from package manager apache-2.0
  • aws-sdk-core, has license from package manager apache-2.0
  • aws-sdk-kms, has license from package manager apache-2.0
  • aws-sdk-s3, has license from pacakge manager apache-2.0
  • aws-sigv4, has license from package manager apache-2.0
  • bundler, has license from package manager mit
  • libv8, has license from package manager mit
  • paper_trail, has license from package manager mit
  • rails-settings-cached, has no license as part of released package at version 0.6.5 - https://github.com/huacnlee/rails-settings-cached/tree/v0.6.5. Upgrade to v2.0.0+ for license inclusion in package

“other” license classification

The list here shows each dependency that was classified as “other”, which files licensed found for each dependency, and the resulting classification from licensee for each file. In the case of other classifications for a file I also added a note on what I was able to tell on why a license wasn’t detected

The majority of the failures here were due to deviations from an official license template for various reasons. I think some of them might be actionable to fix in the license detection logic in licensee, but many are likely to continue to be painful.

It’s getting a little late in my day, but I’ll open an issue tomorrow over in licensee and link to this + the example repo and see if there are any thoughts on improvements.

  • activerecord
    • license - other. nonstandard additions to mit template, additional copyright
    • readme - mit
    • gemspec - mit
  • awesome_print
    • license - other. nonstandard additions to mit template from author contact info
    • readme - mit
    • gemspec - mit
  • bcrypt
    • license - other. multiple assertions
    • gemspec - mit
  • climate_control
    • license - mit
    • readme - other. does not reference license by title
    • gemspec - mit
  • concurrent-ruby
    • license - other. license includes markdown code block markers (```) and link to license
    • gemspec - mit
  • css_parser
    • license - other. nonstandard header
    • gemspec - mit
  • domain_name
    • license - other. multiple licenses (BSD-3, BSD-2, MPL-2)
    • readme - other. multiple licenses
    • gemspec - other. multiple licenses
  • exception_notification
    • license file - other. looks like it’s from inclusion of The MIT License (MIT) as part of the wordset because it’s after the copyright notice
    • readme - other. looks like copyright matching negates the rest of the assertion here maybe, i.e. dependency.readme_file.content_normalized == ""
    • gemspec - mit
  • htmlentities
    • license - other. content above copyright line causes detection mismatch
    • gemspec - mit
  • jquery-ui-rails
    • license - other. non standard mit license content jQuery UI as well as this gem are licensed under the MIT license (see jquery-ui/MIT-LICENSE.txt).
    • gemspec - mit
  • method_source
    • license - other. preamble above license causes detection mismatch
    • readme - mit
    • gemspec - mit
  • mime-types
    • license: other. nonstandard formatting of header, title, copyright causes misdetection
    • gemspec: mit
  • mime-types-data
    • license: other. nonstandard formatting of header, title, copyright causes misdetection
    • gemspec: mit
  • nokogiri
    • LICENSE.md - mit
    • LICENSE-DEPENDENCIES.md - other (multiple license assertions)
    • README - mit
    • gemspec - mit
  • paperclip
    • license - other. misunderstood formatting of header LICENSE above The MIT License
    • readme - mit
    • gemspec - mit
  • pathname-common_prefix
    • readme - other. ruby license for majority, gnu lgps for setup.rb
  • pg
    • license - other. bsd-3 or ruby license
    • gemspec - bsd-3
  • puma
    • license - other. misunderstood formatting of header / copyright
    • readme - bsd-3
    • gemspec - bsd-3
  • rack
    • license - other. missing OR COPYRIGHT HOLDERS from license text is the only difference I can see
    • gemspec - mit
  • rails-settings-cached
  • rubyzip
    • readme - other. ruby license
  • terrapin
    • license - other. misunderstood header format
    • readme - other. doesn’t reference the name of the license used
    • gemspec - mit
  • unf
    • license - bsd-2
    • readme - other. written as 2 clause bsd, where one of bsd 2-clause or bsd-2-clause was expected
  • websocket-driver
    • license - other. uses short form apache-2.0 reference. Should be detected?
    • gemspec - apache-2.0
  • websocket-extensions
    • license - other. uses short form apache-2.0 reference. Should be detected?
    • gemspec - apache-2.0

I believe my previous assumption was wrong if I understand your comment correctly. Then I’m not sure how to make licensed status green in this case. Probably with sources: http_accept_language.gemspec as you outlined. Can you please rephrase and clarify, so I can understand better what you mean?

That sounds like the best option for now as the most straightforward fix. Hopefully as we figure out https://github.com/github/licensed/issues/189, much of this will just fix itself automatically. If you have any specific preference beyond “just make it work please”, please do to comment on the issue 🙇 😄

Some examples added to https://github.com/sergey-alekseev/licensed-example

I forked your example 👍 I don’t have enough time to dig in at the moment but I’d love to look through this and see exactly why each other classification occurred. It’d be great if we could better support some of these scenarios

Going to answer as many of these as I can 😄

So if I understand correctly your proposal is to change project/.licenses/bundler/http_accept_language.dep.yml to the following:

Any content licensee detects from a README is already included in cached output, so I don’t think it’s that.

What would change in this example is that we’d be showing the license specification picked up from http_accept_language.gempsec. Still working through the best way to show that, but as a possible example

name: http_accept_language
version: 2.1.1
type: bundler
summary: Find out which locale the user preferes by reading the languages they specified
  in their browser
homepage: https://github.com/iain/http_accept_language
license: mit
licenses:
- sources: http_accept_language.gemspec
  text: s.license     = "MIT"
notices: []

If my understanding is correct, this is a slippery slope for the reasons outline by licensee. If others with better understanding of OSS licensing requirements give a 👍 that representing this content as a license is ok I’m happy to make that change.

For my project I had to change 36 .licenses/bundler/*.dep.yml files manually our of 171 dependencies (21%). 23 times I changed only license: other, 11 times only licenses: [] and 2 times both

Yeah that’s painful. Some feedback around the details of why you had so many adjustments needed would be great

  • license: other: Is the license text far enough off from the standard template for the license that a false negative is found? Are multiple licenses found? Something else?
  • licenses: []: Do all 11 of these not have any LICENSE type file and nothing in the README?

That said, licensed won’t overwrite the license field on dependency version updates if the (normalized) license contents are determined to be the same between the cached and new version. In our experience there is high likelihood that nothing changes re: license contents between versions of a dependency, so a majority of the pain in clarifying licenses is only experienced once.

However one interesting thing I noticed is Licensee fetching license well from a git source, but not locally

That’s a better question to ask at Licensee directly, though in my experience I’ve found this has to do with gem not installing all the files that Licensee uses from the repo as part of the package.

Does this mean that even though an external dependency has e.g. the MIT license but doesn’t have license text you can’t use it? At least licensed status will not succeed. I’d agree with this comment, but you know licensing requirements, so can you please clarify?

This is the part where I get to say I’m not a lawyer and don’t personally have an answer on what constitutes a binding agreement when using an OSS license. I don’t know @grosser personally and couldn’t tell you whether his comment is based in legal knowledge or personal opinion.

As I mentioned before, licensee does bring this up in case that helps?

I did a quick search to see whether you had any comments on that during the development, but looks like the requirement was added during the initial release of v1.0.0 in the first commit.

The requirement is older than my involvement in the product 😆 I asked awhile ago whether reviewed dependencies should still surface this error, and the answer was to keep the error in place. I don’t recall whether the question+answer was pre-v1 release or not, but I couldn’t find anything in the issue/PR repo history.

Aside: thank you @sergey-alekseev for making PRs (seen from your examples, also thanks for providing concrete examples!) to improve licensing documentation from projects where it is missing, silently saving time for developers using (as they should) license tooling (licensed or numerous others) who might otherwise need to ask for review of innocuous dependencies.

@sergey-alekseev what you described is intentional, though I’m not sure what the recommended remediation here is. @mlinksva any thoughts?

The goal is to have cached metadata for all external dependencies that are used by the shipped product. The ignored flag is to say that a dependency that’s being found isn’t shipped as part of the public product, or it’s something built by the project owner and isn’t an external dependency. An external dependency that’s found but doesn’t have license text shouldn’t be set as ignored

An interesting thing though. Chances are an external dependency doesn’t have a license info (examples: https://github.com/iain/http_accept_language/issues/72, https://github.com/grosser/i18n_data/pull/47, https://github.com/grosser/sort_alphabetical/pull/23). If added to reviewed then licensed status fails with “missing license text”. Hence they should be added to ignored. So I’m not sure whether “but do not have a license found that matches the allowed configured licenses.” suits well for reviewed 🤔