berry: [Bug] PnP API cannot be consumed if file located outside workspace

Describe the bug

require('pnpapi') fails if it’s run in a file located outside Yarn workspace. For example, IntelliJ consumes PnP API in this way (intellij-yarn-pnp-deps-tree-loader.js is located in IDE installation folder) and after updating Yarn to 2.0.0-rc.22 it fails. This disables IntelliJ Yarn 2 integration unfortunately.

To Reproduce

  1. Create an empty package.json (just {}).
  2. Run yarn policies set-version berry, it will install Yarn 2.0.0-rc.22
  3. Run yarn install
  4. Create api-client.js file with the following content
require('pnpapi');
console.log('OK');
  1. Running node --require ./.pnp.js api-client.js outputs OK
  2. Move api-client.js outside of Yarn workspace with mv api-client.js ..
  3. Running node --require ./.pnp.js ../api-client.js fails with Error: Cannot find module 'pnpapi'

Environment if relevant (please complete the following information):

  • OS: Linux
  • Node version 12.11.1
  • Yarn version 2.0.0-rc22

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 22 (18 by maintainers)

Commits related to this issue

Most upvoted comments

tldr: You need to go through createRequire, otherwise the require call is ambiguous.

I’ve thought some more about your question on require.resolve, and I now have a better answer which will also answer this other question.

First, consider that Yarn supports a mode where all projects on the disk load packages from the same global cache. Because of this, a problem appears when doing the following:

require('/home/segrey/WebstormProjects/untitled17/.yarn/cache/eslint-npm-6.8.0-d27045f313-1.zip/node_modules/eslint/lib/options');

Which dependency tree is this ESLint file part of? Since the cache may be shared, multiple projects on the disk may depend on this file - but we don’t know which one is the right one if we only have the path (and we need to know it in order to give ESLint access to its own dependencies). We could maybe default to the global PnP hook, but even then it might not be the right one … and generally speaking, if something cannot be guaranteed to be true, we should assume it can’t be relied.

So a followup question is: how is Yarn able to make such require calls work from scripts located within a dependency tree? The answer is that it keeps track of the PnP API currently in use in each module, so when you make a require it will not only use the path you pass as parameter, but also the dependency tree of the script that calls require (as a trivia, this information is available in module.pnpApiPath). With both of those informations, we can disambiguate ESLint and be sure that we load the right version.

So one last question remains: what’s the difference between createRequire and require.resolve(..., {paths})? The answer is that you don’t actually use require.resolve alone … you typically use it followed by a require call, right? And as we’ve seen, the require calls use the context of the caller script in order to disambiguate the dependency, which means that you’ll always require ESLint as if it was part of the dependency tree of the caller (so in your case, the classic node_modules resolution). Which isn’t right.

By contrast, createRequire is different because it actually creates a new module with a new context. This new context will locate the right PnP API given an entry point, and because you’ll use the result of createRequire for both resolution and instantiation, you’ll load the following modules from the proper context.

To make things maybe clearer, consider what happens if you run this code from a PnP project when the global cache is enabled?:

const pathToMyESLint = require.resolve('eslint');
const pathToAnotherESLint = require.resolve('eslint', {
  paths: [pathToAnotherProject],
});

// Since we have a global cache, pathToMyESLint === pathToAnotherESLint. So
// what should happen when we do this?
const eslint = require(pathToAnotherESLint);

By contrast, if you use createRequire, then we keep all the informations we need to disambiguate the calls:

const requireForOtherProject = createRequire(pathToAnotherProject + '/package.json');

const pathToMyESLint = require.resolve('eslint');
const pathToAnotherESLint = requireForOtherProject.resolve('eslint');

// pathToMyESLint and pathToAnotherESLint are still equal, but this time we are
// able to load them through their own unique `require` contexts:
const myEslint = require(pathToMyESLint);
const anotherESlint = requireForOtherProject(pathToAnotherESLint);

I hope that makes sense - it’s fairly complex, so please feel free to ask me any question. I will try not to make my answers as long as this post 😅

Is it OK to run intellij-yarn-pnp-deps-tree-loader.js from IDE as node --require /path/to/.pnp.js intellij-yarn-pnp-deps-tree-loader.js? It works, but I’d like to ensure it’s a legitimate way to run a script in Yarn PnP environment.

Yep, we’ll keep supporting non-PnP accesses to the API. The .pnp.js will also be the correct way to load the runtime for at least the 2.x line, and probably more (the only thing I could see that would make us change that are the builtin Node loaders, but that’s very much a wip and it’ll take years before we seriously get there).

Is there a way to install an arbitrary Yarn 2 release (e.g. 2.0.0-rc18) using command line?

Never thought about it 🤔 I think you should be able to download any release via GitHub:

https://github.com/yarnpkg/berry/raw/%40yarnpkg/cli/<version>/packages/yarnpkg-cli/bin/yarn.js

I’ll add support for this in yarn set version too 👍 I’ll also start publishing the releases on npm soon (by the end of the week, I’d say).

I think the regression occurred while I was working on the multi-tree improvement (#630). I think I’ll fix that by making require('pnpapi') from a non-PnP module return an object with only the new findApiFromPath method (which I still need to add to the documentation). Would that work for you? You could use it like this:

const api = require(`pnpapi`).findApiPathFor(process.cwd());