nx: Nx is very slow in a large monorepo (>500k files) due to globbing of `**/project.json`
Current Behavior
Reopening https://github.com/nrwl/nx/issues/9660
I’m trying to integrate nx in an existing large monorepo (>500k files), but it only has ~25 workspaces.
When nx initializes, it tries to get the list of workspaces by globbing both **/project.json and all the workspaces items from package.json:
This **/project.json forces the glob to crawl every file, and that takes 5+s in my case, just the init phase of every command.
Dropping **/project.json from the glob makes the glob operation instantaneous.
A fix would be to have something similar to getGlobPatternsFromPackageManagerWorkspaces that only reads package path entries from package.json workspaces instead, and appends project.json.
Expected Behavior
To have a faster nx boot time.
Github Repo
No response
Steps to Reproduce
Any repo with hundreds of thousands of files should show the slowness.
Nx Report
> NX Report complete - copy this into the issue template
Node : 16.18.0
OS : darwin arm64
yarn : 3.2.4
nx : 15.2.4
@nrwl/angular : Not Found
@nrwl/cypress : Not Found
@nrwl/detox : Not Found
@nrwl/devkit : 14.0.0
@nrwl/esbuild : Not Found
@nrwl/eslint-plugin-nx : Not Found
@nrwl/expo : Not Found
@nrwl/express : Not Found
@nrwl/jest : 14.0.0
@nrwl/js : Not Found
@nrwl/linter : 14.0.0
@nrwl/nest : Not Found
@nrwl/next : Not Found
@nrwl/node : Not Found
@nrwl/nx-cloud : Not Found
@nrwl/nx-plugin : Not Found
@nrwl/react : Not Found
@nrwl/react-native : Not Found
@nrwl/rollup : Not Found
@nrwl/schematics : Not Found
@nrwl/storybook : Not Found
@nrwl/web : Not Found
@nrwl/webpack : Not Found
@nrwl/workspace : 14.0.0
typescript : 4.8.4
---------------------------------------
Local workspace plugins:
---------------------------------------
Community plugins:
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 6
- Comments: 21 (2 by maintainers)
I’m working on setting up a JS monorepo within a very large Ruby monolith, and running into this same issue. The
**/project.jsoncauses all of the directories to be crawled, while we only care about a small subset. I’ve tried leveraging.nxignoreto speed it up, but it doesn’t seem to help. I think the problem lies in this section of code: https://github.com/nrwl/nx/blob/73bc2e1c915fac40e29db930297121646362733b/packages/nx/src/config/workspaces.ts#L630-L646Note that the
globSynccall is made without taking.nxignoreinto consideration. Theigvalue is passed intodeduplicateProjectFiles, but not until after we have already globed the file system without considering.nxignore. Would it be possible to parse the.nxignoreand add the values toALWAYS_IGNOREbefore callingglobSync? Something like this:@vdumitraskovic I created this patch (yarn patch or patch-package):
This is still an issue.
Hey, sorry for the lack of answers on this one. Its certainly not stale, and the globbing performance is something that we are aware of. Generally tuning .nxignore can help out quite a bit, but there are some similar issues that would probably shake out to work for this one too.
For instance, #13843 would be solved by a similar solution.
We don’t want to adopt the package.json workspaces field in particular because that ties Nx down to JS and requiring the root package.json. Additionally, the current setup allows you to have some projects managed by npm workspaces and some only by Nx. I could see a world where we add a similar field into nx.json, but I’m not sure exactly what that would look like. I’ll try to get with @FrozenPandaz and see what his thinking is, probably need some input from @vsavkin on this one too.