gridsome: How to debug/improve slow build performance

Description

I am building an e-commerce site with about 235,000 products and the build is taking 86 minutes. Since the FAQ says:

Gridsome can generate thousands of pages in seconds so you can build pretty large sites without any problems.

I’m wondering if the slow part is loading a large amount of data into the GraphQL data layer? In gridsome.server.js I have:

module.exports = function (api) {
  api.loadSource(({ addCollection }) => {
    const Product = require('./src/data/products.json');

    const collection = addCollection({
      typeName: 'Product'
    })

    for (const product of Product.rows) {
      collection.addNode(product);
    }
  }
}

And for my template path in gridsome.config.js I have:

  templates: {
    Product: '/:id/'
  },

And this is my output after running yarn build:

-<1:%>- yarn build
yarn run v1.22.4
$ gridsome build
Gridsome v0.7.14

Initializing plugins...
Load sources - 154.96s
Create GraphQL schema - 2.2s
Create pages and templates - 28.2s
Generate temporary code - 0.07s
Bootstrap finish - 186.32s
Compile assets - 7.25s
Execute GraphQL (244162 queries) - 1643.12s
Write out page data (244162 files) - 1316.14s
Render HTML (244162 files) - 2008.86s
Process files (0 files) - 0s
Process images (9 images) - 10.33s


  Done in 5176.47s

Done in 5177.80s.

Is there a way to see what is taking so long and if there’s a way to improve it? Is Gridsome able to build sites this large efficiently?

Maybe is there someway to take advantage of multiple cores?

Note: I actually have more data coming (category pages, etc), so it would be nice if I could figure this out 😃

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Comments: 16 (10 by maintainers)

Commits related to this issue

Most upvoted comments

@Kulcanhez The path argument was slow because it used a regex to find the node (to match with and without trailing slash). But a new version was just published (0.7.17) which should make path just as fast as id.

I think the id argument should be used in templates etc, where you already have the $id variable. And use path if you need to get a node when you don’t know the id. But path is usually only available if the collection has a template. So v0.8 will add “filters” to single node fields, to query a node by slug for example, or any other unique field.

@srchulo I did some more tests and it turns out that querying a node by $id is much faster than $path. This query runs about 200k times in 5.5s instead of 934s on my computer.

query Product ($id: ID!) {
  product(id: $id) {
    title
    description
  }
}

For everyone who is interested to check this issue: https://github.com/noxify/gridsome-slow-performance

For me, it seems, that the problem is located here: https://github.com/gridsome/gridsome/blob/master/gridsome/lib/app/build/executeQueries.js

@srchulo You can disable indices this way. It must be disabled before adding nodes. And I don’t think it’s necessary to enable it again since the index should be built automatically when needed.

const collection = addCollection('Product')
collection._collection.configureOptions({
  adaptiveBinaryIndices: false
})
api._app.store.nodeIndex.index.configureOptions({
  adaptiveBinaryIndices: false
})

Is it possible to make any more steps parallel?

The reason only those two steps runs in workers is that they can run without access to the app instance. The node processes you see in Gatsby might be the image processor, which starts right after the bootstrap phase has finished. We could try the same in Gridsome sometime and see how it works 😃

Would there be another way to do this without the large file in memory?

I’m not sure if the file size really was the issue. But I’ll do some more debugging to find out why things got slower.