sequelize: High CPU usage on models associations

What is actually happening?

Our system uses multiple databases, and our connection pools are by user. For each user loggin all the models are created and associated using the sqlizr plugin approach. Our system has 100+ models and the above situation is blocking the cpu. All others requests stay waiting for this process finish.

cpu-block

What do you expect to happen?

We looking for a way to do this associations on login without blocking the CPU each time.

Is there another way to do this models associations? How can we improve it?

We’ve tried to cache the instances of model and reuses then, unsuccessfully. We’ve tried to rewrite and improve the sqlizr plugin, but does not solved our problem.

Please, see the snippet below of our implementation of the approach used by sqlizr.

Object.keys(models).forEach(function (modelName) {

	db[modelName] = models[modelName].call(sequelize, sequelize, Sequelize);
});

// create associations
Object.keys(db).forEach(function (modelName) {

	if ('associate' in db[modelName]) {

		db[modelName].associate(db);
	}
});

Which dialect am I using?

  • PostgreSQL

Which sequelize version am I using?

  • sequelize 3.20.0 but i’ve tested the 4.26.0
  • sequelize-cli ^2.3.1

About this issue

  • Original URL
  • State: open
  • Created 7 years ago
  • Comments: 15 (5 by maintainers)

Most upvoted comments

Thanks for listening, @sushantdhiman! We will try to do a performance analysis and arrange a test case as soon as possible. ✌️

Hi guys,

This problem was very worrying for us, so we kind found a solution (or almost), and I want to share with you. It’s so hard to do all the associations on login for each user. It take a long time and blocks the CPU on nodejs, so we try to do the associations on demand.

First we change all the pure include statement for a function called getInclude(). This method receive some informations about association, then it checks if the association was already made, if not, make it. After returns the ìnclude` object to execute de query. This way, on login event, only some associations are executed, and the rest as needed.

briefly, this method checks if the association was already made.

getAssociationByAlias(model, alias) {

	if (model.associations[alias]) {
		// if the association was already made, returns
		return model.associations[alias];
	}

	// Search on associations map if the current model can associate with the target model
	// model.getAssociations() returns a list of possible associations defined on the model it self
	const modelAssociation = model.getAssociations().find(association => association.options && association.options.as === alias);

	// If can, associate it and returns
	if (modelAssociation) {

		model[modelAssociation.type](model.sequelize.models[modelAssociation.model], modelAssociation.options);

		return model.associations[modelAssociation.options.as];
	}

	console.warn(`Error: ${model.name} hasn't an association with alias ${alias}!`);

	return null;
}

this method creates the include object.

getIncludeFrom(model, includeModel, required, options) {

	var alias;
	// Execute the verification of association, if not exists, creates
	const association = model.checkAssociation(includeModel, options.as);

	if (association) {

		alias = association.as;
	}

	let include = {
		model: this._sequelize.models[includeModel],
		as: alias,
		required,
		attributes: options.attributes,
		where: options.where || {},
		include: options.include || [],
	};

	return include;
}

In all queries, the include object was replaced by something like this code below

var findOptions = {
	include: [
		this.getInclude('VehicleModel', false, {
			as: 'model',
			attributes: ['id', 'name']
		})
	],
	where: {}
};

Was so hard for us to do all this changes on our system, and I confess it was a desperate measure. But works! Now the login is too much fast and the CPU is free. I know our situation is very singular, but if you guys are thinking in improvements on Sequelize, this is one of then. 😃

Glanced at sqlizr code seems ok to me.

Our system uses multiple databases, and our connection pools are by user. For each user loggin all the models are created and associated using the sqlizr plugin approach.

I dont know why you guys opted for this approach (performance or scalablility)? But it seems wrong to me, you dont want to open a connection per user OR load models per request.

Here is what you can do

  • Refactor and load all models on application startup and not per request.
  • Ditch your own pool if it solely used for performance/scalablility and not actually solving a unique domain problem

To maintain multiple db reference, if they are different just use different repo per db instance then use them in your application.

Like https://github.com/sequelize/express-example/blob/master/models/index.js

import db1 from 'db1'; 
import db2 from 'db2';

If they all are same database just use read replication

Finally it seems a userland problem not related to Sequelize. You should have someone thoroughly evaluate your setup and got with some simple approach

Topological sorting is only done by 3 methods: sequelize.sync, sequelize.truncate and sequelize.drop. Defining an association does not trigger toposorting

It sorts models based on which foreign key references which model, to be able to sync/truncate/drop them in the right order

If you’re not using these 3 methods, no toposorting is happening as it’s not computed until you call them