sequelize: MSSQL: Implicit conversion causes table scan -- Prepending N to strings precludes index use on varchar columns

What are you doing?

Querying a table and filtering on a varchar column. The issue is most easily illustrated using raw query code although it applies to both raw queries and model-generated ones.

create table names (
  name varchar(100) not null primary key clustered
)

sequelize.query('select * from names where name=:name', { replacements: { name: 'hello' } });

What do you expect to happen?

Sequelize to construct SQL without prepending N to :name

SQL sent to server:

select * from names where name='hello'

SQL Server performs an index seek using index defined above.

What is actually happening?

N is prepended.

SQL sent to server:

select * from names where name=N'hello'

SQL Server performs a table scan after doing an implicit conversion on N’hello’

Notes

I believe this problem was introduced in response to issue #3752 to handle unicode strings in mssql. The problem is that automatically treating all strings as unicode precludes index use on varchar columns which can significantly hurt performance.

The code in question begins here.

Workaround

A workaround is possible for raw queries but not model-generated ones.

Example:

sequelize.query('select * from names where name=cast(:name as varchar(100)), { replacements: { name: 'hello' } });

Dialect: mssql Database version: All (Tested SQL Server 2012/2016) Sequelize version: 4.34.0 Tested with latest release: Yes

About this issue

Original URL
State: open
Created 6 years ago
Reactions: 3
Comments: 20 (2 by maintainers)

Most upvoted comments

@rconstantine

You just need to add that code before you actually perform any queries. You don’t need to modify any other code.

Something like this should work (I think)

// in sequelize-fixes.js
const SqlString = require("sequelize/lib/sql-string");
SqlString.original_escape = SqlString.escape;
SqlString.escape = function escape(val, timeZone, dialect, format) {
  if (typeof val === 'string') {
    return SqlString.original_escape(new String(val), timeZone, dialect, format);
  }
  return SqlString.original_escape(val, timeZone, dialect, format);
}


// in index.js
require('./sequelize-fixes');
import Sequelize from 'sequelize';
import config from './inpatientDBConfig';

dsbert on Nov 16, 2019

You can use the shorter CONVERT syntax for the workaround and not need the data length if the value is under 30 characters:

sequelize.query('select * from courses where name = CONVERT(varchar, :name)', {
  replacements: {
    name: 'test1',
  }
})

Here is how you would do it with model-generated queries:

const value = sequelize.escape('test1');
Course.findAll({
  where: {
    name: {
      $eq: sequelize.literal(`CONVERT(varchar, ${value})`),
    }
  }
})

However, I agree you should be able to specify whether a string value and STRING data type is ASCII or Unicode. Perhaps something like:

const ascii = sequelize.ascii('test1'); // 'test1'
const unicode = sequelize.unicode('test1'); // N'test1'

const Course = sequelize.define('course', {
  name: Sequelize.STRING, // Defaults to NVARCHAR for mssql (maintain backward compatability)
  nameAscii: Sequelize.ASCII, // Clearly uses VARCHAR type
  nameUnicode:  Sequelize.UNICODE, // Clearly uses NVARCHAR type
});

// select * from courses where nameAscii='test1' or nameUnicode=N'test1'
sequelize.query('select * from courses where nameAscii=:nameAscii or nameUnicode=:nameUnicode', {
  replacements: {
    nameAscii: sequelize.ascii('test1'), 
    nameUnicode: sequelize.unicode('test1'),
  },
});

// SELECT [id], [name], [createdAt], [updatedAt] FROM [courses] AS [course] 
// WHERE ([course].[nameAscii] = 'test1' OR [course].[nameUnicode] = N'test1');

Course.findAll({
  where: {
    [Sequelize.Op.or]: [
      {
        nameAscii: {
          $eq: sequelize.ascii('test1'),
        },
      },
      {
        nameUnicode: {
          $eq: sequelize.unicode('test1'),
        },
      },
    ],
  },
});

jerfowler on Feb 27, 2018

I actually solved for this by creating a custom data types folder and I created a STRING_ASCII type so that I would not have to use the “literal” function everywhere.

import Sequelize from 'sequelize';

class STRING_ASCII extends Sequelize.STRING {
  constructor(length, binary) {
    super(length, binary);
    this.escape = false;
    this.key = 'STRING_ASCII';
  }

  toSql() {
    if (!this._binary) {
      return `VARCHAR(${this._length})`;
    }
    return `BINARY(${this._length})`;
  }

  _stringify(value, options) {
    if (this._binary) {
      return Sequelize.BLOB.prototype._stringify(value);
    }
    let data = options.escape(value);
    data = data.substring(0, 1) === 'N' ? data.substring(1) : data;
    return data;
  }
}

export default STRING_ASCII;

Then where I initialize Sequelize I import all my custom data types and loop over them to make them available.

import DataTypes from '../../datatypes';
...
for (const dataType in DataTypes) {
    Sequelize[dataType] = DataTypes[dataType];
  }

I am not sure if I did everything right but its been working for me.

I will say that while being able to extend the data types with your own is a great feature, however, for out of the box support for MSSQL this should not be necessary and this still should be addressed in the core code.

nomadinjax on Dec 7, 2018

BTW, I got the following to work, which is shorter than the in-model solution presented above: I didn’t need to use the escape function.

lr_rslt_id: { [Op.eq]: Sequelize.literal('WBC') }

rconstantine on Dec 7, 2018

I would like to follow same pattern that we already use http://docs.sequelizejs.com/manual/tutorial/models-definition.html#data-types

Sequelize.STRING.ASCII
Sequelize.STRING.UNICODE // same as Sequelize.STRING

sushantdhiman on Apr 7, 2018

@dsbert your code works with all the string comparisons, that’s great. But it can cause problems while working with nvarchar column type where we are storing multilingual data(specifically in case of korean, japanese, chinese language).

attaching a monkey patch where we can check column name and decide if we need to remove N or not.

const queryGenerator = require(/"sequelize/lib/dialects/abstract/query-generator");

const originalEscape = queryGenerator.prototype.escape;
const modifiedEscape = function(value, field, options) {
  const originalVal = originalEscape.call(this ,value, field, options)
  if (typeof value === 'string' && ['practice_id'].includes(field?.name)) {
    return originalVal.substring(0, 1) === 'N' ? originalVal.substring(1) : originalVal;
  }
  return originalVal;
}
queryGenerator.prototype.escape = modifiedEscape

ashish19977 on Jun 16, 2023

Based on some of the above comments, you can monkey patch this out using the following snippet.

const SqlString = require("sequelize/lib/sql-string");
SqlString.original_escape = SqlString.escape;
SqlString.escape = function escape(val, timeZone, dialect, format) {
  if (typeof val === 'string') {
    return SqlString.original_escape(new String(val), timeZone, dialect, format);
  }
  return SqlString.original_escape(val, timeZone, dialect, format);
}

dsbert on Jul 9, 2019

The escaping and ‘N’ prefix are applied based on the result of calling typeof against the parameter. By wrapping the parameter in an Object, it avoids the statement that introduces the ‘N’ prefix.

Therefore:

sequelize.query('select * from names where name=:name', 
                       { replacements: { name: 'hello' } });

becomes:

sequelize.query('select * from names where name=:name', 
                         { replacements: { name: new String('hello') } });

…and the prefix is gone.

Read through the escape() function to see where it gets applied.

mvoorberg on May 30, 2019

Oh geez what the heck is going on here?! Now I know why the performance of our API has been tanking lately. Wow… Does nobody actually use MSSQL in production? This behavior is positively atrocious.

EDIT: Why is the label on this “feature?” This is a straight-up bug.

ianthetechie on Mar 11, 2019