better-sqlite3: Result of query with comment is inconsistent and random
There seem to be an inconsistent bug in the comments parsing method.
Having the following raw SQL query (making it a one liner doesn’t change anything either):
const query = `SELECT *
FROM user_satellites_satellite
FULL JOIN satellite ON user_satellites_satellite.satelliteId = satellite.id
WHERE user_satellites_satellite.userId = '${user.id}'
AND (satellite.name LIKE '%${searchQuery}%' OR satellite.description LIKE '%${searchQuery}%');`;
with searchQuery
being (random
can be anything):
random') UNION SELECT CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), tbl_name, CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR), CAST(1 AS VARCHAR) FROM sqlite_master; --
And executing it with query()
await this.dataSource.query(query);
Will either return the correct and expected data or the error
The supplied SQL string contains more than one statement
Logging with NestJS shows the following query is getting executed (query logged being always the same):
The exact same query being executed multiple times over and over, e.g. by spamming, will make the query return correct data after some time, which makes the bug inconsistent:
Switching the SQLite driver to the sqlite
- being installed with npm install sqlite3
- driver makes the query always work and return the expected result, hence is not related to TypeORM itself.
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 18 (11 by maintainers)
Hello, fellow developers! I happened upon this most fascinating of issue, which intrigued me so much I elected to abandon an hour of perfectly good sleep in favour of debugging the crap out of this.
The first thing I did was to insert a bunch of ugly
printf
statements directly into better-sqlite3.cpp, which mostly led me nowhere. However, after many twists and turns, I made the following modification, allowing me to see every single character processed as part of the query tail:And in my trusty terminal, I began to see something… intriguing. Something unexpected. A pattern of noise just past the last character emerges:
Is this an out-of-bounds read? Surely,
v8::String::Utf8Value
cannot possibly produce a string without zero terminator. But what if… Ah. It is as clear as day now. The nested loops are at fault! Let us consider a simplified double-loop:When the condition
(b = *tail)
is false, then we knowtail
points to a zero byte. So far, so good. This is how we detect the end of the string, of course. The inner loop thus exits, and we proceed to the next iteration of the outer loop. The outer loop performs its increment:++tail
… and the world ceases to be beautiful. Nowtail
points almost certainly to garbage. The once innocent*tail
invokes behaviour of an undefined nature. Probably. Certainly the trusty processor reads something, which it usually interprets as a query character.The solution to this loopy confusion requires more thought than my 1am brain can produce. After some sleep, perhaps a pull request will present itself…
It’s nice to see the chosen one show up in times of need 🙏
Your explanation: 🧑🍳 🤌 💯
Thanks for digging into this!
FWIW ChatGPT is entirely useless telling me for the third time that I need to escape
searchQuery
because I’m apparently using two statements which better-sqlite3 does no support. Or it insists there might be race conditions between multiple processes or database connections after repeatedly being told it’s a single-use in-memory database.