mikro-orm: High memory profile and very slow
Describe the bug
I’ve integrated Mikro-ORM in a benchmark suite and noticed a memory leak and very slow execution times.
The benchmark basically creates 10k objects, persist() each, and then flushes. After that it reads all 10k from the database. What happens with Mikro ORM is that the first 10k insert takes 7 seconds, the second 10k insert 11 seconds, the third 21 seconds. Memory climbing as well rapidly 166mb, then 278mb, then 417mb.
Raw output:
round 1
0.13634226180139436 ops/s Mikro-ORM insert 7,334.482989996672 ms, 166.2109375 MB memory
7.202537523174704 ops/s Mikro-ORM fetch 138.83995699882507 ms, 175.12109375 MB memory
round 2
0.09214466685562421 ops/s Mikro-ORM insert 10,852.500032007694 ms, 278.3046875 MB memory
9.313704069546839 ops/s Mikro-ORM fetch 107.36866798996925 ms, 286.4140625 MB memory
round 3
0.04624159269974236 ops/s Mikro-ORM insert 21,625.552703022957 ms, 417.4453125 MB memory
10.529599470788153 ops/s Mikro-ORM fetch 94.97037401795387 ms, 419.08984375 MB memory
round 4
0.03031239307193595 ops/s Mikro-ORM insert 32,989.80709397793 ms, 481.515625 MB memory
15.424512427769912 ops/s Mikro-ORM fetch 64.83187100291252 ms, 482.24609375 MB memory
round 5
0.023591314642157822 ops/s Mikro-ORM insert 42,388.48131901026 ms, 548.6171875 MB memory
9.086378374941388 ops/s Mikro-ORM fetch 110.05484899878502 ms, 550.55859375 MB memory
To Reproduce Steps to reproduce the behavior:
Execute the following Typescript code:
import 'reflect-metadata';
import {Entity as MikroEntity, MikroORM, PrimaryKey, Property, ReflectMetadataProvider} from 'mikro-orm';
import {performance} from 'perf_hooks';
export async function bench(times: number, title: string, exec: () => void | Promise<void>) {
const start = performance.now();
for (let i = 0; i < times; i++) {
await exec();
}
const took = performance.now() - start;
process.stdout.write([
(1000 / took) * times, 'ops/s',
title,
took.toLocaleString(undefined, {maximumFractionDigits: 17}), 'ms,',
process.memoryUsage().rss / 1024 / 1024, 'MB memory'
].join(' ') + '\n');
}
@MikroEntity({collection: 'mikro'})
export class MikroModel {
@PrimaryKey()
_id!: any;
@Property() id2!: number;
@Property() ready?: boolean;
@Property() tags: string[] = [];
@Property() priority: number = 0;
@Property()
name: string;
constructor(name: string) {
this.name = name;
}
}
(async () => {
const count = 10_000;
const orm = await MikroORM.init({
entities: [MikroModel],
dbName: 'bench-insert',
type: 'mongo',
metadataProvider: ReflectMetadataProvider,
clientUrl: 'mongodb://localhost:27017'
});
for (let j = 1; j <= 15; j++) {
console.log('round', j);
await orm.em.remove(MikroModel, {}, true);
await bench(1, 'Mikro-ORM insert', async () => {
for (let i = 1; i <= count; i++) {
const user = new MikroModel('Peter ' + i);
user.id2 = i;
user.ready = true;
user.priority = 5;
user.tags = ['a', 'b', 'c'];
await orm.em.persist(user);
}
await orm.em.flush();
});
await bench(1, 'Mikro-ORM fetch', async () => {
const items = await orm.em.find(MikroModel, {});
});
}
await orm.close();
})();
Additional Information
The same benchmark with TypeORM has following output.
round 1
0.44799316696261965 ops/s TypeORM insert 2,232.1769030094147 ms, 112.78515625 MB memory
14.745251303605988 ops/s TypeORM fetch 67.81844401359558 ms, 116.30859375 MB memory
round 2
0.5518305396852811 ops/s TypeORM insert 1,812.1505210101604 ms, 157.296875 MB memory
10.61794395223521 ops/s TypeORM fetch 94.18019199371338 ms, 125.12890625 MB memory
round 3
0.5414625441330295 ops/s TypeORM insert 1,846.8498159945011 ms, 139.37890625 MB memory
7.658094507705051 ops/s TypeORM fetch 130.58078598976135 ms, 140.234375 MB memory
round 4
0.5847921986324484 ops/s TypeORM insert 1,710.009132027626 ms, 141.4921875 MB memory
5.850612679085713 ops/s TypeORM fetch 170.92226999998093 ms, 154.796875 MB memory
Note: TypeORM has no UnitOfWork and thus inserting 10k happens via typeorm.manager.save(TypeOrmModel, items), which is naturally faster. Here is a benchmark of a ORM with UnitOfWork for a better reference.
round 1
4.781501792938342 ops/s Marshal insert 209.1393129825592 ms, 70.2734375 MB memory
16.781018394176918 ops/s Marshal fetch 59.591139018535614 ms, 86.30859375 MB memory
round 2
7.130604420772976 ops/s Marshal insert 140.24056601524353 ms, 100.296875 MB memory
32.69504859809784 ops/s Marshal fetch 30.58567100763321 ms, 107.0390625 MB memory
round 3
8.989556662613543 ops/s Marshal insert 111.2401909828186 ms, 113.4609375 MB memory
49.55669555919416 ops/s Marshal fetch 20.178907990455627 ms, 121.6015625 MB memory
round 4
8.519063107519148 ops/s Marshal insert 117.38379999995232 ms, 111.03515625 MB memory
48.701026779872905 ops/s Marshal fetch 20.533447980880737 ms, 112 MB memory
round 5
10.715570703516867 ops/s Marshal insert 93.32214099168777 ms, 114.41015625 MB memory
52.12080869452746 ops/s Marshal fetch 19.186195015907288 ms, 115.3828125 MB memory
Without Identity Map
When I add orm.em.clear(); right before the await bench(1, 'Mikro-ORM fetch, I basically disable Identity Map for the fetch, which I do as well for all other benchmarks (typeorm/marshal). The memory leak is then not that bad, however fetch times drastically increase.
round 1
0.14027045884962996 ops/s Mikro-ORM insert 7,129.08482798934 ms, 167.1484375 MB memory
2.160074054846749 ops/s Mikro-ORM fetch 462.9470909833908 ms, 160.453125 MB memory
round 2
0.0850145411912262 ops/s Mikro-ORM insert 11,762.693604975939 ms, 166.82421875 MB memory
2.45439285240989 ops/s Mikro-ORM fetch 407.43273800611496 ms, 181.80859375 MB memory
round 3
0.08564632189772563 ops/s Mikro-ORM insert 11,675.92463800311 ms, 321.8828125 MB memory
2.701008498703899 ops/s Mikro-ORM fetch 370.2320820093155 ms, 330.484375 MB memory
round 4
0.09211291201040393 ops/s Mikro-ORM insert 10,856.241304010153 ms, 346.94140625 MB memory
2.7796832507085085 ops/s Mikro-ORM fetch 359.7532199919224 ms, 352.234375 MB memory
round 5
0.08652217583452965 ops/s Mikro-ORM insert 11,557.73060899973 ms, 191.89453125 MB memory
2.7081659883433256 ops/s Mikro-ORM fetch 369.253585010767 ms, 183.26171875 MB memory
round 6
0.08637979298259872 ops/s Mikro-ORM insert 11,576.781623005867 ms, 352.48828125 MB memory
2.5409797655486086 ops/s Mikro-ORM fetch 393.5489819943905 ms, 355.92578125 MB memory
Expected behavior
Consistent insert times. Quicker inserts.
Versions
| Dependency | Version |
|---|---|
| node | 14.5.0 |
| typescript | 3.9.6 |
| mikro-orm | 3.6.15? |
| mongodb | 3.6.0 |
Questions
Is there anything I did wrong with the code? It feels weird to me that orm.em.persist is an async function, so maybe its related to that.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 3
- Comments: 43 (26 by maintainers)
Commits related to this issue
- perf(core): improve speed of inserting new items Related: #732 — committed to mikro-orm/mikro-orm by B4nan 4 years ago
- perf(core): use Set instead of array for stacks in UoW Related: #732 — committed to mikro-orm/mikro-orm by B4nan 4 years ago
- perf(core): use Set instead of array for cycle lookups Related: #732 — committed to mikro-orm/mikro-orm by B4nan 4 years ago
- perf(core): simplify `MetadataStorage.find()` method Related: #732 — committed to mikro-orm/mikro-orm by B4nan 4 years ago
- perf(core): do not generate sql when logging disabled Related: #732 — committed to mikro-orm/mikro-orm by B4nan 4 years ago
- perf(core): reduce usage of `wrap` helper internally Related: #732 — committed to mikro-orm/mikro-orm by B4nan 4 years ago
- perf(core): use actual Map for identity maps Related: #732 — committed to mikro-orm/mikro-orm by B4nan 4 years ago
- perf(core): improve performance of QB a bit Related: #732 — committed to mikro-orm/mikro-orm by B4nan 4 years ago
- perf(core): optimize QB for simple cases Related: #732 — committed to mikro-orm/mikro-orm by B4nan 4 years ago
- perf(core): implement bulk deletes UoW will now group change sets based on the type and entity. Instead of ordering all the change sets based on the type and entity, we can now travers those groups b... — committed to mikro-orm/mikro-orm by B4nan 4 years ago
- perf(core): implement bulk deletes (#757) UoW will now group change sets based on the type and entity. Instead of ordering all the change sets based on the type and entity, we can now travers those ... — committed to mikro-orm/mikro-orm by B4nan 4 years ago
- refactor: prepare `UoW.commit()` for possible bulk inserts Related: #732 — committed to mikro-orm/mikro-orm by B4nan 4 years ago
- perf: move reference to metadata to entity prototype + more improvements Various small performance improvements: - adding meta.root to get rid of excessive usage of `Utils.getRootEntity()` - store li... — committed to mikro-orm/mikro-orm by B4nan 4 years ago
- perf: move reference to metadata to entity prototype + more improvements Various small performance improvements: - adding meta.root to get rid of excessive usage of `Utils.getRootEntity()` - store li... — committed to mikro-orm/mikro-orm by B4nan 4 years ago
- perf: move reference to metadata to entity prototype + more improvements (#843) Various small performance improvements: - adding meta.root to get rid of excessive usage of `Utils.getRootEntity()` -... — committed to mikro-orm/mikro-orm by B4nan 4 years ago
- perf(core): use batch inserts in UoW (postgres & mongodb) UoW will now batch all inserts statements if the driver supports returning all of their PKs back (so postgres and mongodb currently). This a... — committed to mikro-orm/mikro-orm by B4nan 4 years ago
- perf(core): use batch inserts in UoW (postgres & mongodb) (#865) UoW will now batch all inserts statements if the driver supports returning all of their PKs back (so postgres and mongodb currently).... — committed to mikro-orm/mikro-orm by B4nan 4 years ago
- perf(core): use bulk inserts in all drivers This implements bulk inserts for mysql and sqlite too. Related: #732 — committed to mikro-orm/mikro-orm by B4nan 4 years ago
- perf(core): use faster way to check number of object keys Related: #732 — committed to mikro-orm/mikro-orm by B4nan 4 years ago
- perf(core): use batching in uow deletes Now the batch delete will respect the `batchSize`. Related: #732 — committed to mikro-orm/mikro-orm by B4nan 4 years ago
I just want to say something. Both of you guys, you are unbelievable. I’ve been watching this topic from the beginning and I’ve learned a lot. Thank you
Reading this thread 2 years later is still a thrill!
Thank you both so much!
4.1 is out, benchmark updated with numbers for all drivers, here are most up to date ones for sqlite (total avg 290ms -> 215ms, so another ~25% improvement):
https://github.com/mikro-orm/mikro-orm/releases/tag/v4.1.0 https://github.com/mikro-orm/benchmark
edit: lol, the remove numbers were wrong, i knew it was too good to be true :]
Just chiming in to say that I’m impressed! That’s really nice to see ppl focusing on performances like you do. Congrats to both of you 🙏
There was quite a lot of room for improvement for sure, here are new numbers for upcoming v4.1 :] I am stuck with sqlite3 for now, I don’t see easy way to use better-sqlite with knex, so that is probably also have an effect on your numbers. But this looks good enough already.
Here are numbers for postgres:
I guess I can now get back to working on features and bugs, lol.
Will update the benchmark once v4.1 is out.
V8 has a JIT engine. This engine makes Javascript as fast as we know it. Without it, it is up to 30x slower. This engine also makes certain code almost as fast as native c/c++ code, which is pretty damn fast. However, some programs are far too complex to efficiently optimize. For example when you have loops, dynamic property access/writes, or to many stacks/frames. In case of an ORM this is not easily optimizable. To fully utilize the power of the v8 JIT engine, you need to either write better javascript code, or better: you generate javascript code in a way that the v8 JIT engine can perfectly easy optimize further. For example in change-detection (but that applies to a wide variety of algos, like serialization/deserialization, snapshotting, validation, etc) instead of having a loop over your property and doing something like this:
you generate a function that looks like that:
This can perfectly be optimized by the v8 jit engine and runs way faster. How much faster? It almost capped out the maximum possible (999m ops/s) at 931m ops/s, which is 63 times faster than the
simplefunction. That’s quite a lot.This JIT way was used heavily in Marshal for stuff like validation, decoding/encoding from JSON to class, or from Mongo types to class, and even a custom BSON parser that is 13x faster than official BSON parser. Having this JIT methods for costly operations makes your stuff incredible fast. That’s why Marshal / Marshal ORM is the by far fastest ORM/serializer/validator.
Got a PoC for the
update casesolution working, looks like it works fine with all of sqlite/mysql/postgres. Will need to spend some time with supporting optimistic locks and composite keys there, but that should not be problem.Not sure if something like this can be also achieved in mongo?
Current numbers with the batch update in sqlite:
edit: also implemented batching in deletes to get around the variable limit in sqlite
Thanks again for everything, this looks fantastic already!
Or actually even better, when I refactored things to not need internal entity uuid, we are getting closer to the typeorm numbers.
For fetching this is more than 100% improvement from 4.0.2 (~200ms vs ~80ms).
https://github.com/mikro-orm/benchmark/commit/bcf82ec731b71a63e2bffcbc50f582ee2c2c2ed2#diff-04c6e90faac2675aa89e2176d2eec7d8
New numbers for
rc.5, again few minor improvements:This is what I get with the update bench added (measuring only the flush + clear):
And here without clearing, so faster reads but higher memory usage:
So the changes definitely helped there as well.
Will probably try to spend a bit more time on this, and publish my version of the benchmark when rc.2 is out, so you can also try.
Nice! We should also add uow/update into the benchmark, right after
fetchsection.I tested and got:
Might be worth optimizing as well.
I guess you are right, will implement that too, in the end it is configurable, so one can always disable it if they want different lock mode…
Things can be even better 😉
Yes, sqlite, but behind knex, so can’t just swap it with better-sqlite3 easily… But it could be another driver independent of knex.
Btw it’s still WIP (not pushed yet), but I made more improvements in the internal merging and diffing workflow, current numbers are even better (~30% faster reads):
Here are the numbers when using a
Setfor persist/remove/orphanRemove/extraUpdate stacks.