typeorm: Performance problems when saving multiple documents
Problem
In our system we noticed slowdowns when we needed to insert a lot entities (up to 10.000). To see what the exact problem was we created a test script to test the insert speed of TypeORM.
For the test we used the save() method inside a transaction (see here). In the screenshot below we can see the results of the test we ran with different number of documents.

We compare against raw queries using pg-promise. The expected behavior is that TypeORM is slower than pg-promise, but should still scale linear. What can be seen in the graph is that TypeORM exhibits O(n^2) runtime complexity instead. Inserting 16000 documents takes about 7.5 minutes using TypeORM.
The table below gives a more detailed view (times are in milliseconds), all the numbers are the average of 10 runs.
| Documents | PG-Batch | TypeORM | PG-Promise | PG-InsertHelper |
|---|---|---|---|---|
| 500 | 179 | 824 | 250 | 54 |
| 1000 | 268 | 1931 | 469 | 93 |
| 2000 | 476 | 12602 | 993 | 130 |
| 4000 | 863.1 | 58696 | 2015.7 | 216.5 |
| 8000 | 1618.9 | 116990.8 | 3886.1 | 385.9 |
| 16000 | 3281.2 | 438171.8 | 7706.2 | 736.8 |
What can be seen in the table is that when inserting a smaller amount of documents the performance is fine.
After every test run the NodeJS process was restarted. If we do all the test runs in the same process (loop) the test results are slightly better on average but the O(n^2) runtime complexity is still there. See the google doc sheet Machine 2 (continuous) for the results.
Solution
Currently not known.
Environment
TypeORM: 0.1.1 Database: Postgres 9.6 Node: 6.11 OS: Ubuntu
Entity used:
@Entity()
export class Document {
@PrimaryColumn('text')
id: string;
@Column('text')
docId: string;
@Column('text')
label: string;
@Column('text')
context: string;
@Column({type: 'jsonb'})
distributions: Distribution[];
@Column({type: 'timestamp with time zone'})
date: Date;
}
export interface Distribution {
weight: string,
id: number,
docId: number
}
More information
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 17
- Comments: 22 (13 by maintainers)
Sharing this here as it might maybe help someone who still gets low execution times; Our solution was; Manually chunk our entities using the chunk method from lodash instead of the inbuild typeorm chunk option. (250 items per chunk delivered the fastest execution time for our use case) After this we saved them in a loop using the typeorm’s .insert() method instead of .save() as it was faster. Basic structure example;
Added this table to show the execution times we got each time with each different approach.
no you can’t. Use alpha version there is nothing wrong with it.
I made even more fixes in
0.2.0-alpha.5and here are results for 10.000 documents insertion:@MaikelH I was referring to the
PG-Batchexample, the one that usesdbdirectly and then usesPromise.all, not the one with thet.batch😃 When you do it like that, you force all available connections out of the pool, so none will be available to do anything in parallel, until the operation is over.The approach with the
task/txwill use only one physical connection from the pool. See Chaining Queries.From that article: