firebase-ios-sdk: Slow firestore listener performance with large data sets

Xcode version: 9.4.1
Firebase SDK version: 5.3.0
Firebase Component: Firestore
Component version: 0.12.4

I’ve been experiencing surprisingly slow performance with Firestore listeners with larger data sets (thousands of documents). There are three issues that are seemingly related…let me know if you’d like me to break this up into separate issues.

All testing has done using a simple sample app running on an iPhone 7.

Slow listener response when querying large data set

For a data set with thousands of documents, it takes several seconds for the initial update from the listener. For example, a listener querying a data set with 3000 simple one-key documents takes 1-2 seconds for the initial listener update closure to be called. From what I’ve seen this appears to be simply the time it takes to query the local cache and return the 3000 documents. This seems surprisingly slow for locally persisted data. I’ve tested this both online and offline with the same results.

I found this closed issue which seems similar, but I wasn’t sure if it was the exact same problem or not: https://github.com/firebase/firebase-ios-sdk/issues/413

Steps to reproduce:

Create listener to query a data set of 3000 documents.
Measure the time it takes for the first listener update to be called.

Paging with limit() does not help (actually a bit slower)

I looked into using limit(to: 100) on my listener query to only return the first 100 results, thinking this might get me faster results from the listener. But it was actually a bit slower when I did this. Should paging using limit() improve the performance of the listener query?

Steps to reproduce:

Create listener to query a data set of 3000 documents, with a .limit(to: 100) added to the end of the query.
Measure the time it takes for the first listener update to be called.
It will take slightly longer time compared to without the limit (maybe 10% longer).

Listener performance is worse immediately after adding the large data set

If I set up the listener after adding the 3000 documents from the app (without delete/reinstall in between), then the initial listener query seems to take MUCH longer, around 20 seconds. This happens even after a re-launch of the app, and after I’ve made sure all the documents have been added to the cloud. If I then delete and reinstall the app, then the listener performs back at the 1-2 second timing again. It almost feels like after adding the 3000 documents that the local cache is in some degraded-performance state and a re-install allows it to re-cache everything cleanly (but I have no evidence this is what is actually happening).

Steps to reproduce:

From an empty database, add 3000 simple documents from the app. Wait for all documents to be written to the cloud.
Create listener to query that same set of 3000 documents.
Measure the time it takes for the first listener update to be called.
It will take much longer, like 20 seconds or so.
Kill and force quit the app, repeat steps 2-4, same result.
Delete and re-insatll the app, repeat steps 2-3, measured time now is 1-2 seconds.

I first ran into these issues with our own in-development app. But I’ve been able to consistently reproduce them using a very simple test app (which I can share if that would help).

About this issue

Original URL
State: closed
Created 6 years ago
Reactions: 6
Comments: 23 (12 by maintainers)

Most upvoted comments

@KabukiAdam I think we’ve got this reproduced locally now. The issue seems to stem from looking up the 3000 documents locally in the underlying leveldb datastore, specifically when searching for local mutations to the documents (of which, there aren’t any in this case, so those searches all fail as expected.)

Compacting the underlying database during startup seems to help this particular situation quite a bit; when I change the code to do that, step 2 goes from ~20s to ~1.2s, as you would otherwise expect. But I’m not sure on other implications to that change. We’ll have to discuss this a bit internally and see what the best approach is. I’ll update this issue again when we’ve got that figured out.

rsgowman on Jul 4, 2018

@mikelehen Thank you for the prompt and helpful reply.

Regarding the listener performance with a large data set, it’s good to hear that my results are not unexpected. However I can’t help but feel this is still “slow” relative to what one would typically expect for locally-persisted data queries (similar queries using core data or sqlite are ~100 times faster). I do agree that 3000 documents is way more than would ever fit on a screen, but it is fairly common to have a very long list of things in a list in the UI, and allow the user to quickly scroll through it (even it only a small subset appears on the screen at once). And there are cases where even a screen-full of data could number 100 documents (for example “events” on a month-view calendar), which would take ~0.1 seconds to query, just barely fast enough given the expected UI latencies for native mobile apps. Anyway, that’s just some extra feedback on the performance in general…if there’s a better place to give this kind of feedback, I’d be happy to do so and go more detail. 😃

Regarding the issue of paging with limit(), I’m very glad to hear that there are plans to implement client-side indexing to proportionally improve performance. Is there any information about the plan/timeframe for this to be implemented? I know time estimates (especially public ones) are somewhat tricky, but it would be extremely helpful if we had some idea about the timeframe you’re shooting for. (Our app will depend on paging to deal with querying these larger data sets).

Regarding the issue of listener performance being worse right after adding the large data set. In my tests, it was doing 3000 individual setData() calls (called in a tight loop). And yes I can confirm that the completion block is called for each write.

I’ve pushed a copy of our test app to Github here: https://github.com/KabukiAdam/FirestoreListenerTest

Steps to reproduce using the above test app:

Run app, tap “Test Large Data Set” to create the 3000 documents. The on-screen log will show when each callback is called after creating the document. Check Firebase console to confirm that all documents have been created.
Force-quit app, disable networking (to guarantee local query), open app, tap Create Listener. Takes ~20s. NOTE if you tap Create Listener at the end of step 1, then it only takes ~7s here in step 3…not sure why the difference?
Delete the app from the device (to start with empty local cache), enable networking, open app, tap Create Listener. First time it needs to query from cloud so it will take a bit longer, but on subsequent Create Listener taps it only takes ~1.2s.
Force-quit app, disable networking (to guarantee local query), open app, tap Create Listener. Takes ~1.2s. Will always take ~1.2 forever after.

Results summary: ~1.2 sec seems to be the “normal” time, but I only see this after a delete/re-installl. The listener time right after adding the documents takes much longer (and it never seems to go down in time on its own).

Thanks again for your help with this.

KabukiAdam on Jul 2, 2018

We faced the same issues where we have 10 collections with an average 500-3000 docs each - all must be listened when the app starts. When the number of docs for each is small e.g 100 each, the performance is fine, but now on an average 500-3000 docs (each docs is less than 2kb), it becomes extremely slow both online and offline. Profiling: Local DB Cache: 23Mb @rsgowman We’re interested on how you do compacting the leveldb for a quick remedy right now. Can you please tip us on that? Thanks.

businessengine on Jul 5, 2018

@KabukiAdam Thank you for this detailed performance feedback! This is useful and I appreciate you taking the time to provide it. I haven’t dug in too deep, but I’ll briefly respond with my initial analysis.

Your first point (slow w/ large data set) is semi-expected. We may be able to optimize this some, but on the order of ~1 second per 1000 documents is probably expected right now. We generally assume apps will be querying smaller result sets (on the order of what fits on a screen).

Your second issue (paging with limit() does not help) is expected right now but we have planned improvements that should help a lot. Currently the client does not implement its own indexing of data which means that in order to satisfy your limit(), we need to load all of the data, sort it, and then apply the limit in-memory. As you noticed, this is just as slow (or slower) than querying the full data unfortunately. Once we implement client-side indexing, limit(100) should be ~30x faster than the full 3000 documents.

Your 3rd item (listener performance worse immediately after adding the large data set) is very surprising to me. I could imagine there could be some performance quirks while those writes have not yet been committed to the backend, but once they’re committed I don’t know why there would be a lingering performance issue. Can you confirm your completion block(s) were called? And were you using the WriteBatch API to do multiple writes at once? Or just 3000 individual setData() calls or similar? If it’s easy to provide a repro app, that would make it much faster for us to investigate. Else we’ll try to get this assigned to somebody for repro and investigation in the coming weeks.

Thanks again for the feedback!

mikelehen on Jun 29, 2018