amplify-js: DataStore Query Performance Very Slow
Before opening, please confirm:
- I have searched for duplicate or closed issues and discussions.
- I have read the guide for submitting bug reports.
- I have done my best to include a minimal, self-contained set of instructions for consistently reproducing the issue.
JavaScript Framework
React Native
Amplify APIs
DataStore
Amplify Categories
api
Environment information
# Put output below this line
Describe the bug
We are porting our AppSync/GraphQL based app to DataStore and what we are finding is that queries on the cached database are very slow. We are also finding that pagination of the queries makes no difference to the performance of the queries.
There are many tables with the largest table having about 8,000 records and most other tables having less than 1000 records. There are only 5 tables/models that have data.
Whether we are connected to the network or working offline it makes no difference to the performance.
In issue #6994 it talks of a first time slowing in performance, but we see consistently slow performance.
I have recorded a video of the performance we are seeing - both online and offline - to give you an idea of the problem we are facing.
https://user-images.githubusercontent.com/6362888/120909860-f74b7180-c6bc-11eb-97dc-300f8d19ab4d.mp4
Expected behavior
The database is local so we should be seeing almost instant responses to our queries. The performance of our existing AppSync GraphQL cache gives us almost instant responses so we expect the same or better performance than the existing AppSync GraphQL database.
Reproduction steps
Not sure how to replicate this as the database I have has 48 models and 200 GSI’s and takes about 6 hours of incremental (manual) pushes to deploy into a new environment.
Code Snippet
// Put your code below this line.
Log output
// Put your logs below this line
aws-exports.js
No response
Manual configuration
No response
Additional configuration
No response
Mobile Device
iPhone 7
Mobile Operating System
14.5
Mobile Browser
Not relevant
Mobile Browser Version
No response
Additional information and screenshots
No response
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 1
- Comments: 110 (46 by maintainers)
@sacrampton, @jeremy-white, @mjaydeep01 - this feature has been officially released as part of
aws-amplify@4.2.6! The documentation change will follow shortly, but in the meantime, here’s how you can enable it in your app:(Note: this new storage adapter is only compatible with React Native CLI-generated apps. Expo support will be added in the future)
Install:
Enable in your app:
I’m closing this issue, as the new storage adapter addresses the performance concerns expressed in the issue. Please create a new issue if you require assistance with this feature.
Hi @iartemiev - one final update from my testing. On the desktop you ship a storage adapter for IndexedDB which is a NoSQL database. So we did a test on the desktop with the large 30K+ Asset data set described above using the full schema.
It took about 11 minutes for the full dataset to download - but after that a query of the Asset table took 2 seconds. This compares with 64 seconds on Mobile App (React Native).
Hi @iartemiev - thank you for going above and beyond to make this happen
Hi @iartemiev - thanks for that - did the trick now. So resolved all issues around the count. Thanks so much as always.
Still noticing the delta-sync on SQLite is slower than before. Not sure if you were seeing that.
@sacrampton, I don’t think there’s anything inherently wrong with it. It’s hard to know for sure without being familiar with your app code and knowing what values are being passed and what data is expected to be present in the local store. DataStore doesn’t use a cache. Every time you execute
DataStore.query, it performs a SQL query against your local database.If I were debugging this, I would check that each of the operands has the expected value by logging them out to the console.
You can also try removing fields from your predicates one by one to determine which one is causing all the results to be filtered out.
Try something like the following in the same place as where you’re normally attempting to query the local records (and before the sync):
If you have a
falsein any of those 4 console logs, that should give you the problematic field.Lastly, if you have debug logs on, you should see the SQL SELECT statement that is generated by that predicate. Does it look correct?
Just a couple more things I’ve hit
Hit this for HAS_MANY connections defined as
as well as
I’ve been able to get past this by changing
isRequiredto false in schema.js . . . this could also be my misunderstanding but I was under the impression that [String!] meant that the items in the array had to be defined but the attribute itself could be undefined, so I was only using ! to be extra cautious. I can certainly work around this one as well but just wanted to raise it as it works as I’d expect with AsyncStorage but fails against SQLiteI’m also seeing
For fields I define as AWSJSON and have values that are arbitrary JSON . . . I haven’t been able to workaround this other than just removing those fields from schema.js so if you have a better workaround I would love to hear it.
I really appreciate your work on this @iartemiev as I think it will be a HUGE improvement!
Hi @iartemiev - thanks so much for the suggestion - will follow this and let you know how we go.
Hi @iartemiev - the performance of the AsyncStorage adapter basically makes it unusable on anything but exceptionally small data sets. So we are already an order of magnitude better off using the SQLite adapter. We will be going forward with the SQLite adapter and keep testing over the next week or so - but looks pretty good with where we are at now. I am very comfortable to go forward with SQLite adapter.
By the way, getting TOTAL’s is really important to us. For example, to work out how many assets are in a work package - we want to be able to just query the “COUNT” of those records that meet that query. Seems to be a standard function of SQLite, but not available in DataStore’s API. Is there a simple way you can enable the total number of items to be returned (not the items - just the total) given this seems to be a standard function of SQLite?
Hi @iartemiev - we have built a signed APK to test the Android version on device and now are seeing performance comparable to what we were seeing on iOS.
Hi @iartemiev - good news - SQLite now working and previously with our full schema it wouldn’t even load 31K assets - and with the massively trimmed down schema it was 64 seconds to query 31K assets. Now we can use the full schema and we are seeing 2 seconds to query 31K assets. Still testing, but looking really promising.
Hello again, @sacrampton! We’ve released a special version of AmplifyJS (aws-amplify@rn-sqlite) containing the new SQLite adapter. We’re still considering this to be an experimental feature, hence the special tag/version.
We’ve tested this adapter thoroughly on our end with up to 350k records. Querying on a predicate that returns a subset of data or using pagination and/or sorting is blazing fast (a fraction of a second in our testing). If you want to return the entire contents of a large table (over 10k records), i.e., querying without a predicate, that still takes a while (10+ seconds), but that is expected.
There is currently a bit of occasional funky behavior and console warnings when deleting relational records. This is because cascading deletes are not yet implemented. You would need to manually delete from the last descendant up to avoid warnings. During my testing, these did not break the application. We will be adding cascading delete functionality shortly to bring this to feature parity with the other adapters.
Please let us know in this thread if you run into issues, and we will get them resolved ASAP. We will include this adapter in the stable release soon but wanted to give you a chance to test and benchmark early to ensure that your performance needs are met.
Here are the installation instructions:
(note: AsyncStorage is still a required dep, as other parts of Amplify continue to rely on it)
In order to restore the current/async-storage version:
Hi @iartemiev - the team has checked using await AsyncStorage.getItem(key).
The code used before looks like this
This results that come are shown below
After applying null validation we are getting these field in asset record. Here is the code after.
And this is the data we are seeing
So it is clear that the null fields are gone after updating the code in aws-amplify module - but the speed doesn’t change.
Quick update: I have been able to repro performance issues similar to those described above when using a model with 10 fields containing values + 100 fields with
nullvalues and > 30k records in the local store. Will be investigating causes and potential solutions.Thank you, that is helpful. We’ll be doing additional library benchmarking today in order to pinpoint the issue and will provide an update with our findings.
New Video - here is exactly the same application - with a few hundred records loaded versus over 7,000 in the previous video.
Seems loading more than a trivial amount of data is grinding it to a halt. What you see below is my expectation for what we should be getting with a larger data set - even 100K records.
https://user-images.githubusercontent.com/6362888/120977328-8e482480-c7b6-11eb-9c10-61daf9a3076a.mp4