amplify-js: DataStore Query Performance Very Slow

Before opening, please confirm:

JavaScript Framework

React Native

Amplify APIs

DataStore

Amplify Categories

api

Environment information

# Put output below this line


Describe the bug

We are porting our AppSync/GraphQL based app to DataStore and what we are finding is that queries on the cached database are very slow. We are also finding that pagination of the queries makes no difference to the performance of the queries.

There are many tables with the largest table having about 8,000 records and most other tables having less than 1000 records. There are only 5 tables/models that have data.

Whether we are connected to the network or working offline it makes no difference to the performance.

In issue #6994 it talks of a first time slowing in performance, but we see consistently slow performance.

I have recorded a video of the performance we are seeing - both online and offline - to give you an idea of the problem we are facing.

https://user-images.githubusercontent.com/6362888/120909860-f74b7180-c6bc-11eb-97dc-300f8d19ab4d.mp4

Expected behavior

The database is local so we should be seeing almost instant responses to our queries. The performance of our existing AppSync GraphQL cache gives us almost instant responses so we expect the same or better performance than the existing AppSync GraphQL database.

Reproduction steps

Not sure how to replicate this as the database I have has 48 models and 200 GSI’s and takes about 6 hours of incremental (manual) pushes to deploy into a new environment.

Code Snippet

// Put your code below this line.

Log output

// Put your logs below this line


aws-exports.js

No response

Manual configuration

No response

Additional configuration

No response

Mobile Device

iPhone 7

Mobile Operating System

14.5

Mobile Browser

Not relevant

Mobile Browser Version

No response

Additional information and screenshots

No response

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 1
  • Comments: 110 (46 by maintainers)

Most upvoted comments

@sacrampton, @jeremy-white, @mjaydeep01 - this feature has been officially released as part of aws-amplify@4.2.6! The documentation change will follow shortly, but in the meantime, here’s how you can enable it in your app:

(Note: this new storage adapter is only compatible with React Native CLI-generated apps. Expo support will be added in the future)

Install:

$ npm install aws-amplify @aws-amplify/datastore-storage-adapter react-native-sqlite-storage aws-amplify-react-native amazon-cognito-identity-js @react-native-community/netinfo @react-native-async-storage/async-storage
$ npx pod-install

Enable in your app:

import { DataStore } from 'aws-amplify';
import { SQLiteAdapter } from '@aws-amplify/datastore-storage-adapter';

DataStore.configure({
  storageAdapter: SQLiteAdapter
});

I’m closing this issue, as the new storage adapter addresses the performance concerns expressed in the issue. Please create a new issue if you require assistance with this feature.

Hi @iartemiev - one final update from my testing. On the desktop you ship a storage adapter for IndexedDB which is a NoSQL database. So we did a test on the desktop with the large 30K+ Asset data set described above using the full schema.

It took about 11 minutes for the full dataset to download - but after that a query of the Asset table took 2 seconds. This compares with 64 seconds on Mobile App (React Native).

Hi @iartemiev - thank you for going above and beyond to make this happen

Hi @iartemiev - thanks for that - did the trick now. So resolved all issues around the count. Thanks so much as always.

Still noticing the delta-sync on SQLite is slower than before. Not sure if you were seeing that.

@sacrampton, I don’t think there’s anything inherently wrong with it. It’s hard to know for sure without being familiar with your app code and knowing what values are being passed and what data is expected to be present in the local store. DataStore doesn’t use a cache. Every time you execute DataStore.query, it performs a SQL query against your local database.

If I were debugging this, I would check that each of the operands has the expected value by logging them out to the console.

You can also try removing fields from your predicates one by one to determine which one is causing all the results to be filtered out.

Try something like the following in the same place as where you’re normally attempting to query the local records (and before the sync):

const newAsset = await DataStore.save(new Asset({...assetData}));
const data = await DataStore.query(Asset, newAsset.id);

console.log(data.assetParentAssetId === mobileActiveLocation?.id, data.assetParentAssetId, mobileActiveLocation?.id);
console.log(data.assetPackageId === userPackageId, data.assetPackageId, userPackageId);
console.log(data.assetPlantId === userPlantId, data.assetPlantId, userPlantId);
console.log(data.isDeleted !== true, data.isDeleted);

If you have a false in any of those 4 console logs, that should give you the problematic field.

Lastly, if you have debug logs on, you should see the SQL SELECT statement that is generated by that predicate. Does it look correct?

Just a couple more things I’ve hit

Object {
  "code": 6,
  "message": "NOT NULL constraint failed: TestModel.items",
}

Hit this for HAS_MANY connections defined as

    items: [Item!] @connection(keyName: "byTestModel", fields: ["id"])

as well as

    items: [String!]

I’ve been able to get past this by changing isRequired to false in schema.js . . . this could also be my misunderstanding but I was under the impression that [String!] meant that the items in the array had to be defined but the attribute itself could be undefined, so I was only using ! to be extra cautious. I can certainly work around this one as well but just wanted to raise it as it works as I’d expect with AsyncStorage but fails against SQLite

I’m also seeing

Error: Field items should be of type string, object received.

For fields I define as AWSJSON and have values that are arbitrary JSON . . . I haven’t been able to workaround this other than just removing those fields from schema.js so if you have a better workaround I would love to hear it.

I really appreciate your work on this @iartemiev as I think it will be a HUGE improvement!

Hi @iartemiev - thanks so much for the suggestion - will follow this and let you know how we go.

Hi @iartemiev - the performance of the AsyncStorage adapter basically makes it unusable on anything but exceptionally small data sets. So we are already an order of magnitude better off using the SQLite adapter. We will be going forward with the SQLite adapter and keep testing over the next week or so - but looks pretty good with where we are at now. I am very comfortable to go forward with SQLite adapter.

By the way, getting TOTAL’s is really important to us. For example, to work out how many assets are in a work package - we want to be able to just query the “COUNT” of those records that meet that query. Seems to be a standard function of SQLite, but not available in DataStore’s API. Is there a simple way you can enable the total number of items to be returned (not the items - just the total) given this seems to be a standard function of SQLite?

Hi @iartemiev - we have built a signed APK to test the Android version on device and now are seeing performance comparable to what we were seeing on iOS.

Hi @iartemiev - good news - SQLite now working and previously with our full schema it wouldn’t even load 31K assets - and with the massively trimmed down schema it was 64 seconds to query 31K assets. Now we can use the full schema and we are seeing 2 seconds to query 31K assets. Still testing, but looking really promising.

Hello again, @sacrampton! We’ve released a special version of AmplifyJS (aws-amplify@rn-sqlite) containing the new SQLite adapter. We’re still considering this to be an experimental feature, hence the special tag/version.

We’ve tested this adapter thoroughly on our end with up to 350k records. Querying on a predicate that returns a subset of data or using pagination and/or sorting is blazing fast (a fraction of a second in our testing). If you want to return the entire contents of a large table (over 10k records), i.e., querying without a predicate, that still takes a while (10+ seconds), but that is expected.

There is currently a bit of occasional funky behavior and console warnings when deleting relational records. This is because cascading deletes are not yet implemented. You would need to manually delete from the last descendant up to avoid warnings. During my testing, these did not break the application. We will be adding cascading delete functionality shortly to bring this to feature parity with the other adapters.

Please let us know in this thread if you run into issues, and we will get them resolved ASAP. We will include this adapter in the stable release soon but wanted to give you a chance to test and benchmark early to ensure that your performance needs are met.

Here are the installation instructions:

(note: AsyncStorage is still a required dep, as other parts of Amplify continue to rely on it)

yarn add aws-amplify@rn-sqlite amazon-cognito-identity-js@rn-sqlite aws-amplify-react-native@rn-sqlite react-native-sqlite-storage @react-native-async-storage/async-storage

# to ensure we're not left with any of the previous versions
rm -rf node_modules yarn.lock package-lock.json
yarn

npx pod-install

In order to restore the current/async-storage version:

yarn remove react-native-sqlite-storage
yarn add aws-amplify@latest amazon-cognito-identity-js@latest aws-amplify-react-native@latest

# to ensure we're not left with any of the previous versions
rm -rf node_modules yarn.lock package-lock.json
yarn

npx pod-install

Hi @iartemiev - the team has checked using await AsyncStorage.getItem(key).

The code used before looks like this codeBefore

This results that come are shown below resultBefore

After applying null validation we are getting these field in asset record. Here is the code after. codeAfter

And this is the data we are seeing resultAfter

So it is clear that the null fields are gone after updating the code in aws-amplify module - but the speed doesn’t change.

Quick update: I have been able to repro performance issues similar to those described above when using a model with 10 fields containing values + 100 fields with null values and > 30k records in the local store. Will be investigating causes and potential solutions.

Thank you, that is helpful. We’ll be doing additional library benchmarking today in order to pinpoint the issue and will provide an update with our findings.

New Video - here is exactly the same application - with a few hundred records loaded versus over 7,000 in the previous video.

Seems loading more than a trivial amount of data is grinding it to a halt. What you see below is my expectation for what we should be getting with a larger data set - even 100K records.

https://user-images.githubusercontent.com/6362888/120977328-8e482480-c7b6-11eb-9c10-61daf9a3076a.mp4