apollo-client: Umbrella issue: Cache invalidation & deletion
Is there a nice and easy way to set a condition to invalidate a cached query against? Think about time-to-live (TTL) or custom conditions.
For example (pseudo-code warning):
query(...).invalidateIf(fiveMinutesHavePassed())
or
query(...).invalidateIf(state.user.hasNewMessages)
forceFetch
serves its purpose, but I think the cache invalidation condition should be able to live close to the query(cache) itself. This way I don’t need to check manually if there’s a forceFetch required when I rerender a container. The query already knows when it’s outdated.
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Reactions: 44
- Comments: 72 (20 by maintainers)
Popping in on this… I’d like to add that I think this is very badly needed. I’m actually surprised it’s not baked into the core of the framework. Without the ability to invalidate parts of the cache, it’s almost impossible to cache a paginated list.
Thinking about this some more: What I’d like to see is something very much like the API for updateQueries:
The invalidateQueries hook is called with exactly the same arguments as updateQueries, however the return result is a boolean, where true means it should remove the query result from the cache and false means the cache should be unchanged.
This strategy is very flexible, in that it allows the mutation code to decide on a case-by-case basis which queries should be invalidated based on both the previous query result and the result of the mutation.
(Alternatively, you could overload the existing updateQueries to handle this by returning a special sentinel value instead of an updated query result.)
The effect of invalidating a query removes the query results from the cache but does not immediately cause a refetch. Instead, a refetch will occur the next time the query is run.
I’ve read tons of workarounds, but that’s not solving the problem.
You have to cherry-pick a bunch of workarounds (or build them yourself) for basic usage of this library. It’s like it is built to work with the “to-do” app and nothing more complex than that.
We need an official implementation or at least documentation guiding users on how to deal with all of these edge cases (if you can call Cache#remove an edge case).
I’m more than willing to help on anything, but I’m afraid if I just fork and try to implement this, it will be forgotten like this issue or the 38 currently open PRs…
Right now I’m mentally cursing who chose this library for a production system 😕
@jvbianchi we are nearing the launch of a new store API and network stack which should allow for fine grained cache control!
I’d like to delete some specific nodes in my cache store for my application’s logout, for security, and also force them to refresh from the network the next time they are accessed via queries.
Would it make sense to have invalidation and deletion methods symmetrical to the other imperative store API methods?
For example:
Nodes selected by the above would be deleted from the cache, and would be refreshed from network when accessed by a query with a cache-first, cache-and-network, or network-only fetch policy. A query with a cache-only policy would not find those nodes.
Nodes selected by the above would be marked as stale in the cache, and would be refreshed from network when accessed by a query with a cache-first, cache-and-network, or network-only fetch policy. A query with a cache-only policy would see those stale nodes.
Is cache invalidation (evict) ready to use on 2.0? Any docs on that?
Can anyone explain how this issue will be affected by the 2.0? Will it be significantly easier with 2.0 to refetch/invalidate queries?
Thanks!
I too am pretty disappointed that we still must manually specify queries to update. As @pleunv said, it’s true that
this is a huge maintenance nightmare and having to keep track of this interconnected web only gets nastier as your application grows/changes.
I’m hopeful for an apollo future where we will not have to worry about this specific problem quite as much 🙏
Would very much appreciate a cache expire time option or a TTL feature.
@swernerx yes, it’s still on the radar! We trying to simplify the core and API before we add more features though, so this won’t be in 1.0, but it could very well be in 1.1!
There is a need to automate garbage collection inside the cache. The cache presents very limited API to the world, mostly allowing reads/writes through queries and fragments. Let’s look at the queries for a moment.
The query has a concept of being active. Once query is activated, and results are fetched, it denormalises the response to the cache. It cannot delete it, because other queries might end up using the same results. What the query can’t do is to reference other queries, so there is no way to make cycles. This make a reference counter based GC viable.
Let’s suppose that the underlying cache object holds a reference counter. Once the result is being written to/materialised from the cache, the query can collect all references objects, hold on them into a private
Set
and increase the reference counter. Every materialisation process would fully clear and repopulate that `Set, while adjusting refcount accordingly.To prune specific query data and enable potential garbage collection from cache, you have to adjust refcount for all associated objects and clean that
Set
.Once in a while the cache could simply filter out all keys that have refcount 0. That eviction could be easily triggered with few strategies:
The
readFragment
would have to be further constrained that it may fail if there is no query that holds on requested object. Simply because the data might have been evicted from the cache already.The remaining issue is with
writeFragment
when there is no backing query to hold on it’s value, as it gives no guarantees that the data will actually be persisted for any length of time. I’m not sure if there is any use-case other than activating a query immediately after some fragments were written, and we can easily make that scenario work.@TSMMark exactly
We have lists queries which can be paginated, sorted and filtered by more than 1 param. In order to do optimistic updates over list item we need to keep track of set of variables which correspond to every single list where this item appears, like all combinations of sorting, pagination and filtering params (!).
What are we doing wrong?
+1
I’m feeling the need to have a field based invalidation strategy. As all resources are available in a normalized form under
state.apollo.data
(based on dataId), and going further with the proposal from @viridia, I believe we could reach a field based invalidation system much like the following:Given this schema:
With a data id resolver such as:
With Person type resources with ids 1, 2, and 3 being found at the apollo store as:
And having a query somewhere in system such as:
I could then perform a mutation to change the relative of person 1 to be person 3 and force invalidation as following::
Edit: I do understand that
updateQueries
is a valid method for this use case, butupdateQueries
is only fit on mutations which result will bring all the data you need to update the queries, which is almost never true.I think that we “just” need a way to update the cache by
__typename:id
instead of a specific query + variables 😉 So even if you have an infinite variables cases (for example afilter
param) it’s no more a problem.Something like this:
Note that with this solution, you can also update every results cached (and this is exactly what I wanted!)
@helfer Would adding a timestamp to each cache result break something in the library?
If a client were able to grab the timestamp of a query result, they could use
refetchPolicy
and decide on an application basis whether data was deemed to be stale.For example, if an app regards data older than n seconds to be out of date, you could fetch @
cache-only
, check the timestamp and then issue a fetch @network-only
.That way the semantics of cache invalidation can be pushed up to the app.
I have a very simple scenario:
I’ve read several issues around here but still can’t find a good way of doing this except manually deleting the data from the store on the
update
method for a mutation.Since cache is stored with variables, I don’t know which list the data should be added to, so it’s just better to invalidate all loaded lists from that field. However there doesn’t seem to be an API for this.
I can use
writeQuery
to ADD data, but how do I remove a whole field from the cache? This issue is from 2016 and we still don’t have an API to remove things from the cache… what can I do to change that?Ok, I’ve worked on this issue and ended up with a project to solve field based cache invalidation:
apollo-cache-invalidation
.This is how you would use it, following the example on my previous comment:
As you can see, invalidateFields method is only a higher-order function to create a valid
update
option for theclient.mutate
method. It receives a function, which will be called with the same argumentsupdate
does. It must return not a structured object, such as intended in my previous comment, for the keys can be dynamic - it accepts strings, regex, or functions for each key in a field path.Further documentation can be found in the project’s page.
Keep in mind this can be used for dynamic cache-key invalidation at any level, so to invalidate the
relative
field on all person one could simply add an invalidating path as such:If you people find this useful, care to provide some feedback.
I’ve spent a whole day trying to figure out how to delete something from my cache/store. Is there a solution for this? I have finished 90% of my app with Apollo and this hit me right in the face. There really is no way to delete something?
Wouldn’t changing
to delete the object instead of having no effect be sufficient here? For my case at least, it would be a very elegant, easy and intuitive solution.
I think the best way to solve delete & cache issue is to add GraphQL directive:
Execution of this query should delete
Asset:{id}
from cache.Also wondering if there has been any change on this subject after 2.0. Starting to implement some optimistic updates here and there and currently not sure how to deal with data removal. I would prefer my mutations not to have knowledge of which queries need to be updated (sometimes it involves multiple, and they can be rather complex + it causes issues when new queries get added and you forget to reference them in the necessary mutations) so I would like to avoid the
withQuery
route where possible and instead work directly on fragments. Is there any possibility (or plans) to directly remove a specific fragment from the cache?I don’t think this one should be closed yet 🙂
Let me try making a proposal for a solution - if the maintainers are OK with it, I or someone else could work on a PR to implement it in
apollo-cache-inmemory
.Originally, I wanted to start with the existing evict function, but I don’t think it’ll work without breaking changes, so I may as well call it something different.
Let’s call it
deleteQuery
anddeleteFragment
, to mirror the existingread/writeQuery/Fragment
functions. I’ll just start withdeleteQuery
and assumedeleteFragment
works mostly the same way:You could use it like this, after adding a
widget
, for example:A couple of important notes here:
widgets
returns aWidgetList
, but we haven’t provided any subfields. This tells Apollo to wipe out the entire entry ofGizmo1.widgets
rather than just a specific subfieldvariable
values can now be functions of type(input: any) => boolean)
. (this only works fordeleteQuery
/deleteFragment
, of course) Best way to walk through how this works is an example - if Apollo goes into the cache sees a value cached asGizmo1.widgets({"page":0,"search": "hello"})
, it will call the functions withpage(0)
andsearch("hello")
.variables
can also provided like normal literals;gizmo: "15"
is equivalent togizmo: value => value === "15"
. If all variables match the field in the cache, the field will be matched and removed.After items have been removed from the cache in this way, any currently active queries that are displaying this data will automatically and immediately refetch.
The part of this I’m least certain about is the ability for a query to be “incomplete” and not specify subfields - some feature needs to exist so that you can clear an entire array instead of, say, just the
id
andname
fields of every entry in the array, but this particular solution would break a lot of tooling that tries to parse GraphQL queries against the schema.I’m using this workaround, but it is pretty bad because I need to refetch all my queries again because my client store is empty but at least doing this my App is always consistent.
we need a real solution ASAP.
The function below invalidates the cache for a given query by deleting all instances from the store. For example, if there was a function called widgetById that accepted an integer id parameter, then the following command could be executed to clear the cache of all related queries:
this.deleteStoreQuery('widgetById');
I’d really like to see this functionality built in as well.
@dallonf you could experiment using the more sophisticated mutation option
update
. It gives you direct access to the cache, so you can accomplish quite everything you would need.Sorry to promote my project here again, but this is exactly the kind of situation I built it for: apollo-cache-invalidate. Basically, following the schema you presented, you could invalidate all your paginated results (because they truly are invalid now) at once with:
But - and here goes a big but - this currently would invalidate cache only for non instantiate queries, meaning if the
widgets
query is currently present in the page, it would work. I have a pull-request working on this issue.Hope it helps.
I like the idea of differentiating between
invalidate
anddelete
.After using https://github.com/lucasconstantino/apollo-cache-invalidation, though, I’m not convinced that an API perfectly parallel to
writeQuery
/writeFragment
is sufficient, since it only targets fields with one particular set of arguments… here’s an example of why that’s important.Let’s say I have a
widgets(page: Int = 0): [Widget]
field in my rootQuery
type. When I query this, I’ll getROOT_QUERY.widgets({"page": 0})
added to the cache, as well as for"page": 1
and page 2 and so on.Now let’s a say a mutation comes along and adds or deletes a Widget somewhere in the middle of that list. There’s no sane way to simulate that client-side, so I need to invalidate the entire
widgets
field so it can be re-fetched. With aninvalidateQuery
API, the best I could do is invalidate one page of it, which would leave the cache in an inconsistent state.I’m not sure, though, that regexes (as used in apollo-cache-invalidation) are the right approach either. Ideally I’d be able to pass a function that takes in a field’s args and returns whether it should be removed from the cache? I have no idea what that might look like, though.
I’m not sure how much this really adds to the conversation, but I spent a whole lot of time typing this out in a dupe issue, so I may as well add it here. 😃 Here is a use case my team commonly runs into that is not well covered by the existing cache control methods and would greatly benefit from field-based cache invalidation:
Let’s say I have a paginated list field in my schema:
There is table in the app powered by this
Query.widgets
field. The user can customize the page size (aka$limit
) as well as paginate through the list ($offset
), so there are an essentially unbounded number of possible permutations for this query. Let’s also say for the sake of argument that the sorting logic of this field is complex and cannot be simulated client-side. (but even in simple sorting cases, I’m not convinced it’s reasonable to do this client side. Pagination really is a difficult caching problem.)So let’s throw a simple mutation into this mix…
When I fire this mutation, there is really no telling where it will be inserted into the list, given the aforementioned complex sorting logic. The only logical update I can make to the state of the store is to flag either the entire field of
widgets
as invalid and needing a refetch, or to invalidate every instance of the query, regardless of what its variables are.Unless I’m missing something, there doesn’t seem to be any way to handle this case in Apollo Client.
refetchQueries
, as well as the the new imperative store manipulation functions, require you to specify variables, andupdateQueries
of course only works on active queries, and even in react-apollo where all queries are kept active, only one instance of the query (that is, with one set of variables) will be active at a time.@helfer I’ve renamed the project to apollo-cache-invalidation. I’ll look at you other considerations in the morning 😉
Thanks for the productive feedback!
@lucasconstantino An additional caveat of
updateQueries
that’s worth ~harping on~ pointing out is that it only works with queries that are currently active; it won’t help you if the affected data will be fetched by a future query.This really needs to get resolved. It’s not even an edge case or rare use scenario. Every single CRUD app will need to deal with this issue.
Why is it so hard to write a
deleteFragment
function and broadcast updates to all queries subscribed to it? I have a chat app that a user wants to delete a single message. Sounds like a really common scenario. I don’t want to refetch all messages nor update the query. I just want to find the message fragment by id and delete it.Here’s my my work around, I would love something like this built in of course done better where I dont need to set manual IDs: https://gist.github.com/riccoski/224bfaa911fb8854bb19e0a609748e34
The function stores in the cache a reference IDs along with a timestamp then checks against it which determines the fetch policy
I’ve talked with @stubailo about being able to write
undefined
to invalidate queries/data - he says it should work but it doesn’t so this might be something that can be added as a good interm solution@yopcop Sure, my solution only works for update and delete a part of a query, but it’s better than the actual solution (and I’m also aware that is a hard problem and no easy solution exists). Sometime is definitively easier to invalidate previous queries.
For this kind of complex queries, I think that a powerfull solution can be to be able to query the queries
Example to illustrate:
So basicly like a
filter
and it’s return array of queries so can you.map
and use the classicclient.writeQuery()
.(after I never put my hands on the Apollo code base, so I really don’t know if it’s possible, it’s just to share ideas 😉)
@fabien0102 With this solution, I don’t think you can add a result to a query though. If I made a query like
posts(fromDate: "2018-01-01", toDate: "2018-03-31")
, and I create a new post withdate="2018-03-20"
, I would like to invalidate the query. I can add it manually to the query results, but if the filters get complicated, it can be a lot of work. Invalidating the query would be much easier if I don’t mind the extra requests made to refresh them.@anton-kabysh interesting. I think this might create confusion on what exactly is being removed, though; the cached information, or the entity itself.
@dr-nafanya I think there is a method available for this:
client.cache.reset()
@Draiken the function in my comment above deletes the whole field from the cache regardless of variables. I agree it’s frustrating that there still isn’t a proper solution.
I have same problem, any date to launch new store API?
@lucasconstantino Aha, thanks! That does solve my use case for now. (I had tried your module, but didn’t realize the regex was necessary to capture field arguments).
@nosovsh apollo-cache-invalidate will basically purge current data from the cache. In the current state of things, it will work as expected for new queries the user eventually perform on that removed data, but if there are any active observable queries (queries currently watching for updates) related to the removed data, these queries will not refetch, but serve the same old data they had prior to the cache clearing. To solve this problem, I’m working on a pull-request to apollo-client to allow the user decide when new data should be refetched in case something goes stale: https://github.com/apollographql/apollo-client/pull/1461. It is still a work in progress, and I’m not sure hold long it will take for something similar to go into core.
@helfer I’ve looked into your second observation, and I think I found a dead end here.
Studying the QueryManager class, and specifically the
queryListenerForObserver
method, I’ve realized the way I’m doing the cache cleaning isn’t really work for current observers. This is quite odd, for I though I had it working on a project using react-apollo. I’ll look into that later, though. About the stale data being returned, I don’t really understand why when a stale data is found isn’t a refetch triggered. In which scenarios could some data be missing, but having it had been fulfilled before, and how come the user would want that old data and not a fresh one in that case? I’m talking about line 412-418 on the QueryManager, to contextualize you.Testing it locally, I was able to fire a refetch from that exact spot, using
storedQuery.observableQuery.refetch()
, which did solve the problem for apollo-cache-invalidation approach.The big problem here, I think, is that the field based invalidation isn’t really current compatible with the approaches Apollo Client has on cache clearing or altering. Both
refetchQueries
andupdateQueries
rely on the code doing the mutation to know exactly which queries (and even variables) are to be updated, meaning the code needs to know quite a lot to perform cache clearing. The idea behind a field-based invalidation system is to make this mutation code aware of the structure of the store, but absent of other code performing queries on that same store. I would like to make all ObservableQueries understand that some of their data is now invalid, but I don’t see how when only having a path in the store - nothing really related to the queries used to build the observables. Basically, I’m fighting the current design when trying to walk ahead on this approach.Well, as far as I could look it up, apollo-cache-invalidation cannot on it’s own fix the stale data problem, meaning I would have to pull-request Apollo Client at least on some minor routines. What do you think? Should I proceed, or am I missing something here?
By the way: I guess being informed that
refetchQueries
now does trigger refetches on non-active queries (supposedly) makes apollo-cache-invalidation project a bit more specific to some use cases.