streamlit: In-memory caching of of instances of user-defined classes does not preserve class identity
Summary
When streamlit reruns a file with a class definition this class gets newly instantiated in memory. A cached instance of this class however remains an instance of the old definition of this class. A newly created instance of it therefore ends up being of a different class than a cached instance of it which can lead to hard to debug bugs.
Steps to reproduce
- Run this code with
streamlit run
from enum import Enum
import streamlit as st
class A(Enum):
Var1 = 0
@st.cache
def get_enum_dict():
return {A.Var1: "Hi"}
look_up_key = A.Var1
cached_value = get_enum_dict()
st.write("class id of look_up_key: {}".format(id(look_up_key.__class__)))
st.write("class id of cached key: {}".format(id(list(cached_value.keys())[0].__class__)))
st.write(cached_value[look_up_key])
- Rerun by pressing ‘r’
Expected behavior:
Rerunning should print the same id for the class of look_up_key and the key in cached_value and the code should still print “Hi” at the end.
Actual behavior:
On the intial run the code print two times the same id and the look-up in the dictionary is successful.
But on rerun the class ids differ and a KeyError: <A.Var1: 0> is raised.
Is this a regression?
no
Debug info
- Streamlit version: 0.71.0
- Python version: 3.8.3
- Using Conda
- OS version: Mac OS 10.15.7
- Browser version: Firefox 82.0.3 (64-Bit)
Additional information
This bug is not unique to Enums but happens with all user-defined classes that get reevaluated. I had the same problems with other classes but this example is more easily reducible.
Ideas on how to fix it
Pickling and unpickling the cached object causes the class id to be updated to the new definition.
A very helpful short-term band-aid would be to have a separate st.cache option that forces pickling and unpickling also for the in-memory cache. That way the user can circumevent that bug selectively for the problematic types.
Long term I have two ideas but do not know how feasible they are: Walk the object hierarchy of every cached value and
- apply in-memory pickling only selectively to classes which definitions are in files that might be rerun during a Session
- “Hot-Patch” the
__class__field upon retrieval from the cache. But I do not know whether that is reliable in python or whether there are unintended side-effects to that.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 11
- Comments: 19 (6 by maintainers)
Commits related to this issue
- Add Enum coercion to options elements, if input Enum classes "identical" but redefined on script run (#7408) * Enum classes will be special-cased in all streamlit elements that accept an `option` seq... — committed to streamlit/streamlit by Asaurus1 8 months ago
- Add Enum coercion to options elements, if input Enum classes "identical" but redefined on script run (#7408) * Enum classes will be special-cased in all streamlit elements that accept an `option` seq... — committed to eric-skydio/streamlit by Asaurus1 8 months ago
- Add Enum coercion to options elements, if input Enum classes "identical" but redefined on script run (#7408) * Enum classes will be special-cased in all streamlit elements that accept an `option` seq... — committed to zyxue/streamlit by Asaurus1 8 months ago
Thank you so much for posting this. This was a very aggravating bug to track down. The stack trace would show that
enums which are supposed to be identical, were not. I was so confused and frustrated.This bug made it difficult for me to use
streamlitwith a mature code base that relied onenumhashing for various data operations.Just want to add another voice to this. I’ve been bit by this as well, wanting to do branching based on
isinstance. I also want to be able to useEnums in my code, but have had to give up on that.I want to be able to write library code, that is agnostic to the UI i put on top of it. This is the number one issue that stops me from doing that with Streamlit.
This issue is underrated.
Hey, @jrieke!
The fix provided is for hashing; this issue refers to the value fetched from the caching mechanisms. The PR you are referencing wouldn’t fix the problem referenced here!
Can you please re-open this?
There are plenty of examples in this thread of replicating the issue if you need further confirmation that this isn’t fixed.
Hi all,
I created a
streamlitsolution for this enum problem. https://github.com/streamlit/streamlit/compare/develop...FloWide:streamlit:enum_supportThe concept:
__import__function to its__builtin__functions.__import__gets theenumas an importable target, it returns with a specific enum module.enum, and inherits from the original classes, but the classes metaclass is extended.Enum, cache behind anexperimental_singletoncall, returns with the same class through all session and run.@LukasMasuch @kmcgrady Can you take another look at this longstanding bug? This issue has been open for over two years, which is the second-longest out of all 40 open P2 bugs. Resolving this would fix a major weakness in the app’s performance and usability.
Why are
enums a core python feature not supported? 😭 This continues to cause problems a year later…