streamlit: Images sometimes do not appear

Summary

The site randomly shows “0” for images inserted via st.image() and MediaFileManager logs “Missing file”.

More information

I have a couple of jpg images embedded with st.image() and randomly they will not render, instead there will just be a 0 shown instead. Reloading the page or rerunning the code fixes the problem. When I reload the page often enough one ore several of the images will break again and just show a 0.

In the terminal with --log_level error I receive

MediaFileManager: Missing file d8a7ff62725a8ab1609c9335ba2e85375f491027d91b3badb27a6ccd

In my complex multi-page streamlit app this happens very often, if I reload the page two or three times, one out of five images is likely broken. The simpler toy example below takes much longer to show the undesired behaviour, but it does so fairly consistently.

Steps to reproduce

Run this code (ideally with --log_level error):

import streamlit as st
if st.checkbox('checkbox'):
    st.image("foo.jpg")

Toggle that checkbox repeatedly (this method is quicker than reloading the page) and look at the console output. Maybe rerun the code once in a while. (Of course you need to put any jpg image named “foo.jpg” in the same folder).

Actual behaviour

Sooner or later (5-50 clicks) the image will not be shown. Instead, in its place a 0 appears. A “MediaFileManager: Missing file” error is shown in the terminal.

Expected behaviour

Images should always be shown.

Debug info

  • Streamlit version: 0.57.1
  • Python version: 3.6.9

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 12
  • Comments: 64 (7 by maintainers)

Commits related to this issue

Most upvoted comments

Below is my investigation on why I believe this happens and in my next reply, the easy way around it.

With my Streamlit app hosted on Google Cloud Platform using Cloud Run (with a Load Balancer in front of Cloud Run) I could NOT get around this issue by using st.image(), images still get the “0”. Even with “Session Affinity” (a Cloud Run setting) enabled, which by the way isn’t a guarantee of your user ending up in the exact instance they first landed in when they booted up your app.

So I looked into this further and the core issue lies with st.image() and how it functions internally. Whether you try using st.image() to load an image locally within your code or you download an image from outside your code, it will randomly have a “0” appear. Definitely will occur on Cloud Run.

The problem there is because when your user first boots up the app on Cloud Run, your app launches in a container on “Instance A” lets say. The image may then appear no problem at all. However if the user refreshes his browser, internally Cloud Run may route the user to “Instance B” and when that happens the “0” will show up on the image.

Behind the scenes multiple things occur with st.image() when you invoke that function. As you can see here, in the source code for Streamlit under streamlit/runtime/media_file_storage.py under the class MediaFileStorage, images that are loaded by st.image() get a unique ID attached to that media file.

So lets go through this.

  • “Instance A” on Cloud Run loads successfullydog.jpg and Streamlit gives that image a unique ID of 123abc.

  • User refreshes his browser and user gets routed to “Instance B” by Cloud Run.

  • “Instance B” attempts to load dog.jpg by referencing 123abc but that ID doesn’t exist in “Instance B”, and there you get a “0”.

“How and why are the container instances connected on Cloud Run?”

They are not what so ever. But your browser doesn’t seem to know that. It still thinks you are connected to “Instance A” when you are actually in “Instance B”.

This perhaps may have something to do with how WebSockets also work? WebSockets are what Streamlit makes use of and it’s a type of internet communication protocol where it’s essentially a direct connection between your browser and the Streamlit app. The connection is maintained with “Instance A” even when you go into “Instance B” (which creates another separate WebSocket connection as well?). I think things may be getting messed up perhaps here.

To summarize. The browser makes a request for dog.jpg by using 123abc to “Instance B” when the internal code has no idea what 123abc is for dog.jpg. Only “Instance A” knows because 123abc is where that ID got created in the first place.

This further seems to be the case because if I keep refreshing my browser the image may actually appear again, like magic. But it’s not magic, it’s because I ended up in “Instance A” once again.

Any thoughts on this @kmcgrady ?

We’ve QAed a lot of scenarios and feel comfortable that the change is an improvement. The change has been merged, and it will be in the next release.

Same issue on my side with the version 0.61. It happens when displaying 2 pie chart created by the same function called twice. The first st.pyplot is not displayed.

Edit : adding a sleep(0.2) after the pyplot() seems work as workaround

The solution is quite simple and fortunately Streamlit already has a built in way to get around the “0” image issue.

This method below was the ONLY way I could resolve this so I could get my images to consistently load 100% of the time on Cloud Run (while using a Load Balancer as well) . The alternative fix would be modifying the internal functionality of st.image().

Directions: https://docs.streamlit.io/library/advanced-features/static-file-serving

  1. Edit your .streamlit/config.toml and put in [server] enableStaticServing = true
  2. Create a directory called static in your root project directory. If main.py is your Streamlit program make sure the directory static is within the same directory as main.py.
  3. Place an image into static, dog.jpg for example
  4. In main.py you would load dog.jpg by putting in your code: st.markdown('<img src="app/static/dog.jpg" style="width:100%">', unsafe_allow_html=True)
  5. Done! Rinse and repeat for every other image you have afterwards.

Downside to this is you must store all your static content (images) within your code. But if you don’t have a lot of images and your images aren’t large in size, it won’t be a big deal.

Hopefully we get a better way to resolve this in the future. 😃

st.image("C:\\Users\\91865\\Desktop\\Streamlit Demo\\data\\sal.jpg") This produces an error and shows a 0 icon when used on the page, but using

from PIL import Image
image = Image.open('C:\\Users\\91865\\Desktop\\Streamlit Demo\\data\\sal.jpg') 
st.image(image) 

While this works totally fine. I don’t see why this happens

Having the same “0” issue on the latest version of Streamlit (1.20.0). This fixed it, thanks!

I was having the same issue in 1.23.0 and opening the image this way solved my issue!

st.image("C:\\Users\\91865\\Desktop\\Streamlit Demo\\data\\sal.jpg") This produces an error and shows a 0 icon when used on the page, but using

from PIL import Image
image = Image.open('C:\\Users\\91865\\Desktop\\Streamlit Demo\\data\\sal.jpg') 
st.image(image) 

While this works totally fine. I don’t see why this happens

Having the same “0” issue on the latest version of Streamlit (1.20.0). This fixed it, thanks!

FYI in case other folks come across this, we were able to fix our images loading issue with this: https://cloud.google.com/run/docs/configuring/session-affinity

@JohnMachado11 Thank you. You’re a legend.

Issue is here still…

I see numerous more cases of this issue happening in this thread. To be clear, here’s what goes on below the hood:

  • images are loaded and stored in memory per script run.
  • images are only removed when the latest script run does not use images from the previous script run.

With that in mind, this leads to some important outcomes.

  • If you have your streamlit app running with replicas behind a load balancer, you will see issues. The image is stored in the memory of the app where the script ran and is connected. If your request for the image is received with a different replica, it will not respond. The best solution is to enable some sort of sticky routing. We have some thoughts on this, but it’s going to be a longer time to a solution.
  • If you are assuming that images are being received past the script run will be generated, this may present a problem. I imagine this less a use case.

Outside of above, if you have a simple code example/repro steps to demonstrate the issue. It would be very helpful.

For those using GKE you can resolve this by making sure the application container’s replicas is set to one

spec:
  replicas: 1

I was having this same issue when deploying to GCP App Engine and was able to solve it with help from the @kmcgrady post above (thank you!). The key is multiple instances. If you use the default configs on App Engine you are going to have multiple instances (VMs) underlying your application and this is going to cause 404s to from the HTTP image fetch outside the websocket. When I force the instance count to 1, this issue goes away and all of the images render properly. Of course, I loose autoscaling which isn’t ideal but at least the app works properly.

Thanks Suvoo. I just encountered this issue and your workaround using PIL fixed it. Streamlit, version 0.73.0, Windows 10. What baffles me is that the st.image/st.sidebar.image call is in a page base class for a multi-page app that has been running reliably for over a year - I’m getting this testing standalone a new page subclassed from the same base class as the others. It’s the first time I’ve ever seen this issue. Weird.

@tvst This is happening in the application we talked about a while back. Here’s a screen snip of a test. The sidebar image is using Suvoo’s workaround, the main frame one is the same filepath but provided directly to st.image. The sidebar one is called from the base class, the main frame from the sub-classed application page

image

Kevin

st.image("C:\\Users\\91865\\Desktop\\Streamlit Demo\\data\\sal.jpg") This produces an error and shows a 0 icon when used on the page, but using

from PIL import Image
image = Image.open('C:\\Users\\91865\\Desktop\\Streamlit Demo\\data\\sal.jpg') 
st.image(image) 

While this works totally fine. I don’t see why this happens

Hey @tylerjrichards Just want to follow up. I spent some time today investigating it. I am reasonably confident on the problem and can reproduce it. In essence:

  • One specific process is running Streamlit and images (from pyplot in this case) are generated and stored in memory for that process.
  • Streamlit Share has multiple processes and a system designed to have messages/requests target the same process to grab the correct custom generated image/video/media.
  • That process fails when making image (and probably other media) requests. It’s targeting any process, and if it targets the wrong one, the image will not be found. This is why you see some requests work and some don’t (that may seem random).

I had trouble by uploading the same code to my personal Streamlit share because there was no problem. That is because my service is only one process. When a process gets too much attention (as in your case when we marketed it for Streamlit Share), we add more resources to it and now we have multiple processes causing the issue. I was able to demonstrate this on my system by upping the number of processes.

So the easy workaround for you is to delete the app and redeploy it (thereby making it seem like a new single process app for now). I believe that would work and would bypass some internal communication chain to reduce resources. If the app gets more attention, we can increase those resources again, but hopefully we’ll have a fix in time. If you do that, let me know if that works/doesn’t work. If that’s unacceptable, I can look through my end and see if I can reduce the resources, just a little more red tape.

To summarize for @chakra-ai I wonder if your Azure solution is creating multiple processes (at least more than 1) and creates the same problem above. If that’s the case, I don’t have a clear solution. We are spending a lot of time figuring our the solution for Streamlit Share that I can’t speak for on Azure, but I am bringing it up as a big point for the team to think about prioritization at the very least.

@masolin @rbracco @MauGal @KyotoSunshine @madpowah @emanoelbarreiros @m-ad @danielvarga @JohnPaton @thomelane @sabualkaz

I’ve been working on the MediaFileManager missing file issues. I think I have a solution, but because this has been a tricky piece of code, I’m looking for some testers. Would you be willing to try it out with your use cases? I would greatly appreciate it. One thing I do recommend is if using pyplot, you supply a figure to st.pyplot See our documentation for this. It might save some headaches.

You can download the update from my dropbox.

https://www.dropbox.com/s/scqhnpo41ld8fla/streamlit-0.67.1-py2.py3-none-any.whl?dl=0

First you uninstall streamlit

$ pip uninstall streamlit

Then you install the package using the wheel package.

$ pip install wheel
$ pip install PATH/TO/streamlit-0.67.1-py2.py3-none-any.whl

Once you are done testing it, you can uninstall it just like any other version.

$ pip uninstall streamlit

Curious on the code? I used it to help implement Range Requests on the server https://github.com/streamlit/streamlit/pull/1967

I’m really hoping I cover all the issues you have found. 😊

Tried it with the current streamlit-nightly==0.62.1.dev20200621 and the issue still appears there as well, though perhaps a bit less often. Also, in my case, reloading (by pressing “r”), does not help to make the image appear.

Still experiencing an issue with this one, even on nightly (v0.59.1.dev20200506).

Can see the issue with streamlit hello (Animation Demo) running on a web server. More of an issue over https, compared to http, but still get dropped frames either way.

Seeing these errors scattered in the console…

GET https://example.com/media/fc5c3cc4b370b667d407fa2ac2a808348815d93eae8229f04ba030a1.jpeg 404
GET https://example.com/media/e8c9d0a5997976f844e0886a2702f6f083647828c73e8354fcb64c06.jpeg 404
GET https://example.com/media/55c9e43090bd59da85388e171d1f5d558775f5aca80567b7d019ae53.jpeg 404
...