gradio: Output file is put at /tmp file and it doesn't work on cluster environment

Describe the bug

I’m using version 3.14, and there is an issue with /tmp file. After /join processing , Gradio returns the path to the output file.

Screen Shot 2022-12-28 at 13 58 25

Then it tries to fetch that file, but unfortunately it failed. In a cluster environment like multiple pods on K8s, requests are routed to different pod every single times.

In this case, the request was routed to a different pod, and it logs out this error.

ValueError: File cannot be fetched: /tmp/output_vsre9e5bfed6ae3cdf251047d5d5ddb1bc5e6d4b1d0.mp4. All files must contained within the Gradio python app working directory

I think the only way to fix this problem is to return the output file in websocket message.

Is there an existing issue for this?

  • I have searched the existing issues

Reproduction

Run Gradio 3.14 on kubernetes with at least 2 pods

Screenshot

No response

Logs

2022-12-28T05:25:55.012+0000 raw_response = await run_endpoint_function(                                                                                                                              
2022-12-28T05:25:55.013+0000 File "/usr/local/lib/python3.9/site-packages/fastapi/routing.py", line 163, in run_endpoint_function                                                                     
2022-12-28T05:25:55.015+0000 return await run_in_threadpool(dependant.call, **values)                                                                                                                 
2022-12-28T05:25:55.016+0000 File "/usr/local/lib/python3.9/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool                                                                    
2022-12-28T05:25:55.020+0000 return await anyio.to_thread.run_sync(func, *args)                                                                                                                       
2022-12-28T05:25:55.021+0000 File "/usr/local/lib/python3.9/site-packages/anyio/to_thread.py", line 31, in run_sync                                                                                   
2022-12-28T05:25:55.023+0000 return await get_asynclib().run_sync_in_worker_thread(                                                                                                                   
2022-12-28T05:25:55.025+0000 File "/usr/local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread                                                        
2022-12-28T05:25:55.027+0000 return await future                                                                                                                                                      
2022-12-28T05:25:55.028+0000 File "/usr/local/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 867, in run                                                                              
2022-12-28T05:25:55.030+0000 result = context.run(func, *args)                                                                                                                                        
2022-12-28T05:25:55.031+0000 File "/usr/local/lib/python3.9/site-packages/gradio/routes.py", line 272, in file                                                                                        
2022-12-28T05:25:55.033+0000 raise ValueError(                                                                                                                                                        
2022-12-28T05:25:55.034+0000 ValueError: File cannot be fetched: /tmp/output_vsre9e5bfed6ae3cdf251047d5d5ddb1bc5e6d4b1d0.mp4. All files must contained within the Gradio python app working directory

System Info

- Gradio 3.14

Severity

annoying

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 57 (23 by maintainers)

Most upvoted comments

@MrNocTV I believe this line should be at the top of your Python script (anywhere before the Gradio app is created) and you don’t need to create a config necessarily, you could just hardcode the path

@abidlabs , at the moment, we are applying IP Hash to make requests consistently go to 1 pod. However this might cause imbalance when the workload increases.

@abidlabs

Thank you a lot for your support 😄

Same thing goes for the option analytics_enabled=False, we remove it manually because it can be override by user’s co

We do allow setting analytics_enabled via an environmental variable, but we would always let the user have the final option. Throughout Gradio’s codebase, we respect the following priority system: default value < environmental variable < user-provided parameter value.

We could in theory do something similar for the file_directories variable, though I’m inclined against it for now since it could lead to some very insecure settings (if the environmental variable is configured improperly, it could allow users to access any file on the Gradio app developer’s computer). Let’s keep it explicit for now, glad to hear that we’ve finally managed to make it work.

I’ve removed the tempfile and kept file_directories="/root/cache", that is the only way I could make it works.

I’m figuring that out 😄 please wait a bit.

@MrNocTV thanks for sharing the error. Since the file has been created, it looks like Gradio is blocking access to it for some reason. Can you try one more thing – in the launch() method, can you try specifying this directory to the file_directories parameter? This parameter allows the developer to specify additional directories that the Gradio app can serve files from. So you could would look something like this:

...
demo.launch(file_directories="/root/cache")

You could be even more permissive and try something like file_directories="/root" or file_directories="/"

About it’s security… Got quiet some concerns about this part since you can basically read any file now by just changing the URL path. /file=/root/cache/tmp735ebskw.wav to another file.

Actually that should not be the case. Gradio only allows access to files that it created (or are in the Gradio working directory). You can check this yourself by changing the path to a different file.

Changing it’s tempfile.tempdir it changes it not only for gradio, but as well as for other libraries I suppose?

Yes, so perhaps this is something that we can let users configure within the Gradio library.


@MrNocTV could you try the solution @zolero provided and let us know if it works for you as well?

You should see a “reopen” button at the bottom of the page – I’ll go ahead and reopen it myself

@MrNocTV just FYI I have some bandwidth tomorrow or Friday to take a look if you are able to set up a reproducible environment. Thanks!

It should definitely be fixable, but it’s a difficult problem to reproduce / test given that kubernetes is involved. If you have a simple way we could reproduce this, it would be super helpful in getting this fixed.