gcsfuse: GCSFUSE hanging while trying to write large files (~6GB)
Describe the issue
I am trying to cp
a relatively large single file, around 6GB, through GCSFUSE, and it just hangs forever. I am able to use gsutil/gcloud-storage-rsync to copy the same file, so I am sure there’s no bandwidth problem here. I have also ensured that there’s enough space in --temp-dir=/tmp
to write a 6GB file. Doing ls from separate terminal is also stuck.
I don’t see this issue with smaller files around 200MB.
I think this issue is similar to https://github.com/GoogleCloudPlatform/gcsfuse/issues/715.
This is greatly limiting our use of GCSFUSE, hence requesting to please have look urgently.
To Collect more Debug logs
Relevant GCSFUSE logs while doing cp
:
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x00002906 connection.go:416] \u003c- WriteFile (inode 4, PID 0, handle 0, offset 5228199936, 1048576 bytes)\n","timestampSeconds":1702017866,"timestampNanos":109021661}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x000028f0 connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":109514277}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x000028f2 connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":109968981}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x00002908 connection.go:416] \u003c- WriteFile (inode 4, PID 0, handle 0, offset 5229248512, 1048576 bytes)\n","timestampSeconds":1702017866,"timestampNanos":110087270}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x0000290a connection.go:416] \u003c- WriteFile (inode 4, PID 0, handle 0, offset 5230297088, 1048576 bytes)\n","timestampSeconds":1702017866,"timestampNanos":110350364}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x000028f4 connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":110457680}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x0000290c connection.go:416] \u003c- WriteFile (inode 4, PID 0, handle 0, offset 5231345664, 1048576 bytes)\n","timestampSeconds":1702017866,"timestampNanos":110798379}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x000028f8 connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":112676976}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x000028f6 connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":112803534}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x000028fa connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":112841686}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x000028fc connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":112869553}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x0000290e connection.go:416] \u003c- WriteFile (inode 4, PID 0, handle 0, offset 5232394240, 1048576 bytes)\n","timestampSeconds":1702017866,"timestampNanos":112983238}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x00002910 connection.go:416] \u003c- WriteFile (inode 4, PID 0, handle 0, offset 5233442816, 1048576 bytes)\n","timestampSeconds":1702017866,"timestampNanos":113224758}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x00002912 connection.go:416] \u003c- WriteFile (inode 4, PID 0, handle 0, offset 5234491392, 1048576 bytes)\n","timestampSeconds":1702017866,"timestampNanos":113465212}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x00002914 connection.go:416] \u003c- WriteFile (inode 4, PID 0, handle 0, offset 5235539968, 1048576 bytes)\n","timestampSeconds":1702017866,"timestampNanos":113721204}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x000028fe connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":114112281}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x00002900 connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":114194980}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x00002916 connection.go:416] \u003c- WriteFile (inode 4, PID 0, handle 0, offset 5236588544, 1048576 bytes)\n","timestampSeconds":1702017866,"timestampNanos":114386752}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x00002918 connection.go:416] \u003c- WriteFile (inode 4, PID 0, handle 0, offset 5237637120, 1048576 bytes)\n","timestampSeconds":1702017866,"timestampNanos":114636800}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x00002902 connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":114689493}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x0000291a connection.go:416] \u003c- WriteFile (inode 4, PID 0, handle 0, offset 5238685696, 1048576 bytes)\n","timestampSeconds":1702017866,"timestampNanos":114994381}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x00002904 connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":115484631}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x0000291c connection.go:416] \u003c- WriteFile (inode 4, PID 0, handle 0, offset 5239734272, 1048576 bytes)\n","timestampSeconds":1702017866,"timestampNanos":115805222}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x00002906 connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":116074374}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x0000291e connection.go:416] \u003c- WriteFile (inode 4, PID 0, handle 0, offset 5240782848, 1048576 bytes)\n","timestampSeconds":1702017866,"timestampNanos":116348745}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x00002908 connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":116654494}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x00002920 connection.go:416] \u003c- WriteFile (inode 4, PID 0, handle 0, offset 5241831424, 1048576 bytes)\n","timestampSeconds":1702017866,"timestampNanos":116973897}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x0000290a connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":117296953}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x0000290c connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":117931049}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x00002914 connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":118536042}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x0000290e connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":119144253}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x00002910 connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":119719868}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x00002912 connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":120347024}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x00002916 connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":120978491}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x00002918 connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":121618115}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x0000291a connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":122236082}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x0000291c connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":122920430}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x00002920 connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":123971391}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x0000291e connection.go:498] -\u003e OK ()\n","timestampSeconds":1702017866,"timestampNanos":124041422}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x00002922 connection.go:416] \u003c- FlushFile (inode 4, PID 1809725)\n","timestampSeconds":1702017866,"timestampNanos":124142730}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"gcs: Req 0x9: \u003c- StatObject(\".bfuse/proj/scratch/gupttaru/file0.txt\")\n","timestampSeconds":1702017866,"timestampNanos":127842104}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"gcs: Req 0x9: -\u003e StatObject(\".bfuse/proj/scratch/gupttaru/file0.txt\") (140.90789ms): OK\n","timestampSeconds":1702017866,"timestampNanos":268764912}
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"gcs: Req 0xa: \u003c- CreateObject(\".bfuse/proj/scratch/gupttaru/file0.txt\")\n","timestampSeconds":1702017866,"timestampNanos":268833249}
The logs stop being generated at this point, i.e. at CreateObject
, even though the cp
command hasn’t been completed. On the bucket, I just a see an object with 0 bytes present.
Doing ls
from a separate terminal is also stuck, and the last GCSFUSE log statement is the following:
{"name":"root","levelname":"DEBUG","severity":"DEBUG","message":"fuse_debug: Op 0x00002954 connection.go:416] \u003c- LookUpInode (parent 3, name \"file0.txt\", PID 1820773)\n","timestampSeconds":1702018841,"timestampNanos":720568546}
System (please complete the following information):
- OS: RHEL8 (version 8.9)
- Platform: On-premise machine.
- Version: gcsfuse version 1.1.0 (Go version go1.20.4)
About this issue
- Original URL
- State: open
- Created 7 months ago
- Comments: 18
@tarun360 the fix for this is in https://github.com/GoogleCloudPlatform/gcsfuse/pull/1562 and is currently planned to release in Jan 2024 release, on Jan 22.
Without the above change, the data upload traffic wasn’t being routed via the proxy server (verified by checking the proxy-server logs), so I suppose that might be related to some sort of rate-limiting by the firewalls for public endpoints, which lead to
write: connection timed out
.With the above change, I am able to upload the 6GB file, so that probably implies that data is being routed via the proxy server (although I am yet to check proxy logs, but I am pretty sure that is the case since there’s no other explaination of why this change would fix this issue).
Hi @tarun360, When dealing with large files, GCSFuse may require a significant amount of time to complete the upload. For instance, writing a 6GB file can take around 15 minutes. During this process, ls calls may appear to be stuck, but other read/write operations within the directory should remain functional.
Could you please confirm if GCSFuse is truly stuck or simply experiencing a longer upload duration due to the file size?
Thanks!