velox: AsyncDataCache Failed to write to SSD
Bug description
Expect the cache writing to not fail or expect it to retry and succeed the next time around(Which we think it does?).
System information
config-native.properties
discovery.uri=http://coordinator:8080
http-server.http.port=8080
presto.version=${PRESTO_BUILD_VERSION}
system-memory-gb=108
query-memory-gb=108
query.max-memory-per-node=108GB
memory-arbitrator-kind=SHARED
async-data-cache-enabled=true
async-cache-ssd-gb=200
async-cache-ssd-path=/opt/presto-server/async_data_cache
Relevant logs
I1110 17:01:53.951222 37 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 8 entries
I1110 17:01:53.951247 37 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 7 entries
I1110 17:01:53.951253 37 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 4 entries
E1110 17:01:53.952140 157 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache1, fd: 8, size: 8, offset: 0, error code: 22, error string: Invalid argument
E1110 17:01:53.952679 158 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache3, fd: 10, size: 5, offset: 0, error code: 22, error string: Invalid argument
E1110 17:01:53.953361 159 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache7, fd: 14, size: 26, offset: 0, error code: 22, error string: Invalid argument
I1110 17:01:53.953380 159 SsdCache.cpp:122] [SSDCA] Wrote 13MB, 6439.5522 MB/s
I1110 17:01:54.251892 67 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 12 entries
I1110 17:01:54.251917 67 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 14 entries
E1110 17:01:54.252444 173 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache1, fd: 8, size: 8, offset: 638834, error code: 22, error string: Invalid argument
E1110 17:01:54.252723 174 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache3, fd: 10, size: 5, offset: 150444, error code: 22, error string: Invalid argument
E1110 17:01:54.253002 175 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache5, fd: 12, size: 5, offset: 0, error code: 22, error string: Invalid argument
E1110 17:01:54.253219 176 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache7, fd: 14, size: 46, offset: 12894771, error code: 22, error string: Invalid argument
I1110 17:01:54.253237 176 SsdCache.cpp:122] [SSDCA] Wrote 24MB, 19356.297 MB/s
I1110 17:01:54.602331 79 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 17 entries
I1110 17:01:54.602358 79 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 17 entries
E1110 17:01:54.602947 177 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache1, fd: 8, size: 8, offset: 1277668, error code: 22, error string: Invalid argument
E1110 17:01:54.603561 178 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache3, fd: 10, size: 5, offset: 300888, error code: 22, error string: Invalid argument
E1110 17:01:54.603894 179 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache5, fd: 12, size: 40, offset: 2369490, error code: 22, error string: Invalid argument
E1110 17:01:54.604246 180 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache7, fd: 14, size: 43, offset: 35266960, error code: 22, error string: Invalid argument
I1110 17:01:54.604266 180 SsdCache.cpp:122] [SSDCA] Wrote 56MB, 30882.986 MB/s
I1110 17:01:54.743993 27 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 18 entries
I1110 17:01:54.744015 27 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 18 entries
E1110 17:01:54.744616 181 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache1, fd: 8, size: 8, offset: 1916502, error code: 22, error string: Invalid argument
E1110 17:01:54.744943 182 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache3, fd: 10, size: 5, offset: 451332, error code: 22, error string: Invalid argument
E1110 17:01:54.745416 183 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache5, fd: 12, size: 40, offset: 21324664, error code: 22, error string: Invalid argument
E1110 17:01:54.745930 184 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache7, fd: 14, size: 10, offset: 60240568, error code: 22, error string: Invalid argument
I1110 17:01:54.745949 184 SsdCache.cpp:122] [SSDCA] Wrote 72MB, 39146.96 MB/s
I1110 17:01:55.060570 35 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 20 entries
I1110 17:01:55.060596 35 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 20 entries
E1110 17:01:55.061105 185 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache1, fd: 8, size: 8, offset: 2555336, error code: 22, error string: Invalid argument
E1110 17:01:55.061133 158 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache5, fd: 12, size: 38, offset: 40279838, error code: 22, error string: Invalid argument
E1110 17:01:55.061131 157 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache3, fd: 10, size: 5, offset: 601776, error code: 22, error string: Invalid argument
E1110 17:01:55.061165 159 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache7, fd: 14, size: 86, offset: 67108864, error code: 22, error string: Invalid argument
I1110 17:01:55.061183 159 SsdCache.cpp:122] [SSDCA] Wrote 96MB, 172370.77 MB/s
I1110 17:01:55.209815 63 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 34 entries
I1110 17:01:55.209839 63 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 21 entries
I1110 17:01:55.209843 63 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 24 entries
E1110 17:01:55.209985 173 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache1, fd: 8, size: 8, offset: 3194170, error code: 22, error string: Invalid argument
E1110 17:01:55.209985 176 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache7, fd: 14, size: 5, offset: 131424093, error code: 22, error string: Invalid argument
E1110 17:01:55.209985 175 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache5, fd: 12, size: 5, offset: 62884746, error code: 22, error string: Invalid argument
E1110 17:01:55.210000 174 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache3, fd: 10, size: 5, offset: 752220, error code: 22, error string: Invalid argument
I1110 17:01:55.210224 174 SsdCache.cpp:122] [SSDCA] Wrote 134MB, 342029.25 MB/s
I1110 17:01:55.378504 73 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 34 entries
I1110 17:01:55.378527 73 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 25 entries
I1110 17:01:55.378531 73 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 28 entries
E1110 17:01:55.378670 177 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache1, fd: 8, size: 8, offset: 3833004, error code: 22, error string: Invalid argument
E1110 17:01:55.378672 178 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache3, fd: 10, size: 5, offset: 902664, error code: 22, error string: Invalid argument
E1110 17:01:55.378720 180 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache7, fd: 14, size: 79, offset: 134217728, error code: 22, error string: Invalid argument
E1110 17:01:55.378731 179 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache5, fd: 12, size: 80, offset: 67108864, error code: 22, error string: Invalid argument
I1110 17:01:55.378760 179 SsdCache.cpp:122] [SSDCA] Wrote 134MB, 610031 MB/s
I1110 17:01:55.627291 35 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 40 entries
I1110 17:01:55.627314 35 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 30 entries
I1110 17:01:55.627318 35 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 31 entries
E1110 17:01:55.627404 181 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache1, fd: 8, size: 8, offset: 4471838, error code: 22, error string: Invalid argument
E1110 17:01:55.627404 182 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache3, fd: 10, size: 5, offset: 1053108, error code: 22, error string: Invalid argument
E1110 17:01:55.627432 184 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache7, fd: 14, size: 5, offset: 193252191, error code: 22, error string: Invalid argument
E1110 17:01:55.627425 183 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache5, fd: 12, size: 5, offset: 128007078, error code: 22, error string: Invalid argument
I1110 17:01:55.627463 183 SsdCache.cpp:122] [SSDCA] Wrote 177MB, 1280411.4 MB/s
I1110 17:01:55.897388 66 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 47 entries
I1110 17:01:55.897418 66 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 32 entries
I1110 17:01:55.897424 66 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 35 entries
E1110 17:01:55.897541 185 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache1, fd: 8, size: 8, offset: 5110672, error code: 22, error string: Invalid argument
E1110 17:01:55.897547 157 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache3, fd: 10, size: 5, offset: 1203552, error code: 22, error string: Invalid argument
E1110 17:01:55.897559 159 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache7, fd: 14, size: 9, offset: 195621704, error code: 22, error string: Invalid argument
E1110 17:01:55.897567 158 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache5, fd: 12, size: 5, offset: 130376538, error code: 22, error string: Invalid argument
I1110 17:01:55.897629 158 SsdCache.cpp:122] [SSDCA] Wrote 215MB, 1091453.8 MB/s
I1110 17:01:56.178414 38 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 54 entries
I1110 17:01:56.178440 38 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 35 entries
I1110 17:01:56.178447 38 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 38 entries
E1110 17:01:56.178542 173 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache1, fd: 8, size: 8, offset: 5749506, error code: 22, error string: Invalid argument
E1110 17:01:56.178551 174 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache3, fd: 10, size: 5, offset: 1353996, error code: 22, error string: Invalid argument
E1110 17:01:56.178632 176 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache7, fd: 14, size: 82, offset: 201326592, error code: 22, error string: Invalid argument
E1110 17:01:56.179170 175 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache5, fd: 12, size: 73, offset: 134217728, error code: 22, error string: Invalid argument
I1110 17:01:56.179189 175 SsdCache.cpp:122] [SSDCA] Wrote 258MB, 365410.6 MB/s
I1110 17:01:56.757359 72 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 61 entries
I1110 17:01:56.757385 72 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 45 entries
I1110 17:01:56.757391 72 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 51 entries
E1110 17:01:56.757524 178 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache3, fd: 10, size: 5, offset: 1504440, error code: 22, error string: Invalid argument
E1110 17:01:56.757853 179 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache5, fd: 12, size: 5, offset: 196825215, error code: 22, error string: Invalid argument
E1110 17:01:56.759240 180 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache7, fd: 14, size: 9, offset: 262340773, error code: 22, error string: Invalid argument
E1110 17:01:56.759820 177 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache1, fd: 8, size: 8, offset: 6388340, error code: 22, error string: Invalid argument
I1110 17:01:56.759845 177 SsdCache.cpp:122] [SSDCA] Wrote 282MB, 120865.21 MB/s
I1110 17:01:57.447929 137 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 74 entries
I1110 17:01:57.447961 137 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 60 entries
I1110 17:01:57.447966 137 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 67 entries
E1110 17:01:57.448081 181 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache1, fd: 8, size: 8, offset: 7027174, error code: 22, error string: Invalid argument
E1110 17:01:57.448083 182 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache3, fd: 10, size: 5, offset: 1654884, error code: 22, error string: Invalid argument
E1110 17:01:57.448158 183 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache5, fd: 12, size: 77, offset: 201326592, error code: 22, error string: Invalid argument
E1110 17:01:57.448166 184 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache7, fd: 14, size: 90, offset: 268435456, error code: 22, error string: Invalid argument
I1110 17:01:57.448194 184 SsdCache.cpp:122] [SSDCA] Wrote 330MB, 1533892.4 MB/s
I1110 17:01:58.095772 208 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 84 entries
I1110 17:01:58.095801 208 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 70 entries
I1110 17:01:58.095806 208 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 77 entries
E1110 17:01:58.095908 185 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache1, fd: 8, size: 8, offset: 7666008, error code: 22, error string: Invalid argument
E1110 17:01:58.095918 158 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache5, fd: 12, size: 7, offset: 262222070, error code: 22, error string: Invalid argument
E1110 17:01:58.095911 157 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache3, fd: 10, size: 5, offset: 1805328, error code: 22, error string: Invalid argument
E1110 17:01:58.095943 159 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache7, fd: 14, size: 9, offset: 329870069, error code: 22, error string: Invalid argument
I1110 17:01:58.095957 159 SsdCache.cpp:122] [SSDCA] Wrote 373MB, 2643966.5 MB/s
I1110 17:01:58.492918 203 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 85 entries
I1110 17:01:58.492944 203 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 70 entries
I1110 17:01:58.492950 203 AsyncDataCache.cpp:517] [SSDCA] Limiting SSD save batch to 78 entries
E1110 17:01:58.493048 173 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache1, fd: 8, size: 8, offset: 8304842, error code: 22, error string: Invalid argument
E1110 17:01:58.493070 174 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache3, fd: 10, size: 5, offset: 1955772, error code: 22, error string: Invalid argument
E1110 17:01:58.493113 175 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache5, fd: 12, size: 80, offset: 268435456, error code: 22, error string: Invalid argument
E1110 17:01:58.493167 176 SsdFile.cpp:396] [SSDCA] Failed to write to SSD, file name: /opt/presto-server/async_data_cache7, fd: 14, size: 93, offset: 335544320, error code: 22, error string: Invalid argument
I1110 17:01:58.493193 176 SsdCache.cpp:122] [SSDCA] Wrote 392MB, 1707764.8 MB/s
About this issue
- Original URL
- State: closed
- Created 8 months ago
- Comments: 22 (17 by maintainers)
Update from my earlier comment. We cannot rely on the offset increasing because it seems to increase no matter if it the insert succeeds or fails.
In SsdFile.cpp[307] We can see that we already increment
This is before we have even called
pwritev
@meharanjan318 I have not tried to add success logging myself. Will attempt it now and get back to you. Thank you for trying it out!