restic: runtime: out of memory: cannot allocate 536870912-byte block (556793856 in use)
Likely supersedes #526
Output of restic version
# restic.debug version
debug enabled
restic 0.6.0-rc.1 (v0.6.0-rc.1-1-g37b107b)
compiled with go1.8.3 on linux/arm
How did you start restic exactly? (Include the complete command line)
# /usr/local/sbin/restic.debug -r rest:http://qnap.ch.polyware.nl:8000 backup -x --tag CH / /boot --exclude "/var/log/*,/var/tmp/*,/tmp/*" --mem-profile=/tmp
What backend/server/service did you use?
rest
Expected behavior
Making a backup.
Actual behavior
Failed backup. out of memory.
# /usr/local/sbin/restic.debug -r rest:http://qnap.ch.polyware.nl:8000 backup -x --tag CH / /boot --exclude "/var/log/*,/var/tmp/*,/tmp/*" --mem-profile=/tmp
debug enabled
enter password for repository:
runtime: out of memory: cannot allocate 536870912-byte block (556793856 in use)
fatal error: out of memory
runtime stack:
runtime.throw(0x4c85ae, 0xd)
/usr/local/go/src/runtime/panic.go:596 +0x70
runtime.largeAlloc(0x1ffffe00, 0x6ae01, 0x711f50)
/usr/local/go/src/runtime/malloc.go:809 +0xe0
runtime.mallocgc.func1()
/usr/local/go/src/runtime/malloc.go:702 +0x30
runtime.systemstack(0x10a1ca80)
/usr/local/go/src/runtime/asm_arm.s:264 +0x80
runtime.mstart()
/usr/local/go/src/runtime/proc.go:1132
goroutine 117 [running]:
runtime.systemstack_switch()
/usr/local/go/src/runtime/asm_arm.s:209 +0x4 fp=0x10ba7a7c sp=0x10ba7a78
runtime.mallocgc(0x1ffffe00, 0x44ea60, 0x10b66e01, 0x10adab18)
/usr/local/go/src/runtime/malloc.go:703 +0x9e0 fp=0x10ba7ad8 sp=0x10ba7a7c
runtime.makeslice(0x44ea60, 0x1ffffe00, 0x1ffffe00, 0x960, 0x0, 0x0)
/usr/local/go/src/runtime/slice.go:54 +0x68 fp=0x10ba7afc sp=0x10ba7ad8
bytes.makeSlice(0x1ffffe00, 0x0, 0x0, 0x0)
/usr/local/go/src/bytes/buffer.go:201 +0x58 fp=0x10ba7b18 sp=0x10ba7afc
bytes.(*Buffer).ReadFrom(0x10ba7b74, 0x6e16f0, 0x10aa3fc0, 0x200, 0x0, 0x10a80400, 0x0)
/usr/local/go/src/bytes/buffer.go:173 +0x270 fp=0x10ba7b50 sp=0x10ba7b18
io/ioutil.readAll(0x6e16f0, 0x10aa3fc0, 0x200, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
/usr/local/go/src/io/ioutil/ioutil.go:33 +0xd0 fp=0x10ba7bc8 sp=0x10ba7b50
io/ioutil.ReadAll(0x6e16f0, 0x10aa3fc0, 0x10aa3fc0, 0x6e16f0, 0x10aa3fc0, 0x0, 0x0)
/usr/local/go/src/io/ioutil/ioutil.go:42 +0x34 fp=0x10ba7bf0 sp=0x10ba7bc8
restic/backend.LoadAll(0x6e5fe0, 0x10ae0150, 0x4c1338, 0x5, 0x10b2dcc0, 0x40, 0x0, 0x0, 0x0, 0x0, ...)
/tmp/restic-build-847077805/src/restic/backend/utils.go:28 +0x104 fp=0x10ba7c2c sp=0x10ba7bf0
restic/repository.(*Repository).LoadAndDecrypt(0x10adc180, 0x4c1338, 0x5, 0x1daf0f27, 0x39e92cc6, 0x9c59383f, 0x97399bea, 0x21ed0b22, 0xb2d24563, 0xa65b3a67, ...)
/tmp/restic-build-847077805/src/restic/repository/repository.go:57 +0x120 fp=0x10ba7d50 sp=0x10ba7c2c
restic/repository.LoadIndexWithDecoder(0x6e6e20, 0x10adc180, 0x1daf0f27, 0x39e92cc6, 0x9c59383f, 0x97399bea, 0x21ed0b22, 0xb2d24563, 0xa65b3a67, 0x8d242524, ...)
/tmp/restic-build-847077805/src/restic/repository/index.go:525 +0xc8 fp=0x10ba7de0 sp=0x10ba7d50
restic/repository.LoadIndex(0x6e6e20, 0x10adc180, 0x1daf0f27, 0x39e92cc6, 0x9c59383f, 0x97399bea, 0x21ed0b22, 0xb2d24563, 0xa65b3a67, 0x8d242524, ...)
/tmp/restic-build-847077805/src/restic/repository/repository.go:352 +0x48 fp=0x10ba7e34 sp=0x10ba7de0
restic/repository.(*Repository).LoadIndex.func1(0x1daf0f27, 0x39e92cc6, 0x9c59383f, 0x97399bea, 0x21ed0b22, 0xb2d24563, 0xa65b3a67, 0x8d242524, 0x10b62080, 0x8d242524, ...)
/tmp/restic-build-847077805/src/restic/repository/repository.go:320 +0x4c fp=0x10ba7ec0 sp=0x10ba7e34
restic/repository.ParallelWorkFuncParseID.func1(0x10b622c0, 0x40, 0x10b62080, 0x10b4ff00, 0x1)
/tmp/restic-build-847077805/src/restic/repository/parallel.go:89 +0x148 fp=0x10ba7f54 sp=0x10ba7ec0
restic/repository.FilesInParallel.func1(0x10b689c0, 0x10b620c0, 0x10b34820, 0x10b62080, 0x10b62380)
/tmp/restic-build-847077805/src/restic/repository/parallel.go:53 +0xd4 fp=0x10ba7fd4 sp=0x10ba7f54
runtime.goexit()
/usr/local/go/src/runtime/asm_arm.s:1017 +0x4 fp=0x10ba7fd4 sp=0x10ba7fd4
created by restic/repository.FilesInParallel
/tmp/restic-build-847077805/src/restic/repository/parallel.go:63 +0x130
goroutine 1 [chan receive]:
restic/repository.(*Repository).LoadIndex(0x10adc180, 0x4e9798, 0x10b2c3c0)
/tmp/restic-build-847077805/src/restic/repository/repository.go:339 +0x124
main.runBackup(0x0, 0x0, 0x0, 0x10ad8a80, 0x3, 0x3, 0x0, 0x0, 0x0, 0x1, ...)
/tmp/restic-build-847077805/src/cmds/restic/cmd_backup.go:375 +0x330
main.glob..func2(0x70d1b8, 0x10b397c0, 0x2, 0xa, 0x0, 0x0)
/tmp/restic-build-847077805/src/cmds/restic/cmd_backup.go:38 +0xcc
github.com/spf13/cobra.(*Command).execute(0x70d1b8, 0x10b396d0, 0xa, 0xa, 0x70d1b8, 0x10b396d0)
/tmp/restic-build-847077805/src/github.com/spf13/cobra/command.go:631 +0x308
github.com/spf13/cobra.(*Command).ExecuteC(0x70cf88, 0x10ade2b8, 0x44e820, 0x10b34180)
/tmp/restic-build-847077805/src/github.com/spf13/cobra/command.go:710 +0x220
github.com/spf13/cobra.(*Command).Execute(0x70cf88, 0x9, 0x10b1bf74)
/tmp/restic-build-847077805/src/github.com/spf13/cobra/command.go:669 +0x1c
main.main()
/tmp/restic-build-847077805/src/cmds/restic/main.go:60 +0x88
goroutine 5 [syscall]:
os/signal.signal_recv(0x0)
/usr/local/go/src/runtime/sigqueue.go:116 +0x154
os/signal.loop()
/usr/local/go/src/os/signal/signal_unix.go:22 +0x14
created by os/signal.init.1
/usr/local/go/src/os/signal/signal_unix.go:28 +0x30
goroutine 6 [chan receive]:
restic.init.1.func1.1()
/tmp/restic-build-847077805/src/restic/lock.go:265 +0x9c
created by restic.init.1.func1
/tmp/restic-build-847077805/src/restic/lock.go:268 +0x24
goroutine 7 [select, locked to thread]:
runtime.gopark(0x4e9cdc, 0x0, 0x4c27d7, 0x6, 0x18, 0x2)
/usr/local/go/src/runtime/proc.go:271 +0xfc
runtime.selectgoImpl(0x10a26fa0, 0x0, 0xc)
/usr/local/go/src/runtime/select.go:423 +0x116c
runtime.selectgo(0x10a26fa0)
/usr/local/go/src/runtime/select.go:238 +0x10
runtime.ensureSigM.func1()
/usr/local/go/src/runtime/signal_unix.go:434 +0x2bc
runtime.goexit()
/usr/local/go/src/runtime/asm_arm.s:1017 +0x4
goroutine 8 [chan receive]:
restic.init.2.func1(0x10a12440)
/tmp/restic-build-847077805/src/restic/progress_unix.go:17 +0x38
created by restic.init.2
/tmp/restic-build-847077805/src/restic/progress_unix.go:21 +0x90
goroutine 17 [chan receive]:
main.CleanupHandler(0x10ad6040)
/tmp/restic-build-847077805/src/cmds/restic/cleanup.go:62 +0x38
created by main.init.1
/tmp/restic-build-847077805/src/cmds/restic/cleanup.go:25 +0x90
goroutine 18 [chan receive]:
github.com/pkg/profile.Start.func7(0x10ad8ae0)
/tmp/restic-build-847077805/src/github.com/pkg/profile/profile.go:206 +0x90
created by github.com/pkg/profile.Start
/tmp/restic-build-847077805/src/github.com/pkg/profile/profile.go:212 +0x18c
goroutine 33 [select]:
net/http.(*persistConn).readLoop(0x10a7e280)
/usr/local/go/src/net/http/transport.go:1599 +0x7b4
created by net/http.(*Transport).dialConn
/usr/local/go/src/net/http/transport.go:1117 +0x814
goroutine 34 [select]:
net/http.(*persistConn).writeLoop(0x10a7e280)
/usr/local/go/src/net/http/transport.go:1704 +0x394
created by net/http.(*Transport).dialConn
/usr/local/go/src/net/http/transport.go:1118 +0x830
goroutine 105 [IO wait]:
net.runtime_pollWait(0x76ee4088, 0x72, 0x10b57000)
/usr/local/go/src/runtime/netpoll.go:164 +0x44
net.(*pollDesc).wait(0x10b984fc, 0x72, 0x6e20e0, 0x6e0138)
/usr/local/go/src/net/fd_poll_runtime.go:75 +0x28
net.(*pollDesc).waitRead(0x10b984fc, 0x10b57000, 0x1000)
/usr/local/go/src/net/fd_poll_runtime.go:80 +0x24
net.(*netFD).Read(0x10b984c0, 0x10b57000, 0x1000, 0x1000, 0x0, 0x6e20e0, 0x6e0138)
/usr/local/go/src/net/fd_unix.go:250 +0x148
net.(*conn).Read(0x10b648d0, 0x10b57000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
/usr/local/go/src/net/net.go:181 +0x58
net/http.(*persistConn).Read(0x10b5c460, 0x10b57000, 0x1000, 0x1000, 0x4, 0x67d2c, 0x10b63140)
/usr/local/go/src/net/http/transport.go:1316 +0x174
bufio.(*Reader).fill(0x10ae0cf0)
/usr/local/go/src/bufio/bufio.go:97 +0xf4
bufio.(*Reader).Peek(0x10ae0cf0, 0x1, 0x10bcada8, 0x1, 0x0, 0x10b98680, 0x0)
/usr/local/go/src/bufio/bufio.go:129 +0x58
net/http.(*persistConn).readLoop(0x10b5c460)
/usr/local/go/src/net/http/transport.go:1474 +0x160
created by net/http.(*Transport).dialConn
/usr/local/go/src/net/http/transport.go:1117 +0x814
goroutine 106 [select]:
net/http.(*persistConn).writeLoop(0x10b5c460)
/usr/local/go/src/net/http/transport.go:1704 +0x394
created by net/http.(*Transport).dialConn
/usr/local/go/src/net/http/transport.go:1118 +0x830
goroutine 44 [select]:
main.refreshLocks(0x711a70, 0x10b0bf00)
/tmp/restic-build-847077805/src/cmds/restic/lock.go:70 +0x218
created by main.lockRepository
/tmp/restic-build-847077805/src/cmds/restic/lock.go:47 +0x294
goroutine 45 [semacquire]:
sync.runtime_Semacquire(0x10b689cc)
/usr/local/go/src/runtime/sema.go:47 +0x24
sync.(*WaitGroup).Wait(0x10b689c0)
/usr/local/go/src/sync/waitgroup.go:131 +0x94
restic/repository.FilesInParallel(0x76ee6d88, 0x10ae0150, 0x4c1338, 0x5, 0x14, 0x10b34820, 0x0, 0x0)
/tmp/restic-build-847077805/src/restic/repository/parallel.go:66 +0x14c
restic/repository.(*Repository).LoadIndex.func2(0x10b62040, 0x10b0bfc0, 0x10adc180, 0x10b689b0)
/tmp/restic-build-847077805/src/restic/repository/repository.go:336 +0x9c
created by restic/repository.(*Repository).LoadIndex
/tmp/restic-build-847077805/src/restic/repository/repository.go:337 +0x100
goroutine 138 [IO wait]:
net.runtime_pollWait(0x76ee4100, 0x72, 0x12c06000)
/usr/local/go/src/runtime/netpoll.go:164 +0x44
net.(*pollDesc).wait(0x10a13cfc, 0x72, 0x6e20e0, 0x6e0138)
/usr/local/go/src/net/fd_poll_runtime.go:75 +0x28
net.(*pollDesc).waitRead(0x10a13cfc, 0x12c06000, 0x1000)
/usr/local/go/src/net/fd_poll_runtime.go:80 +0x24
net.(*netFD).Read(0x10a13cc0, 0x12c06000, 0x1000, 0x1000, 0x0, 0x6e20e0, 0x6e0138)
/usr/local/go/src/net/fd_unix.go:250 +0x148
net.(*conn).Read(0x10b34ae8, 0x12c06000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
/usr/local/go/src/net/net.go:181 +0x58
net/http.(*persistConn).Read(0x10b3e6e0, 0x12c06000, 0x1000, 0x1000, 0x4, 0x67d2c, 0x10b632c0)
/usr/local/go/src/net/http/transport.go:1316 +0x174
bufio.(*Reader).fill(0x10a16e10)
/usr/local/go/src/bufio/bufio.go:97 +0xf4
bufio.(*Reader).Peek(0x10a16e10, 0x1, 0x12c57da8, 0x1, 0x0, 0x10b63200, 0x0)
/usr/local/go/src/bufio/bufio.go:129 +0x58
net/http.(*persistConn).readLoop(0x10b3e6e0)
/usr/local/go/src/net/http/transport.go:1474 +0x160
created by net/http.(*Transport).dialConn
/usr/local/go/src/net/http/transport.go:1117 +0x814
goroutine 139 [select]:
net/http.(*persistConn).writeLoop(0x10b3e6e0)
/usr/local/go/src/net/http/transport.go:1704 +0x394
created by net/http.(*Transport).dialConn
/usr/local/go/src/net/http/transport.go:1118 +0x830
And dmesg
May 27 01:32:44 rp1 kernel: [1779339.099132] apache2 invoked oom-killer: gfp_mask=0x24201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), nodemask=0, order=0, oom_score_adj=0
May 27 01:32:47 rp1 kernel: [1779339.099149] apache2 cpuset=/ mems_allowed=0
May 27 01:32:47 rp1 kernel: [1779339.099183] CPU: 2 PID: 776 Comm: apache2 Not tainted 4.9.24-v7+ #993
May 27 01:32:47 rp1 kernel: [1779339.099189] Hardware name: BCM2835
May 27 01:32:47 rp1 kernel: [1779339.099236] [<8010fb3c>] (unwind_backtrace) from [<8010c058>] (show_stack+0x20/0x24)
May 27 01:32:47 rp1 rsyslogd-2007: action 'action 17' suspended, next retry is Sat May 27 01:34:17 2017 [try http://www.rsyslog.com/e/2007 ]
May 27 01:32:50 rp1 kernel: [1779339.099259] [<8010c058>] (show_stack) from [<80455200>] (dump_stack+0xd4/0x118)
May 27 01:32:50 rp1 kernel: [1779339.099281] [<80455200>] (dump_stack) from [<8026c998>] (dump_header+0x9c/0x1f4)
May 27 01:32:50 rp1 kernel: [1779339.099306] [<8026c998>] (dump_header) from [<80210364>] (oom_kill_process+0x3e0/0x4e4)
May 27 01:32:50 rp1 kernel: [1779339.099327] [<80210364>] (oom_kill_process) from [<802107cc>] (out_of_memory+0x124/0x334)
May 27 01:32:50 rp1 kernel: [1779339.099348] [<802107cc>] (out_of_memory) from [<802158f0>] (__alloc_pages_nodemask+0xcf4/0xdd0)
May 27 01:32:50 rp1 kernel: [1779339.099370] [<802158f0>] (__alloc_pages_nodemask) from [<8020e694>] (filemap_fault+0x474/0x674)
May 27 01:32:50 rp1 kernel: [1779339.099390] [<8020e694>] (filemap_fault) from [<8030c74c>] (ext4_filemap_fault+0x3c/0x50)
May 27 01:32:50 rp1 kernel: [1779339.099412] [<8030c74c>] (ext4_filemap_fault) from [<8023d844>] (__do_fault+0x7c/0x100)
May 27 01:32:50 rp1 kernel: [1779339.099430] [<8023d844>] (__do_fault) from [<80241b70>] (handle_mm_fault+0x5fc/0xde4)
May 27 01:32:50 rp1 kernel: [1779339.099452] [<80241b70>] (handle_mm_fault) from [<807193f4>] (do_page_fault+0x33c/0x3b0)
May 27 01:32:50 rp1 kernel: [1779339.099471] [<807193f4>] (do_page_fault) from [<801012a8>] (do_PrefetchAbort+0x44/0xa8)
May 27 01:32:50 rp1 kernel: [1779339.099488] [<801012a8>] (do_PrefetchAbort) from [<80718ea4>] (ret_from_exception+0x0/0x1c)
May 27 01:32:50 rp1 kernel: [1779339.099497] Exception stack(0xb2a89fb0 to 0xb2a89ff8)
May 27 01:32:50 rp1 kernel: [1779339.099509] 9fa0: 000f4240 00000000 00000003 00011175
May 27 01:32:50 rp1 kernel: [1779339.099524] 9fc0: 7ea3aad4 7ea3aacc 7ea3aad0 00000000 00000004 76912080 76c44cb0 553e0f60
May 27 01:32:50 rp1 kernel: [1779339.099536] 9fe0: 76ec9274 7ea3aa30 54bcd908 54bc3f68 20000010 ffffffff
May 27 01:32:50 rp1 kernel: [1779339.099543] Mem-Info:
May 27 01:32:51 rp1 kernel: [1779339.099573] active_anon:103001 inactive_anon:102938 isolated_anon:32
May 27 01:32:51 rp1 kernel: [1779339.099573] active_file:344 inactive_file:665 isolated_file:0
May 27 01:32:51 rp1 kernel: [1779339.099573] unevictable:0 dirty:0 writeback:0 unstable:0
May 27 01:32:51 rp1 kernel: [1779339.099573] slab_reclaimable:1659 slab_unreclaimable:4074
May 27 01:32:51 rp1 kernel: [1779339.099573] mapped:1022 shmem:686 pagetables:772 bounce:0
May 27 01:32:51 rp1 kernel: [1779339.099573] free:4085 free_pcp:357 free_cma:0
May 27 01:32:51 rp1 kernel: [1779339.099599] Node 0 active_anon:412004kB inactive_anon:411852kB active_file:1376kB inactive_file:2660kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:4088kB dirty:0kB writeback:0kB shmem:2744kB writeback_tmp:0kB unstable:0kB pages_scanned:37542 all_unreclaimable? yes
May 27 01:32:51 rp1 kernel: [1779339.099630] Normal free:16340kB min:16384kB low:20480kB high:24576kB active_anon:412004kB inactive_anon:411752kB active_file:1376kB inactive_file:2660kB unevictable:0kB writepending:0kB present:901120kB managed:880564kB mlocked:0kB slab_reclaimable:6636kB slab_unreclaimable:16296kB kernel_stack:1840kB pagetables:3088kB bounce:0kB free_pcp:1428kB local_pcp:316kB free_cma:0kB
May 27 01:32:51 rp1 kernel: lowmem_reserve[]: 0 0
May 27 01:32:51 rp1 kernel: [1779339.099651] Normal: 129*4kB (UME) 86*8kB (E) 50*16kB (UME) 50*32kB (ME) 26*64kB (UME) 51*128kB (UME) 8*256kB (UME) 3*512kB (ME) 1*1024kB (M) 0*2048kB 0*4096kB = 16404kB
May 27 01:32:51 rp1 kernel: 1747 total pagecache pages
May 27 01:32:51 rp1 kernel: [1779339.099758] 62 pages in swap cache
May 27 01:32:51 rp1 kernel: [1779339.099767] Swap cache stats: add 95904, delete 95842, find 35132/41154
May 27 01:32:51 rp1 kernel: [1779339.099772] Free swap = 0kB
May 27 01:32:51 rp1 kernel: [1779339.099776] Total swap = 102396kB
May 27 01:32:51 rp1 kernel: [1779339.099782] 225280 pages RAM
May 27 01:32:51 rp1 kernel: [1779339.099787] 0 pages HighMem/MovableOnly
May 27 01:32:51 rp1 kernel: [1779339.099792] 5139 pages reserved
...
Last few msgs in the debug log:
2017/05/29 03:12:50 debug/round_tripper_debug.go:87 debug.loggingRoundTripper.RoundTrip 103 ------------ HTTP RESPONSE ----------
HTTP/1.1 206 Partial Content
Content-Length: 449555642
Accept-Ranges: bytes
Content-Range: bytes 0-449555641/449555642
Content-Type: application/octet-stream
Date: Mon, 29 May 2017 01:12:50 GMT
2017/05/29 03:12:50 repository/index.go:454 repository.DecodeIndex 111 Start decoding index
2017/05/29 03:12:50 repository/index.go:486 repository.DecodeIndex 111 done
2017/05/29 03:12:54 repository/index.go:454 repository.DecodeIndex 108 Start decoding index
2017/05/29 03:13:01 repository/index.go:486 repository.DecodeIndex 108 done
Steps to reproduce the behavior
Pick a RP3, run apache and domotics (both not small memory users) and try to make a backup and see it oom.
Do you have any idea what may have caused this?
Excessive sized mallocs? A 512MB alloc is rather steep… runtime: out of memory: cannot allocate 536870912-byte block (556793856 in use) I mean… it did not even start scanning, so likely it was reading indexes oid. Which then confuses me, because there should be no index of that size. <pauze> I take that back:
-rw------- 1 admin administ 428.7M May 28 15:19 270faf1dc62ce9393f38599cea9b3997220bed216345d2b2673a5ba62425248d
-rw------- 1 admin administ 6.1M May 29 00:33 bef6d99358953f642115c290b1dd312ced2e2f7b34650524ac14156ce2a036f7
-rw------- 1 admin administ 48.7K May 29 00:34 e91b8b4186b086b8def47025cb88efc12a0ef04ef91f885e72d0c5af801b4724
and this was after a prune earlier today.
So… does this mean restic is impossible to use on systems with memory <= 1GB like the RP3? Or many VMs in the world? That seems like a show stopper for wider deployment.
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 1
- Comments: 48 (40 by maintainers)
Don’t worry about it 🙂 I get to see crypto break regularly during my day job as a pentester (sometimes I’m the one breaking it), so I’m very sensitive to these issues. On the other hand I know that most people don’t know all the details from the top of their heads (which is fine).
Checking the MAC only after data has been processed enables vulnerabilities. For example, Padding Oracle vulnerabilities rely on the fact that processing is done before a MAC is checked (if there even is one). For these vulnerabilities, leaking one bit of information about the plaintext enables decrypting the data and even encrypting arbitrary other data, without knowing the encryption key. It’s really astonishing. In general, verifying the authenticity of the ciphertext before even attempting to decrypt the data is an established best practice.
For a while, there was a discussion if encrypt-then-mac (MAC is on the outside, so it can be checked before decryption) or mac-then-encrypt (MAC is encrypted, needs to be decrypted before it can be checked) is better. From my point of view, this has been settled and the former is clearly better. Here’s a question on Security Stackexchange: https://crypto.stackexchange.com/questions/202/should-we-mac-then-encrypt-or-encrypt-then-mac
I’m also experiencing this issue using the b2 backend. Repository is right around 1.5TB total, index directory is 6.8GB (2274 files, largest one <30MB). restic rapidly takes up ~23.5GB memory, then it seems to stabilize.
Personally I think the drive for keeping the internal data in json format is a mistake. The data is encrypted, so not directly readable anyway and
restic cat xxxcan marshal to json and show that instead. Doing that would allow restic to use the more performant gob encoders or whatever we come up with.Just to be clear, as I hinted in https://github.com/restic/restic/issues/979#issuecomment-343651588 above, we must verify the MAC before attempting to processing the contents in any way. Especially, we must not decrypt or even JSON decode data for which the MAC has not been verified yet.
I don’t see how we can do that while having a streaming architecture (and I’ve tried, look for
DecryptReaderin the git history). We can (and should) build a streaming architecture, but after decryption. That’s why I was so interested in the memory usage of the JSON decoder independent of the source of the data (e.g. the[]byte).I’ve had a quick look at the code in your branch and it looks like the code in there does exactly what we must not do: start decrypting data before verifying the MAC. That’s terribly insecure and violates established best practice in crypto. I won’t merge any code which does that.
My ideas for improving the situation are:
rebuild-index), which are then processed sequentially. Each index file can then be limited to < 100MiB.I really appreciate your effort!
Yeah, that code still generates a big []byte and then streams from it, so no surprises there.
Using the indexJsonStreamer is not going to help unless the entire workflow from backend -> json can be streamed. To that end, I experimented a bit with a LoadAndDecryptStream() method on repository. You can grab it here.
I had trouble with two things:
The mac bytes are stored at the end of the file, and these bytes should not be fed to the decryption func. So we need a stream that “reserves” the last n bytes from the end of a stream. I wrote a “ReservedReadCloser”. I’m not entirely confident it covers every edgecase, but I wrote a quick unit test to cover the obvious ones.
I don’t see a stream-compatible way to do poly1305.Verify(). Maybe I didn’t look hard enough? For the moment, my code just punts on this with a TODO.
It’s not exactly apples-to-apples, since poly1305 is not implemented, but here is a quick test with a 256MB index. Note that this does not involve JSON decoding at all; it just writes the data to stdout.
@mholt :
@fd0 :
Hey, I did some experimentation and @mholt’s comment above seems to be correct. You do need to do a tiny bit of manual parsing (to navigate brackets and whatnot). But once you do that, you can call
Decode(&thing)on just a small part of the JSON stream and not need to keep the entire JSON in memory all at once.I wrote a stream-friendly version of LoadIndex and it performs way better on RAM usage than the current code which uses
json.Unmarshal. Check the memory logging inTestLoadGiantIndexUnmarshalvsTestLoadGiantIndexStreaming.Unfortunately this is not a drop-in replacement, because much of restic’s other Load* funcs deal in
[]byterather thanReader. And I know you’re looking into a more-general solution (multi-level indices, or whatever). But maybe something to think about for the future.That’s unfortunately not applicable: It loads the whole (large) object into a buffer and then parses it. I’ve benchmarked that.