restic: Cannot find docs for the exclude file syntax

I can’t seem to find any docs on how the exclude file syntax is parsed.

i.e. does it support wildcards? regex? how does it differentiate between files and directories? are path prefixes needed? where from (cwd/or root?)?

Some examples:

.qiv-trash (directory that could be anywhere on the filesystem)
.DS_Store
lost+found/
._*
desktop.ini
Thumbs.db (file that could be anywhere on the filesystem)
.Trash-* (the asterisk could be any number, is it needed?)
.tmp$ (file ending in .tmp)
~$ (file ending in a tilde)
~/.cache/ (cache directory in user home dir, using tilde syntax)
/full/path/to/directory/.syncthing/index*

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 3
  • Comments: 19 (10 by maintainers)

Most upvoted comments

To answer a few of your questions already:

  • All patterns are tested against the full path of a file/dir to be saved
  • Relative paths/patterns will match anywhere below the path to be saved
  • At the moment there’s no way to distinguish between a file and a directory, so --exclude foo will exclude any files and directories named foo. The same goes for --exclude foo/.

From your excludes file:

  • ._* will match all files and directories which name starts with a dot and an underscore
  • desktop.ini will match all files called desktop.ini exactly. So desktop.ini.bak is not excluded and saved in the snapshot.
  • .Trash-* excludes files/dirs named .Trash-, .Trash-foobar, etc.
  • .tmp$ excludes all files/dirs literally named .tmp$, that is a dot, followed by tmp, followed by a dollar sign. No regexp expansion.
  • ~$ excludes all files/dirs literally named tilde dollar. For excluding all files/dirs ending in a tilde, use *~.
  • ~/.cache excludes the directory .cache in all dirs called tilde. For excluding the cache directory in your home directory only, use $HOME/.cache (tilde is not expanded, environment variables are, but only in a file read via --exclude-file, in the command-line the shell expands both).
  • /full/path/to/directory/.syncthing/index* excludes all things with names starting with index below /full/path/to/directory/.syncthing.

Hi, have you seen

Patterns use filepath.Glob internally, see filepath.Match for syntax. Additionally ** excludes arbitrary subdirectories. Environment-variables in exclude-files are expanded with os.ExpandEnv.

in https://github.com/restic/restic/blob/master/doc/manual.rst?

I think this should answer your questions.

I read the golang doc and I think the end-user (me!) isn’t going to know what restic is comparing an exclude to internally - is it the full path (e.g. /home/me/blah) or a path from the repository root (/blah or blah), or relative to the cwd (me/blah when I am at /home)?

The matching code there is modeled after what a shell would do: If you’d ask yourself, if the file /home/user/secret2 exists, what would ls /home/user/secret print (provided the file secret does not exist)?

In more formal terms: If the pattern starts with a / it is absolute and the pattern must match at the beginning of the string under test, so pattern /home/user/secret does not match /home/user/somemount[...]: The pattern is not a prefix of the string.

You can imagine for yourself that the pattern and the file path are both split into their respective components:

  • /home/user/secret is split into [ROOT, "home", "user", "secret"] and the file /home/user/somemount/home/user/secret is split into [ROOT, "home", "user", "somemount", "home", "user", "secret"]. The string ROOT is used in this example to mark the root directory. You can see that the pattern is not contained in the file name.
  • Let’s look at the file /home/user/secret2, wich is split into [ROOT, "home", "user", "secret2"]. Again you can see that the pattern is not contained in the file name.
  • For the file /home/user/secret/secret.txt, which is split into [ROOT, "home", "user", "secret", "secret.txt"] you can see that the pattern is indeed contained in the file name, right at the beginning: [ROOT, "home", "user", "secret", ...], therefore the pattern matches and the file is excluded.
  • Let’s say we have a relative exclude pattern of secret/secret.txt, which is split into ["secret", "secret.txt"]. You can see that this pattern can be found in the list for the file /home/user/secret/secret.txt, starting at offset 3: [ROOT, "home", "user", "secret", "secret.txt"], so the pattern matches.

When you have wildcards (*, ? and so on) in a path component, they are also tested. So for your first example, a pattern of /home/user/secret* would match the path /home/user/secret2.

You have to specify the excludes as --exclude foo/bar/t.txt --exclude foor/bar2/1.txt Unless foo/bar/t.txt foor/bar2/1.txt is a single filename. Or use an exclude file as described in https://restic.readthedocs.io/en/stable/040_backup.html#excluding-files

Are negative excludes possible ala .gitignore? say I want to exclude all content in directories named .meteor except for the nested dir .meteor/local/db, could I do this?

/etc/restic/excludes:

.meteor/
!.meteor/local/db

restic backup exclude-file=/etc/restic/excludes

All these examples should be document in the manual I think.

Ah, I’m afraid that’s still not completely correct. I’ll describe how restic evaluates the exclude patterns. Let’s suppose that restic is run by a user in his home directory (/home/user) like this:

$ restic backup --exclude='*.bak' --exclude='/home/user/secret' --exclude='extra' ~

Then restic will see the following command line arguments (after expansion by the shell):

["restic", "backup", "--exclude='*.bak'", "--exclude='/home/user/secret'", "--exclude='extra'", "/home/user"]

Then, it starts traversing /home/user. The following list describes what happens when the named file/dir is seen. restic always tests the complete path against the patterns:

  • file /home/user/foo.bak: The pattern *.bak matches and the file is not saved. The pattern is not absolute so it matches everywhere for all files ending in .bak.
  • dir /home/user/secret: The absolute pattern /home/user/secret matches, so the dir is not saved and not traversed
  • dir /home/user/foo/home/user/secret: No pattern matches, so the dir is saved.
  • dir /home/user/work/extra: The pattern extra matches, the dir is not saved.

I hope that this is a bit clearer now, I’ll add a section to the manual describing the process. The key take-away point is that the patterns are evaluated against the full path of the files during backup. So if you want to match a single directory, use the complete path, otherwise it may match several times somewhere.

Any further questions? 😃

Thanks @fd0

So with “current directory”, you don’t mean the directory I was in when I launched the backup, but the directory that restic is currently in examining the files (apart from excludes that begin with a slash). Got it.

The behaviour for files and directories is slightly unexpected, I would have expected --exclude foo/ to backup the directory but not the contents, whereas --exclude foo to backup neither. Not sure why, from rsync I guess.

My examples missed an important one: spaces! I guess I need to escape those and shell metacharacters with a backslash.

I ended up copying lots of these ones: https://gist.github.com/jult/e2eaedad6b9e29d95977fea0ddffae7d

Are comments allowed in the excludes file? Edit: https://github.com/restic/restic/pull/916/commits/c796d84fca48feea91ca3e85fbf38e16f764a468 looks like a hash is the comment character.

Thanks for raising this issue, I think you have a valid point. The manual should explain the exclude filters without referencing godoc.org, and more examples are necessary.