influxdb: Remote restore not working - "DB metadata not changed. database may already exist"

Influxdb 1.5 (apine docker version)

backup

influxd backup -portable -database "metrics-cadvisor" -host $DATABASE_HOST:$DATABASE_PORT "$BACKUP_PATH"

influxd backup -portable -database "metrics-health" -host $DATABASE_HOST:$DATABASE_PORT "$BACKUP_PATH"

restore

influxd restore -portable -database "metrics-cadvisor"  -host $DATABASE_HOST:$DATABASE_PORT $BACKUP_PATH ;

influxd restore -portable -database "metrics-health" -host $DATABASE_HOST:$DATABASE_PORT $BACKUP_PATH

error

2018/03/16 17:18:44 error updating meta: DB metadata not changed. database may already exist
restore: DB metadata not changed. database may already exist
Restore failed

I tried deleting the 2 databases before restoring, and it didn’t work.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 8
  • Comments: 15 (5 by maintainers)

Most upvoted comments

I don’t want to complain about open source software and I appreciate the work that you are doing a lot, but is incremental restore planned? Incremental backup without the means of restoring them incrementally without intensive manual work seems rather cumbersome and a huge oversight.

@mcappadonna yes, that’s correct.

Also to add onto @petetnt’s great work, before you do the SELECT * INTO operation suggested by @aanthony1243, you’ll want to:

  1. disable query-timeout
  2. set some sort of CPU limit on influxdb so it doesn’t render the machine completely unusable during the restore

I just tried an incremental restore of about 1.6M data points on a 2 core machine, and it took about 2 minutes. This operation is slower than the initial /usr/bin/influxd restore command by one or two orders of magnitude, which makes sense.

Edit: We’re solidly in the “low” category on this page – 18k series with appropriately sized hardware – and doing this SELECT * INTO query on a single day’s incremental backup takes down the machine by eating all the RAM. I’m afraid I’m going to have to invest a couple more days and write some custom code to stream rows between databases in a “nice” way – chunking by series and by shard. I’ve looked into Kapacitor and export/import (data becomes too large on disk) and neither of them solve the problem.

Edit2: I found a way to merge everything from one database into another: pipe the output of an export command directly to an import command. Uses an extra ~500MB of RAM on my machine.

DB_FROM=r_air
DB_TO=r_air2
fifo_name=fifo-${DB_FROM}-to-${DB_TO}
mkfifo $fifo_name
influx_inspect export -datadir /var/lib/influxdb/data -waldir /var/lib/influxdb/wal -database $DB_FROM -out $fifo_name &
cat $fifo_name \
    | sed -e "s/^CREATE DATABASE ${DB_FROM} WITH/CREATE DATABASE ${DB_TO} WITH/" \
    | sed -e "s/^# CONTEXT-DATABASE:${DB_FROM}$/# CONTEXT-DATABASE:${DB_TO}/" \
    | influx -import -path /dev/stdin
rm $fifo_name

Note that you can’t use -out /dev/stdout on the influx_inspect command because it already writes random stuff to stdout, which messes up influx -import. A named pipe is required.

The only downside: it’s extremely slow. The number of lines to process is somewhere around 3/4 the number of bytes in the gzipped backup, so a 1GB backup will require processing about 750M lines, and at 100k lines/sec on a modest machine, that’s about 2 hours.

hi @entone , try:

influxd restore -portable -db “brood” -newdb “brood1” -host localhost:8088 ~/Desktop/influx-backup-test-2

to clarify: -db identifies the database from the backup file that you want to restore. -newdb indicates the name you want to give to the imported database. If not given, it will default to the same name as the original. However, you must restore to a unique db name. If the original db already exists in the system, then the restore will fail, which is why you need both -db and -newdb in this case.