Off-site
backups are important, and even though I know this, I rarely implement them in
my own servers. Lately, I’ve been setting up rsnapshot to do hourly and daily
backups locally(to the same server), and I only do manual backups to remote
servers occasionally. I decided to install duply on all of my servers/virtual
machines(that I care about) and have them back up to a single backup server.
This backup server will also do daily encrypted backups to Amazon S3,
effectively giving me 3 redundant layers of backups.
If you
haven’t heard of Duplicity or Duply before, Duply is basically a wrapper for
Duplicity which makes it easier to manage. Duplicity itself is similar to
rsnapshot except, it uses tar to efficiently store differences between backups
(instead of hardlinks). Here’s the description from the man
page:
Duplicity incrementally backs
up files and directory by
encrypting tar-format volumes
with GnuPG and uploading
them to a remote (or local)
file server. Currently local,
ftp, ssh/scp, rsync, WebDAV,
WebDAVs, HSi and Amazon S3 backends
are available. Because
duplicity uses librsync, the incremental
archives are space efficient
and only record the parts of files
that have changed since the last backup. Currently duplicity
supports deleted files, full
Unix permissions, directories,
symbolic links, fifos, etc.,
but not hard links.
|
I wrote
this mainly as a reference for myself when I need to set duply up on another
server, but it might be useful for others as well.
CentOS 5
/ 6 Instructions
Install
the EPEL repo:
#Cent 6:
rpm -Uvh
http://download.fedoraproject.org/pub/epel/6/i386/epel-release-6-5.noarch.rpm
#Cent 5:
rpm -Uvh
http://download.fedoraproject.org/pub/epel/5/i386/epel-release-5-4.noarch.rpm
|
Install
duplicity:
yum --enablerepo=epel
install duplicity
|
Install
duply:
Get the URL for the latest version here: http://duply.net/?title=Duply-downloads
download it to your server, extract it, and copy duply to /usr/local/bin/duply then chmod +x /usr/local/bin/duply
Get the URL for the latest version here: http://duply.net/?title=Duply-downloads
download it to your server, extract it, and copy duply to /usr/local/bin/duply then chmod +x /usr/local/bin/duply
wget http://dev.justynshull.com/duply_1.5.5.1.tgz
tar xvzf duply_1.5.5.1.tgz
cp duply_1.5.5.1/duply /usr/local/bin/duply
chmod +x /usr/local/bin/duply
|
Note: Bug
675234 is a request to have
duply put into the Fedora repo, but there is also a .spec and source rpm if you
wish to build an rpm yourself.
Setup a
basic Duply profile
mkdir /etc/duply &&
chmod 700
/etc/duply
duply testvm1 create
|
Note: If
you don’t create /etc/duply, then it will use $HOME/.duply by default.
Take a
look at the options in /etc/duply/testvm1/conf and configure it to your liking.
There are many different TARGET formats you can use, including ssh, rsync over
ssh, ftp, and even amazon’s S3. You can view them all here: URL Formats
This is what mine usually look like(without encryption) using rsync over ssh:
This is what mine usually look like(without encryption) using rsync over ssh:
# egrep -v
'^#|^$' /etc/duply/testvm1/conf
GPG_KEY='disabled'
TARGET='rsync://backups.justynshull.com//home/testvm1/backups'
TARGET_USER='testvm1'
SOURCE='/'
MAX_AGE=3M
MAX_FULL_BACKUPS=3
MAX_FULLBKP_AGE=30D
VOLSIZE=3500
DUPL_PARAMS="$DUPL_PARAMS
--full-if-older-than $MAX_FULLBKP_AGE "
DUPL_PARAMS="$DUPL_PARAMS --volsize $VOLSIZE "
DUPL_PARAMS="$DUPL_PARAMS
--include=/etc \
--include=/home \
--include=/root \
--include=/var/www \
--include=/var/lib/mysql \
--include=/var/log \
--exclude=/** "
|
I prefer
to use multiple –include= options rather than fill /etc/duply/testvm1/exclude
with every directory I *don’t* want backed up. Either way will work though.
Also, if you’re going to use rsync or ssh/sftp, I’d recommend setting up the
backup server so that you can log in with ssh keys and generate a separate key
for each server you’re backing up from. You’ll also have to have ssh’d into the
backup server as that user at least once to avoid errors about the target host
key.
Encryption
Duply/Duplicity supports using GPG to encrypt volumes before uploading them to the remote server, and the easiest way to enable encryption is by putting this in your conf:
Duply/Duplicity supports using GPG to encrypt volumes before uploading them to the remote server, and the easiest way to enable encryption is by putting this in your conf:
#comment out
#GPG_KEY from earlier
#GPG_KEY='disabled'
GPG_PW='secret_password'
|
This will
encrypt the volumes using the passphrase you put in GPG_PW, but you can refer
to the documentation for how to set it up to use actual gpg keys.
Including
MySQL Backups
If you’re running mysql on the server, you should consider adding something similar to this to /etc/testvm1/pre, which gets run automatically by duply, to dump all databases before backing up the server.
If you’re running mysql on the server, you should consider adding something similar to this to /etc/testvm1/pre, which gets run automatically by duply, to dump all databases before backing up the server.
#!/bin/sh
mkdir -pv
/root/db_backups
for db in
$(mysql -uroot
-e 'show
databases' -s --skip-column-names |
grep -v
'information_schema');
do
mysqldump -uroot
$db >
/root/db_backups/$db.sql;
sleep
10;
done
|
Run your
first backup:
duply testvm1 backup
|
Automate
it
If all goes well(no errors), then you should be okay to set up a cronjob to run duply backup.
If all goes well(no errors), then you should be okay to set up a cronjob to run duply backup.
# crontab -l
30 3 * * * /usr/local/bin/duply
testvm1 backup
30 5 * * sun /usr/local/bin/duply testvm1 backup_verify_purge --force
|
Purge Old
Backups
If you use the above crontab, the 2nd line will run once a week, purging old backups from the remote server. If you’re worried about keeping too many backups, you might want to increase how often this runs and also decrease the options in the duply profile configuration.
If you use the above crontab, the 2nd line will run once a week, purging old backups from the remote server. If you’re worried about keeping too many backups, you might want to increase how often this runs and also decrease the options in the duply profile configuration.
Verify
Backups
Duply makes it easy to see what you’re currently backing up.
To see a list of backups stored on the remote server:
Duply makes it easy to see what you’re currently backing up.
To see a list of backups stored on the remote server:
duply testvm1 status
|
To see a
list of files that have changed since the last backup:
duply testvm1 verify
|
List all
files in a backup yesterday(leave out to show latest):
duply testvm1 list 1D
|
Restore
Backups
It’s just as easy to restore complete or partial backups.
Restore the entire latest backup to /tmp/restore:
It’s just as easy to restore complete or partial backups.
Restore the entire latest backup to /tmp/restore:
duply testvm1 restore /tmp/restore
|
Restore
backup from 7 days ago to /tmp/restore:
duply testvm1 restore /tmp/restore
1W
|
Restore
single file or directory to /tmp/restore:
duply testvm1 fetch home/justyns /tmp/restore
#When using
'fetch', make sure you leave off the leading slash.
|
Restore a
file from a month ago:
duply testvm1 fetch home/justyns/plans_for_world_dom.txt
/home/justyns/plans_for_world_dom.txt 1M
|