Another Incremental Back-up script?
Recently, we published a similar guide on creating an incremental back-up script, which was nice, but we could improve. The retention period of the previous script was determined by changing the interval and the amount of increments. E.g. a daily interval and 14 increments, translated to a retention period 14 days. But what if we want a retention period of a month? or ten years? This would mean 30 or 3650 increments, resp.
Even with incremental back-ups, many increments need lots of disk space. Moreover, it's simply not dynamic, thus not suitable as a proper back-up solution. We've created a new iteration, allowing you to set multiple intervals (.e.g hourly, daily, weekly, monthly, yearly) and set the amount of increments per interval.
This will hopefully be the last back-up script you'll need in a long time. It's the only one I'm using anyways.
TL;DR: The entire script
#!/bin/bash
# Title: Perfacilis Incremental Back-up script
# Description: Create back-ups of dirs and dbs by copying them to Perfacilis' back-up servers
# We strongly recommend to put this in /etc/cron.hourly/backup
# Author: Roy Arisse <support@perfacilis.com>
# See: https://www.perfacilis.com/blog/systeembeheer/linux/rsync-daily-weekly-monthly-incremental-back-ups.html
# Version: 0.12
# Usage: bash /etc/cron.hourly/backup
readonly BACKUP_LOCAL_DIR="/backup"
readonly BACKUP_DIRS=($BACKUP_LOCAL_DIR /home /root /etc /var/www)
readonly RSYNC_TARGET="username@backup.perfacilis.com::profile"
readonly RSYNC_DEFAULTS="-trlqpz4 --delete --delete-excluded --prune-empty-dirs"
readonly RSYNC_EXCLUDE=(tmp/ temp/)
readonly RSYNC_SECRET='RSYNCSECRETHERE'
readonly MYSQL="mysql --defaults-file=/etc/mysql/debian.cnf"
readonly MYSQLDUMP="mysqldump --defaults-file=/etc/mysql/debian.cnf -E -R --max-allowed-packet=512MB -q --single-transaction -Q --skip-comments"
# Amount of increments per interval and duration per interval resp.
readonly -A INCREMENTS=([hourly]=24 [daily]=7 [weekly]=4 [monthly]=12 [yearly]=5)
readonly -A DURATIONS=([hourly]=3600 [daily]=86400 [weekly]=604800 [monthly]=2419200 [yearly]=31536000)
# ++++++++++ NO CHANGES REQUIRED BELOW THIS LINE ++++++++++
set -e
export LC_ALL=C
log() {
MSG=`echo $1`
logger -p local0.notice -t `basename $0` -- $MSG
# Interactive shell
if tty -s; then
echo $MSG
fi
}
check_only_instance() {
# Assign file handle Bob to this script, because we can
exec 808<$0
flock -n 808
if [ $? -gt 0 ]; then
log "Already running"
exit 0
fi
}
prepare_local_dir() {
[ -d $BACKUP_LOCAL_DIR ] || mkdir -p $BACKUP_LOCAL_DIR
}
prepare_remote_dir() {
local TARGET="$1"
local RSYNC_OPTS=$(get_rsync_opts)
local EMPTYDIR=$(mktemp -d)
local DIR TREE
if [ -z "$TARGET" ]; then
echo "Usage: prepare_remote_dir remote/dir/structure"
exit 1
fi
# Remove options that delete empty dir
RSYNC_OPTS=$(echo "$RSYNC_OPTS" | sed -E 's/--(delete|delete-excluded|prune-empty-dirs)//g')
for DIR in ${TARGET//\// }; do
TREE="$TREE/$DIR"
rsync $RSYNC_OPTS $EMPTYDIR $RSYNC_TARGET/${TREE/#\//}
done
rm -rf $EMPTYDIR
}
get_last_inc_file() {
local PERIOD="$1"
if [ -z "$PERIOD" ]; then
echo "Usage: ${FUNCTION[0]} daily"
exit 1
fi
echo "$BACKUP_LOCAL_DIR/last_inc_$PERIOD"
}
get_next_increment() {
local PERIOD="$1"
local LIMIT="${INCREMENTS[$PERIOD]}"
local LAST NEXT INCFILE
if [ -z "$PERIOD" -o -z "$LIMIT" ]; then
echo "Usage: get_next_increment period"
echo "- period = 'hourly', 'daily', 'weekly', 'monthly'"
exit 1
fi
INCFILE=$(get_last_inc_file $PERIOD)
if [ -f "$INCFILE" ]; then
LAST=$(cat "$INCFILE" | tr -d "\n")
fi
if [ -z "$LAST" ]; then
echo 0
return
fi
NEXT=$(($LAST+1))
if [ "$NEXT" -ge "$LIMIT" ]; then
echo 0
return
fi
echo $NEXT
}
# Return biggest interval to backup
get_interval_to_backup() {
local NOW=$(date +%s)
local LAST PERIOD INCFILE DURATION DIFF
local TODO=""
# Sort associative array: biggest first
for PERIOD in "${!DURATIONS[@]}"; do
echo "${DURATIONS["$PERIOD"]} $PERIOD"
done | sort -rn | while read DURATION PERIOD; do
# Skip disabled intervals
if [[ ${INCREMENTS[$PERIOD]} -eq 0 ]]; then
continue;
fi
LAST=0
INCFILE=$(get_last_inc_file $PERIOD)
if [ -f "$INCFILE" ]; then
LAST=$(date -r "$INCFILE" +%s)
fi
DIFF=$(($NOW - $LAST))
if [ $DIFF -ge $DURATION ]; then
echo "$PERIOD"
break
fi
done
}
get_rsync_opts() {
local EXCLUDE=`dirname $0`/rsync.exclude
local SECRET=`dirname $0`/rsync.secret
local OPTS="$RSYNC_DEFAULTS"
if [ ! -z "$RSYNC_EXCLUDE" ]; then
if [ ! -f $EXCLUDE ]; then
printf '%s\n' "${RSYNC_EXCLUDE[@]}" > $EXCLUDE
chmod 600 $EXCLUDE
fi
OPTS="$OPTS --exclude-from=$EXCLUDE"
fi
if [ ! -z "$RSYNC_SECRET" ]; then
if [ ! -f $SECRET ]; then
echo $RSYNC_SECRET > $SECRET
chmod 600 $SECRET
fi
OPTS="$OPTS --password-file=$SECRET"
fi
echo "$OPTS"
}
backup_packagelist() {
local TODO=$(get_interval_to_backup)
if [ -z "$TODO" ]; then
return
fi
log "Back-up list of installed packages"
dpkg --get-selections > $BACKUP_LOCAL_DIR/packagelist.txt
}
backup_mysql() {
local TODO=$(get_interval_to_backup)
local DB
if [ -z "$TODO" ]; then
return
fi
if [ -z "$MYSQL" -o -z "$MYSQLDUMP" ]; then
log "MySQL not set up, skipping database backup."
return
fi
log "Back-up mysql databases:"
for DB in `$MYSQL -e 'show databases' | grep -v 'Database'`; do
if [ $DB = 'information_schema' -o $DB = 'performance_schema' ]; then
continue
fi
log "- $DB"
$MYSQLDUMP $DB | gzip > $BACKUP_LOCAL_DIR/$DB.sql.gz
done
}
backup_folders() {
local RSYNC_OPTS=$(get_rsync_opts)
local DIR TARGET INC INCDIR
local VANISHED='^(file has vanished: |rsync warning: some files vanished before they could be transferred)'
local PERIOD=$(get_interval_to_backup)
if [ -z "$PERIOD" ]; then
log "No intervals to back-up yet."
exit
fi
INC=$(get_next_increment $PERIOD)
log "Moving $PERIOD back-up to target: $INC"
prepare_remote_dir "current"
for DIR in ${BACKUP_DIRS[@]}; do
TARGET=${DIR/#\//}
TARGET=${TARGET//\//_}
# Make path absolute if target is not RSYNC profile
# Also remove "user@server:" for SSH setups
INCDIR="/$PERIOD/$INC/$TARGET"
if [ -z "$RSYNC_SECRET" ]; then
INCDIR="${RSYNC_TARGET##*:}$INCDIR"
fi
log "- $DIR"
rsync $RSYNC_OPTS --backup --backup-dir=$INCDIR \
$DIR/ $RSYNC_TARGET/current/$TARGET 2>&1 | (egrep -v "$VANISHED" || true)
done
}
signoff_increments() {
local STARTTIME="$1"
local PERIOD=$(get_interval_to_backup)
local INC INCFILE
INC=$(get_next_increment $PERIOD)
INCFILE=$(get_last_inc_file $PERIOD)
echo $INC > "$INCFILE"
touch -t "$STARTTIME" "$INCFILE"
}
cleanup() {
rm -f `dirname $0`/rsync.exclude
rm -f `dirname $0`/rsync.secret
}
main() {
starttime=$(date +%Y%m%d%H%M.%S)
log "Back-up initiated at `date`"
trap "cleanup" EXIT
check_only_instance
prepare_local_dir
backup_packagelist
backup_mysql
backup_folders
signoff_increments $starttime
log "Back-up completed at `date`"
}
main
How it works
Setting increments
The INCREMENTS
variable stores the amount of increments to save per period. The DURATIONS
variable stores how long — in seconds — a period is, this variable only needs changing if you want to alter the duration or add new periods.
In INCREMENTS
, you can set the amount to " 0
" to exclude the increment. For every increment you include, a folder on the back-up target location is created automatically.
Keep in mind both vars are associative arrays, make sure the formatting is right. If you're interested, Andy Balaam's blogpost is a great explanation. If you're not interested, just look at the current formatting and change as you wish.
Installation
Copy the contents of the entire script in a file you name " backup
", store it in " /etc/cron.hourly
":
sudo nano /etc/cron.hourly/backup
sudo chmod +x /etc/cron.hourly/backup
Don't forget, if you've created an hourly or even shorter period, the script needs to be called more often. Save the file somewhere else and call it accordingly from /etc/crontab
(or any other method you like).
The following variables probably need changing:
BACKUP_LOCAL_DIR
: Folder to keep required tracking files;BACKUP_DIRS
: The folders you want to have back when you computer or server dies, don't remove$BACKUP_LOCAL_DIR
;RSYNC_TARGET
: Where the actual back-up should be stored — the remote, possibly off-site, location;RSYNC_SECRET
: Optional, if Rsync profile on the remote server requires a secret;MYSQL
: Either leave as is or replace--defaults-file=/etc/mysql/debian.cnf
with-uUSERNAME -pPASSWORD
parameters.MYSQLDUMP
: Same as theMSQL
variable.
The following variables only need checked:
INCREMENTS
: Change the amount of increments you want to save in addition to the full back-up per period.
Full and Incremental Back-ups
The first time the script runs, it creates one full back-up and stores it in the
folder.current
The next run — the first increment — it stores that increment in
, e.g. 1
. Files modified since the last run are moved to this folder and the latest copy is moved to the " daily/1
current
" folder. For the amount of given increments for a period, new increments are created every run, e.g
, daily/2
, etc.daily/3
Finally, when the amount of increments has reached, a new full back-up is stored in the " current
" folder. After that, the increment folders are updated one by one.
To make it easier to find files modified before certain date or time, each folder's timestamp is updated to match the time it ran.
Back-up to a local folder or USB disk instead of a remote Rsync server
The RSYNC_TARGET
variable dictates the remote — preferably off-site location — for the back-up. For example, if your back-up USB disk is mounted at /dev/sdc1
(use lsblk
to find out where it's mounted) change it as follows:
readonly RSYNC_TARGET="/dev/sdc1"
readonly RSYNC_DEFAULTS="-trlqz4 --delete --delete-excluded --prune-empty-dirs"
readonly RSYNC_EXCLUDE=(/tmp /temp)
readonly RSYNC_SECRET=""
Don't forget to empty the RSYNC_SECRET
variable, to ensure it all works as it should.
What's BACKUP_LOCAL_DIR
for?
The back-up script keeps track of which increment it last completed, by storing a file per period in the BACKUP_LOCAL_DIR
folder, e.g. " last_inc_hourly
", " last_inc_daily
", etc. The timestamp of these files is used to determine when that increment was created, to see if the period is elapsed. If you remove these files, or the dir entirely, the script will start with "current" as explained above.
This ensures that if a run was missed — because your laptop was powered off, or because your server was rebooting — the next increment is created as soon as it powers on again, though not sooner than given period duration.
Finally, this folder contains a local copy of all created database dumps.
Conclusion
Is this the final back-up scrip we'll ever create? Probably not, there's always room for improvement. This script allows to create proper back-ups you can rely on, that span big retention periods, without requiring an unhealthy amount of disk space. In our opinion, it's a healthy mix between incremental an full back-ups, allowing for proper disaster recovery.
The current script will only function on Linux systems, or at least systems running Bash. It's unknown if it will run using the Linux Subsystem for Windows 10. Therefore we made a Windows Powershell Rsync backup script instead.
Finally, the script is probably not suitable for the less tech-savvy among us, but that — in my humble opinion — might be a pretty good user filter on itself: If you can't get it running, don't use it.
Perfacilis Back-up Service
The Perfacilis Back-up service alerts you when a back-up didn't finish in the scheduled time-frame, or if a lot of data changes at once (which might indicate an encrypter virus). You only pay for the amount of space your back-ups use.
Want to learn more?
Changelog
- 2021-02-01
- Bump version to 0.7.1
- Removed
$RSYNC_TARGET
from--backup-dir=$RSYNC_TARGET/$PERIOD/$INC/$TARGET
- 2021-04-06
- Removed
log/ logs/ *.log
from exclusion list, it's good practice to back-up log files as well. - 2021-04-25
- Bump version to 0.8.
- Removed
current
dir per increment folder, only onecurrent
dir inside the root directory will be created. - 2021-05-26
- Bump version to 0.8.1.
- Added check to enforce only one running instance, using pidof check.
- 2021-09-24
- Bump version to 0.8.2.
- Added
--single-transaction
, as suggested by mysqldump manpage and ClusterEngine's artile on dumping live MySql databases. - Using shorthand options -E, -R, -q and -Q for mysqldump command, because it's gets too long.
- Remove trailing slash after
$EMPTYDIR
inrsync $RSYNC_OPTS $EMPTYDIR/ $RSYNC_TARGET/${TREE/#\//}
, ensuring folder timestamps are properly set. - 2022-03-07
- Bump version to 0.9 code name Doug.
- No longer overwrite
current
dir if increment is 0, so we have true incremental backups. - Fixed setting
LAST_INC_XXX
files modified time to ensure intervals are more closely met. - Added
-p
argument to rsync command to make sure file permissions are set. - Created GitHub repository to keep track of changes.
- Special thanks to Doug for his suggestions!
- 2022-03-17
- Bump version to 0.9.1
- Check if $MYSQL or $MYSQLDUMP are empty, if no don't make mysql backups.
- 2022-03-29
- Bump version to 0.9.2 code name Gabriel.
- Fixed checking for empty $RSYNC_SECRET
- Special thanks to Gabriel for his suggestions!
- 2022-05-09
- Bump version to 0.9.3
- Making
touch
BSD-compatible, using xave's suggestions. - 2022-06-03
- Bump version to 0.10
- Using
flock
instead ofpidof
for more BDS-compatibility - 2022-06-17
- Bump version to 0.10.1
- More BSD compatibility, now for
date
command. - Bump version to 0.10.2
- Fix for local backups: Force increment dirs on target.
- 2022-06-27
- Bump version to 0.11
- Fixed some off by one errors, backup updates only one interval per run.
- Thanks to mgoerens for his suggestion
- 2022-06-28
- Bump version to 0.11.1
- Magic to sort associative array, to be sure biggest — most important — interval is back-upped