CommunityData:Backups (nada): Difference between revisions

From CommunityData
No edit summary
Line 1: Line 1:
'''nada.com.washington.edu''' (our main Internet-connected research server at the University of Washington) has about 14TB of available disk space. Although we want to backup thing everything by default, backing up all 14TB would cost about $140/month. As a result, we should be smart about what/how we back things up. This page discusses the current backup setup and strategy.
'''nada.com.washington.edu''' (our main Internet-connected research server at the University of Washington) has about 14TB of available disk space. The disks are in a [[https://enw.wikipedia.org/wiki/RAID5 RAID5 configuration]. This means that if there's hardware failure on one drive, we won't lose data. If there is hardware failure on more than one drive through bad luck, some sort of physical accident that destroys the machine (e.g., fire), or if someone accidentally deletes files they need, we'd be out of luck. As a result, we have backups.
 
Although we would love to backup thing everything, backing up all 14TB would cost about $140/month! As a result, we are trying be smart about what/how we back things up. This page discusses the current backup setup and strategy.


== Backups on Nada ==
== Backups on Nada ==


Nada backups are full filesystem-wide backups using [http://duplicity.nongnu.org/ Duplicity]. The backups are incremental weekly backups done using [http://www.nongnu.org/rdiff-backup/ rdiff backup] (think [https://rsync.samba.org/ rsync]), are encrypted using GPG, and then are stored in [https://cloud.google.com/storage/docs/nearline?hl=en Google Nearline storage] which costs about $0.01/GB. Backups run once at the beginning of each week.
Nada backups are full filesystem-wide backups using [http://duplicity.nongnu.org/ Duplicity]. The backups are incremental backups done weekly using [http://www.nongnu.org/rdiff-backup/ rdiff backup] (think [https://rsync.samba.org/ rsync]), are encrypted using a GPG key under Mako's control, and are stored in [https://cloud.google.com/storage/docs/nearline?hl=en Google Nearline storage] which costs about $0.01/GB. Backups run once at the beginning of each week.


Everything is backed up except for the directories listed in <code>/root/duplicitity_exclude</code>. This page may not be up to date but the following files/directories are '''excluded''' at the time that this page was written:
Everything is backed up except for the directories listed in <code>/root/duplicitity_exclude</code>. This page may not be up to date but the following files/directories are '''excluded''' at the time that this page was written:

Revision as of 01:16, 25 December 2015

nada.com.washington.edu (our main Internet-connected research server at the University of Washington) has about 14TB of available disk space. The disks are in a [RAID5 configuration. This means that if there's hardware failure on one drive, we won't lose data. If there is hardware failure on more than one drive through bad luck, some sort of physical accident that destroys the machine (e.g., fire), or if someone accidentally deletes files they need, we'd be out of luck. As a result, we have backups.

Although we would love to backup thing everything, backing up all 14TB would cost about $140/month! As a result, we are trying be smart about what/how we back things up. This page discusses the current backup setup and strategy.

Backups on Nada

Nada backups are full filesystem-wide backups using Duplicity. The backups are incremental backups done weekly using rdiff backup (think rsync), are encrypted using a GPG key under Mako's control, and are stored in Google Nearline storage which costs about $0.01/GB. Backups run once at the beginning of each week.

Everything is backed up except for the directories listed in /root/duplicitity_exclude. This page may not be up to date but the following files/directories are excluded at the time that this page was written:

/mnt
/media
/mit
/nonexistent
/openafs_cache_fs
/tmp
/var/log
/var/lib/mysql
/var/lib/mongodb
/var/lib/redis
/var/lib/postgresql
/var/spool
/var/tmp
/var/cache
/lost+found
/lolo
/cdrom
/floppy
/proc
/sys
/root/.cache
/root/nobackup
/home/*/.cache
/home/*/nobackup
/home/awjordan

Backing up Databases

MySQL Backups

Although /var/lib/mysql is excluded, some MySQL databases 'are backed up using a separate MySQL incremental backup script that calls Percona XtraBackup. These incremental MySQL backups are created once each week before the duplicity backup script is run. To add a new MySQL database to the backup list, you should edit the following files:

/usr/local/sbin/mysql_backup_full
/usr/local/sbin/mysql_backup_inc

Minimizing Backup Size

If you have large datasets that are unlikely to change or be replaced, store a copy of these data in the /com/ directory in Hyak and keep the files in /home/<YOUR NAME>/nobackup and then symlink them from a more convenient location.