Backup and Recovery: Difference between revisions

From FnordWiki
Jump to navigation Jump to search
No edit summary
Line 1: Line 1:
== General thoughts ==
Accepted wisdom:
* Three copies of your data
* Two different media
* One copy off site

== Current state of affairs ==
== Current state of affairs ==
(Early June 2025)
(Early June 2025)
Line 11: Line 5:


A very quick survey of the Linux systems counts approximately 2615Gbytes of data being used. Much of this is likely to be duplicated picture archives.
A very quick survey of the Linux systems counts approximately 2615Gbytes of data being used. Much of this is likely to be duplicated picture archives.

== General thoughts ==
Accepted wisdom:
* Three copies of your data
* Two different media
* One copy off site


== Data classification ==
== Data classification ==

Revision as of 15:57, 1 June 2025

Current state of affairs

(Early June 2025)

There are 30+ computers in the house, mostly Debian GNU/Linux machines both physical and virtual. 4 or so Windows machines. One OSX (MacOS, whatever) Macbook Pro. 4+ phones. And probably a few other things. There is no centralized backup happening for any of these, though individual hardware failures are well protected against with redundant storage. This does not help in the cases of a fat finger event or Windows malware infection, though.

A very quick survey of the Linux systems counts approximately 2615Gbytes of data being used. Much of this is likely to be duplicated picture archives.

General thoughts

Accepted wisdom:

  • Three copies of your data
  • Two different media
  • One copy off site

Data classification

  • Not replaceable, unique
    • Irreplaceable photos
    • Important financial records
    • Schoolwork
    • Project work (like this Wiki's contents)
    • Email
    • Disk encryption passhphrases
  • Inconvenient to replace
    • Server configuration files (/etc and the like)
    • Not sure what else might go here
  • Mundane
    • Not sure what goes here at all

Moving forward

Requirements:

  • Put family members at ease that a hardware failure, malware event, or physical catastrophe does not lose a lifetime's irreplaceable computer data
  • RPO (recovery point objective) of one day. No more than one day's data might be lost.
  • RTO (recovery time objective) of one week. It might take some time to retrieve the data, but the process will work eventually.
  • Multiple eggs, multiple baskets: No single technology or vendor holds all the data.
  • Any dependencies on external data vendors (cloud backup companies, etc) be minimal in dollar expenditure.
    • Some vendors are cheap for upload and storage, but charge for retrieval. These are still worth exploring.
  • External storage entities are unable to examine or retrieve the data being stored.

Existing resources:

  • A server that has been designated solely for backup purposes: bitkeeper-0. This is on-site in the garage data center.
    • Approx 7Tbytes of usable space available for spooling of backup data, in 4 disk (2+2) RAIDZ2 ZFS pool.
  • A Dell PowerVault MD3060e external disk enclosure with 60x 2Tbyte SATA disk drives installed. This device can be powered off independently of bitkeeper-0, providing a mostly offline data storage space. Depending on configuration, this will provide 60-90 Tbytes of capacity for backup space.
  • One Ceph cluster (fnord-201802) with approximately 12Tbytes of usable space available. This storage is made available over the Amazon AWS S3 protocol, a network filesystem, and as raw block devices. As this is written, usable space in this cluster is being increased to approximately 15Tbytes.
  • One HP MSL 4048 tape library equipped with 2x LTO-5 drives. This library will hold 48 tape cartridges, each of 1.5Tbytes capacity. Tape cartridges are removable and may be taken off site for long term storage. Perhaps to a trusted friend's house, a climate controlled storage facility, or a bank safe deposit box.
  • One IT nerd. Who likes Linux (especially Debian), UNIX in general, and doing data storage.

Tools to be used:

  • Bareos, a free software backup and recovery software package that supports Linux (for both servers and clients) and Windows clients. This provides a centralized client/server backup service.
  • restic, another free software backup and recovery package. This one is more distributed in nature, has other features that make it attractive, too.