How many backups do you need

There is a very easy an simple 3-2-1 backup rule

  • 3: Each file should exist 3 times (original + 2 backups)
  • 2: Use 2 different types of storage systems (e.g. internal disk + tape / cloud / ...)
  • 1: Have one copy offside (to survive e.g. natural disasters, issues with cloud / infrastructure provider etc)

Examples

Where your data isWhere first backup isWhere second backup isComment
Internal diskInternal diskAmazon S3 (region with enough distance to your server) 
Amazon S3Amazon S3 (other region)Azure Storage (other region)At last one of the backups needs to in a different region

Cloud accounts for backups

AWS

We use AWS S3 to store the backups

Create S3 Bucket

S3 -> Create bucket -> General purpose ->
 Bucket name (good idea to use a common prefix for all your S3 buckets that is unlikely to be taken by somebody else)
 ACLs disabled
 Block all public access
 Bucket Versioning (protects against deletion or modifications of files for some time)
 Encryption type Server-side encrytion Amazon S3 managed (we encrypt our files anyhow before upload so not really relevant)
 Bucket Key Enabled

Create an S3 lifecycle to do some cleanup

Click on the new bucket -> Management -> Lifecycle rules -> Create lifecyle rule ->
  Apply to all objects in the bucket
  Transition current versions of objects between storage classes (move files after some time to a cheap storage class like Standard-IA)
  Transition noncurrent versions of objects between storage classes (do cleanup)
  Permanently delete noncurrent versions of objects

Create a policy to access the S3 bucket

Let's assue your S3 bucket was named foo.bar.backup

Go to IAM -> Policies -> Create policy -> JSON
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:DeleteObject",
                "s3:GetObject",
                "s3:PutObject",
                "s3:PutObjectAcl",
                "s3:CreateBucket"
            ],
            "Resource": [
                "arn:aws:s3:::foo.bar.backup/*",
                "arn:aws:s3:::foo.bar.backup"
            ]
        },
        {
            "Effect": "Allow",
            "Action": "s3:ListAllMyBuckets",
            "Resource": "arn:aws:s3:::*"
        }
    ]
}

Create an User to access the S3 bucket via the policy

IAM -> Uers -> Create user ->
  No console access
  -> Attach policies directly -> Search your policy

Create an Access token for the user

IAM -> Users -> (the user you just created and added the policy to) -> Security credentials -> Access keys -> Create access key -> Other -> copy the access key and the secret and remember it for later

Tools

Rclone

Rclone is an amazing tool to copy files between different filesystems.

apt-get install rclone

Rclone configuration

Rclone comes with a build in configuration assistant. Just do this and answer the questions about the storage you want to add

rclone config

name: foo_bar_rclone # think of your unique name
Type of storage to configure: s3
Choose your S3 provider: AWS
Get AWS credentials from runtime: false
access_key_id: ***       # as provided by AWS
secret_access_key: ***   # as provided by AWS
Region to connect to: eu-north-1 # your favorite AWS region
Endpoint for S3 API:
Location constraint: eu-north-1  # your favorite AWS region
This ACL is used for creating objects: private
The server-side encryption algorithm: AES256
sse_kms_key_id:
The storage class: STANDARD_IA
This tells you where the generated configuration is
rclone config file

And this would be your generated config

[foo_bar_rclone]
type = s3
provider = AWS
access_key_id = ***
secret_access_key = ***
region = eu-north-1
location_constraint = eu-north-1
acl = private
bucket_acl = private
server_side_encryption = AES256
storage_class = STANDARD_IA

Rclone commands

If you added an S3 storage with the bucket name com.example.foo and you named it bar that you can do this

rclone ls                                                                      bar:/com.example.foo/
rclone copy    --retries int 1 --copy-links                      /tmp/pictures bar:/com.example.foo/pictures
rclone sync    --retries int 1 --copy-links                      /tmp/pictures bar:/com.example.foo/pictures
rclone sync -P --retries int 1 --copy-links --max-backlog 100000 /tmp/pictures bar:/com.example.foo/pictures

Where sync will also delete files in the destination folder that are not in the source folder. When you want to follow the progress (-P) having a high number for max-backlog is good to see how long to sync will actually take.

Rclone encryption

After you created you first Rclone storage you can add a second one that encrypts the file content, file name and folder name before sending it to the first one. As long as you have the Rclone configuration with the password this is totally transparent for you, just read and write from the second storage. The advantage is that the data can not be read by the cloud provider, the disadvantage is that you will always need rclone to read the data

rclone config
  n) New remote
     Encrypt/Decrypt a remote (crypt)
      remote> bar:/com.example.foo/