How many backups do you need
There is a very easy an simple 3-2-1 backup rule
- 3: Each file should exist 3 times (original + 2 backups)
- 2: Use 2 different types of storage systems (e.g. internal disk + tape / cloud / ...)
- 1: Have one copy offside (to survive e.g. natural disasters, issues with cloud / infrastructure provider etc)
Examples
Where your data is | Where first backup is | Where second backup is | Comment |
---|---|---|---|
Internal disk | Internal disk | Amazon S3 (region with enough distance to your server) | |
Amazon S3 | Amazon S3 (other region) | Azure Storage (other region) | At last one of the backups needs to in a different region |
Cloud accounts for backups
AWS
We use AWS S3 to store the backups
Create S3 Bucket
Bucket name (good idea to use a common prefix for all your S3 buckets that is unlikely to be taken by somebody else)
ACLs disabled
Block all public access
Bucket Versioning (protects against deletion or modifications of files for some time)
Encryption type Server-side encrytion Amazon S3 managed (we encrypt our files anyhow before upload so not really relevant)
Bucket Key Enabled
Create an S3 lifecycle to do some cleanup
Apply to all objects in the bucket
Transition current versions of objects between storage classes (move files after some time to a cheap storage class like Standard-IA)
Transition noncurrent versions of objects between storage classes (do cleanup)
Permanently delete noncurrent versions of objects
Create a policy to access the S3 bucket
Let's assue your S3 bucket was named foo.bar.backup
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:DeleteObject",
"s3:GetObject",
"s3:PutObject",
"s3:PutObjectAcl",
"s3:CreateBucket"
],
"Resource": [
"arn:aws:s3:::foo.bar.backup/*",
"arn:aws:s3:::foo.bar.backup"
]
},
{
"Effect": "Allow",
"Action": "s3:ListAllMyBuckets",
"Resource": "arn:aws:s3:::*"
}
]
}
Create an User to access the S3 bucket via the policy
No console access
-> Attach policies directly -> Search your policy
Create an Access token for the user
Tools
Rclone
Rclone is an amazing tool to copy files between different filesystems.
Rclone configuration
Rclone comes with a build in configuration assistant. Just do this and answer the questions about the storage you want to add
name: foo_bar_rclone # think of your unique name
Type of storage to configure: s3
Choose your S3 provider: AWS
Get AWS credentials from runtime: false
access_key_id: *** # as provided by AWS
secret_access_key: *** # as provided by AWS
Region to connect to: eu-north-1 # your favorite AWS region
Endpoint for S3 API:
Location constraint: eu-north-1 # your favorite AWS region
This ACL is used for creating objects: private
The server-side encryption algorithm: AES256
sse_kms_key_id:
The storage class: STANDARD_IA
rclone config file
And this would be your generated config
type = s3
provider = AWS
access_key_id = ***
secret_access_key = ***
region = eu-north-1
location_constraint = eu-north-1
acl = private
bucket_acl = private
server_side_encryption = AES256
storage_class = STANDARD_IA
Rclone commands
If you added an S3 storage with the bucket name com.example.foo and you named it bar that you can do this
rclone copy --retries int 1 --copy-links /tmp/pictures bar:/com.example.foo/pictures
rclone sync --retries int 1 --copy-links /tmp/pictures bar:/com.example.foo/pictures
rclone sync -P --retries int 1 --copy-links --max-backlog 100000 /tmp/pictures bar:/com.example.foo/pictures
Where sync will also delete files in the destination folder that are not in the source folder. When you want to follow the progress (-P) having a high number for max-backlog is good to see how long to sync will actually take.
Rclone encryption
After you created you first Rclone storage you can add a second one that encrypts the file content, file name and folder name before sending it to the first one. As long as you have the Rclone configuration with the password this is totally transparent for you, just read and write from the second storage. The advantage is that the data can not be read by the cloud provider, the disadvantage is that you will always need rclone to read the data
n) New remote
Encrypt/Decrypt a remote (crypt)
remote> bar:/com.example.foo/