backup2mdisc/README.md
2025-01-25 16:57:10 +00:00

113 lines
5 KiB
Markdown

# backup2mdisc
**Now you can enjoy a _self-contained_ backup on each disc without chain-dependency across your entire multi TB backup set!**:
- **Independently decryptable** (and restorable) archives on each M-Disc.
- Automatic ISO creation and optional disc burning in the same script.
- **Fast compression via lz4**.
---
## Purpose:
1. Scans all files in a source directory.
2. Groups them into "chunks" so that each chunk is <= a specified size (default 100GB).
3. Creates a TAR archive of each chunk, compresses it with `lz4`, and encrypts it with GPG (AES256).
4. Each `.tar.lz4.gpg` is fully independent (no other parts/discs needed to restore that chunk).
5. (Optional) Creates ISO images from each encrypted chunk if `--create-iso` is provided.
6. (Optional) Burns each chunk or ISO to M-Disc if `--burn` is provided.
## How It Works
1. **File Collection & Sorting**
- The script uses `find` to list all files in your `SOURCE_DIR` with their sizes.
- It sorts them in ascending order by size so it can pack smaller files first (you can remove `| sort -n` if you prefer a different method).
2. **Chunk Accumulation**
- It iterates over each file, summing up file sizes into a “current chunk.”
- If adding a new file would exceed `CHUNK_SIZE` (default 100GB), it **finalizes** the current chunk (creates `.tar.lz4.gpg`) and starts a new one.
3. **Archive, Compress, Encrypt**
- For each chunk, it creates a `.tar.lz4.gpg` file. Specifically:
1. `tar -cf - -T $TMP_CHUNK_LIST` (archive of the files in that chunk)
2. Pipe into `lz4 -c` for fast compression
3. Pipe into `gpg --batch -c` (symmetric encrypt with AES256, using your passphrase)
- The result is a self-contained file like `chunk_001.tar.lz4.gpg`.
4. **Checksums & Manifest**
- It calculates the SHA-256 sum of each chunk archive and appends it to a manifest file along with the list of included files.
- That manifest is stored in `$WORK_DIR`.
5. **Optional ISO Creation** (`--create-iso`)
- After each chunk is created, the script can build an ISO image containing just that `.tar.lz4.gpg`.
- This step uses `genisoimage` (or `mkisofs`). The resulting file is `chunk_001.iso`, etc.
6. **Optional Burning** (`--burn`)
- If you specify `--burn`, the script will pause after creating each chunk/ISO and prompt you to insert a fresh M-Disc.
- On **Linux**, it tries `growisofs`.
- On **macOS**, it tries `hdiutil` (if creating an ISO).
- If it doesn't find these commands, it'll instruct you to burn manually.
7. **Repeat**
- The script loops until all files have been placed into chunk(s).
---
## Usage:
```
./backup2mdisc.sh /path/to/source /path/to/destination [CHUNK_SIZE] [--create-iso] [--burn]
```
## Examples:
```bash
./backup2mdisc.sh /home/user/data /mnt/backup 100G --create-iso
```
```bash
./backup2mdisc.sh /data /backup 50G --burn
```
## Dependencies:
- `bash`
- `gpg` (for encryption)
- `lz4` (for fast compression)
- `tar`
- `split` or file-based grouping approach
- `sha256sum` (or '`shasum -a 256`' on macOS/FreeBSD)
- `genisoimage` or `mkisofs` (for creating ISOs if `--create-iso`)
- `growisofs` (Linux) or `hdiutil` (macOS) for burning if `--burn`
---
## Restoring Your Data
- **Disc is self-contained**: If you have disc #4 containing `chunk_004.tar.lz4.gpg`, you can restore it independently of the others.
- **Decrypt & Extract**:
```bash
gpg --decrypt chunk_004.tar.lz4.gpg | lz4 -d | tar -xvf -
```
This will prompt for the passphrase you used during backup.
- If one disc is lost, you only lose the files in that chunk; all other chunks remain restorable.
---
## Why lz4?
- **Speed**: `lz4` is extremely fast at both compression and decompression.
- **Less compression ratio** than xz, but if your priority is speed (and 100GB disc space is enough), `lz4` is a great choice.
- For maximum compression at the cost of time, you could replace `lz4` with `xz -9`, but expect slower backups and restores.
---
## Tips & Caveats
1. **Large Files**
- A single file larger than your chunk size (e.g., 101GB file with a 100GB chunk limit) won't fit. This script doesn't handle that gracefully. You'd need to split such a file (e.g., with `split`) before archiving or use a backup tool that supports partial file splitting.
2. **Verification**
- Always verify your discs after burning. Mount them and compare the chunk's SHA-256 with the manifest to ensure data integrity.
3. **Incremental or Deduplicated Backups**
- For advanced features (incremental, deduplication, partial-chunk checksums), consider specialized backup programs (like Borg, restic, Duplicati). However, they usually produce multi-volume archives that need **all** volumes to restore.
4. **Cross-Platform**
- On FreeBSD or macOS, you might need to tweak the commands for hashing (`sha256sum` vs. `shasum -a 256`) or ISO creation (`mkisofs` vs. `genisoimage`).
- For burning, Linux uses `growisofs`, macOS uses `hdiutil`, and FreeBSD may require `cdrecord` or another tool.