backup2mdisc/README.md
2025-01-25 16:57:10 +00:00

5 KiB

backup2mdisc

Now you can enjoy a self-contained backup on each disc without chain-dependency across your entire multi TB backup set!:

  • Independently decryptable (and restorable) archives on each M-Disc.
  • Automatic ISO creation and optional disc burning in the same script.
  • Fast compression via lz4.

Purpose:

  1. Scans all files in a source directory.
  2. Groups them into "chunks" so that each chunk is <= a specified size (default 100GB).
  3. Creates a TAR archive of each chunk, compresses it with lz4, and encrypts it with GPG (AES256).
  4. Each .tar.lz4.gpg is fully independent (no other parts/discs needed to restore that chunk).
  5. (Optional) Creates ISO images from each encrypted chunk if --create-iso is provided.
  6. (Optional) Burns each chunk or ISO to M-Disc if --burn is provided.

How It Works

  1. File Collection & Sorting

    • The script uses find to list all files in your SOURCE_DIR with their sizes.
    • It sorts them in ascending order by size so it can pack smaller files first (you can remove | sort -n if you prefer a different method).
  2. Chunk Accumulation

    • It iterates over each file, summing up file sizes into a “current chunk.”
    • If adding a new file would exceed CHUNK_SIZE (default 100GB), it finalizes the current chunk (creates .tar.lz4.gpg) and starts a new one.
  3. Archive, Compress, Encrypt

    • For each chunk, it creates a .tar.lz4.gpg file. Specifically:
      1. tar -cf - -T $TMP_CHUNK_LIST (archive of the files in that chunk)
      2. Pipe into lz4 -c for fast compression
      3. Pipe into gpg --batch -c (symmetric encrypt with AES256, using your passphrase)
    • The result is a self-contained file like chunk_001.tar.lz4.gpg.
  4. Checksums & Manifest

    • It calculates the SHA-256 sum of each chunk archive and appends it to a manifest file along with the list of included files.
    • That manifest is stored in $WORK_DIR.
  5. Optional ISO Creation (--create-iso)

    • After each chunk is created, the script can build an ISO image containing just that .tar.lz4.gpg.
    • This step uses genisoimage (or mkisofs). The resulting file is chunk_001.iso, etc.
  6. Optional Burning (--burn)

    • If you specify --burn, the script will pause after creating each chunk/ISO and prompt you to insert a fresh M-Disc.
    • On Linux, it tries growisofs.
    • On macOS, it tries hdiutil (if creating an ISO).
    • If it doesn't find these commands, it'll instruct you to burn manually.
  7. Repeat

    • The script loops until all files have been placed into chunk(s).

Usage:

./backup2mdisc.sh /path/to/source /path/to/destination [CHUNK_SIZE] [--create-iso] [--burn]

Examples:

./backup2mdisc.sh /home/user/data /mnt/backup 100G --create-iso
./backup2mdisc.sh /data /backup 50G --burn

Dependencies:

  • bash
  • gpg (for encryption)
  • lz4 (for fast compression)
  • tar
  • split or file-based grouping approach
  • sha256sum (or 'shasum -a 256' on macOS/FreeBSD)
  • genisoimage or mkisofs (for creating ISOs if --create-iso)
  • growisofs (Linux) or hdiutil (macOS) for burning if --burn

Restoring Your Data

  • Disc is self-contained: If you have disc #4 containing chunk_004.tar.lz4.gpg, you can restore it independently of the others.

  • Decrypt & Extract:

    gpg --decrypt chunk_004.tar.lz4.gpg | lz4 -d | tar -xvf -
    

    This will prompt for the passphrase you used during backup.

  • If one disc is lost, you only lose the files in that chunk; all other chunks remain restorable.


Why lz4?

  • Speed: lz4 is extremely fast at both compression and decompression.
  • Less compression ratio than xz, but if your priority is speed (and 100GB disc space is enough), lz4 is a great choice.
  • For maximum compression at the cost of time, you could replace lz4 with xz -9, but expect slower backups and restores.

Tips & Caveats

  1. Large Files

    • A single file larger than your chunk size (e.g., 101GB file with a 100GB chunk limit) won't fit. This script doesn't handle that gracefully. You'd need to split such a file (e.g., with split) before archiving or use a backup tool that supports partial file splitting.
  2. Verification

    • Always verify your discs after burning. Mount them and compare the chunk's SHA-256 with the manifest to ensure data integrity.
  3. Incremental or Deduplicated Backups

    • For advanced features (incremental, deduplication, partial-chunk checksums), consider specialized backup programs (like Borg, restic, Duplicati). However, they usually produce multi-volume archives that need all volumes to restore.
  4. Cross-Platform

    • On FreeBSD or macOS, you might need to tweak the commands for hashing (sha256sum vs. shasum -a 256) or ISO creation (mkisofs vs. genisoimage).
    • For burning, Linux uses growisofs, macOS uses hdiutil, and FreeBSD may require cdrecord or another tool.