diff --git a/README.md b/README.md index 10622b9..70aa091 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,151 @@ -# regularbm +## About -Leaves a nicely packaged log \ No newline at end of file +`regularbm` provides a robust, automated solution for backing up an Open WebUI instance running in Docker to an Amazon S3 bucket. It includes a Python script for performing the backup and an AWS CloudFormation template for deploying the necessary cloud infrastructure securely. + +The script performs the following actions: + +1. Exports all chats for every user into individual JSON files. +2. Exports the application configuration into a JSON file. +3. Backs up the entire SQLite database file (`webui.db`). +4. Packages all exported files into a single, compressed archive (`.tar.lz4`). +5. Uploads the compressed archive to the designated Amazon S3 bucket. + +## Features + +- **Comprehensive Backup**: Captures user data, chats, and system configuration. +- **Efficient Archiving**: Uses LZ4 compression for fast packaging. +- **Secure Cloud Storage**: Uploads backups to a private S3 bucket. +- **Automated & Scheduled**: Designed to be run automatically on a schedule using `cron`. +- **Secure by Default**: Leverages IAM Roles for AWS EC2 instances, eliminating the need for long-lived access keys on the server. +- **Flexible**: Includes documented procedures for both AWS-native servers and external (on-premises or multi-cloud) servers. +- **Configurable**: Easily set parameters like container name and S3 path via command-line arguments. + +## Core Architecture + +This solution supports two primary security patterns for granting server backup permissions: + +1. **Pattern A: AWS EC2 Instances (Recommended)**: An IAM Role is attached to the EC2 instance via an Instance Profile. The server automatically receives short-lived, temporary credentials, which is the most secure method. +2. **Pattern B: External/On-Premises Servers**: A dedicated IAM User with a narrowly scoped policy and long-lived static access keys is created. This pattern requires careful credential management. + +## Setup and Deployment Guide + +### Step 1: Prerequisites & Dependencies + +#### System Dependencies (Debian/Ubuntu) + +You need `python3`, `docker`, and development libraries installed on the host machine. + +```bash +sudo apt update +sudo apt install -y python3-venv docker.io liblz4-dev +``` + +**Note on Docker Permissions:** The user running this script must have permission to execute `docker` commands. The easiest way is to add the user to the `docker` group: + +```bash +sudo usermod -aG docker $USER +# IMPORTANT: You must log out and log back in for this change to take effect. +``` + +#### AWS Account + +You will need an AWS Account with permissions to create S3, IAM, and CloudFormation resources. + +### Step 2: Deploying the AWS Infrastructure (CloudFormation) + +The cloud infrastructure (S3 buckets, IAM Role) is defined in `cloudformation_template.yaml` for consistent deployment. + +1. **Navigate to CloudFormation** in the AWS Console. +2. Click **Create stack** -> **With new resources (standard)**. +3. Select **Template is ready** -> **Direct input** and paste the contents of `cloudformation_template.yaml`. +4. Specify a stack name and provide globally unique names for the S3 buckets. +5. Acknowledge that CloudFormation will create IAM resources and click **Create stack**. +6. Wait for `CREATE_COMPLETE` status. Note the `InstanceProfileName` from the **Outputs** tab. + +### Step 3: Configuring Server Permissions + +Choose the appropriate path based on where your server is hosted. + +#### For AWS EC2 Instances (Pattern A) + +1. **Tag the EC2 Instance:** Apply a tag to the instance with Key `hostname` and a unique value (e.g., `openwebui-prod-01`). +2. **Attach the IAM Role:** Select the instance, go to **Actions** -> **Security** -> **Modify IAM role**, and attach the Instance Profile created by CloudFormation. + +#### For External/On-Premises Servers (Pattern B) + +1. **Create a Dedicated IAM Policy:** In the IAM console, create a new policy that allows the `s3:PutObject` action on the desired S3 prefix (e.g., `arn:aws:s3:::your-backup-bucket/your-prefix/*`). +2. **Create an IAM User:** Create a new user with **Programmatic access**, attaching the policy created above. Save the generated Access Key and Secret Key. +3. **Configure Credentials on the Server:** + - Create the credentials file: `mkdir -p ~/.aws && nano ~/.aws/credentials`. + - Add the keys in the `[default]` profile format. + - Set strict permissions: `chmod 600 ~/.aws/credentials`. + +### Step 4: Preparing the Python Script Environment + +Due to modern Linux protections (`externally-managed-environment`), you have two good options for installing the required Python packages (`boto3`, `lz4`): + +1. **The OS-Native Way (Simple):** Install the dependencies using `apt`. + ```bash + sudo apt install python3-boto3 python3-lz4 + ``` + +2. **The Virtual Environment Way (Robust):** Create an isolated environment for the script. + ```bash + cd /path/to/your/script + python3 -m venv venv + source venv/bin/activate + pip install boto3 lz4 + ``` + +### Step 5: Usage and Configuration + +#### Script Arguments + +The `backup_openwebui.py` script is configured via command-line arguments: + +- `--s3-bucket` (Required): The name of your S3 bucket. +- `--s3-region` (Required): The AWS region of your S3 bucket (e.g., `us-east-1`). +- `--container-name`: The name of your Open WebUI Docker container. Defaults to `open-webui`. +- `--s3-prefix`: A prefix (folder path) within your S3 bucket. Defaults to `openwebui_backups/`. +- `--tmp-dir`: A temporary directory for staging files. Defaults to `/tmp`. + +#### Manual Execution + +Make the script executable (`chmod +x backup_openwebui.py`) and run it manually for testing: + +```bash +# Basic usage +./backup_openwebui.py --s3-bucket "your-backup-bucket" --s3-region "us-west-2" + +# With custom parameters +./backup_openwebui.py \ + --s3-bucket "your-backup-bucket" \ + --s3-region "us-west-2" \ + --container-name "my-custom-webui" \ + --s3-prefix "production_backups/" +``` + +### Step 6: Automation with Cron + +Schedule the script for automated execution using `cron`. + +1. Open your crontab: `crontab -e`. +2. Add a job entry. **Ensure you use absolute paths**. If you used a virtual environment, you must use the Python interpreter from that environment. + + **Example Cron Job (runs daily at 09:00 UTC):** + ```crontab + 0 9 * * * /path/to/script/venv/bin/python /path/to/script/backup_openwebui.py --s3-bucket "your-bucket" --s3-region "us-east-1" --s3-prefix "your-prefix/" >> /var/log/regularbm.log 2>&1 + ``` + + **Explanation of the cron line:** + - `0 9 * * *`: The schedule (minute 0, hour 9, every day). + - `/path/to/script/.../python`: Absolute path to the Python interpreter. Use `/usr/bin/python3` for the OS-native install or the path to your `venv` interpreter. + - `/path/to/script/.../backup_openwebui.py`: Absolute path to the backup script. + - `--s3-bucket ...`: Your required script arguments. + - `>> /var/log/regularbm.log 2>&1`: **Highly recommended.** Redirects all output and errors to a log file for debugging. + +## Troubleshooting Common Errors + +- **`externally-managed-environment`**: Your OS is protecting its Python installation. Follow the instructions in **Step 4** to use `apt` or a virtual environment. +- **`Unable to locate credentials`**: The script cannot authenticate with AWS. If on EC2, ensure the IAM Instance Profile is attached (Step 3A). If external, ensure the `~/.aws/credentials` file is correctly created and secured (Step 3B). +- **`AccessDenied` on `s3:PutObject`**: The IAM policy is incorrect. The user/role has authenticated successfully but is not authorized to write to the specified S3 path. Edit the IAM policy and ensure the `Resource` ARN exactly matches the bucket and prefix your script is trying to use. \ No newline at end of file