This commit is contained in:
first 2025-07-07 02:17:16 +00:00
parent 8dbab1a78e
commit b02c9f0246

152
README.md
View file

@ -1,3 +1,151 @@
# regularbm ## About
Leaves a nicely packaged log `regularbm` provides a robust, automated solution for backing up an Open WebUI instance running in Docker to an Amazon S3 bucket. It includes a Python script for performing the backup and an AWS CloudFormation template for deploying the necessary cloud infrastructure securely.
The script performs the following actions:
1. Exports all chats for every user into individual JSON files.
2. Exports the application configuration into a JSON file.
3. Backs up the entire SQLite database file (`webui.db`).
4. Packages all exported files into a single, compressed archive (`.tar.lz4`).
5. Uploads the compressed archive to the designated Amazon S3 bucket.
## Features
- **Comprehensive Backup**: Captures user data, chats, and system configuration.
- **Efficient Archiving**: Uses LZ4 compression for fast packaging.
- **Secure Cloud Storage**: Uploads backups to a private S3 bucket.
- **Automated & Scheduled**: Designed to be run automatically on a schedule using `cron`.
- **Secure by Default**: Leverages IAM Roles for AWS EC2 instances, eliminating the need for long-lived access keys on the server.
- **Flexible**: Includes documented procedures for both AWS-native servers and external (on-premises or multi-cloud) servers.
- **Configurable**: Easily set parameters like container name and S3 path via command-line arguments.
## Core Architecture
This solution supports two primary security patterns for granting server backup permissions:
1. **Pattern A: AWS EC2 Instances (Recommended)**: An IAM Role is attached to the EC2 instance via an Instance Profile. The server automatically receives short-lived, temporary credentials, which is the most secure method.
2. **Pattern B: External/On-Premises Servers**: A dedicated IAM User with a narrowly scoped policy and long-lived static access keys is created. This pattern requires careful credential management.
## Setup and Deployment Guide
### Step 1: Prerequisites & Dependencies
#### System Dependencies (Debian/Ubuntu)
You need `python3`, `docker`, and development libraries installed on the host machine.
```bash
sudo apt update
sudo apt install -y python3-venv docker.io liblz4-dev
```
**Note on Docker Permissions:** The user running this script must have permission to execute `docker` commands. The easiest way is to add the user to the `docker` group:
```bash
sudo usermod -aG docker $USER
# IMPORTANT: You must log out and log back in for this change to take effect.
```
#### AWS Account
You will need an AWS Account with permissions to create S3, IAM, and CloudFormation resources.
### Step 2: Deploying the AWS Infrastructure (CloudFormation)
The cloud infrastructure (S3 buckets, IAM Role) is defined in `cloudformation_template.yaml` for consistent deployment.
1. **Navigate to CloudFormation** in the AWS Console.
2. Click **Create stack** -> **With new resources (standard)**.
3. Select **Template is ready** -> **Direct input** and paste the contents of `cloudformation_template.yaml`.
4. Specify a stack name and provide globally unique names for the S3 buckets.
5. Acknowledge that CloudFormation will create IAM resources and click **Create stack**.
6. Wait for `CREATE_COMPLETE` status. Note the `InstanceProfileName` from the **Outputs** tab.
### Step 3: Configuring Server Permissions
Choose the appropriate path based on where your server is hosted.
#### For AWS EC2 Instances (Pattern A)
1. **Tag the EC2 Instance:** Apply a tag to the instance with Key `hostname` and a unique value (e.g., `openwebui-prod-01`).
2. **Attach the IAM Role:** Select the instance, go to **Actions** -> **Security** -> **Modify IAM role**, and attach the Instance Profile created by CloudFormation.
#### For External/On-Premises Servers (Pattern B)
1. **Create a Dedicated IAM Policy:** In the IAM console, create a new policy that allows the `s3:PutObject` action on the desired S3 prefix (e.g., `arn:aws:s3:::your-backup-bucket/your-prefix/*`).
2. **Create an IAM User:** Create a new user with **Programmatic access**, attaching the policy created above. Save the generated Access Key and Secret Key.
3. **Configure Credentials on the Server:**
- Create the credentials file: `mkdir -p ~/.aws && nano ~/.aws/credentials`.
- Add the keys in the `[default]` profile format.
- Set strict permissions: `chmod 600 ~/.aws/credentials`.
### Step 4: Preparing the Python Script Environment
Due to modern Linux protections (`externally-managed-environment`), you have two good options for installing the required Python packages (`boto3`, `lz4`):
1. **The OS-Native Way (Simple):** Install the dependencies using `apt`.
```bash
sudo apt install python3-boto3 python3-lz4
```
2. **The Virtual Environment Way (Robust):** Create an isolated environment for the script.
```bash
cd /path/to/your/script
python3 -m venv venv
source venv/bin/activate
pip install boto3 lz4
```
### Step 5: Usage and Configuration
#### Script Arguments
The `backup_openwebui.py` script is configured via command-line arguments:
- `--s3-bucket` (Required): The name of your S3 bucket.
- `--s3-region` (Required): The AWS region of your S3 bucket (e.g., `us-east-1`).
- `--container-name`: The name of your Open WebUI Docker container. Defaults to `open-webui`.
- `--s3-prefix`: A prefix (folder path) within your S3 bucket. Defaults to `openwebui_backups/`.
- `--tmp-dir`: A temporary directory for staging files. Defaults to `/tmp`.
#### Manual Execution
Make the script executable (`chmod +x backup_openwebui.py`) and run it manually for testing:
```bash
# Basic usage
./backup_openwebui.py --s3-bucket "your-backup-bucket" --s3-region "us-west-2"
# With custom parameters
./backup_openwebui.py \
--s3-bucket "your-backup-bucket" \
--s3-region "us-west-2" \
--container-name "my-custom-webui" \
--s3-prefix "production_backups/"
```
### Step 6: Automation with Cron
Schedule the script for automated execution using `cron`.
1. Open your crontab: `crontab -e`.
2. Add a job entry. **Ensure you use absolute paths**. If you used a virtual environment, you must use the Python interpreter from that environment.
**Example Cron Job (runs daily at 09:00 UTC):**
```crontab
0 9 * * * /path/to/script/venv/bin/python /path/to/script/backup_openwebui.py --s3-bucket "your-bucket" --s3-region "us-east-1" --s3-prefix "your-prefix/" >> /var/log/regularbm.log 2>&1
```
**Explanation of the cron line:**
- `0 9 * * *`: The schedule (minute 0, hour 9, every day).
- `/path/to/script/.../python`: Absolute path to the Python interpreter. Use `/usr/bin/python3` for the OS-native install or the path to your `venv` interpreter.
- `/path/to/script/.../backup_openwebui.py`: Absolute path to the backup script.
- `--s3-bucket ...`: Your required script arguments.
- `>> /var/log/regularbm.log 2>&1`: **Highly recommended.** Redirects all output and errors to a log file for debugging.
## Troubleshooting Common Errors
- **`externally-managed-environment`**: Your OS is protecting its Python installation. Follow the instructions in **Step 4** to use `apt` or a virtual environment.
- **`Unable to locate credentials`**: The script cannot authenticate with AWS. If on EC2, ensure the IAM Instance Profile is attached (Step 3A). If external, ensure the `~/.aws/credentials` file is correctly created and secured (Step 3B).
- **`AccessDenied` on `s3:PutObject`**: The IAM policy is incorrect. The user/role has authenticated successfully but is not authorized to write to the specified S3 path. Edit the IAM policy and ensure the `Resource` ARN exactly matches the bucket and prefix your script is trying to use.