107 lines
No EOL
3.2 KiB
Markdown
107 lines
No EOL
3.2 KiB
Markdown
# 🧹 Filename Sanitizer
|
||
|
||
A Bash script to rename files in a directory by removing unsafe characters, handling name collisions, and ensuring consistent lowercase filenames.
|
||
|
||
## 🧩 Features
|
||
|
||
- **Safe character handling**: Removes all non-alphanumeric characters except underscores, hyphens, and periods
|
||
- **Unicode cleanup**: Converts special characters like `’`, `?`, `(`, `)` to standard ASCII
|
||
- **Collision avoidance**: Adds numeric suffixes (`_1`, `_2`) to prevent overwriting
|
||
- **Case normalization**: Converts filenames to lowercase
|
||
- **Space handling**: Replaces spaces with underscores
|
||
- **Truncation**: Limits base filenames to 250 chars to stay under 255 char limit (filesystem safe)
|
||
- **Dry-run support**: Test changes before applying them
|
||
|
||
## 🚀 Usage
|
||
|
||
```bash
|
||
# Clone the repo
|
||
git clone https://github.com/yourname/yourrepo.git
|
||
cd yourrepo
|
||
|
||
# Make the script executable
|
||
chmod +x rename_safe.sh
|
||
|
||
# Dry run (test mode)
|
||
./rename_safe.sh --dry-run /path/to/files
|
||
|
||
# Actual renaming
|
||
./rename_safe.sh /path/to/files
|
||
```
|
||
|
||
> **Windows Users**: Run this script in [WSL](https://learn.microsoft.com/en-us/windows/wsl/) or [Git Bash](https://git-scm.com/downloads) - native Windows CMD/Powershell won't work
|
||
|
||
## 🧪 Test Script
|
||
|
||
```bash
|
||
# Run validation tests
|
||
./test_rename.sh
|
||
```
|
||
|
||
Creates files with:
|
||
- Special characters (`%`, `’`, `!`, `(`, `)`, `?`)
|
||
- Multi-dot extensions (`.en.vtt`)
|
||
- Hidden files (`.Hidden File`)
|
||
- Long filenames (near 255 char limit)
|
||
- Duplicate filenames
|
||
- Mixed case names
|
||
|
||
## ⚙️ Configuration
|
||
|
||
Edit `rename_safe.sh` to customize:
|
||
- Max filename length: `base_truncated="${sanitized_base:0:250}"`
|
||
- Allowed characters: `sed 's/[^a-zA-Z0-9._-]//g'`
|
||
- Enable hidden file support: Uncomment `.??*` loop
|
||
|
||
## ⚠️ System-Specific Notes
|
||
|
||
### 📏 Filesystem Limitations
|
||
| OS | Max Filename Length | Notes |
|
||
|-----|----------------------|-------|
|
||
| Linux | 255 bytes | UTF-8 encoded characters count as 1-4 bytes |
|
||
| Windows | 260 characters | NTFS supports 32,767 characters (Unicode-aware) |
|
||
| macOS | 255 characters | HFS+ and APFS both use 255 char limit |
|
||
|
||
> ⚠️ **This script truncates base names to 250 characters** to allow room for extensions.
|
||
> For extremely long extensions (e.g., `.tar.gz`), manual length checking may still be required.
|
||
|
||
## 🔧 Cross-Platform Compatibility
|
||
|
||
This script is compatible with:
|
||
- **Linux** (Debian, Ubuntu, Arch, etc.)
|
||
- **macOS** (requires GNU coreutils installed via Homebrew)
|
||
- **WSL2** (Windows Subsystem for Linux)
|
||
- **Git Bash** (on Windows)
|
||
|
||
### ⚠️ Not Compatible With:
|
||
- Native Windows CMD/Powershell (due to `mv`, `tr`, `sed` differences)
|
||
- Legacy systems with non-Bash shells (e.g., `sh` or `dash`)
|
||
|
||
### ✅ Cross-Platform Tips
|
||
- Save scripts with **LF line endings** (not CRLF)
|
||
- Use `dos2unix rename_safe.sh` if editing on Windows
|
||
- Avoid filenames > 255 chars to ensure portability
|
||
- Test on target system before bulk renaming
|
||
|
||
## 📦 Example
|
||
|
||
Before:
|
||
```
|
||
"Let’s Play A Game!.en.vtt"
|
||
"401(k) Plan.en.vtt"
|
||
"VeryLongFilename_AAAA...AAA.txt"
|
||
"Duplicate Name"
|
||
```
|
||
|
||
After:
|
||
```
|
||
"lets_play_a_game.en.vtt"
|
||
"401k_plan.en.vtt"
|
||
"verylongfilename_aaaaa...aaa.txt"
|
||
"duplicate_name"
|
||
"duplicate_name_1"
|
||
```
|
||
|
||
## 📝 License
|
||
|
||
MIT License - see [LICENSE](LICENSE) for details |