# ๐Ÿงน Filename Sanitizer A Bash script to rename files in a directory by removing unsafe characters, handling name collisions, and ensuring consistent lowercase filenames. ## ๐Ÿงฉ Features - **Safe character handling**: Removes all non-alphanumeric characters except underscores, hyphens, and periods - **Unicode cleanup**: Converts special characters like `โ€™`, `๏ผŸ`, `๏ผˆ`, `๏ผ‰` to standard ASCII - **Collision avoidance**: Adds numeric suffixes (`_1`, `_2`) to prevent overwriting - **Case normalization**: Converts filenames to lowercase - **Space handling**: Replaces spaces with underscores - **Truncation**: Limits base filenames to 250 chars to stay under 255 char limit (filesystem safe) - **Dry-run support**: Test changes before applying them ## ๐Ÿš€ Usage ```bash # Clone the repo git clone https://github.com/yourname/yourrepo.git cd yourrepo # Make the script executable chmod +x rename_safe.sh # Dry run (test mode) ./rename_safe.sh --dry-run /path/to/files # Actual renaming ./rename_safe.sh /path/to/files ``` > **Windows Users**: Run this script in [WSL](https://learn.microsoft.com/en-us/windows/wsl/) or [Git Bash](https://git-scm.com/downloads) - native Windows CMD/Powershell won't work ## ๐Ÿงช Test Script ```bash # Run validation tests ./test_rename.sh ``` Creates files with: - Special characters (`%`, `โ€™`, `!`, `(`, `)`, `๏ผŸ`) - Multi-dot extensions (`.en.vtt`) - Hidden files (`.Hidden File`) - Long filenames (near 255 char limit) - Duplicate filenames - Mixed case names ## โš™๏ธ Configuration Edit `rename_safe.sh` to customize: - Max filename length: `base_truncated="${sanitized_base:0:250}"` - Allowed characters: `sed 's/[^a-zA-Z0-9._-]//g'` - Enable hidden file support: Uncomment `.??*` loop ## โš ๏ธ System-Specific Notes ### ๐Ÿ“ Filesystem Limitations | OS | Max Filename Length | Notes | |-----|----------------------|-------| | Linux | 255 bytes | UTF-8 encoded characters count as 1-4 bytes | | Windows | 260 characters | NTFS supports 32,767 characters (Unicode-aware) | | macOS | 255 characters | HFS+ and APFS both use 255 char limit | > โš ๏ธ **This script truncates base names to 250 characters** to allow room for extensions. > For extremely long extensions (e.g., `.tar.gz`), manual length checking may still be required. ## ๐Ÿ”ง Cross-Platform Compatibility This script is compatible with: - **Linux** (Debian, Ubuntu, Arch, etc.) - **macOS** (requires GNU coreutils installed via Homebrew) - **WSL2** (Windows Subsystem for Linux) - **Git Bash** (on Windows) ### โš ๏ธ Not Compatible With: - Native Windows CMD/Powershell (due to `mv`, `tr`, `sed` differences) - Legacy systems with non-Bash shells (e.g., `sh` or `dash`) ### โœ… Cross-Platform Tips - Save scripts with **LF line endings** (not CRLF) - Use `dos2unix rename_safe.sh` if editing on Windows - Avoid filenames > 255 chars to ensure portability - Test on target system before bulk renaming ## ๐Ÿ“ฆ Example Before: ``` "Letโ€™s Play A Game!.en.vtt" "401(k) Plan.en.vtt" "VeryLongFilename_AAAA...AAA.txt" "Duplicate Name" ``` After: ``` "lets_play_a_game.en.vtt" "401k_plan.en.vtt" "verylongfilename_aaaaa...aaa.txt" "duplicate_name" "duplicate_name_1" ``` ## ๐Ÿ“ License MIT License - see [LICENSE](LICENSE) for details