Skip to content

Latest commit

 

History

History
839 lines (641 loc) · 22.8 KB

File metadata and controls

839 lines (641 loc) · 22.8 KB

Building a Modular CLI Application with Bash: Architecture Lessons from CompressKit

Posted: November 4, 2024
Author: CrisisCore-Systems
Tags: Bash, CLI, Architecture, Software Design, Modular Design

Introduction

"Shell scripts don't need architecture."

This is what the senior developer said during code review, dismissing the junior engineer's concerns about the 2,000-line monolithic script that had grown organically over two years. Three months later, that same script caused a production outage that lasted six hours. The bug? A variable collision between two functions written by different developers, neither of whom understood the full scope of the script anymore.

The incident cost the company $200,000 in lost revenue, countless customer complaints, and a scramble to refactor the unmaintainable mess. But it didn't have to be this way.

Meet David, a DevOps engineer at a rapidly growing tech company. When he joined, he inherited a collection of shell scripts that automated their deployment pipeline. Each script started as a "quick solution" to an immediate problem:

"Just add a few lines here..."
"We'll clean it up later..."
"It's only 50 lines, we don't need structure..."

Fast forward six months: The "50-line script" had metastasized into a 3,000-line monster. Functions called other functions in unpredictable ways. Global variables littered the code like landmines. Making a simple change meant reading through hundreds of lines to understand the context. Testing? Forget it—nobody dared to change anything for fear of breaking something else.

David spent three frustrating weeks debugging a subtle race condition that occurred only in production. The root cause? Two functions were modifying the same global variable, and the execution order varied depending on system load. "If this script had been properly structured," David later reflected, "I would have found that bug in minutes, not weeks."

If you've ever thought "shell scripts don't need architecture," you're not alone. Many developers treat shell scripts as quick hacks—throw some commands in a file, make it executable, and call it done. But as scripts grow in complexity, this approach leads to unmaintainable messes: tangled dependencies, duplicated code, mysterious bugs that appear and disappear like ghosts, and a growing fear of touching anything.

CompressKit challenges this notion by demonstrating that Bash scripts can—and should—be architected with the same rigor as any other software project. From matrix rain animations to interactive menus, from color-coded progress indicators to responsive layouts—all implemented in pure Bash with a clean, modular architecture that developers actually enjoy working with.

In this post, we'll explore the architectural principles behind CompressKit and show you how to transform your shell scripts from fragile utilities into maintainable, extensible applications. Whether you're managing a five-file automation suite or a complex CLI tool, these patterns will help you write code that stands the test of time.

The Case for Modular Shell Scripts

Why Modularity Matters

Consider a typical monolithic shell script:

#!/bin/bash
# A 2000-line script that does everything

# Global variables scattered throughout
OUTPUT_DIR="/tmp/output"
QUALITY="medium"

# Functions that depend on globals
function compress_file() {
    # Uses global OUTPUT_DIR
    # Modifies global state
    # Hard to test
}

# More globals, more functions, all tangled together...

This approach creates several problems:

  1. Testing Nightmare: You can't test individual functions without running the entire script
  2. Reusability Issues: Can't reuse functions in other scripts without copy-paste
  3. Maintenance Hell: Changes ripple unpredictably through the codebase
  4. Collaboration Barriers: Multiple developers can't work on different parts simultaneously
  5. Debugging Difficulty: Finding bugs requires understanding the entire script

CompressKit solves these problems through modular architecture.

CompressKit's Architectural Principles

1. Separation of Concerns

CompressKit separates functionality into distinct modules, each with a single, well-defined responsibility:

lib/
├── compress.sh          # PDF compression logic
├── config.sh            # Configuration management
├── ui.sh                # User interface components
├── logger.sh            # Logging functionality
├── security_module.sh   # Security operations
├── premium.sh           # Premium feature management
├── license_verifier.sh  # License validation
└── error_handler.sh     # Error handling

Each module is:

  • Self-contained: Has minimal external dependencies
  • Single-purpose: Does one thing well
  • Testable: Can be tested in isolation
  • Reusable: Can be used in multiple contexts

2. Layered Architecture

CompressKit implements a clear layering strategy:

┌───────────────────────────────────────┐
│     Entry Point Layer                 │  High-level
│  (compresskit, compresskit-pdf)       │  interface
└────────────┬──────────────────────────┘
             ↓
┌───────────────────────────────────────┐
│     Application Layer                 │  Business
│  (compress, config, premium)          │  logic
└────────────┬──────────────────────────┘
             ↓
┌───────────────────────────────────────┐
│     Service Layer                     │  Supporting
│  (ui, logger, error_handler)          │  services
└────────────┬──────────────────────────┘
             ↓
┌───────────────────────────────────────┐
│     Foundation Layer                  │  Core
│  (security, validation, file_ops)     │  utilities
└───────────────────────────────────────┘

Key Principle: Higher layers depend on lower layers, never the reverse. This prevents circular dependencies and keeps the architecture clean.

3. Dependency Injection

Rather than hardcoding dependencies, CompressKit uses explicit sourcing:

#!/bin/bash
# compresskit entry point

# Explicitly declare dependencies
source "./lib/branding.sh"
source "./lib/config.sh"
source "./lib/premium.sh"
source "./lib/compress.sh"

# Now use the sourced functions
show_banner
compress_pdf "$1" "$2" "."

Benefits:

  • Dependencies are explicit and visible
  • Easy to mock dependencies for testing
  • Clear understanding of what each script needs

Building Blocks: Module Design Patterns

Pattern 1: The Service Module

Service modules provide specific functionality without side effects:

#!/bin/bash
# lib/logger.sh - A service module

# Module metadata
readonly LOGGER_VERSION="1.0.0"

# Module configuration
LOGGER_LOG_DIR="${HOME}/.config/compresskit/logs"
LOGGER_LOG_FILE="${LOGGER_LOG_DIR}/compresskit.log"

# Initialization function
init_logger() {
    # Ensure log directory exists
    mkdir -p "$LOGGER_LOG_DIR"
    chmod 700 "$LOGGER_LOG_DIR"
    
    # Rotate logs if needed
    if [ -f "$LOGGER_LOG_FILE" ]; then
        local size=$(stat -f%z "$LOGGER_LOG_FILE" 2>/dev/null || stat -c%s "$LOGGER_LOG_FILE" 2>/dev/null)
        if [ "$size" -gt 1048576 ]; then  # 1MB
            mv "$LOGGER_LOG_FILE" "${LOGGER_LOG_FILE}.old"
        fi
    fi
}

# Public API functions
log_error() {
    local message="$1"
    local timestamp=$(date '+%Y-%m-%d %H:%M:%S')
    echo "[${timestamp}] [ERROR] ${message}" >> "$LOGGER_LOG_FILE"
    echo "[ERROR] ${message}" >&2
}

log_info() {
    local message="$1"
    local timestamp=$(date '+%Y-%m-%d %H:%M:%S')
    echo "[${timestamp}] [INFO] ${message}" >> "$LOGGER_LOG_FILE"
    echo "[INFO] ${message}"
}

# Initialize on module load
init_logger

Key Characteristics:

  • Clear public API
  • Initialization function
  • Configuration at the top
  • Self-documenting code

Pattern 2: The Integration Module

Integration modules coordinate between multiple services:

#!/bin/bash
# lib/core.sh - Integration module

# Load dependencies
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
source "${SCRIPT_DIR}/logger.sh"
source "${SCRIPT_DIR}/config.sh"
source "${SCRIPT_DIR}/error_handler.sh"
source "${SCRIPT_DIR}/compress.sh"

# Initialize system
init_system() {
    log_info "Initializing CompressKit..."
    
    # Check dependencies
    if ! check_dependencies; then
        log_error "Missing required dependencies"
        return 1
    fi
    
    # Load configuration
    if ! load_config; then
        log_warning "Using default configuration"
    fi
    
    log_info "System initialized successfully"
    return 0
}

# Check required dependencies
check_dependencies() {
    local required_commands=("gs" "bash")
    local missing=()
    
    for cmd in "${required_commands[@]}"; do
        if ! command -v "$cmd" &>/dev/null; then
            missing+=("$cmd")
        fi
    done
    
    if [ ${#missing[@]} -gt 0 ]; then
        log_error "Missing commands: ${missing[*]}"
        return 1
    fi
    
    return 0
}

# Orchestrate compression workflow
compress_workflow() {
    local file="$1"
    local quality="$2"
    local output_dir="$3"
    
    # Validate inputs
    if ! validate_compression_params "$file" "$quality" "$output_dir"; then
        return 1
    fi
    
    # Execute compression
    if ! compress_pdf "$file" "$quality" "$output_dir"; then
        handle_compression_error "$file"
        return 1
    fi
    
    # Post-processing
    log_compression_success "$file"
    update_statistics
    
    return 0
}

Key Characteristics:

  • Coordinates multiple modules
  • Implements business workflows
  • Handles cross-cutting concerns
  • Provides high-level APIs

Pattern 3: The Configuration Module

Configuration modules manage application state:

#!/bin/bash
# lib/config.sh - Configuration module

# Default configuration
declare -A CONFIG=(
    ["default_quality"]="medium"
    ["output_dir"]="."
    ["create_backup"]="true"
    ["log_level"]="info"
)

# Configuration file location
CONFIG_DIR="${HOME}/.config/compresskit"
CONFIG_FILE="${CONFIG_DIR}/config"

# Load configuration from file
load_config() {
    if [ ! -f "$CONFIG_FILE" ]; then
        save_default_config
        return 0
    fi
    
    while IFS='=' read -r key value; do
        # Skip comments and empty lines
        [[ "$key" =~ ^#.*$ ]] && continue
        [[ -z "$key" ]] && continue
        
        # Update configuration
        CONFIG["$key"]="$value"
    done < "$CONFIG_FILE"
    
    return 0
}

# Save configuration to file
save_config() {
    mkdir -p "$CONFIG_DIR"
    chmod 700 "$CONFIG_DIR"
    
    # Write configuration
    {
        echo "# CompressKit Configuration"
        echo "# Generated on $(date)"
        echo ""
        
        for key in "${!CONFIG[@]}"; do
            echo "${key}=${CONFIG[$key]}"
        done
    } > "$CONFIG_FILE"
    
    chmod 600 "$CONFIG_FILE"
}

# Get configuration value
get_config() {
    local key="$1"
    local default="$2"
    
    if [ -n "${CONFIG[$key]}" ]; then
        echo "${CONFIG[$key]}"
    else
        echo "$default"
    fi
}

# Set configuration value
set_config() {
    local key="$1"
    local value="$2"
    
    CONFIG["$key"]="$value"
    save_config
}

# Save default configuration
save_default_config() {
    save_config
}

Key Characteristics:

  • Centralized configuration management
  • Persistent storage
  • Default value handling
  • Clean get/set API

Advanced Architectural Patterns

Pattern 4: The Security Wrapper

CompressKit wraps dangerous operations in security checks:

#!/bin/bash
# lib/secure_compress.sh - Security wrapper

source "${SCRIPT_DIR}/security_module.sh"
source "${SCRIPT_DIR}/compress.sh"

# Secure wrapper for PDF compression
secure_compress_pdf() {
    local input_file="$1"
    local quality="$2"
    local output_dir="$3"
    
    # Security validation
    local safe_input
    safe_input=$(safe_path "$input_file") || {
        log_security_incident "path_traversal" "Attempted access: $input_file"
        return 1
    }
    
    local safe_output
    safe_output=$(safe_path "$output_dir") || {
        log_security_incident "path_traversal" "Attempted output: $output_dir"
        return 1
    }
    
    # Quality validation
    if ! validate_input "$quality" "low" "medium" "high" "ultra"; then
        log_security_incident "invalid_input" "Invalid quality: $quality"
        return 1
    fi
    
    # Permission check
    if ! check_file_permissions "$safe_input"; then
        log_error "Insufficient permissions for: $safe_input"
        return 1
    fi
    
    # Call underlying compression function with validated parameters
    compress_pdf "$safe_input" "$quality" "$safe_output"
}

Key Characteristics:

  • Single Responsibility: Security validation
  • Wraps unsafe operations
  • Centralized security policy
  • Comprehensive logging

Pattern 5: The Error Handler

Centralized error handling improves maintainability:

#!/bin/bash
# lib/error_handler.sh - Centralized error handling

# Error codes
readonly ERROR_SUCCESS=0
readonly ERROR_INVALID_INPUT=1
readonly ERROR_FILE_NOT_FOUND=2
readonly ERROR_PERMISSION_DENIED=3
readonly ERROR_DEPENDENCY_MISSING=4
readonly ERROR_COMPRESSION_FAILED=5

# Error context stack
declare -a ERROR_CONTEXT_STACK=()

# Push error context
push_error_context() {
    local context="$1"
    ERROR_CONTEXT_STACK+=("$context")
}

# Pop error context
pop_error_context() {
    unset 'ERROR_CONTEXT_STACK[-1]'
}

# Handle error with context
handle_error() {
    local error_code="$1"
    local error_message="$2"
    
    # Build context string
    local context=""
    if [ ${#ERROR_CONTEXT_STACK[@]} -gt 0 ]; then
        context=" (Context: ${ERROR_CONTEXT_STACK[*]})"
    fi
    
    # Log with context
    log_error "${error_message}${context}"
    
    # Error-specific handling
    case $error_code in
        $ERROR_DEPENDENCY_MISSING)
            suggest_dependency_install
            ;;
        $ERROR_PERMISSION_DENIED)
            suggest_permission_fix
            ;;
        $ERROR_COMPRESSION_FAILED)
            attempt_recovery
            ;;
    esac
    
    return $error_code
}

# Usage example
compress_with_error_handling() {
    local file="$1"
    
    push_error_context "compress_workflow"
    
    if ! compress_pdf "$file" "medium" "."; then
        handle_error $ERROR_COMPRESSION_FAILED "Failed to compress $file"
        pop_error_context
        return 1
    fi
    
    pop_error_context
    return 0
}

Key Characteristics:

  • Consistent error handling
  • Context tracking
  • Recovery mechanisms
  • User-friendly messages

Testing Modular Shell Scripts

Modularity makes testing straightforward:

#!/bin/bash
# tests/test_logger.sh

# Set up test environment
setup_test() {
    export LOGGER_LOG_DIR="/tmp/compresskit-test-$$"
    mkdir -p "$LOGGER_LOG_DIR"
}

# Clean up after tests
teardown_test() {
    rm -rf "/tmp/compresskit-test-$$"
}

# Test log_error function
test_log_error() {
    setup_test
    
    # Source the module
    source "../lib/logger.sh"
    
    # Test
    log_error "Test error message"
    
    # Verify
    if grep -q "Test error message" "$LOGGER_LOG_FILE"; then
        echo "PASS: log_error works correctly"
        teardown_test
        return 0
    else
        echo "FAIL: log_error did not write to log"
        teardown_test
        return 1
    fi
}

# Run tests
test_log_error

Test Organization

CompressKit organizes tests to match the module structure:

tests/
├── test_suite.sh           # Test orchestrator
├── test_compression.sh     # Tests for lib/compress.sh
├── test_security.sh        # Tests for lib/security_module.sh
├── test_config.sh          # Tests for lib/config.sh
└── test_integration.sh     # Integration tests

Performance Considerations

Lazy Loading

Load modules only when needed:

#!/bin/bash
# Entry point with lazy loading

# Load core modules immediately
source "./lib/logger.sh"
source "./lib/config.sh"

# Lazy load function
load_module() {
    local module="$1"
    local module_path="./lib/${module}.sh"
    
    if [ -f "$module_path" ]; then
        source "$module_path"
        return 0
    else
        log_error "Module not found: $module"
        return 1
    fi
}

# Main function
main() {
    local command="$1"
    
    case "$command" in
        compress)
            # Load compression modules only when needed
            load_module "compress"
            load_module "security_module"
            compress_workflow "$2" "$3"
            ;;
        config)
            # Config modules already loaded
            manage_config "$2" "$3"
            ;;
        premium)
            # Load premium modules on demand
            load_module "premium"
            load_module "license_verifier"
            show_premium_features
            ;;
    esac
}

main "$@"

Caching Results

Cache expensive operations:

# Cache dependency checks
declare -A DEPENDENCY_CACHE=()

check_dependency_cached() {
    local cmd="$1"
    
    # Check cache first
    if [ -n "${DEPENDENCY_CACHE[$cmd]}" ]; then
        return "${DEPENDENCY_CACHE[$cmd]}"
    fi
    
    # Perform check
    if command -v "$cmd" &>/dev/null; then
        DEPENDENCY_CACHE[$cmd]=0
        return 0
    else
        DEPENDENCY_CACHE[$cmd]=1
        return 1
    fi
}

Documentation Standards

Good documentation is crucial for modular code:

#!/bin/bash
# lib/compress.sh - PDF Compression Module
#
# This module provides PDF compression functionality using GhostScript.
# It supports multiple quality levels and implements comprehensive
# security validation.
#
# Dependencies:
#   - GhostScript (gs command)
#   - lib/security_module.sh
#   - lib/logger.sh
#
# Public Functions:
#   compress_pdf(file, quality, output_dir) - Compress a PDF file
#   get_compression_ratio(original, compressed) - Calculate compression ratio
#
# Usage:
#   source "./lib/compress.sh"
#   compress_pdf "document.pdf" "medium" "/output"
#
# Author: CrisisCore-Systems
# Version: 1.1.0

# [Module code follows...]

Real-World Benefits

CompressKit's modular architecture provides tangible benefits:

1. Maintainability

  • Before: Changing compression logic required editing 2000-line script
  • After: Edit only lib/compress.sh, ~200 lines

2. Testing

  • Before: Testing required running entire script with specific conditions
  • After: Each module has isolated tests, can run in seconds

3. Reusability

  • Before: Code duplication across multiple scripts
  • After: Shared modules used across compresskit and compresskit-pdf

4. Collaboration

  • Before: Merge conflicts on every change
  • After: Developers work on separate modules independently

5. Extensibility

  • Before: Adding features broke existing functionality
  • After: New modules integrate cleanly without affecting existing code

Migration Strategy

If you have an existing monolithic script, here's how to modularize it:

Step 1: Identify Functional Areas

Analyze your script to identify distinct responsibilities:

  • User interface
  • Business logic
  • Data access
  • Utilities
  • Configuration

Step 2: Extract Pure Functions First

Start with functions that don't depend on global state:

# Before: In monolithic script
calculate_compression_ratio() {
    local original_size="$1"
    local compressed_size="$2"
    echo $(( (original_size - compressed_size) * 100 / original_size ))
}

# After: In lib/utils.sh
calculate_compression_ratio() {
    local original_size="$1"
    local compressed_size="$2"
    
    # Input validation
    if [ "$original_size" -le 0 ]; then
        return 1
    fi
    
    echo $(( (original_size - compressed_size) * 100 / original_size ))
}

Step 3: Create Service Modules

Extract related functions into service modules:

# lib/file_service.sh
get_file_size() { ... }
validate_file_exists() { ... }
copy_file_safely() { ... }

Step 4: Implement Configuration Module

Centralize all configuration:

# Before: Scattered throughout script
OUTPUT_DIR="/tmp/output"
QUALITY="medium"
# ...more globals...

# After: In lib/config.sh
declare -A CONFIG=(
    ["output_dir"]="/tmp/output"
    ["quality"]="medium"
)

Step 5: Refactor Entry Point

Simplify the main script:

#!/bin/bash
# After refactoring

source "./lib/config.sh"
source "./lib/logger.sh"
source "./lib/compress.sh"

main() {
    init_system || exit 1
    compress_workflow "$@"
}

main "$@"

Best Practices Summary

  1. One Module, One Responsibility: Each module should do one thing well

  2. Explicit Dependencies: Use source at the top of files to declare dependencies

  3. Clear APIs: Define public functions clearly, prefix internal functions with _

  4. Consistent Naming: Use consistent naming conventions across modules

  5. Documentation: Every module should have a header comment explaining its purpose

  6. Error Handling: Return error codes, don't exit from library functions

  7. Testing: Write tests for each module

  8. Versioning: Track module versions for compatibility

  9. Security: Implement security at the foundation layer

  10. Performance: Profile and optimize hot paths

Conclusion

Modular architecture transforms shell scripts from throwaway code into maintainable, professional software. CompressKit demonstrates that with proper structure, Bash scripts can rival the organization and quality of applications written in any language.

The patterns and practices presented here are battle-tested and production-ready. By applying them to your own scripts, you can:

  • Reduce maintenance burden
  • Improve code quality
  • Enable better collaboration
  • Facilitate testing
  • Support long-term evolution

Remember: Good architecture is not about the language—it's about the principles.

Start small, refactor incrementally, and watch your shell scripts transform from scripts into software.


Resources:

About the Author:
CrisisCore-Systems advocates for software engineering excellence in all languages and environments. We believe that with proper design, any code can be elegant, maintainable, and professional.