Bash Cheat Sheet

Practical snippets for writing robust shell scripts and maintaining a productive Bash environment.

Versions: Bash 4.4+. declare -A (associative arrays) and some string ops require Bash 4+; macOS ships Bash 3.2 by default - install via Homebrew (brew install bash). bats-core testing examples use bats-core 1.10+.

Last reviewed: May 2026

Script Boilerplate

Strict mode ✅ Use at the top of every script

Always start scripts with this. set -e exits on error, set -u treats unset variables as errors, set -o pipefail propagates failures through pipelines, and set -x prints each command before running it (useful for debugging - remove for production).

Bash

#!/usr/bin/env bash
set -euo pipefail
# set -x   # uncomment for debug tracing

Run as root, re-exec with sudo if needed

Bash

#!/usr/bin/env bash
[[ "$(whoami)" == root ]] || { sudo "$0" "$@"; exit $?; }

Colour Output Helpers

Drop these into your script or .bashrc for consistent coloured output.

Bash

print_success() {
  local lightcyan='\033[1;36m' nocolor='\033[0m'
  echo -e "${lightcyan}$1${nocolor}"
}
 
print_error() {
  local lightred='\033[1;31m' nocolor='\033[0m'
  echo -e "${lightred}$1${nocolor}"
}
 
print_alert() {
  local yellow='\033[1;33m' nocolor='\033[0m'
  echo -e "${yellow}$1${nocolor}"
}

Logging

Bash has no built-in logger - everything is echo plus convention. The minimum bar in any script you’d run in CI or in production: levels, timestamps, output to stderr for anything that isn’t structured data on stdout, and a single env var to control verbosity.

Leveled log functions

Bash

#!/usr/bin/env bash
# Source this once at the top of your script.
 
# LOG_LEVEL: DEBUG < INFO < WARN < ERROR (default INFO)
LOG_LEVEL="${LOG_LEVEL:-INFO}"
 
# Map level names to numeric weights for comparison.
declare -A _LOG_WEIGHTS=( [DEBUG]=10 [INFO]=20 [WARN]=30 [ERROR]=40 )
 
_log() {
  local level="$1"; shift
  local msg="$*"
  local want="${_LOG_WEIGHTS[$LOG_LEVEL]:-20}"
  local have="${_LOG_WEIGHTS[$level]:-20}"
  (( have < want )) && return 0
 
  # ISO-8601 UTC timestamp, level-padded, write to stderr so stdout stays for data.
  printf '%s  %-5s  %s\n' \
    "$(date -u +'%Y-%m-%dT%H:%M:%SZ')" "$level" "$msg" >&2
}
 
log_debug() { _log DEBUG "$@"; }
log_info()  { _log INFO  "$@"; }
log_warn()  { _log WARN  "$@"; }
log_error() { _log ERROR "$@"; }
 
# Usage
log_info  "Starting deploy to $env"
log_warn  "Soft-delete disabled on vault $vault_name"
log_error "Apply failed with exit code $rc"
LOG_LEVEL=DEBUG ./deploy.sh   # turn on verbose logging

Why stderr?

Anything diagnostic goes to stderr (>&2); stdout is reserved for data the caller wants to consume. This keeps result=$(my_script.sh) clean even when the script logs heavily, and CI runners still surface both streams in their UI.

Bash

# Right - log to stderr, return JSON on stdout
get_vault_secret() {
  log_info "Fetching secret $1"
  az keyvault secret show --vault-name "$kv" --name "$1" --query value -o tsv
}
 
secret=$(get_vault_secret "db-password")   # secret captured, logs printed live

Structured (JSON) logging for log shippers

If a log shipper (Fluent Bit, Vector, Loki agent) is parsing your output, emit JSON lines instead of free text. One line per event keeps it grep-friendly and machine-parseable.

Bash

log_json() {
  local level="$1"; shift
  local msg="$*"
  # Use jq to escape values correctly - never string-concat into JSON.
  jq -cn \
    --arg ts    "$(date -u +'%Y-%m-%dT%H:%M:%S.%3NZ')" \
    --arg level "$level" \
    --arg msg   "$msg" \
    --arg host  "$(hostname)" \
    --arg pid   "$$" \
    '{ts: $ts, level: $level, host: $host, pid: ($pid|tonumber), msg: $msg}' >&2
}
 
log_json INFO  "deploy started"
log_json ERROR "apply failed: $rc"

Add context fields with extra --arg pairs and a richer jq template - e.g. correlation_id, env, module. Stick to one event per line.

`trap` for error context

Capture line number and command on unhandled errors so you don’t have to guess where the script died.

Bash

#!/usr/bin/env bash
set -Eeuo pipefail
 
on_err() {
  local exit_code=$?
  log_error "command '$BASH_COMMAND' failed at line $1 (exit=$exit_code)"
  exit "$exit_code"
}
trap 'on_err $LINENO' ERR

set -E is what makes ERR traps inherit into functions and subshells - skip it and the trap won’t fire from inside helpers.

Sensible defaults checklist

set -Eeuo pipefail at the top of every script.
All logging goes to >&2; stdout is reserved for data.
Timestamp every line in UTC (date -u +'%Y-%m-%dT%H:%M:%SZ').
LOG_LEVEL env var to flip verbose mode without code changes.
One JSON object per line if anything will parse the output.
An ERR trap that prints the failed command + line number.

String Utilities

Bash

title_case_convert() {
  sed 's/.*/\L&/; s/[a-z]*/\u&/g' <<<"$1"
}
 
upper_case_convert() {
  sed -e 's/\(.*\)/\U\1/' <<< "$1"
}
 
lower_case_convert() {
  sed -e 's/\(.*\)/\L\1/' <<< "$1"
}

Argument Parsing

Positional arguments with defaults

Bash

#!/usr/bin/env bash
full_name="${1:-Craig Thacker}"
email_address="${2:-user@example.com}"
 
echo "Name:  $full_name"
echo "Email: $email_address"

Named flags with `while` / `case`

Bash

#!/usr/bin/env bash
name=""
env="dev"
debug=false
 
while [[ $# -gt 0 ]]; do
  case $1 in
    --name)    name="$2";  shift 2 ;;
    --env)     env="$2";   shift 2 ;;
    --debug)   debug=true; shift   ;;
    *) echo "Unknown flag: $1"; exit 1 ;;
  esac
done
 
echo "name=$name env=$env debug=$debug"

PATH Management

Remove duplicate entries from PATH

Keeps the first occurrence of each path and discards any subsequent duplicates.

Bash

PATH=$(printf %s "$PATH" | awk -vRS=: -vORS= '!a[$0]++ {if (NR>1) printf(":"); printf("%s", $0)}')
export PATH

Directory Iteration

Run a script in every subdirectory (1 level deep)

Bash

#!/usr/bin/env bash
set -euo pipefail
 
START=$(pwd)
 
while IFS= read -r dir; do
  cd "$dir"
  ./my-script.sh || echo "Script not in $dir, skipping"
  cd "$START"
done < <(find . -maxdepth 1 -mindepth 1 -type d)

Sort and format all Terraform modules in a directory

Runs terraform-sort.sh (from the Terraform Standards) across every terraform-<provider>-* module directory. Git releases are handled per-module in CI, not in bulk scripts.

Bash

#!/usr/bin/env bash
set -euo pipefail
 
back=$(pwd)
provider="${1:-azurerm}"
location="${2:-.}"
sort_script="${TERRAFORM_SORT_SCRIPT:-./terraform-sort.sh}"
 
workspace=$(find "${location}" -maxdepth 1 -name "terraform-${provider}-*" -type d | sort)
 
for dir in ${workspace}; do
  cd "${dir}"
  echo "Processing: ${dir}"
  if [[ -x "$sort_script" ]]; then
    "$sort_script" --sort-variables --sort-outputs --format-terraform --generate-readme
  else
    # Fallback: fmt only when sort script not present
    terraform fmt -recursive
  fi
  cd "${back}"
done

Terraform Module Sort Helper 🏠 Personal workflow helper

Appends a reusable tfsort function to .bashrc that sorts variable and output blocks, runs terraform fmt, and regenerates README.md via terraform-docs. Git operations are handled separately - see Terraform Standards - Release workflow.

Bash

#!/usr/bin/env bash
 
tfsort() {
  # Requires: terraform, terraform-docs, gawk
  # Usage: tfsort [--readme-header-file HEADER.md]
  local readme_header_file=""
 
  while [[ $# -gt 0 ]]; do
    case "$1" in
      --readme-header-file) readme_header_file="$2"; shift 2 ;;
      *) echo "Unknown option: $1" >&2; return 1 ;;
    esac
  done
 
  # Verify required tools
  for tool in terraform terraform-docs gawk; do
    command -v "$tool" &>/dev/null || { echo "ERROR: $tool not found in PATH" >&2; return 1; }
  done
 
  [[ -f main.tf || -f build.tf ]] \
    || { echo "ERROR: no main.tf or build.tf found - not a module root" >&2; return 1; }
 
  echo ">> terraform fmt"
  terraform fmt -recursive
 
  # Sort variable blocks
  if [[ -f variables.tf ]]; then
    echo ">> sorting variables.tf"
    _tf_sort_blocks variables.tf variable
  fi
 
  # Sort output blocks
  if [[ -f outputs.tf ]]; then
    echo ">> sorting outputs.tf"
    _tf_sort_blocks outputs.tf output
  fi
 
  # Write README header if provided, then inject terraform-docs
  if [[ -n "$readme_header_file" ]]; then
    [[ -f "$readme_header_file" ]] \
      || { echo "ERROR: header file not found: $readme_header_file" >&2; return 1; }
    cat "$readme_header_file" > README.md
    printf '\n\n<!-- BEGIN_TF_DOCS -->\n<!-- END_TF_DOCS -->\n' >> README.md
  fi
 
  if [[ -f .terraform-docs.yml ]]; then
    echo ">> terraform-docs . (using .terraform-docs.yml)"
    terraform-docs .
  else
    [[ -f README.md ]] \
      || printf '<!-- BEGIN_TF_DOCS -->\n<!-- END_TF_DOCS -->\n' > README.md
    echo ">> terraform-docs markdown table --output-mode inject"
    terraform-docs markdown table --output-file README.md --output-mode inject .
  fi
 
  echo "Done."
}
 
# Helper: sort variable or output blocks in a file using gawk brace-depth counting
_tf_sort_blocks() {
  local file="$1" keyword="$2" tmp_dir count=0
 
  tmp_dir="$(mktemp -d)"
 
  gawk -v kw="$keyword" -v tmpdir="$tmp_dir" '
  BEGIN { in_block=0; depth=0; block=""; name="" }
  !in_block && match($0, "^" kw " \"([^\"]+)\" \\{", arr) {
    name=arr[1]; in_block=1; depth=0; block=$0"\n"
    for(i=1;i<=length($0);i++){c=substr($0,i,1);if(c=="{")depth++;else if(c=="}")depth--}
    if(depth==0){sub(/\n$/,"",block);print block>(tmpdir"/"name".tf");close(tmpdir"/"name".tf");in_block=0;block="";name=""}
    next
  }
  in_block {
    block=block $0"\n"
    for(i=1;i<=length($0);i++){c=substr($0,i,1);if(c=="{")depth++;else if(c=="}")depth--}
    if(depth==0){sub(/\n$/,"",block);print block>(tmpdir"/"name".tf");close(tmpdir"/"name".tf");in_block=0;block="";name=""}
  }
  ' "$file"
 
  local output="" first=true
  while IFS= read -r -d $'\0' bf; do
    $first && output="$(cat "$bf")" && first=false \
      || output="$output"$'\n\n'"$(cat "$bf")"
    count=$((count + 1))
  done < <(find "$tmp_dir" -name '*.tf' -print0 | sort -z)
 
  rm -rf "$tmp_dir"
  [[ $count -gt 0 ]] && printf '%s\n' "$output" > "$file" \
    || echo "WARNING: no $keyword blocks found in $file"
}
 
# Append both functions to .bashrc
echo "" >> ~/.bashrc
echo "# Terraform sort helper" >> ~/.bashrc
declare -f _tf_sort_blocks >> ~/.bashrc
declare -f tfsort >> ~/.bashrc
echo "tfsort appended to ~/.bashrc"

See also: Terraform Standards - Module maintenance scripts for the full PowerShell, Bash, and Python implementations. Git - Tags for the tagging commands that trigger a module registry release.

Testing with bats

🛠️ Deeper reference - covers install, test structure, assertions, setup/teardown, command mocking via PATH override, and CI integration. For quick test execution, jump to “Running tests”.

bats-core (Bash Automated Testing System) is the de-facto unit test runner for bash. Tests are plain .bats files that look like bash with @test blocks. Companion libraries (bats-assert, bats-support, bats-file) give you readable assertions.

Install

Bash

# Linux (system-wide)
git clone https://github.com/bats-core/bats-core.git
cd bats-core
sudo ./install.sh /usr/local
 
# macOS
brew install bats-core
 
# Per-project (recommended for CI reproducibility) - vendor as git submodules
mkdir -p test/test_helper
git submodule add https://github.com/bats-core/bats-core.git           test/bats
git submodule add https://github.com/bats-core/bats-support.git        test/test_helper/bats-support
git submodule add https://github.com/bats-core/bats-assert.git         test/test_helper/bats-assert
git submodule add https://github.com/bats-core/bats-file.git           test/test_helper/bats-file

Test file structure

Bash

#!/usr/bin/env bats
# test/string_utils.bats
 
setup() {
  load 'test_helper/bats-support/load'
  load 'test_helper/bats-assert/load'
  load 'test_helper/bats-file/load'
 
  # Source the script under test
  PROJECT_ROOT="$(cd "$(dirname "$BATS_TEST_FILENAME")/.." && pwd)"
  source "$PROJECT_ROOT/lib/string_utils.sh"
}
 
@test "upper_case_convert: lowercase input -> uppercase" {
  run upper_case_convert "hello"
  assert_success
  assert_output "HELLO"
}
 
@test "upper_case_convert: already uppercase is unchanged" {
  run upper_case_convert "FOO"
  assert_output "FOO"
}
 
@test "title_case_convert: handles multiple words" {
  run title_case_convert "hello world"
  assert_output "Hello World"
}
 
@test "non-existent function fails cleanly" {
  run not_a_real_function
  assert_failure
}

Useful assertions

Bash

@test "assertion examples" {
  run echo "hello world"
 
  assert_success                       # exit code 0
  assert_failure                       # non-zero exit
  assert_failure 2                     # specific exit code
 
  assert_output "hello world"          # exact match
  assert_output --partial "hello"      # substring
  assert_output --regexp '^hello.*$'   # regex
 
  refute_output "goodbye"              # negative match
  assert_line --index 0 "hello world"  # specific line in multi-line output
 
  # File assertions (from bats-file)
  assert_file_exist  /etc/hostname
  assert_file_executable ./deploy.sh
  assert_file_contains /etc/hostname "expected"
}

`setup` / `teardown` / `setup_file` / `teardown_file`

Bash

setup_file() {
  # Once per file - heavy setup goes here
  export FIXTURE_DIR="$(mktemp -d)"
  cp -r tests/fixtures/* "$FIXTURE_DIR"
}
 
teardown_file() {
  rm -rf "$FIXTURE_DIR"
}
 
setup() {
  # Per test - light/cheap setup
  cd "$BATS_TEST_TMPDIR"   # fresh temp dir per test, auto-cleaned
}
 
teardown() {
  # Per test - rarely needed; $BATS_TEST_TMPDIR is auto-removed
  :
}

Mocking external commands - PATH override

Bash has no built-in mocking. The standard trick: prepend a temp directory to PATH containing fake executables that print what you want.

Bash

@test "deploy.sh calls terraform apply with -auto-approve" {
  local mock_dir; mock_dir=$(mktemp -d)
 
  # Fake `terraform` that records its args and exits 0
  cat > "$mock_dir/terraform" <<'EOF'
#!/usr/bin/env bash
echo "$@" >> "$MOCK_CALLS"
EOF
  chmod +x "$mock_dir/terraform"
 
  export MOCK_CALLS="$BATS_TEST_TMPDIR/calls.log"
  : > "$MOCK_CALLS"
  PATH="$mock_dir:$PATH" run ./deploy.sh prod
 
  assert_success
  assert_file_contains "$MOCK_CALLS" "apply -auto-approve"
}

Mocking shell functions - override after sourcing

If the script under test calls a helper function (not an external binary), redefine the function in the test after sourcing.

Bash

@test "deploy aborts if precheck fails" {
  source ./deploy.sh
 
  # Override the precheck function
  run_precheck() { return 1; }
  export -f run_precheck
 
  run deploy_main
  assert_failure
  assert_output --partial "precheck failed"
}

Running tests

Bash

# Run a single file
./test/bats/bin/bats test/string_utils.bats
 
# Run the whole suite
./test/bats/bin/bats test/
 
# Parallel (one process per file)
./test/bats/bin/bats --jobs 4 test/
 
# JUnit output for CI
./test/bats/bin/bats --report-formatter junit -o test-results test/
 
# TAP output (default; CI runners often parse this natively)
./test/bats/bin/bats --formatter tap test/

GitHub Actions example

YAML

# .github/workflows/test.yml
- uses: actions/checkout@v4
  with:
    submodules: recursive   # pull in bats + helpers
- name: Run bats tests
  run: ./test/bats/bin/bats --report-formatter junit -o test-results test/
- uses: dorny/test-reporter@v1
  if: always()
  with:
    name: bats
    path: test-results/*.xml
    reporter: java-junit

Coverage with `kcov`

bats integrates with kcov for line-level coverage. Install kcov (apt install kcov on Ubuntu) then:

Bash

mkdir -p coverage
kcov --include-pattern=lib/ coverage ./test/bats/bin/bats test/
# coverage/index.html now has the report

Sensible defaults checklist

One .bats file per script/module under test, named <thing>.bats.
setup() sources the script under test; never run installation steps inside a test.
Use run <command> then assert_* - direct invocation breaks status capture.
One assertion per behavioural rule; avoid mega-tests that fail for unclear reasons.
Mock external commands via PATH override; never call real terraform/az/kubectl from a unit test.
Tests run in $BATS_TEST_TMPDIR (auto-cleaned) - never write to the working tree.

Process Management & Health Checks

Monitor a process, restart if dead

Bash

#!/usr/bin/env bash
# Ensure a service is always running; restart if it exits
 
SERVICE_PID_FILE="/var/run/my-service.pid"
SERVICE_CMD="python3 /opt/service/main.py"
LOG_FILE="/var/log/my-service.log"
 
check_and_restart() {
  if [[ -f "$SERVICE_PID_FILE" ]]; then
    local pid=$(cat "$SERVICE_PID_FILE")
    if ! kill -0 "$pid" 2>/dev/null; then
      echo "Process $pid is dead, restarting..." | tee -a "$LOG_FILE"
      rm -f "$SERVICE_PID_FILE"
      start_service
    fi
  else
    start_service
  fi
}
 
start_service() {
  echo "Starting service..." | tee -a "$LOG_FILE"
  $SERVICE_CMD >> "$LOG_FILE" 2>&1 &
  echo $! > "$SERVICE_PID_FILE"
}
 
# Run every 30 seconds
while true; do
  check_and_restart
  sleep 30
done

Health check with retry logic

Bash

#!/usr/bin/env bash
# Check if a service is healthy; retry with exponential backoff
 
health_check() {
  local endpoint="$1"
  local max_retries=5
  local retry=0
  local backoff=1
 
  while (( retry < max_retries )); do
    if curl -sf --max-time 5 "$endpoint" > /dev/null; then
      echo "✓ Health check passed"
      return 0
    fi
        
    echo "⚠ Health check failed (attempt $((retry + 1))/$max_retries), retrying in ${backoff}s..."
    sleep "$backoff"
    backoff=$((backoff * 2))  # exponential backoff
    ((retry++))
  done
    
  echo "✗ Health check failed after $max_retries attempts"
  return 1
}
 
# Usage
health_check "http://localhost:8080/health" || exit 1

Networking & Service Discovery

Wait for a port to be ready

Bash

#!/usr/bin/env bash
# Block until a service is listening on a port
 
wait_for_port() {
  local host="$1"
  local port="$2"
  local timeout="${3:-30}"
  local elapsed=0
    
  echo "Waiting for $host:$port..."
    
  while (( elapsed < timeout )); do
    if nc -z "$host" "$port" 2>/dev/null; then
      echo "✓ $host:$port is ready"
      return 0
    fi
        
    echo "  Not ready yet, retrying... (${elapsed}s/$timeout)"
    sleep 2
    elapsed=$((elapsed + 2))
  done
    
  echo "✗ Timeout waiting for $host:$port"
  return 1
}
 
# Usage
wait_for_port localhost 5432  # PostgreSQL
wait_for_port redis-server 6379  # Redis

Resolve service from DNS

Bash

#!/usr/bin/env bash
# Service discovery - resolve a hostname to IPs
 
get_service_ips() {
  local service_name="$1"
    
  # Use getent to query DNS (more portable than nslookup)
  getent hosts "$service_name" | awk '{print $1}' | sort -u
}
 
# Load balance across resolved IPs
load_balance_to_service() {
  local service="$1"
  local ips=($(get_service_ips "$service"))
    
  if (( ${#ips[@]} == 0 )); then
    echo "Error: No IPs found for $service"
    return 1
  fi
    
  # Simple round-robin
  local idx=$((RANDOM % ${#ips[@]}))
  echo "${ips[$idx]}"
}
 
# Usage
SELECTED_IP=$(load_balance_to_service "api-service")
curl "http://$SELECTED_IP:8080/health"

Cloud CLI Patterns (Azure / AWS)

Azure CLI with managed identity

Bash

#!/usr/bin/env bash
# Authenticate using managed identity (no secrets in code)
 
# Azure CLI automatically uses managed identity in Azure VMs, AKS pods, etc.
# Explicitly enable with:
az login --identity
 
# Run commands
az vm list-ip-addresses --subscription "my-subscription"
az keyvault secret show --vault-name "my-vault" --name "database-password" --query value -o tsv
 
# Retry pattern for transient failures
retry_az() {
  local cmd=("$@")
  local max_retries=3
  local retry=0
    
  while (( retry < max_retries )); do
    if "${cmd[@]}"; then
      return 0
    fi
        
    echo "Command failed, retrying... ($((retry + 1))/$max_retries)"
    sleep $((2 ** retry))  # exponential backoff
    ((retry++))
  done
    
  echo "Command failed after $max_retries attempts"
  return 1
}
 
# Usage
retry_az az deployment group create \
  --resource-group "my-rg" \
  --template-file "template.json" \
  --parameters "environment=prod"

Batch operations with Azure CLI

Bash

#!/usr/bin/env bash
# Apply operations to multiple resources
 
apply_to_resources() {
  local resource_group="$1"
  local resource_type="$2"
  shift 2
  local update_args=("$@")   # e.g. --set tags.env=prod
 
  # List all resources of the given type, NUL-safe.
  while IFS= read -r resource; do
    [[ -z "$resource" ]] && continue
    echo "Updating $resource..."
    # Use the generic `az resource update`, which works for any resource type.
    az resource update \
      --resource-group "$resource_group" \
      --resource-type "$resource_type" \
      --name "$resource" \
      "${update_args[@]}"
  done < <(az resource list \
    --resource-group "$resource_group" \
    --resource-type "$resource_type" \
    --query "[].name" -o tsv)
}
 
# Usage: apply_to_resources "my-rg" "Microsoft.Storage/storageAccounts" --set tags.env=prod

Container & Kubernetes Helpers

Kubernetes port-forward with automatic cleanup

Bash

#!/usr/bin/env bash
# Forward a pod's port; clean up on exit
 
port_forward_pod() {
  local pod="$1"
  local local_port="$2"
  local remote_port="${3:-$local_port}"
  local namespace="${4:-default}"
    
  local pf_pid
    
  # Trap to clean up on exit
  cleanup() {
    echo "Killing port-forward (PID: $pf_pid)"
    kill "$pf_pid" 2>/dev/null || true
  }
  trap cleanup EXIT INT TERM
    
  echo "Port-forwarding: localhost:$local_port -> $pod:$remote_port"
  kubectl port-forward -n "$namespace" "pod/$pod" "$local_port:$remote_port" &
  pf_pid=$!
    
  wait "$pf_pid"
}
 
# Usage
port_forward_pod "api-server-abc123" 8080 8080 "production"

Wait for Kubernetes deployment to be ready

Bash

#!/usr/bin/env bash
# Block until a deployment has all replicas ready
 
wait_for_deployment() {
  local deployment="$1"
  local namespace="${2:-default}"
  local timeout="${3:-300}"
  local elapsed=0
    
  echo "Waiting for deployment/$deployment to be ready..."
    
  while (( elapsed < timeout )); do
    local ready=$(kubectl get deployment "$deployment" -n "$namespace" \
      -o jsonpath='{.status.conditions[?(@.type=="Available")].status}')
        
    if [[ "$ready" == "True" ]]; then
      echo "✓ Deployment ready"
      return 0
    fi
        
    echo "  Waiting... (${elapsed}s/$timeout)"
    sleep 5
    elapsed=$((elapsed + 5))
  done
    
  echo "✗ Deployment did not become ready within $timeout seconds"
  kubectl describe deployment "$deployment" -n "$namespace"
  return 1
}
 
# Usage
wait_for_deployment "api-service" "production" 300

Deployment & Rollback Patterns

Blue-green deployment with health check

Bash

#!/usr/bin/env bash
# Deploy to inactive slot, verify health, then swap
 
blue_green_deploy() {
  local app_name="$1"
  local resource_group="$2"
  local new_version="$3"
  local health_endpoint="$4"
 
  # Pattern: always deploy to the "staging" slot, verify it, then swap into production.
  local staging_slot="staging"
 
  echo "Deploying $new_version to the $staging_slot slot of $app_name..."
 
  # Create the staging slot if it does not already exist (idempotent).
  az webapp deployment slot create \
    --name "$app_name" --resource-group "$resource_group" \
    --slot "$staging_slot" --configuration-source "$app_name" 2>/dev/null || true
 
  # Point the staging slot at the new container image (adjust for your deploy method).
  az webapp config container set \
    --name "$app_name" --resource-group "$resource_group" --slot "$staging_slot" \
    --container-image-name "myregistry.azurecr.io/api:$new_version"
 
  # Health check the staging slot before swapping.
  local staging_url="https://${app_name}-${staging_slot}.azurewebsites.net${health_endpoint}"
  if ! curl -sf --max-time 10 "$staging_url" > /dev/null; then
    echo "✗ Health check failed on $staging_slot, aborting swap"
    return 1
  fi
 
  echo "✓ Health check passed, swapping $staging_slot into production..."
  az webapp deployment slot swap \
    --name "$app_name" --resource-group "$resource_group" \
    --slot "$staging_slot" --target-slot production
 
  echo "✓ Deployment complete"
}
 
# Usage
blue_green_deploy "my-api-app" "my-rg" "v2.1.0" "/health"

Rollback with version pinning

Bash

#!/usr/bin/env bash
# Rollback to previous version if current deployment fails
 
rollback_deployment() {
  local deployment="$1"
  local namespace="${2:-default}"
  local health_check_cmd="$3"
    
  echo "Checking current deployment health..."
    
  if ! eval "$health_check_cmd"; then
    echo "✗ Health check failed, rolling back..."
        
    # Rollback: set replicas to 0 for current version, restore previous
    kubectl rollout undo "deployment/$deployment" -n "$namespace"
        
    echo "Waiting for rollback to complete..."
    kubectl rollout status "deployment/$deployment" -n "$namespace"
        
    echo "✓ Rolled back to previous version"
    return 0
  fi
    
  echo "✓ Deployment healthy, no rollback needed"
  return 0
}
 
# Usage
rollback_deployment "api-service" "prod" \
  "curl -sf http://api-service:8080/health > /dev/null"

Quick One-Liners ⚡️

Bash

# Upgrade all outdated pip packages (Linux)
pip3 list -o | awk 'NR>2 {print $1}' | xargs -n1 pip3 install -U
 
# Find all files modified in the last 24 hours
find . -mtime -1 -type f
 
# Count lines in all .tf files
find . -name "*.tf" | xargs wc -l
 
# Recursively replace a string in all files
grep -rl 'old-string' . | xargs sed -i 's/old-string/new-string/g'
 
# Show disk usage, sorted by size
du -sh */ | sort -rh | head -20
 
# Watch a command every 2 seconds
watch -n2 'kubectl get pods'

Anti-patterns

🚨 Unquoted variables - rm -rf $DIR with DIR="" expands to rm -rf (and then whatever follows). Always quote: rm -rf "$DIR". The same applies to $@ in argument passing - use "$@", not $@.
🚨 Parsing JSON, YAML, or XML with grep, sed, or awk - these tools operate on lines and break on multi-line values, escaped characters, and whitespace. Use jq for JSON, yq for YAML, xmllint for XML.
⚠️ Logging to stdout inside functions that may be captured - result=$(my_func) captures all stdout including log lines, silently mixing data with diagnostics. Log to stderr (>&2); keep stdout for data only.
⚠️ #!/bin/bash in shebangs - /bin/bash doesn’t exist on all systems (notably NixOS, some containers). Use #!/usr/bin/env bash for portability.
🚨 Skipping set -euo pipefail - without it, commands can fail silently mid-script and execution continues with corrupt state. Every script should start with set -Eeuo pipefail.
⚠️ ls output in scripts - filenames can contain spaces, newlines, and glob characters that break word-splitting. Use glob patterns (for f in *.tf) or find ... -print0 | xargs -0 instead.
⚠️ cat file | grep pattern - unnecessary fork. Use grep pattern file directly. The same applies to cat file | wc -l → wc -l < file.
🔬 Ignoring $? after native commands - with set -e, a failed command exits immediately, but commands in conditionals (if cmd; then) don’t trigger the exit. Explicitly check: cmd || { log_error "cmd failed"; exit 1; }.

Bash Cheat Sheet

Script Boilerplate

Strict mode ✅ Use at the top of every script

Run as root, re-exec with sudo if needed

Colour Output Helpers

Logging

Leveled log functions

Why stderr?

Structured (JSON) logging for log shippers

trap for error context

Sensible defaults checklist

String Utilities

Argument Parsing

Positional arguments with defaults

Named flags with while / case

PATH Management

Remove duplicate entries from PATH

Directory Iteration

Run a script in every subdirectory (1 level deep)

Sort and format all Terraform modules in a directory

Terraform Module Sort Helper 🏠 Personal workflow helper

Testing with bats

Install

Test file structure

Useful assertions

setup / teardown / setup_file / teardown_file

Mocking external commands - PATH override

Mocking shell functions - override after sourcing

Running tests

GitHub Actions example

Coverage with kcov

Sensible defaults checklist

Process Management & Health Checks

Monitor a process, restart if dead

Health check with retry logic

Networking & Service Discovery

Wait for a port to be ready

Resolve service from DNS

Cloud CLI Patterns (Azure / AWS)

Azure CLI with managed identity

Batch operations with Azure CLI

Container & Kubernetes Helpers

Kubernetes port-forward with automatic cleanup

Wait for Kubernetes deployment to be ready

Deployment & Rollback Patterns

Blue-green deployment with health check

Rollback with version pinning

Quick One-Liners ⚡️

Anti-patterns

`trap` for error context

Named flags with `while` / `case`

`setup` / `teardown` / `setup_file` / `teardown_file`

Coverage with `kcov`