Shell Scripting Essentials and Real-World Automation
Core Syntax Fundamentals
Annotations
Adding notes to scripts is crucial for maintenance. A single-line remark is prefixed with #. For multi-line commentary, you can utilize the null command : paired with a quoted block, or a here-doc redirected to a null command.
#!/usr/bin/env bash
# A single-line remark
: '
Multi-line remark approach one
Second line
Third line
'
<<remark_block approach="" echo="" inline="" line="" multi-line="" remark="" remark_block="" second="" starts="" two=""></remark_block>
Variables
Shell variables can be string-based, numeric, or arrays. Assigning values requires no spaces around the equals sign.
- Quoting Differences: Single quotes preserve literal values, preventing variable expansion and command substitution. Double quotes allow expansion and substitution.
- Accessing Values: Prefix the variable name with
$or wrap it in${}to clearly delineate boundaries.
msg='literal string'
user_name="expanded $msg"
host=server01
combined="${user_name}@${host}"
- Scope: Variables declared by default are local to the current shell session. To make an environment variable available to child processes, use the
exportcommand. - Special Parameters: The shell provides built-in variables to handle script context.
echo "Executable: $0"
echo "First argument: $1"
echo "Argument count: $#"
echo "All arguments: $@"
echo "Last exit code: $?"
Arrays
Arrays can be populated by explicitly listing elements or assigning values to specific indices. Accessing an element requires the index, while ${#arr[@]} yields the total count.
fruits=("apple" "banana" "cherry")
fruits[0]="apricot"
echo "${fruits[1]}" # banana
echo "${#fruits[@]}" # 3
for item in "${fruits[@]}"; do
echo "$item"
done
Standard I/O
The read builtin captures user input, supporting prompts via -p and timeouts via -t. The echo builtin outputs text, with -e enabling escape sequences like \n. Output can be redirected to files using > (overwrite) or >> (append).
read -p "Enter your name: " -t 5 username
echo "Welcome $username"
echo -e "Line one\nLine two"
echo "Log entry" >> activity.log
A common pattern for automated yes/prompts is piping an affirmative response:
echo -e 'y\n' | mkfs.ext4 /dev/sdx
Control Structures
Conditional Execution
Branching logic utilizes if, elif, and else, terminated by fi. Tests are enclosed in brackets [] or double brackets [[]] for advanced pattern matching.
val=$(grep -oP 'avg=\K\d+\.\d+' "$logfile" | head -n 2 | tail -1)
if [[ -n "$val" ]]; then
converted=$(awk "BEGIN {print $val / 1000}")
elif [[ -z "$val" ]]; then
converted=0
else
echo "Unexpected state"
fi
Pattern Matching
The case statement provides a cleaner mechanism for multi-branch logic based on string patterns.
case $action in
start) echo "Starting service";;
stop) echo "Stopping service";;
restart) echo "Restarting service";;
*) echo "Unknown action";;
esac
Logical Operators
Logical AND and OR are represented by &&/-a and ||/-o respectively within test constructs.
if [ "$size" -gt 10 ] && [ "$size" -lt 20 ]; then
echo "Size is within range"
fi
Loops
The for loop iterates over lists or parameter expansions. while executes as long as a condition remains true, whereas until executes until a condition becomes true.
# Iterate over a sequence
for i in 1 2 3 4 5; do echo "Iteration $i"; done
# Iterate over script arguments
for disk in "$@"; do smartctl -i /dev/"$disk"; done
# Continuous monitoring
log_file="/var/log/disk_monitor.log"
while true; do
date >> "$log_file"
lsblk >> "$log_file"
sleep 120
done
# Retry mechanism
retries=0
until [ "$retries" -gt 3 ]; do
attempt_task
((retries++))
done
Functions
Functions encapsulate reusable logic. They can be defined using the function keyword or simply by providing a name followed by parentheses. Local variables are declared with local to prevent scope leakage.
calculate_sum() {
local param_a=$1
local param_b=$2
echo $((param_a + param_b))
}
result=$(calculate_sum 15 25)
echo "Total: $result"
File Operations
Shell scripting frequently interacts with the filesystem. Redirections write content, while test operators verify existence and type.
echo "Initial content" > data.txt
echo "Appended content" >> data.txt
cat data.txt
if [ -f "data.txt" ]; then echo "Regular file exists"; fi
if [ -d "/var/log" ]; then echo "Directory exists"; fi
String Manipulation
Parameter expansion provides powerful string processing capabilities without external commands.
text="Hello Shell"
echo ${#text} # Length: 11
echo ${text:6:5} # Substring: Shell
echo ${text/Shell/Bash} # Replace: Hello Bash
echo ${text#Hello } # Remove prefix: Shell
echo ${text% Shell} # Remove suffix: Hello
# Split string by delimiter
IFS='_' read -ra parts <<< "one_two_three"
for p in "${parts[@]}"; do echo "$p"; done
# Case conversion
upper="HELLO"; echo ${upper,,} # hello
lower="hello"; echo ${lower^^} # HELLO
# Trim whitespace
raw=" padded "
trimmed=$(echo $raw | xargs)
Regular Expressions for Log Parsing
Regular expressions combine ordinary characters and metacharacters to define search patterns. In Linux environments, utilities like grep and sed natively support basic regex, while awk and grep -E support extended regex. Perl-compatible regular expressions (PCRE), accessible via grep -P, offer the most robust pattern matching capabilities for complex log extraction.
Extracting Benchmark Metrics from fio
When parsing storage benchmark outputs such as fio, PCRE allows precise extraction of throughput, IOPS, and latency values.
Consider the following log segment:
write: IOPS=3823, BW=478MiB/s (501MB/s)(140GiB/300067msec)
lat (msec): min=2, max=133, avg=66.95, stdev= 1.28
To isolate specific numeric fields and format them as a CSV:
#!/usr/bin/env bash
benchmark_log="fio_output.log"
result_csv="metrics.csv"
io_ops=$(grep -oP 'IOPS=\K\d+' "$benchmark_log")
bw_mbps=$(grep -oP 'BW=.*?\(\K\d+(?=MB/s\))' "$benchmark_log")
avg_latency=$(grep -oP 'lat \(msec\).*avg=\K\d+\.\d+' "$benchmark_log")
echo "IOPS,BW_MBps,Avg_Latency_ms" > "$result_csv"
echo "${io_ops},${bw_mbps},${avg_latency}" >> "$result_csv"
Parsing Mixed Read/Write Workloads
For mixed workloads containing separate read and write metrics, targeted pattern matching ensures the correct data points are captured.
#!/usr/bin/env bash
rw_log="fio_rw.log"
rw_csv="rw_metrics.csv"
r_iops=$(grep -oP 'read: IOPS=\K\d+' "$rw_log")
w_iops=$(grep -oP 'write: IOPS=\K\d+' "$rw_log")
r_bw=$(grep -oP 'read:.*BW=.*?\(\K\d+(?=MB/s\))' "$rw_log")
w_bw=$(grep -oP 'write:.*BW=.*?\(\K\d+(?=MB/s\))' "$rw_log")
echo "Category,IOPS,BW_MBps" > "$rw_csv"
echo "Read,${r_iops},${r_bw}" >> "$rw_csv"
echo "Write,${w_iops},${w_bw}" >> "$rw_csv"
Practical System Administration Scripts
Parallel Disk Initialization
To maximize hardware utilization, storage initialization tasks can be launched concurrently in background subshells using the { ... }& construct. The wait command ensures the script pauses until all background processes complete.
#!/usr/bin/env bash
devices=("nvme0n1" "nvme1n1" "nvme2n1")
for dev in "${devices[@]}"; do
{
fio --filename=/dev/$dev --name=init_task --rw=randwrite --bs=4k --numjobs=4 --iodepth=16 --runtime=600
}&
done
wait
echo "All initialization tasks finished"
Automated Filesystem Formating
When formatting multiple volumes, interactive prompts can block automation. Piping an affirmative string directly into the formatting tool bypasses these interruptions.
#!/usr/bin/env bash
block_devices=("/dev/sdb" "/dev/sdc" "/dev/sdd")
for blk in "${block_devices[@]}"; do
echo -e 'y\n' | mkfs.ext4 "$blk"
done
Querying Disk Interface Speeds
Retrieving interface specifications across multiple storage devices is streamlined by iterating through device identifiers and applying diagnostic tools like smartctl.
#!/usr/bin/env bash
for drv in /dev/sd[a-z]; do
if [ -b "$drv" ]; then
echo "Checking $drv"
smartctl -a "$drv" | grep "SATA Version"
fi
done