Using Conditional Logic, Loops, and String Functions in awk
Conditionals
condition ? actionA : actionB
# equivalent multi-branch form:
if (cond1) {
stmt1
} else if (cond2) {
stmt2
} else {
stmt3
}
Looping Constructs
While loop:
while (cond)
statement
Do-while loop:
do
statement
while (cond)
For loop:
for (init; test; step)
statement
Filter Lines by Field Value Range
Print entries from /etc/passwd where the third colon-separated field (numeric UID) lies between 50 and 100:
BEGIN { FS=":" }
($3 > 50 && $3 < 100) { print }
Classify Users by UID
File uid_class.awk:
BEGIN { FS=":" }
{
if ($3 < 50)
printf "%-20s %-20s %-10d\n", "UID<50", $1, $3
else if ($3 > 50 && $3 < 100)
printf "%-20s %-20s %-10d\n", "50<UID<100", $1, $3
else
printf "%-20s %-20s %-10d\n", "UID>100", $1, $3
}
Run with:
awk -f uid_class.awk passwd
Compute and Filter Student Averages
Input scores.txt:
Allen 80 90 96 98
Mike 93 98 92 91
Zhang 78 76 87 92
Jerry 86 89 68 92
Han 85 95 75 90
Li 78 88 98 100
Print header and averages:
BEGIN {
printf "%-20s %-20s %-20s %-20s %-20s %-20s\n",
"Name","Chinese","English","Math","Physical","Average"
}
{
total = $2 + $3 + $4 + $5
mean = total / 4
printf "%-20s %-20d %-20d %-20d %-20d %-0.2f\n", $1, $2, $3, $4, $5, mean
}
Add condition to show only means above 90:
BEGIN {
printf "%-20s %-20s %-20s %-20s %-20s %-20s\n",
"Name","Chinese","English","Math","Physical","Average"
}
{
total = $2 + $3 + $4 + $5
mean = total / 4
if (mean > 90)
printf "%-20s %-20d %-20d %-20d %-20d %-0.2f\n", $1, $2, $3, $4, $5, mean
}
Summing 1 to 100 Using Different Loops
While version – file sum_while.awk:
BEGIN {
idx = 1
total = 0
while (idx <= 100) {
total += idx
idx++
}
print total
}
Do-while version – file sum_dowhile.awk:
BEGIN {
idx = 1
total = 0
do {
total += idx
idx++
} while (idx <= 100)
print total
}
For version – file sum_for.awk:
BEGIN {
total = 0
for (idx = 1; idx <= 100; idx++)
total += idx
print total
}
String Function Examples
Field length per line – file field_len.awk:
BEGIN { FS=":" }
{
pos = 1
while (pos <= NF) {
sep = (pos == NF) ? "" : ":"
printf "%d%s", length($pos), sep
pos++
}
print ""
}
Usage:
awk -f field_len.awk passwd
Find substring position:
BEGIN {
txt = "I have a gream"
printf "%d\n", index(txt, "ea")
}
Convert to lowercase:
BEGIN {
txt = "Hadoop is a bigdata Framework"
print tolower(txt)
}
Convert to uppercase:
BEGIN {
txt = "Hadoop is a bigdata Framework"
print toupper(txt)
}
Split string into array:
BEGIN {
txt = "Hadoop Kafka Spark Storm HDFS YARN Zookeeper"
n = split(txt, parts, " ")
for (j = 1; j <= n; j++)
print parts[j]
}
Locate first digit:
BEGIN {
txt = "Transaction 2345 Start:Select * from master"
print match(txt, /[0-9]/)
}
Extract substring:
BEGIN {
txt = "transaction start"
print substr(txt, 4, 5)
}
Replace first numeric sequence:
BEGIN {
txt = "Transaction 243 Start,Event ID:9002"
reps = sub(/[0-9]+/, "$", txt)
print reps
print txt
}
Replace all numeric sequences:
BEGIN {
txt = "Transaction 243 Start,Event ID:9002"
reps = gsub(/[0-9]+/, "$", txt)
print reps
print txt
}
Array indexing starts at 1:
BEGIN {
txt = "Hadoop Kafka Spark Storm HDFS YARN Zookeeper"
split(txt, elems, " ")
print elems[1] # first element
for (k in elems)
print elems[k]
}