Essential Linux Command-Line Utilities for File Inspection and Manipulation
File Content Viewing and Concatenation
The cat command is primarily used to display the contents of a file to the standard output. It reads the file sequentially and writes the data to the terminal. Since Linux treats files as collections of attributes and content, cat specifically targets the content portion.
Common usage includes:
cat source_code.c // Displays the full content
Useful options for controlling output include:
-n: Number all output lines.-b: Number non-blank output lines.-s: Suppress repeated empty output lines.
The tac command performs the inverse operation, reading the file from the last line to the first and printing it to the terminal.
Paged File Viewing
When dealing with large files, dumping the entire content to the screen at once is impractical. The more command allows users to view data one screen at a time. It supports forward navigation via the Enter key and basic string searching using /pattern. However, it does not support scrolling backward.
A more advanced alternative is the less command. It provides robust navigation, allowing users to scroll both up and down using arrow keys. It also supports search pattern highlighting and is often preferred for analyzing system logs where context in both directions is necessary.
Extracting File Segments
To inspect specific parts of a file without opening the entire document, head and tail are utilized.
Head Command
This command outputs the beginning of a file. By default, it prints the first 10 lines.
head -20 server.log // Prints the first 20 lines
Tail Command
Conversely, tail outputs the ending portion of a file, defaulting to the last 10 lines. It is frequently used to monitor the latest entries in log files.
tail -15 server.log // Prints the last 15 lines
To extract a specific range of lines from the middle of a file, you can combine these commands using a pipe (|). For example, to get lines 7991 to 8000:
head -8000 huge_data.txt | tail -10
Time and Date Management
Accurate timekeeping is critical for system logging and debugging. The date command displays or sets the system time.
Unix Timestamps
Computers often store time as the number of seconds elapsed since the Unix Epoch (January 1, 1970). This format is monotonically increasing, making it ideal for sorting and time-range calculations.
date +%s // Displays the current timestamp
date -d @1625097600 // Converts a timestamp back to a readable format
You can also format the date string specifically:
date "+%Y-%m-%d %H:%M:%S"
The cal command displays a simple calendar. Running cal shows the current month, while cal -y displays the entire year for the current date.
File Searching and Text Filtering
Find Command
The find utility searches the directory tree for files based on various criteria such as name, size, or modification time. Unlike some simpler search tools, find actually scans the disk.
find /var/log -name "*.log" // Searches for log files in /var/log
Grep Command
grep filters text line by line. It searches for a specified pattern and prints the matching lines.
grep "error" syslog.txt // Finds lines containing "error"
grep -n "500" access.log // Shows line numbers along with matches
grep -v "comment" config.cfg // Inverts match to show lines without "comment"
grep -i "warning" alert.log // Case-insensitive search
Archiving and Compression
Managing disk space and transferring groups of files efficiently requires archiving.
Zip and Unzip
These commands handle the compression and decompression of files and directories using the widely supported .zip format.
zip -r project.zip ./project_dir // Recursively compresses a directory
unzip archive.zip -d /target/path // Decompresses to a specific path
Tar Command
The tar (tape archive) utility is standard on Unix-like systems for combining multiple files into a single archive file, often compressed with gzip or bzip2.
tar -czvf backup.tar.gz /data // Creates (c), uses gzip (z), verbose (v), specifies file (f)
tar -xzvf backup.tar.gz -C /restore_path // Extracts (x) to a specific directory (C)
System Utilities
BC Command
bc acts as a command-line calculator supporting arbitrary precision arithmetic, including floating-point numbers which the standard shell does not handle natively.
echo "10.5 * 4" | bc
Uname Command
To retrieve system information, use uname.
uname -r // Displays the kernel release version
uname -m // Displays the machine hardware name (e.g., x86_64)
Input and Output Redirection
In Linux, standard input, output, and error are treated as file streams. This allows for powerful manipulation of where data comes from and goes.
Output Redirection (> and >>)
The > operator redirects standard output to a file, overwriting the file if it exists or creating it if it does not.
echo "System initialization complete" > status.log
To append data to an existing file without deleting its current content, use >>.
echo "New user logged in" >> status.log
Input Redirection (<)
The < operator takes input from a file rather than the keyboard.
cat < status.log
This concept reinforces the "everything is a file" philosophy in Linux. Even hardware devices are represented as files in the /dev directory. For example, writing to a terminal device file sends text to that specific screen:
echo "Hello User" > /dev/pts/1
Generating Test Data
To create a large file for testing log analysis or I/O performance, a simple loop can be combined with redirection:
count=1; while [ $count -le 50000 ]; do echo "Log entry number $count"; ((count++)); done > heavy_log.txt