Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Essential Linux Command-Line Utilities for File Inspection and Manipulation

Tech 2

File Content Viewing and Concatenation

The cat command is primarily used to display the contents of a file to the standard output. It reads the file sequentially and writes the data to the terminal. Since Linux treats files as collections of attributes and content, cat specifically targets the content portion.

Common usage includes:

cat source_code.c      // Displays the full content

Useful options for controlling output include:

  • -n: Number all output lines.
  • -b: Number non-blank output lines.
  • -s: Suppress repeated empty output lines.

The tac command performs the inverse operation, reading the file from the last line to the first and printing it to the terminal.

Paged File Viewing

When dealing with large files, dumping the entire content to the screen at once is impractical. The more command allows users to view data one screen at a time. It supports forward navigation via the Enter key and basic string searching using /pattern. However, it does not support scrolling backward.

A more advanced alternative is the less command. It provides robust navigation, allowing users to scroll both up and down using arrow keys. It also supports search pattern highlighting and is often preferred for analyzing system logs where context in both directions is necessary.

Extracting File Segments

To inspect specific parts of a file without opening the entire document, head and tail are utilized.

Head Command
This command outputs the beginning of a file. By default, it prints the first 10 lines.

head -20 server.log   // Prints the first 20 lines

Tail Command
Conversely, tail outputs the ending portion of a file, defaulting to the last 10 lines. It is frequently used to monitor the latest entries in log files.

tail -15 server.log   // Prints the last 15 lines

To extract a specific range of lines from the middle of a file, you can combine these commands using a pipe (|). For example, to get lines 7991 to 8000:

head -8000 huge_data.txt | tail -10

Time and Date Management

Accurate timekeeping is critical for system logging and debugging. The date command displays or sets the system time.

Unix Timestamps
Computers often store time as the number of seconds elapsed since the Unix Epoch (January 1, 1970). This format is monotonically increasing, making it ideal for sorting and time-range calculations.

date +%s               // Displays the current timestamp
date -d @1625097600    // Converts a timestamp back to a readable format

You can also format the date string specifically:

date "+%Y-%m-%d %H:%M:%S"

The cal command displays a simple calendar. Running cal shows the current month, while cal -y displays the entire year for the current date.

File Searching and Text Filtering

Find Command
The find utility searches the directory tree for files based on various criteria such as name, size, or modification time. Unlike some simpler search tools, find actually scans the disk.

find /var/log -name "*.log"  // Searches for log files in /var/log

Grep Command
grep filters text line by line. It searches for a specified pattern and prints the matching lines.

grep "error" syslog.txt           // Finds lines containing "error"
grep -n "500" access.log         // Shows line numbers along with matches
grep -v "comment" config.cfg     // Inverts match to show lines without "comment"
grep -i "warning" alert.log      // Case-insensitive search

Archiving and Compression

Managing disk space and transferring groups of files efficiently requires archiving.

Zip and Unzip
These commands handle the compression and decompression of files and directories using the widely supported .zip format.

zip -r project.zip ./project_dir    // Recursively compresses a directory
unzip archive.zip -d /target/path   // Decompresses to a specific path

Tar Command
The tar (tape archive) utility is standard on Unix-like systems for combining multiple files into a single archive file, often compressed with gzip or bzip2.

tar -czvf backup.tar.gz /data    // Creates (c), uses gzip (z), verbose (v), specifies file (f)
tar -xzvf backup.tar.gz -C /restore_path   // Extracts (x) to a specific directory (C)

System Utilities

BC Command
bc acts as a command-line calculator supporting arbitrary precision arithmetic, including floating-point numbers which the standard shell does not handle natively.

echo "10.5 * 4" | bc

Uname Command
To retrieve system information, use uname.

uname -r    // Displays the kernel release version
uname -m    // Displays the machine hardware name (e.g., x86_64)

Input and Output Redirection

In Linux, standard input, output, and error are treated as file streams. This allows for powerful manipulation of where data comes from and goes.

Output Redirection (> and >>)
The > operator redirects standard output to a file, overwriting the file if it exists or creating it if it does not.

echo "System initialization complete" > status.log

To append data to an existing file without deleting its current content, use >>.

echo "New user logged in" >> status.log

Input Redirection (<)
The < operator takes input from a file rather than the keyboard.

cat < status.log

This concept reinforces the "everything is a file" philosophy in Linux. Even hardware devices are represented as files in the /dev directory. For example, writing to a terminal device file sends text to that specific screen:

echo "Hello User" > /dev/pts/1

Generating Test Data

To create a large file for testing log analysis or I/O performance, a simple loop can be combined with redirection:

count=1; while [ $count -le 50000 ]; do echo "Log entry number $count"; ((count++)); done > heavy_log.txt
Tags: LinuxCLI

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.