Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Understanding Linux File I/O: System Interfaces, File Descriptors, and Redirection

Tech 1

File Operations in C

In C programming, file operations involve managing both file content and attributes. To access a file, it must first be opened, which loads its properties or data into memory. This process is governed by the von Neumann architecture, where operations like reading and writing occur in memory. Files can be categorized as memory files (opened files) and disk files (stored on disk).

File operations are performed by processes. When a process executes code that calls functions like fopen or write, it interacts with the file. Therefore, learning file operations is essentially about understanding the relationship between processes and opened files.

Writing to Files in C

To write data to a file in C, use the fopen function with appropriate modes. For example, to create and write to a file:

#include <stdio.h>

int main() {
    FILE* file_handle = fopen("example.txt", "w");
    if (file_handle == NULL) {
        perror("fopen");
        return 1;
    }
    for (int i = 0; i < 20; i++) {
        fprintf(file_handle, "Line %d\n", i);
    }
    fclose(file_handle);
    return 0;
}

If a filename is passed without a path, fopen creates the file in the current working directory of the process.

Current Working Directory

The current path refers to the working directory of the process. By default, this is the directory where the executable resides, but it can be changed using functions like chdir. For instance, to get the process ID and view its working directory:

#include <stdio.h>
#include <unistd.h>

int main() {
    printf("Process ID: %d\n", getpid());
    while (1) {
        // Infinite loop to inspect process info
    }
    return 0;
}

Run this and check the process details to see the current working directory (cwd).

File Operation Modes

C provides several modes for file operations:

  • r: Read-only mode for existing text files.
  • r+: Read-write mode for existing text files.
  • w: Write-only mode; creates a new file or truncates an existing one.
  • w+: Read-write mode with truncation.
  • a: Append mode; writes data to the end of the file without overwriting.
  • a+: Read and append mode.

For example, using append mode:

FILE* append_file = fopen("log.txt", "a");
if (append_file != NULL) {
    fprintf(append_file, "New entry\n");
    fclose(append_file);
}

This adds text to log.txt without clearing existing content.

Reading Files and Implementing cat

To read a file line by line, use fgets:

#include <stdio.h>

int main() {
    FILE* input_file = fopen("log.txt", "r");
    if (input_file == NULL) {
        perror("fopen");
        return 1;
    }
    char line_buffer[100];
    while (fgets(line_buffer, sizeof(line_buffer), input_file) != NULL) {
        printf("%s", line_buffer);
    }
    fclose(input_file);
    return 0;
}

To mimic the cat command, read a file specified as a command-line argument:

#include <stdio.h>

int main(int argc, char* argv[]) {
    if (argc != 2) {
        fprintf(stderr, "Usage: %s <filename>\n", argv[0]);
        return 1;
    }
    FILE* target_file = fopen(argv[1], "r");
    if (target_file == NULL) {
        perror("fopen");
        return 1;
    }
    char text_chunk[80];
    while (fgets(text_chunk, sizeof(text_chunk), target_file) != NULL) {
        printf("%s", text_chunk);
    }
    fclose(target_file);
    return 0;
}

File System Interfaces

In Linux, files encompass not just disk files but also devices like keyboards and printers. Various programming languages provide file operation interfaces, but they ultimately rely on Linux system calls.

System Calls and Encapsulation

System calls are low-level interfaces provided by the operating system for hardware access. Higher-level languages encapsulate these calls to simplify usage and ensure cross-platform compatibility. For example, printf in C internally uses system calls to write to the display.

Opening Files with open

The open system call is used to open files. Its prototype is:

#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>

int open(const char* pathname, int flags, mode_t mode);
  • pathname: File path.
  • flags: Options like O_RDONLY (read-only), O_WRONLY (write-only), O_CREAT (create if missing).
  • mode: Permissions for new files (e.g., 0666).

Flags are passed using bitwise operations. For instance, to open a file for writing, creaitng it if necessary:

int file_desc = open("data.txt", O_WRONLY | O_CREAT, 0644);
if (file_desc < 0) {
    perror("open");
    return 1;
}

Closing, Writing, and Reading Files

Use close to close a file descriptor:

#include <unistd.h>
close(file_desc);

To write data:

char message[] = "Hello, file!\n";
write(file_desc, message, sizeof(message) - 1);

To read data:

char read_buffer[50];
ssize_t bytes_read = read(file_desc, read_buffer, sizeof(read_buffer));
if (bytes_read > 0) {
    // Process read_buffer
}

Additional flags:

  • O_TRUNC: Truncate file to zero length when opening.
  • O_APPEND: Append data to the end of the file.

For example, to open a file for appending:

int append_fd = open("log.txt", O_WRONLY | O_APPEND, 0644);

File Descriptors (fd)

File descriptors are integers returned by open, representing open files. Standard descriptors are:

  • 0: Standard input (stdin)
  • 1: Standard output (stdout)
  • 2: Standard error (stderr)

New files receive descriptors starting from 3, as 0-2 are preallocated.

Underlying Mechanism

The operating system manages open files using a struct file for each, organized in a linked list. Each process has a files_struct containing an array of pointers to these structures. The array index corresponds to the file descriptor, establishing a mapping between processes and files.

Everything is a File in Linux

Linux treats hardware devices as files by abstracting them into struct file objects with function pointers for read/write operations. This abstraction allows uniform access via file descriptors.

fd Allocation Rule

File descriptors are allocated by finding the smallest unused index in the array. For example, if descriptor 0 is closed, a new file may receive fd 0.

Redirection

Redirection changes where input comes from or output goes to by modifying file descriptor mappigns.

Using dup2 for Redirection

The dup2 system call duplicates a file descriptor:

int dup2(int oldfd, int newfd);

It copies oldfd to newfd, closing newfd if necessary. For output redirection to a file:

int output_fd = open("output.txt", O_WRONLY | O_CREAT, 0644);
dup2(output_fd, STDOUT_FILENO); // STDOUT_FILENO is 1
close(output_fd);
printf("This goes to output.txt\n");

Append and Input Redirection

For append redirection, use O_APPEND:

int append_fd = open("log.txt", O_WRONLY | O_APPEND, 0644);
dup2(append_fd, STDOUT_FILENO);
close(append_fd);

For input redirection, use O_RDONLY:

int input_fd = open("input.txt", O_RDONLY);
dup2(input_fd, STDIN_FILENO); // STDIN_FILENO is 0
close(input_fd);
char user_input[100];
fgets(user_input, sizeof(user_input), stdin); // Reads from input.txt

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.