Linux Basic I/O: File Descriptors, dup2 Redirection, and the Everything-Is-A-File Model
File Descriptors: The Array Index Underpinning I/O
Disk Files vs In-Memory Open Files
Files stored persistently on storage are called disk files. When a file is opened by a process, it is loaded from disk into memory, becoming an in-memory open file. This relationship mirrors that of programs on disk vs running processes in memory.
Since any system can have many processes, each opening multiple filess, the operating system must manage all open files efficiently. Following the kernel's common "describe first, organize later" management pattern:
- The kernel creates a
struct filefor every open file, which stores the file's metadata, content, and state information - All
struct fileinstances are linked into a global doubly linked list, so management becomes simple operations on the list.
To map which open files belong to which process, we need a mapping between processes and open files. When a process is created, the kernel sets up its task_struct (process control block), which includes a pointer to a files_struct structure. This structure holds an array of struct file * pointers, where each entry points to the struct file of an open file. The index of this array is the file descriptor (fd).
To perform I/O, the kernel uses the fd to index into this array, get the pointer to the struct file, and access the file's data. When you open a new file, the kernel adds the new struct file pointer to the first available empty slot in the array, and returns the index of that slot to the process.
Note: Writes to a file are first buffered in memory, and flushed to disk at a later time for performance.
File Descriptor Allocation Rule
Let's test the allocation rule with a simple example:
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int main() {
int f1 = open("./first.log", O_WRONLY | O_CREAT, 0644);
int f2 = open("./second.log", O_WRONLY | O_CREAT, 0644);
int f3 = open("./third.log", O_WRONLY | O_CREAT, 0644);
int f4 = open("./fourth.log", O_WRONLY | O_CREAT, 0644);
printf("%d\n", f1);
printf("%d\n", f2);
printf("%d\n", f3);
printf("%d\n", f4);
close(f1);
close(f2);
close(f3);
close(f4);
return 0;
}
Running this code will show all fds starting at 3. What happens if we close the default 0 and 2 descriptors before opening new files?
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int main() {
close(STDIN_FILENO);
close(STDERR_FILENO);
int f1 = open("./first.log", O_WRONLY | O_CREAT, 0644);
int f2 = open("./second.log", O_WRONLY | O_CREAT, 0644);
int f3 = open("./third.log", O_WRONLY | O_CREAT, 0644);
int f4 = open("./fourth.log", O_WRONLY | O_CREAT, 0644);
printf("%d\n", f1);
printf("%d\n", f2);
printf("%d\n", f3);
printf("%d\n", f4);
close(f1);
close(f2);
close(f3);
close(f4);
return 0;
}
Now the output shows 0 and 2 are allocated to the new files. If we close 1:
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int main() {
close(STDOUT_FILENO);
int f1 = open("./first.log", O_WRONLY | O_CREAT, 0644);
int f2 = open("./second.log", O_WRONLY | O_CREAT, 0644);
int f3 = open("./third.log", O_WRONLY | O_CREAT, 0644);
int f4 = open("./fourth.log", O_WRONLY | O_CREAT, 0644);
printf("%d\n", f1);
printf("%d\n", f2);
printf("%d\n", f3);
printf("%d\n", f4);
close(f1);
close(f2);
close(f3);
close(f4);
return 0;
}
No output appearss on screen! This confirms the allocation rule: new file descriptors are always allocated starting from the smallest available unused index in the fd array.
Every new Linux process automatically has 3 default open file descriptors:
0: Standard input, defaults to the keyboard device1: Standard output, defaults to the display2: Standard error, defaults to the display
This is a kernel feature, not a feature of any programming language — all user-space libraries follow this convention because it is mandated by the operating system. We can verify this with simple tests:
#include <stdio.h>
#include <unistd.h>
#include <string.h>
int main() {
const char *msg = "Hello standard output\n";
write(STDOUT_FILENO, msg, strlen(msg));
return 0;
}
#include <stdio.h>
#include <unistd.h>
#include <string.h>
int main() {
const char *msg = "Hello standard error\n";
write(STDERR_FILENO, msg, strlen(msg));
return 0;
}
#include <stdio.h>
#include <unistd.h>
#include <string.h>
int main() {
char buf[1024];
ssize_t n = read(STDIN_FILENO, buf, sizeof(buf) - 1);
if (n > 0) {
buf[n] = '\0';
printf("Received input: %s", buf);
}
return 0;
}
Redirection
Output Redirection
test where we close standard output and get no output on screen is actually a simple demonstration of output redirection:
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#define TARGET_FILE "output.log"
int main() {
int fd = open(TARGET_FILE, O_CREAT | O_WRONLY | O_TRUNC, 0666);
if (fd < 0) {
perror("open failed");
return 1;
}
const char *line = "hello linux\n";
for (int i = 0; i < 5; i++) {
write(STDOUT_FILENO, line, strlen(line));
}
close(fd);
return 0;
}
This code prints 5 lines to screen. If we modify it to close stdout before opening:
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#define TARGET_FILE "output.log"
int main() {
close(STDOUT_FILENO);
int fd = open(TARGET_FILE, O_CREAT | O_WRONLY | O_TRUNC, 0666);
if (fd < 0) {
perror("open failed");
return 1;
}
const char *line = "hello linux\n";
for (int i = 0; i < 5; i++) {
write(STDOUT_FILENO, line, strlen(line));
}
close(fd);
return 0;
}
All 5 lines are now written to output.log instead of the display. This is output redirection. The core idea is: by changing what the file descriptor entry points to, we change where the I/O goes. When we close stdout, the entry 1 becomes free, so the new file gets allocated index 1. All code that writes to fd 1 (standard output) now writes to our new file.
Input Redirection
Input redirection follows exactly the same pattern: we redirect what would normally come from standard input (keyboard) to come from another file:
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int main() {
close(STDIN_FILENO);
int fd = open("input.txt", O_RDONLY);
if (fd < 0) {
perror("open failed");
exit(1);
}
char buf[64];
ssize_t n = read(STDIN_FILENO, buf, sizeof(buf) - 1);
if (n > 0) {
buf[n] = '\0';
printf("Read from file: %s\n", buf);
}
close(fd);
return 0;
}
By closing stdin (0), we make 0 available, so the new file gets fd 0, and all reads from 0 now come from the file.
Append Redirection
Append redirection is just output redirection where we open the file with the O_APPEND flag, so new writes are added to the end of the file instead of overwriting existing content:
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int main() {
close(STDOUT_FILENO);
int fd = open("output.log", O_WRONLY | O_APPEND);
if (fd < 0) {
perror("open failed");
exit(1);
}
const char *line = "new appended line\n";
write(STDOUT_FILENO, line, strlen(line));
close(fd);
return 0;
}
The dup2 System Call
Manually closing descriptors to get the right index works, but it's error-prone. Linux provides the dup2 system call to directly perform the redirection by copying the file pointer from one descriptor to another.
- Function signature:
int dup2(int oldfd, int newfd); - Behavior: Copies the
struct file *pointer fromfd_array[oldfd]tofd_array[newfd], overwriting the existing value atnewfd. - Return value: Returns
0on success,-1on error.
Edge cases:
- If
oldfdis not a valid open file descriptor, the call fails, andnewfdis not modified. - If
oldfdis valid andoldfd == newfd,dup2does nothing and returnsnewfdimmediately.
Example of output redirection with dup2:
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h>
int main() {
int fd = open("./output.txt", O_WRONLY | O_CREAT, 0644);
dup2(fd, STDOUT_FILENO);
close(fd); // Clean up the original descriptor, no longer needed
printf("This text is redirected to the file\n");
printf("No output appears on the screen\n");
return 0;
}
After the copy, both fd and the target descriptor (1 in this case) point to the same struct file, so we can safely close the original fd to avoid descriptor leaks.
Append and input redirection with dup2 follow the same pattern:
// Append redirection example
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#define OUTPUT_FILE "output.log"
int main() {
int fd = open(OUTPUT_FILE, O_CREAT | O_WRONLY | O_APPEND, 0666);
if (fd < 0) {
perror("open failed");
return 1;
}
dup2(fd, STDOUT_FILENO);
close(fd);
const char *msg = "appended hello linux\n";
for (int i = 0; i < 5; i++) {
write(STDOUT_FILENO, msg, strlen(msg));
}
return 0;
}
// Input redirection example
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#define INPUT_FILE "input.log"
int main() {
int fd = open(INPUT_FILE, O_RDONLY);
if (fd < 0) {
perror("open failed");
return 1;
}
dup2(fd, STDIN_FILENO);
close(fd);
char buf[64];
ssize_t n = read(STDIN_FILENO, buf, sizeof(buf) - 1);
if (n > 0) {
buf[n] = '\0';
printf("Read content: %s\n", buf);
}
return 0;
}
Redirection for C Standard Library Functions
C standard I/O functions like printf and fprintf work perfectly with redirection done via dup2, because these functions ultimately use the kernel's file descriptor table under the hood:
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#define OUTPUT_FILE "log.txt"
int main() {
int fd = open(OUTPUT_FILE, O_CREAT | O_WRONLY | O_APPEND, 0666);
if (fd < 0) {
perror("open failed");
return 1;
}
dup2(fd, STDOUT_FILENO);
close(fd);
printf("fd: %d\n", fd);
printf("hello from printf\n");
fprintf(stdout, "hello from fprintf\n");
return 0;
}
All output from the C library functions is correctly redirected to the file.
Shell Redirection Operators
The common shell redirection operators > (overwrite output), >> (append output), < (input), << (here document) are all implemented using dup2 under the hood. When the shell sets up redirection before executing a user command, it creates the redirection by modifying the file descriptor table, and the changes are preserved across process execution.
Understanding Linux's Everything-Is-A-File Model
A key property of the Linux design is that redirection state (the mappings of file descriptors to open files) is preserved across execve (process replacement), which is how the shell can implement redirection for user programs.
To understand why everything is a file in Linux, we need to look at how the kernel abstracts different types of resources and devices:
- Any resource or device that can be read from or written to is abstracted as an open file. When you open the device, the kernel creates a
struct filejust like it does for a regular disk file. - Each
struct fileincludes a pointer to astruct file_operations, a structure that holds function pointers to device-specific implementations of core operations likeread,write,open, andrelease.
When a process calls read(fd, buf, size):
- The kernel looks up
struct file *from the fd array entry - Follows the pointer to the
file_operationstable for that file - Calls the device-specific
readfunction stored in the table via the function pointer
This means that regardless of whether the target is a regular disk file, keyboard, display, network card, or block device, the process uses the exact same read/write/open/close interface to interact with it. The kernel handles dispatching to the correct device-specific implementation via the function pointers.
This abstraction is analogous to object-oriented polymorphism: struct file acts as a base class, and every device/file type implements its own operations based on the base interface. All resources look identical to the calling process, hence the "everything is a file" design. This abstraction layer is called the Virtual File System (VFS), which unifies access to all types of file systems and devices under a single interface.