Home > Tech > Content

Practical File Operations in Python

Tech May 17 15

Python's file operations are straightforward, using the built-in open() function to obtain a file handler for performing various tasks. The permissible actions are determined by the specified access mode.

Available access modes include: r, w, a, r+, w+, a+, rb, wb, ab, r+b, w+b, a+b. The default mode is r for read-only.

Read-Only Modes (`r`, `rb`)

data_file = open('document.txt', mode='r', encoding='utf-8')
data_content = data_file.read()
print(data_content)
data_file.close()

The encoding parameter is used to decode the file's content. UTF-8 is the most common encoding.

rb mode reads data in binary (bytes format) and does not accept a encoding parameter.

binary_file = open('document.txt', mode='rb')
binary_data = binary_file.read()
print(binary_data)
binary_file.close()
# Output (example): b'\xef\xbb\xbfSample Content...'

Binary read mode (rb) is essential for non-text files like images, audio, or video where data cannot be directly represented as text. It is also fundamental for streaming and file transfer operations.

File Paths:

Absolute Path: The full path from the root directory to the file.
Relative Path: The path relative to the location of the executing script (e.g., './data.txt', '../config.txt'). Using relative paths improves portability, as the project can be moved without breaking file references.

Methods for Reading Files:

read(): Reads the entire file content into memory. Can cause memory issues with large files.

file_handle = open('sample.txt', mode='r', encoding='utf-8')
full_text = file_handle.read()
print(full_text)

read(size): Reads a specified number of characters (or bytes in rb mode). Subsequent reads continue from the current file position.

file_handle = open('sample.txt', mode='r', encoding='utf-8')
chunk_one = file_handle.read(4)  # Reads 4 characters
chunk_two = file_handle.read(4)  # Reads the next 4 characters
print(chunk_one, chunk_two)

readline(): Reads a single line from the file, including the newline character (\n). The strip() method is often used to remove whitespace.

file_handle = open('sample.txt', mode='r', encoding='utf-8')
line_one = file_handle.readline()
line_two = file_handle.readline()
print(line_one.strip())
print(line_two.strip())

readlines(): Reads all lines into a list, with each line as an element. This also loads the entire file into memory.

file_handle = open('sample.txt', mode='r', encoding='utf-8')
lines_list = file_handle.readlines()
for single_line in lines_list:
    print(single_line.strip())

Iterating over the file object: The most memory-efficient method for reading large files, processing one line at a time.

file_handle = open('sample.txt', mode='r', encoding='utf-8')
for each_line in file_handle:
    print(each_line.strip())

Always close the file handler after operations using close().

Write-Only Modes (`w`, `wb`)

In write mode (w), if the file does not exist, it is created. If it exists, its content is erased before writing.

output_file = open('output.txt', mode='w', encoding='utf-8')
output_file.write('Initial Content')
output_file.flush()  # Ensures data is written to disk
output_file.close()

In wb (binary write) mode, strings must be encoded to bytes before writing.

bin_output = open('data.bin', mode='wb')
bin_output.write('Binary Data'.encode('utf-8'))
bin_output.close()

Append Modes (`a`, `ab`)

Append mode adds new data to the end of an existing file without overwriting previous content.

log_file = open('app.log', mode='a', encoding='utf-8')
log_file.write('New log entry\n')
log_file.close()

Read and Write (`r+`)

This mode allows both reading and writting. It is crucial to understand that the initial cursor position is at the start of the file, and operations depend on the order of reads and writes.

Recommended workflow: Read first, then write.

file_rw = open('test.txt', mode='r+', encoding='utf-8')
existing = file_rw.read()  # Cursor moves to end after read
file_rw.write('Appended Text')  # Write occurs at the end
print(existing)
file_rw.close()

Note: Writing before reading in r+ mode will overwrite content from the cursor's starting position.

Write and Read (`w+`)

This mode truncates (clears) the file upon opening, then allows writing and subsequent reading. The initial read after opening yields nothing.

file_wr = open('test_wplus.txt', mode='w+', encoding='utf-8')
file_wr.write('Sample')
file_wr.seek(0)  # Move cursor to start before reading
content = file_wr.read()  # Now content can be read
print(content)
file_wr.close()

Append and Read (`a+`)

This mode opens the file for appending. The cursor is posisioned at the end of the file, so a direct read() call returns an empty string unless the cursor is moved.

file_ar = open('test_aplus.txt', mode='a+', encoding='utf-8')
file_ar.write('Appended Line\n')
file_ar.seek(0)  # Move to beginning to read
old_content = file_ar.read()
print(old_content)
file_ar.close()

File Pointer and Truncation

seek(offset, whence): Moves the file pointer. offset is in bytes. whence is 0 (start), 1 (current), or 2 (end).
tell(): Returns the current file pointer position in bytes.
truncate(size=None): Truncates the file to size bytes. If size is omitted, truncates from the current position.

Important: In r+ mode, after any read operation, subsequent writes happen at the end. To truncate from a specific point, move the cursor with seek() first.

with open('example.dat', mode='r+', encoding='utf-8') as f:
    f.seek(10)  # Move to byte 10
    current_pos = f.tell()
    f.truncate()  # Deletes all content after byte 10
    f.write('Replacement')  # Writes starting at current (post-truncation) position

File Modification Patterns

Files cannot be modified in-place. The standard approach is to read from a source file, modify the content, and write to a new temporary file. Finally, replace the original file with the new one.

Method 1: In-Memory Replacement (for small files)

import os
with open('source.txt', 'r') as src, open('temp.txt', 'w') as dst:
    original_text = src.read()
    updated_text = original_text.replace('old_word', 'new_word')
    dst.write(updated_text)
os.replace('temp.txt', 'source.txt')  # Atomic replacement

Method 2: Line-by-Line Processing (for large files)

import os
with open('source.txt', 'r') as src, open('temp.txt', 'w') as dst:
    for line in src:
        new_line = line.replace('old_word', 'new_word')
        dst.write(new_line)
os.replace('temp.txt', 'source.txt')

Tags: Python File Handling I/O

Back to List

Prev: C++ Variable Types and Data Storage Fundamentals

Next: Monitoring Memory Usage in Linux Systems

Fading Coder

Practical File Operations in Python

Read-Only Modes (`r`, `rb`)

Write-Only Modes (`w`, `wb`)

Append Modes (`a`, `ab`)

Read and Write (`r+`)

Write and Read (`w+`)

Append and Read (`a+`)

File Pointer and Truncation

File Modification Patterns

Related Articles

Understanding Strong and Weak References in Java

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Leave a Comment

Copyright © fadingcoder.top

Fading Coder

Practical File Operations in Python

Read-Only Modes (r, rb)

Write-Only Modes (w, wb)

Append Modes (a, ab)

Read and Write (r+)

Write and Read (w+)

Append and Read (a+)

File Pointer and Truncation

File Modification Patterns

Related Articles

Understanding Strong and Weak References in Java

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Leave a CommentCancel Reply

Copyright © fadingcoder.top

Read-Only Modes (`r`, `rb`)

Write-Only Modes (`w`, `wb`)

Append Modes (`a`, `ab`)

Read and Write (`r+`)

Write and Read (`w+`)

Append and Read (`a+`)

Leave a Comment