Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Java I/O Operations: Streams, Formatting, and NIO.2 File Handling

Tech 1

Most I/O operations demonstrated so far utilize unbuffered streams, where each read or write request directly invokes underlying OS operations. This approach often degrades performance significantly since system calls typically trigger disk access, network latency, or other costly operations.

To mitigate overhead, Java provides buffered stream implementations that utilize memory regions called buffers. Buffered input streams read data from these memory areas, only calling native input APIs when the buffer exhausts. Similarly, buffered output streams accumulate data in buffers, invoking native output APIs solely when buffers fill completely.

Programs can convert unbuffered streams to buffered variants using the decorator pattern, passing the unbuffered instance to buffered stream constructors. The following demonstrates adapting character stream constructors for buffered I/O:

reader = new BufferedReader(new FileReader("source.txt"));
writer = new BufferedWriter(new FileWriter("output.txt"));

Four concrete classes provide buffering capabilities: BufferedInputStream and BufferedOutputStream handle byte-oriented streams, while BufferedReader and BufferedWriter handle character-oriented streams.

Flushing Buffered Streams

Writing buffer contents at critical points without waiting for capacity exhaustion constitutes flushing. Certain buffered output classes support autoflush capabilities via constructor parameters. When enabled, specific events trigger automatic flushes; for instance, PrintWriter with autoflush enabled flushes upon println or format invocations.

Manual flushing requires calling the flush method, available on all output streams, though only buffered streams exhibit visible effects from this operation.

Parsing and Formatting Data

I/O programming frequently involves translating between machine-readable data and human-readable formats. Java provides two primary APIs for these transformations: the Scanner API decomposes input into discrete tokens associated with data types, while formatting APIs assemble data into structured presentations.

Scanning Input

Scanner instances excel at splitting formatted input into tokens and translating individual tokens according to their data types.

Tokenizing Input

By default, scanners employ whitespace delimiters (spaces, tabs, line terminators—see Character.isWhitespace for complete listings). Consider TokenParser, which extracts words from data.txt and emits them line-by-line:

import java.io.*;
import java.util.Scanner;

public class TokenParser {
    public static void main(String[] args) throws IOException {
        Scanner scanner = null;
        try {
            scanner = new Scanner(new BufferedReader(new FileReader("data.txt")));
            while (scanner.hasNext()) {
                System.out.println(scanner.next());
            }
        } finally {
            if (scanner != null) {
                scanner.close();
            }
        }
    }
}

Note the explicit close invocation upon Scanner completion. Despite not being a stream itself, closing the scanner signals completion with its underlying stream.

To customize delimiters, invoke useDelimiter() with a regular expression. For comma-separated values potentially followed by whitespace:

scanner.useDelimiter(",\\s*");

Translating Tokens

While TokenParser treats all input as String values, Scanner supports all Java primitive types (except char), plus BigInteger and BigDecimal. Numeric parsing accommodates locale-specific separators; under the US locale, "32,767" parses correctly as an integer.

Locale considerations prove crucial since thousand separators and decimal symbols vary by region. Without specifying Locale.US, the following example behaves unpredictably across different regional settings—a concern primarily when processing data from disparate geographic sources.

NumericAggregator demonstrates accumulating double-precision values:

import java.io.*;
import java.util.*;
import java.math.*;

public class NumericAggregator {
    public static void main(String[] args) throws IOException {
        Scanner scanner = null;
        double total = 0.0;
        try {
            scanner = new Scanner(new BufferedReader(new FileReader("numbers.txt")));
            scanner.useLocale(Locale.US);
            while (scanner.hasNext()) {
                if (scanner.hasNextDouble()) {
                    total += scanner.nextDouble();
                } else {
                    scanner.next();
                }
            }
        } finally {
            if (scanner != null) {
                scanner.close();
            }
        }
        System.out.println(total);
    }
}

Given input:

8.5
32,767
3.14159
1,000,000.1

Output: 1032778.74159

Formatting Output

Stream classes implementing formatting capabilities include PrintWriter (character streams) and PrintStream (byte streams).

Note: System.out and System.err represent the primary PrintStream instances most applications require. For custom formatted output streams, instantiate PrintWriter rather than PrintStream.

Beyond standard write methods, both classes implement identical methods for converting internal data to formatted output. Two formatting levels exist:

  • print and println format individual values conventionally
  • format handles arbitrary value quantities via format strings offering precise control

Simple Formatting Methods

print and println invoke appropriate toString conversions before output. SquareRootDemo illustrates this:

public class SquareRootDemo {
    public static void main(String[] args) {
        int value = 2;
        double result = Math.sqrt(value);
        
        System.out.print("Square root of ");
        System.out.print(value);
        System.out.print(" equals ");
        System.out.print(result);
        System.out.println(".");
        
        value = 5;
        result = Math.sqrt(value);
        System.out.println("Square root of " + value + " equals " + result + ".");
    }
}

Output:

Square root of 2 equals 1.4142135623730951.
Square root of 5 equals 2.23606797749979.

Variables value and result undergo formatting twice: explicitly via print overloads and implicitly through compiler-generated toString conversions.

Advanced Formatting

The format method accepts a format string containing embedded format specifiers. Static text remains unchanged; only specifiers trigger transformation.

Format specifiers begin with % and terminate with conversion characters indicating output types. FormattedOutputDemo illustrates:

public class FormattedOutputDemo {
    public static void main(String[] args) {
        int value = 2;
        double result = Math.sqrt(value);
        System.out.format("Square root of %d is %f.%n", value, result);
    }
}

Output:

Square root of 2 is 1.414214.

Common conversions include:

  • d: Decimal integer
  • f: Floating-point decimal
  • n: Platform-specific line separator
  • x: Hexadecimal integer
  • s: String conversion
  • tB: Locale-specific month name

Note: Except for %% and %n, each specifier requires a corresponding argument; mismatches throw exceptions.

While \n always produces \u000A, prefer %n for platform-appropriate line separators.

Additional specifier elements include precision (floating-point digits or maximum string width), width (minimum output width), flags (alignment, padding, separators), and argument indices. PrecisionDemo demonstrates:

public class PrecisionDemo {
    public static void main(String[] args) {
        System.out.format("%f, %1$+020.10f %n", Math.PI);
    }
}

Output:

3.141593, +00000003.1415926536

Console and Standard I/O

Applications frequently execute within command-line environments, requiring interaction through standard streams or console facilities.

Standard Streams

Standard streams represent OS-level features providing default input (keyboard) and output (display) channels. Java exposes three standard streams: System.in (standard input), System.out (standard output), and System.err (standard error). These objects initialize automatically without explicit opening.

Despite character-oriented appearances, historical reasons dictate these as byte streams. System.out and System.err define as PrintStream objects, simulating character stream features through internal character stream objects. System.in provides raw byte input without character stream capabilities.

To utilize standard input as a character stream, wrap System.in with InputStreamReader:

InputStreamReader consoleReader = new InputStreamReader(System.in);

Console Operations

The Console class offers enhanced command-line interaction beyond standard streams, particularly for secure password entry. System.console() retrieves the singleton console instance, returning null if unavailable (non-interactive environments or unsupported platforms).

The readPassword method facilitates secure credential entry by suppressing echo and returning character arrays rather than String objects, enabling memory clearing after use.

SecureLogin demonstrates console operations:

import java.io.*;
import java.util.Arrays;

public class SecureLogin {
    public static void main(String[] args) throws IOException {
        Console console = System.console();
        if (console == null) {
            System.err.println("Console unavailable");
            System.exit(1);
        }
        
        String username = console.readLine("Username: ");
        char[] currentPassword = console.readPassword("Current password: ");
        
        if (authenticate(username, currentPassword)) {
            boolean mismatch;
            do {
                char[] newPass1 = console.readPassword("New password: ");
                char[] newPass2 = console.readPassword("Confirm password: ");
                mismatch = !Arrays.equals(newPass1, newPass2);
                
                if (mismatch) {
                    console.format("Passwords mismatch. Retry.%n");
                } else {
                    updateCredentials(username, newPass1);
                    console.format("Password updated for %s.%n", username);
                }
                Arrays.fill(newPass1, ' ');
                Arrays.fill(newPass2, ' ');
            } while (mismatch);
        }
        Arrays.fill(currentPassword, ' ');
    }
    
    static boolean authenticate(String user, char[] pass) {
        return true; // Placeholder
    }
    
    static void updateCredentials(String user, char[] pass) {
        // Placeholder
    }
}

Binary Data Streams

Data streams support binary I/O for primitive types (boolean, char, byte, short, int, long, float, double) and String values. All data stream classes implement DataInput or DataOutput interfaces. Primary implementations include DataInputStream and DataOutputStream.

InvoiceProcessor demonstrates writing and reading data records containing price, quantity, and description fields:

static final String DATA_FILE = "invoice.dat";

static final double[] PRICE_LIST = { 19.99, 9.99, 15.99, 3.99, 4.99 };
static final int[] QUANTITY_LIST = { 12, 8, 13, 29, 50 };
static final String[] ITEM_LIST = {
    "Java T-shirt",
    "Java Mug", 
    "Juggling Dolls",
    "Java Pin",
    "Key Chain"
};

Writing records:

DataOutputStream output = new DataOutputStream(
    new BufferedOutputStream(new FileOutputStream(DATA_FILE)));

for (int i = 0; i < PRICE_LIST.length; i++) {
    output.writeDouble(PRICE_LIST[i]);
    output.writeInt(QUANTITY_LIST[i]);
    output.writeUTF(ITEM_LIST[i]);
}
output.close();

Reading records:

DataInputStream input = new DataInputStream(
    new BufferedInputStream(new FileInputStream(DATA_FILE)));

double price;
int quantity;
String description;
double grandTotal = 0.0;

try {
    while (true) {
        price = input.readDouble();
        quantity = input.readInt();
        description = input.readUTF();
        System.out.format("Item: %d units of %s @ $%.2f%n", 
            quantity, description, price);
        grandTotal += quantity * price;
    }
} catch (EOFException e) {
    // End of stream reached
}

Note that writeUTF employs modified UTF-8 encoding, using single bytes for common Western characters.

Caution: Using floating-point for monetary values introduces precision errors. BigDecimal provides proper decimal arithmetic but requires object streams for serialization.

Object Serialization

Object streams extend data stream capabilities to object I/O. Classes implementing Serializable support serialization. Primary classes include ObjectInputStream and ObjectOutputStream, implementing ObjectInput and ObjectOutput (subinterfaces of DataInput/DataOutput).

This enables streams containing mixed primitive and object values. ObjectPersistenceDemo illustrates persisting BigDecimal prices and Calendar timestamps:

Complex object graphs serialize automatically—writeObject traverses reference networks, writting all reachable objects. When reading back, readObject reconstructs the entire graph, preserving original reference relationships.

Duplicate references with in a single stream serialize once; subsequent writes store only references. Thus:

Object obj = new Object();
out.writeObject(obj);
out.writeObject(obj);

Reads as:

Object ref1 = in.readObject();
Object ref2 = in.readObject();

Both variables reference identical objects. However, writing the same object to different streams creates distinct copies.

NIO.2 File I/O

The java.nio.file package and java.nio.file.attribute subpackage provide comprehensive file I/O support introduced in JDK 7. While extensive, developers primarily interact with Path and Files classes.

Path Fundamentals

File systems organize files hierarchically, with root nodes (e.g., / on Unix, C:\ on Windows) containing files and directories. Paths identify file locations, using system-specific separators (/ vs \).

Paths may be absolute (complete from root) or relative (requiring resolution against another path). Symbolic links (symlinks) represent special files referencing other files, generally transparent to applications except during deletion or renaming.

The Path Class

Path provides programmatic file system path representation, encapsulating file names and directory sequences. Instances reflect underlying platforms—Unix syntax on Solaris, Windows syntax on Windows.

Paths.get() factory methods create instances:

Path p1 = Paths.get("/var/log/app.log");
Path p2 = Paths.get(System.getProperty("user.home"), "docs", "report.txt");
Path p3 = Paths.get(URI.create("file:///etc/config.txt"));

Path manipulation methods include:

  • getFileName(): Terminal element
  • getParent(): Parent directory
  • getRoot(): Root component
  • subpath(int, int): Subsequence between indices
  • resolve(Path): Combine with partial path
  • relativize(Path): Construct relative path between two locations
  • normalize(): Remove redundant elements (., ..)
  • toAbsolutePath(): Convert to absolute form
  • toRealPath(): Resolve to actual file location (following symlinks, checking existence)

File Operations

The Files class provides static methods for file manipulation:

Existence and Accessibility:

boolean exists = Files.exists(path);
boolean readable = Files.isReadable(path);
boolean sameFile = Files.isSameFile(path1, path2);

Deletion:

Files.delete(path); // Throws NoSuchFileException if absent
Files.deleteIfExists(path); // Silent if absent

Copying and Moving:

Files.copy(source, target, StandardCopyOption.REPLACE_EXISTING);
Files.move(source, target, StandardCopyOption.ATOMIC_MOVE);

Metadata Management: Retrieve attributes individually:

long size = Files.size(path);
FileTime modified = Files.getLastModifiedTime(path);
UserPrincipal owner = Files.getOwner(path);

Or bulk-read via views:

BasicFileAttributes attrs = Files.readAttributes(path, BasicFileAttributes.class);
System.out.println("Created: " + attrs.creationTime());
System.out.println("Modified: " + attrs.lastModifiedTime());
System.out.println("Size: " + attrs.size());

View types include BasicFileAttributeView, PosixFileAttributeView, DosFileAttributeView, AclFileAttributeView, and UserDefinedFileAttributeView.

Reading and Writing

Small Files:

byte[] content = Files.readAllBytes(path);
List<String> lines = Files.readAllLines(path, StandardCharsets.UTF_8);
Files.write(path, content);

Buffered I/O:

try (BufferedReader reader = Files.newBufferedReader(path, charset)) {
    String line;
    while ((line = reader.readLine()) != null) {
        System.out.println(line);
    }
}

Stream I/O:

try (InputStream in = Files.newInputStream(path);
     BufferedReader reader = new BufferedReader(new InputStreamReader(in))) {
    // ...
}

Channel I/O:

try (SeekableByteChannel channel = Files.newByteChannel(path, StandardOpenOption.READ)) {
    ByteBuffer buffer = ByteBuffer.allocate(1024);
    while (channel.read(buffer) > 0) {
        buffer.flip();
        // process buffer
        buffer.clear();
    }
}

Random Access Files

SeekableByteChannel enables non-sequential file access:

try (SeekableByteChannel channel = Files.newByteChannel(path, 
        StandardOpenOption.READ, StandardOpenOption.WRITE)) {
    channel.position(100); // Seek to offset
    channel.read(buffer);
    channel.position(0);
    channel.write(otherBuffer);
}

Directory Operations

Creating:

Files.createDirectory(path); // Single level
Files.createDirectories(path); // Create parent dirs as needed
Path tempDir = Files.createTempDirectory("prefix");

Listing:

try (DirectoryStream<Path> stream = Files.newDirectoryStream(dir)) {
    for (Path entry : stream) {
        System.out.println(entry.getFileName());
    }
}

Glob Filtering:

try (DirectoryStream<Path> stream = 
        Files.newDirectoryStream(dir, "*.{java,class}")) {
    // Only .java and .class files
}

Custom Filters:

DirectoryStream.Filter<Path> filter = entry -> Files.isDirectory(entry);
try (DirectoryStream<Path> stream = Files.newDirectoryStream(dir, filter)) {
    // Only directories
}
Tags: Java

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.