Home > Tech > Content

Manipulating Excel Files with Apache POI in Java

Tech May 9 4

Handling srpeadsheet data programmatically in Java is most efficiently achieved using the Apache POI library. To support both legacy .xls and modern .xlsx formats, include the following Maven dependency in your project configuration:

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>5.2.3</version>
</dependency>

Workbook API Architecture

Apache POI separates Excel handling based on file versions and memory requirements:

HSSFWorkbook: Targets older Excel 97-2003 files (.xls).
XSSFWorkbook: Targets Excel 2007+ files (.xlsx).
SXSSFWorkbook: A low-memory streaming implementation of XSSF. Standard XSSF structures consume excessive heap space and will throw errors when processing datasets exceeding approximate 65,000 rows.
WorkbookFactory: An abstract factory that inspects the file signature and automatically returns the appropriate Workbook instance, eliminating manual format detection.

Reading Spreadsheet Data

Extraction follows a hierarchical traversal: load the workbook, select a sheet, iterate through row indices, and extract cell values. Always verify cell existence and utilize safe type conversion to avoid runtime casting exceptions.

Data Extraction Implementation

import org.apache.poi.ss.usermodel.*;
import org.apache.poi.openxml4j.exceptions.InvalidFormatException;

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

public class ExcelDataSource {
    public static void main(String[] args) {
        String targetPath = "C:/data/inventory.xlsx";
        List<String> parsedRecords = new ArrayList<>();

        try (FileInputStream stream = new FileInputStream(new File(targetPath))) {
            Workbook spreadsheet = WorkbookFactory.create(stream);
            Sheet activeSheet = spreadsheet.getSheetAt(0);
            DataFormatter valueFormatter = new DataFormatter();

            int startRow = activeSheet.getFirstRowNum();
            int endRow = activeSheet.getLastRowNum();

            for (int r = startRow; r <= endRow; r++) {
                Row currentLine = activeSheet.getRow(r);
                if (currentLine != null) {
                    int startCol = currentLine.getFirstCellNum();
                    int endCol = currentLine.getLastCellNum();

                    for (int c = startCol; c < endCol; c++) {
                        Cell targetCell = currentLine.getCell(c, Row.MissingCellPolicy.RETURN_BLANK_AS_NULL);
                        if (targetCell != null) {
                            // Automatically handles numeric, date, and boolean types
                            String cellText = valueFormatter.formatCellValue(targetCell);
                            parsedRecords.add(cellText);
                        }
                    }
                }
            }
            parsedRecords.forEach(System.out::println);
        } catch (IOException | InvalidFormatException e) {
            System.err.println("Spreadsheet read operation failed: " + e.getMessage());
        }
    }
}

Writing Spreadsheet Data

Document generation requires instantiating workbook and sheet objects, mapping data to specific coordinates, and flushing the in-memory structure to a file output stream. Proper stream closure is critical to prevent file corruption or handle leaks.

Data Export Implementation

import org.apache.poi.ss.usermodel.*;

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;

public class ExcelDataSink {
    public static void main(String[] args) {
        String outputPath = "C:/data/report.xlsx";

        try (Workbook targetBook = new XSSFWorkbook();
             FileOutputStream outputStream = new FileOutputStream(new File(outputPath))) {

            Sheet outputSheet = targetBook.createSheet("ProcessedReport");
            Row insertRow = outputSheet.createRow(2);
            Cell dataCell = insertRow.createCell(3);

            dataCell.setCellValue("Automated Report Entry");

            targetBook.write(outputStream);
        } catch (IOException e) {
            System.err.println("Export stream error: " + e.getMessage());
        }
    }
}

Development Best Practices

Dimension Calculation: Avoid getPhysicalNumberOfRows() and getPhysicalNumberOfCells() for boundary detection. These methods ignore empty slots, resulting in truncated loops. Use getLastRowNum() and getLastCellNum() combined with null checks to safely handle sparse datasets.
Type Safety: Directly invoking getStringCellValue() on numeric or boolean cells triggers an IllegalStateException. Implement DataFormatter or a custom type-switching mechanism to guarantee stable extraction across mixed cell formats.
Resource Lifecycle: Always manage Workbook and OutputStream instances via try-with-resources or explicit finally blocks. Unclosed streams frequently cause FileLock exceptions in subsequent operations.

Back to List

Prev: Core Data Structures in Python: Lists, Tuples, and Dictionaries

Next: Deploying BIND DNS for Bidirectional Lookup Resolution

Fading Coder

Manipulating Excel Files with Apache POI in Java

Workbook API Architecture

Reading Spreadsheet Data

Data Extraction Implementation

Writing Spreadsheet Data

Data Export Implementation

Development Best Practices

Related Articles

Understanding Strong and Weak References in Java

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

SBUS Signal Analysis and Communication Implementation Using STM32 with Fus Remote Controller

Leave a Comment

Copyright © fadingcoder.top

Fading Coder

Manipulating Excel Files with Apache POI in Java

Workbook API Architecture

Reading Spreadsheet Data

Data Extraction Implementation

Writing Spreadsheet Data

Data Export Implementation

Development Best Practices

Related Articles

Understanding Strong and Weak References in Java

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

SBUS Signal Analysis and Communication Implementation Using STM32 with Fus Remote Controller

Leave a CommentCancel Reply

Copyright © fadingcoder.top

Leave a Comment