Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Implementing File Compression and Extraction in Java

Tech May 15 2

Extracting ZIP Files

To decompress a ZIP archive, we utilize the ZipInputStream class to read the entries sequentially. The following implementation defines a destination directory based on the archive name, ensuring that any pre-existing directory is removed before extraction begins. This cleanup process requires a utility like FileUtils from Apache Commons IO because the standard File.delete() method cannot remove non-empty directories.

import java.io.*;
import java.util.zip.*;
import org.apache.commons.io.FileUtils;

public class ZipHandler {
    public static void extractArchive(String sourcePath) {
        File sourceFile = new File(sourcePath);
        String baseName = sourceFile.getName().substring(0, sourceFile.getName().lastIndexOf("."));
        File destinationDir = new File(sourceFile.getParent(), baseName);

        if (destinationDir.exists()) {
            try {
                FileUtils.deleteDirectory(destinationDir);
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        destinationDir.mkdirs();

        try (ZipInputStream zipIn = new ZipInputStream(new FileInputStream(sourceFile))) {
            ZipEntry entry;
            while ((entry = zipIn.getNextEntry()) != null) {
                File newFile = new File(destinationDir, entry.getName());
                if (entry.isDirectory()) {
                    newFile.mkdirs();
                } else {
                    // Ensure parent directories exist
                    new File(newFile.getParent()).mkdirs();
                    newFile.createNewFile();
                    try (FileOutputStream fos = new FileOutputStream(newFile)) {
                        byte[] buffer = new byte[1024];
                        int length;
                        while ((length = zipIn.read(buffer)) > 0) {
                            fos.write(buffer, 0, length);
                        }
                    }
                }
                zipIn.closeEntry();
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

Each file and directory within the ZIP archive is represented as a ZipEntry. By iterating through these entries, we reconstruct the directory structure on the local file system and write the file contents using a buffer stream.

Extracting RAR Files

Handling RAR archives requires a different approach, often utilizing a third-party library such as Junrar. The logic is similar to ZIP extraction: we prepare a target directory, parse the archive headers, and write the data. However, since headers are not necessarily returned in a hierarchical order, we must sort them to ensure parent directories are created before their children.

import com.github.junrar.Archive;
import com.github.junrar.rarfile.FileHeader;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.Collections;
import java.util.Comparator;
import java.util.List;

public class RarHandler {
    public static void extractRar(String sourcePath) {
        File sourceFile = new File(sourcePath);
        String baseName = sourceFile.getName().substring(0, sourceFile.getName().lastIndexOf("."));
        File destinationDir = new File(sourceFile.getParent(), baseName);

        if (destinationDir.exists()) {
            try {
                FileUtils.deleteDirectory(destinationDir);
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        destinationDir.mkdirs();

        try (Archive archive = new Archive(new FileInputStream(sourceFile))) {
            List<FileHeader> headers = archive.getFileHeaders();
            // Sort entries to handle directories before files
            headers.sort(Comparator.comparing(FileHeader::getFileName));

            for (FileHeader header : headers) {
                File newFile = new File(destinationDir, header.getFileName());
                if (header.isDirectory()) {
                    newFile.mkdirs();
                } else {
                    new File(newFile.getParent()).mkdirs();
                    try (InputStream is = archive.getInputStream(header)) {
                        FileUtils.copyInputStreamToFile(is, newFile);
                    }
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

In this snippet, FileHeader objects represent the contents of the RAR file. We use FileUtils.copyInputStreamToFile to transfer the data efficiently from the archive input stream to the disk.

Compressing Files into ZIP Format

Creating a ZIP archive involves ZipOutputStream, which acts as a filter stream compressing data as it is written. The process involves iterating over source files, creating a ZipEntry for each, and writing the file bytes into the stream.

import java.io.*;
import java.nio.file.Files;
import java.util.zip.ZipEntry;
import java.util.zip.ZipOutputStream;

public class ZipCreator {
    public static void compressDirectory(File sourceDir) {
        File zipFile = new File(sourceDir.getParent(), sourceDir.getName() + ".zip");
        File[] files = sourceDir.listFiles();

        if (files == null) return;

        try (ZipOutputStream zos = new ZipOutputStream(new FileOutputStream(zipFile))) {
            for (File file : files) {
                if (file.isFile()) {
                    ZipEntry entry = new ZipEntry(file.getName());
                    zos.putNextEntry(entry);
                    zos.write(Files.readAllBytes(file.toPath()));
                    zos.closeEntry();
                }
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

This method reads all files located directly in the source directory and adds them to the root of the created ZIP file. The putNextEntry method signals the start of a new file entry in the archive, and closeEntry finalizes it before moving to the next.

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.