Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Implementing CSV File Processing with FlatFileItemReader and FlatFileItemWriter

Tech 3

Reading Delimited CSV Files

A batch job can read football player data from a CSV file with the following structure:

ID,lastName,firstName,position,birthYear,debutYear
AbduKa00,Abdul-Jabbar,Karim,rb,1974,1996
AbduRa00,Abdullah,Rabih,rb,1975,1999
AberWa00,Abercrombie,Walter,rb,1959,1982
AbraDa00,Abramowicz,Danny,wr,1945,1967
AdamBo00,Adams,Bob,te,1946,1969
AdamCh00,Adams,Charlie,wr,1979,2003

This data maps to a Player class:

public class Player {
    private String id;
    private String surname;
    private String givenName;
    private String role;
    private int yearOfBirth;
    private int firstYear;

    // Getters and setters
    public String toString() {
        return "PLAYER:ID=" + id + ",Surname=" + surname +
               ",Given Name=" + givenName + ",Role=" + role +
               ",Birth Year=" + yearOfBirth + ",First Year=" + firstYear;
    }
}

Define a FieldSetMapper to convert each line into a Player object:

public class PlayerMapper implements FieldSetMapper<Player> {
    public Player mapFieldSet(FieldSet fields) {
        if (fields == null) return null;
        Player player = new Player();
        player.setId(fields.readString(0));
        player.setSurname(fields.readString(1));
        player.setGivenName(fields.readString(2));
        player.setRole(fields.readString(3));
        player.setYearOfBirth(fields.readInt(4));
        player.setFirstYear(fields.readInt(5));
        return player;
    }
}

Configure a FlatFileItemReader to process the file:

@Bean
public FlatFileItemReader<Player> csvReader() {
    FlatFileItemReader<Player> reader = new FlatFileItemReader<>();
    reader.setResource(new FileSystemResource("players.csv"));
    DefaultLineMapper<Player> mapper = new DefaultLineMapper<>();
    mapper.setLineTokenizer(new DelimitedLineTokenizer());
    mapper.setFieldSetMapper(new PlayerMapper());
    reader.setLineMapper(mapper);
    reader.setLinesToSkip(1);
    return reader;
}

Test the reader:

@Test
public void testReader() throws Exception {
    FlatFileItemReader<Player> reader = csvReader();
    reader.open(new ExecutionContext());
    Player player;
    while ((player = reader.read()) != null) {
        System.out.println(player);
    }
    reader.close();
}

Mapping Fields by Name

To improve readability, specify column names in the tokenizer:

DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();
tokenizer.setNames("id", "surname", "givenName", "role", "yearOfBirth", "firstYear");

Udpate the mapper to use named fields:

public class NamedPlayerMapper implements FieldSetMapper<Player> {
    public Player mapFieldSet(FieldSet fields) {
        if (fields == null) return null;
        Player player = new Player();
        player.setId(fields.readString("id"));
        player.setSurname(fields.readString("surname"));
        player.setGivenName(fields.readString("givenName"));
        player.setRole(fields.readString("role"));
        player.setYearOfBirth(fields.readInt("yearOfBirth"));
        player.setFirstYear(fields.readInt("firstYear"));
        return player;
    }
}

Reading Fixed-Length Files

Fixed-length files have fields with specific character widths. Example data:

UK21341EAH4121131.11customer1
UK21341EAH4221232.11customer2
UK21341EAH4321333.11customer3
UK21341EAH4421434.11customer4
UK21341EAH4521535.11customer5

Fields: ISIN (12 chars), quantity (3 chars), price (5 chars), customer (9 chars).

Define a Watch class:

public class Watch {
    private String isin;
    private int quantity;
    private BigDecimal price;
    private String customer;
    // Getters and setters
}

Create a maper:

public class WatchMapper implements FieldSetMapper<Watch> {
    public Watch mapFieldSet(FieldSet fields) {
        if (fields == null) return null;
        Watch watch = new Watch();
        watch.setIsin(fields.readString(0));
        watch.setQuantity(fields.readInt(1));
        watch.setPrice(fields.readBigDecimal(2));
        watch.setCustomer(fields.readString(3));
        return watch;
    }
}

Configure the reader with FixedLengthTokenizer:

@Bean
public FlatFileItemReader<Watch> fixedLengthReader() {
    FlatFileItemReader<Watch> reader = new FlatFileItemReader<>();
    reader.setResource(new FileSystemResource("watches.txt"));
    DefaultLineMapper<Watch> mapper = new DefaultLineMapper<>();
    FixedLengthTokenizer tokenizer = new FixedLengthTokenizer();
    tokenizer.setNames("isin", "quantity", "price", "customer");
    tokenizer.setColumns(new Range(1, 12),
                         new Range(13, 15),
                         new Range(16, 20),
                         new Range(21, 29));
    mapper.setLineTokenizer(tokenizer);
    mapper.setFieldSetMapper(new WatchMapper());
    reader.setLineMapper(mapper);
    return reader;
}

Test the fixed-length reader:

@Test
public void testFixedLengthReader() throws Exception {
    FlatFileItemReader<Watch> reader = fixedLengthReader();
    reader.open(new ExecutionContext());
    Watch watch;
    while ((watch = reader.read()) != null) {
        System.out.println(watch);
    }
    reader.close();
}

Writing Delimited CSV Files

Use DelimitedLineAggregator to write data to a CSV file.

Configrue a writer:

@Bean
public FlatFileItemWriter<Watch> csvWriter() throws Exception {
    BeanWrapperFieldExtractor<Watch> extractor = new BeanWrapperFieldExtractor<>();
    extractor.setNames("isin", "quantity", "price", "customer");
    extractor.afterPropertiesSet();

    DelimitedLineAggregator<Watch> aggregator = new DelimitedLineAggregator<>();
    aggregator.setDelimiter(",");
    aggregator.setFieldExtractor(extractor);

    FlatFileItemWriter<Watch> writer = new FlatFileItemWriter<>();
    writer.setLineAggregator(aggregator);
    writer.setResource(new FileSystemResource("output.csv"));
    writer.setEncoding("UTF-8");
    writer.setAppendAllowed(true);
    writer.setLineSeparator("\n");
    return writer;
}

Write data:

@Test
public void testWriter() throws Exception {
    FlatFileItemWriter<Watch> writer = csvWriter();
    writer.open(new ExecutionContext());
    List<Watch> items = fetchData(); // Assume this retrieves data
    Chunk<Watch> chunk = new Chunk<>(items);
    writer.write(chunk);
    writer.close();
}

Writing Fixed-Length Files

Use FormatterLineAggregator for fixed-length output.

Configure the writer:

@Bean
public FlatFileItemWriter<Watch> fixedLengthWriter() throws Exception {
    BeanWrapperFieldExtractor<Watch> extractor = new BeanWrapperFieldExtractor<>();
    extractor.setNames("isin", "quantity", "price", "customer");
    extractor.afterPropertiesSet();

    FormatterLineAggregator<Watch> aggregator = new FormatterLineAggregator<>();
    aggregator.setFormat("%-12s%-3s%-5s%-9s");
    aggregator.setFieldExtractor(extractor);

    FlatFileItemWriter<Watch> writer = new FlatFileItemWriter<>();
    writer.setLineAggregator(aggregator);
    writer.setResource(new FileSystemResource("output.txt"));
    writer.setEncoding("UTF-8");
    writer.setAppendAllowed(true);
    return writer;
}

Adjust field widths by modiyfing the format string, e.g., "%-14s%-5s%-7s%-11s" for wider columns.

Test the writer:

@Test
public void testFixedLengthWriter() throws Exception {
    FlatFileItemWriter<Watch> writer = fixedLengthWriter();
    writer.open(new ExecutionContext());
    List<Watch> items = fetchData();
    Chunk<Watch> chunk = new Chunk<>(items);
    writer.write(chunk);
    writer.close();
}

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.