Generic CSV File Reader in Java

CSV file format is widely used portable format to store and transfer data. Its used in various domains and industries as its very easy to use and understand or interpret. Also its very easy for any program to read and process the data from CSV file. It can be treated as data stored in tabular format with a separator and most commonly used separator is “Comma”. But some times when your data may contain comma, people prepare and use data that is separated by some other character e.g. semi colon. It depends upon the data you want to store. In the end if the separator is known, the program or csv file reader application that is going to read the file will be able to read it and process it.

Most of the times when your java program reads a file, it first prepares list of string and then split it and then it will populate the objects. Here we will see a Java csv reader that will give us fully read and processed data based on your domain object or data transfer object.

Software used in this example

  • Java 8
  • Eclipse

Below is the data that we have in our CSV, which our csv file reader java program will read, process and print.

First of all we need to define our domain object that will be mapped to above csv data.

Lets define our CSV reader class with some required class variables as

You can see that I have added <T> generic in class declarations, that is to make sure that each csv reader is associated with its domain object and we will use that domain class to map our data. The class also has 2 constructor variants to accept different parameters like separator, file path, header etc.. But one important variable that it accepts is the class of the domain object.

We have to ask for this class as even if we specify the domain object this information is not available at runtime due to Java Type Erasure

The class defines a init method in which we collect the class variables/field information of our mapped domain class and read the data from csv file.

In the read methods csv is read and split to form list of our domain object.

If ordering of variables is provided then that is used to map each csv value with the field mentioned in order. Otherwise, the csv values will be assigned as per they are collected from domain class.

This will give you simply the list of domain objects mapped to csv file. What if you want to process them first before that are passed to any other flow or application. Well for that we can have our own custom processor that can be attached to the reader.

We will define an generic interface to make the implementation consistent. This interface will have only one method which will accept the generic object as input and it will return the processed object back.

A sample processor that will convert the names to upper case can be written as

You can put all your transformation business logic here. This will provide you logical separation of your reading and processing.

Now how do we call it? Check it out below

You can see above, that how I have specified the domain object, ordering and processor. Its that simple 🙂

You can download this code from Git. Please let me know if any issues in the code or any suggestions

 

Add a Comment

Your email address will not be published. Required fields are marked *