Accessing and Paginating CSV Files with DataValve
- Install Maven
- Installing DataValve Into Maven
While DataValve is mostly used with database driven back ends, this tutorial shows you how DataValve can turn a comma delimited file into a paginated list of objects that the user can page through. We will then use this data provider in a console application, a Swing application and a JSF web page using the DataValve data client interfaces.
We will start by writing this tutorial as a console application and then demonstrate how the data provider can be used in other applications, even web applications. We’ll start by creating a new Maven application and then include the dependencies for DataValve. We’ll then create our comma delimited data provider, define a row mapper class for it, and hook it up to our demo data. We’ll then attach it to a dataset so we can easily iterate through the data. We will then take our provider and use it in different clients.
- Create a new Maven application in your IDE and add
datavalve-dataset
as a dependency in yourpom.xml
file.<dependencies> <dependency> <groupId>org.fluttercode.datavalve</groupId> <artifactId>datavalve-dataset</artifactId> <version>0.9.0.CR2</version> </dependency> </dependencies>
- Create a new class called
Person
in the package of your choice. I usedorg.fluttercode.tutorials.datavalve.csvreader
We will only be using one package in this demonstration. - The
Person
class will be our model which will be populated from the comma delimited file.public class Person { private Integer id; private String firstName; private String middleName; private String lastName; private Date dob; private String email; private String address; private String city; private String zip; public Person() { } public Person(Integer id, String firstName, String lastName,String middleName, Date dob, String email, String address, String city, String zip) { super(); this.id = id; this.firstName = firstName; this.middleName = middleName; this.lastName = lastName; this.email = email; this.dob = dob; this.address = address; this.city = city; this.zip = zip; } public String getName() { return firstName + " " + lastName; } public String getAddressText() { return address+ "," + city+","+zip; } @Override public String toString() { return String.format("Person [id=%d, name=%s, dob=%s,address=%s, email=%s", id,getName(),dob,getAddressText(),email); } .... Getters and Setters Omitted .... }
- In order to convert the csv data to an object, we need to create a
ColumnarRowMapper
instance. This takes a row of data, already converted to an array of string values, builds the model object from it and returns it back to the caller. Create a new class calledPersonRowMapper
. - The only other element we need is some test data to run it on which you can download from here. Download this file and place it in the package folder with your source code. Depending on your IDE, you may need to refresh the folder in the package explorer so it can pick up the new file in the folder. The example file has 100 rows of data in it.
- Create a new class called
ProviderFactory
which will create our provider and initialize it with the CSV file url.public class ProviderFactory { public static CommaDelimitedProvider<Person> createProvider() { URL url = ProviderFactory.class.getResource("data.csv"); CommaDelimitedProvider<Person> provider; provider = new CommaDelimitedProvider<Person>(url.getFile()); provider.setRowMapper(new PersonRowMapper()); return provider; } }
We use a
URL
to reference the csv file stored in the jar from which we get aFile
instance that we can pass to ourCommaDelimitedProvider
. This relies on thedata.csv
file being in the same location as this class. Alternatively, you can put the file elsewhere and just create a regularFile
object to it rather than use theURL
. - For the console part of this demo, create a
Main
class and add amain
method containing the code get an instance of the provider, and perform a simple iteration through the data.public class Main { public static void main(String[] args) { CommaDelimitedProvider<Person> provider; provider= ProviderFactory.createProvider(); List<Person> results = provider.fetchResults(new DefaultPaginator()); for (Person p : results) { System.out.println(p); } } }
In this example we are reading all the results at once in memory and displaying the whole list.
public class PersonRowMapper implements ColumnarRowMapper<Person> { private static SimpleDateFormat converter = new SimpleDateFormat( "MM/dd/yyyy"); public Person mapRow(String[] values) { Date dob = null; try { dob = converter.parse(values[5]); } catch (ParseException e) { } return new Person(DataConverter.getInteger(values[0]), values[1], values[2], values[3], dob, values[4], values[6], values[7], values[8]); } }
This row mapper implements the ColumnarRowMapper
interface and implements the single method to return a built entity object from the array of column values passed in.
Paginating the results
We can control which results and how many are displayed by using a paginator to control the flow of data. This can be useful if you have a huge file and it would be impractical to read all the results into memory in one go. Datasets have paginators built in and use a reference to a provider to fetch the actual data.
- In the
main
method, instead of using the provider directly, create a dataset and pass it a reference to our data provider.public static void main(String[] args) { CommaDelimitedProvider<Person> provider; provider = ProviderFactory.createProvider(); CommaDelimitedDataset<Person> ds = new CommaDelimitedDataset<Person>(provider); ds.setMaxRows(10); List<Person> results = ds.getResultList(); for (Person p : results) { System.out.println(p); } }
We have set the maximum number of rows to 10 so the results returned only contains 10 rows of data at most. We can use this mechanism to control the number of results fetched if they are paginated.
- The dataset classes implement the
Iterable
interface which lets us iterate over the entire set of data. By setting the page size, we can control the batch sizes when the data is fetched in while it is being iterated over. This allows us to control the flow of data, either to reduce the in-memory realization of the source data, or to reduce latency in an otherwise slow data source as the application can process one page of data while the provider is loading the next page of data.public static void main(String[] args) { CommaDelimitedProvider<Person> provider; provider = ProviderFactory.createProvider(); CommaDelimitedDataset<Person> ds = new CommaDelimitedDataset<Person>(provider); ds.setMaxRows(5); for (Person p : ds) { System.out.println(p); } }
- You can see this process in action by extending the comma delimited data provider as an anonymous class and adding logging to the post fetch methods. We do this by modifying the
ProviderFactory.createProvider()
method :public static CommaDelimitedProvider<Person> createProvider() { URL url = Main.class.getResource("data.csv"); CommaDelimitedProvider<Person> provider = new CommaDelimitedProvider<Person>(url.getFile()) { @Override protected List<Person> doPostFetchResults(List<Person> results, Paginator paginator) { int end = paginator.getFirstResult()+paginator.getMaxRows(); System.out.println("Fetching results from "+paginator.getFirstResult()+" to "+end); return results; } }; provider.setRowMapper(new PersonRowMapper()); return provider; }
-
If you run this now, with the max rows set to 5, in you will see in that log that we iterate through all the records, but we only fetch the results every 5 rows since the max results is set to 5.
Fetching results from 0 to 5 Person [id=6825, name=JUANITA LAMBERT, address=139 MANNING HWY,CLYO,76604, email=mbeasley@everyma1l.biz Person [id=5740, name=GREG CABRERA, address=736 GENESSEE BLVD,CORDELE,17433, email=cholder@b1zmail.biz Person [id=8599, name=ALISSA WISE, address=205 ALICE RD,CAMILLA,14855, email=theyweb@eyec0de.net Person [id=9282, name=SHARON WINTERS, address=955 COHEN PIKE,TYRONE,811, email=jlogan@hotma1l.com Person [id=2150, name=KRISTY FRANKS, address=1471 ALEXIS PKWY,BALDWIN,85, email=jgates3@somema1l.com Fetching results from 5 to 10 Person [id=9927, name=JEFF RICE, address=104 DUNDEE PKWY,HOGANSVILLE,3741, email=diedlots@b1zmail.org Person [id=7972, name=TAMARA BRYANT, address=1382 WOGAN BLVD,CITY OF CALHOUN,43790, email=hotworn@everyma1l.us Person [id=5824, name=ALISHA YANG, address=716 HOGANS DR,HARDING,58932, email=foundwrong@hotma1l.net Person [id=3402, name=JASON NGUYEN, address=527 MICHAEL CRES,FORT STEWART,14664, email=haveothers@ma1l2u.com Person [id=3620, name=LINDSEY CABRERA, address=1420 LAZELERE HTS,FORT STEWART,21650, email=askeddreams@b1zmail.com Fetching results from 10 to 15 Person [id=3511, name=ANTHONY MATHEWS, address=1325 OKEY LN,THUNDERBOLT,63656, email=cfarrell@everyma1l.co.uk Person [id=572, name=JARED FORD, address=722 EUCLID RD,FORSYTH,42014, email=ortrying@b1zmail.co.uk Person [id=9720, name=AUTUMN WILLIAMS, address=260 HILL PARK,NORCROSS,58355, email=roomwhere@hotma1l.net Person [id=3447, name=MARION BROWN, address=739 MIDDLE PATH,MACON,9944, email=hwagner@b1zmail.co.uk Person [id=9356, name=HOPE HAYNES, address=1023 COOPERRIDERS CRES,STATENVILLE,34247, email=sacrificeit@eyec0de.com Fetching results from 15 to 20 Person [id=2259, name=JEANNIE RANDOLPH, address=672 EDISON PATH,CENTERVILLE,48580, email=wornto@eyec0de.net Person [id=8264, name=RACHAEL CONLEY, address=1223 STEVENS CT,CARROLLTON,40084, email=smokewhite20@eyec0de.net
Creating a Swing Client
DataValve provides an interface for data access that can be used by different clients. Lets look at using our provider with a Swing JTable
.
- Start by creating a new class that will be the Swing frame that contains the table and the scroll pane.
public class CsvTableFrame extends JFrame { private JTable table; private JScrollPane pane; public CsvTableFrame(CommaDelimitedProvider<Person> provider) { initControls(); initModel(provider); } //here we will create the table model and attach the provider private void initModel(CommaDelimitedProvider<Person> provider) { } //construct the gui table and scrollable panel private void initControls() { setTitle("CSV Data"); setSize(400, 400); setDefaultCloseOperation(EXIT_ON_CLOSE); setVisible(true); table = new JTable(); pane = new JScrollPane(table); table.setAutoResizeMode(JTable.AUTO_RESIZE_OFF); getContentPane().add(pane); } }
This will setup the display for showing a frame with a scrollable table in it.
- Now we need to implement the
initModel
method which will create aProviderTableModel
and attach the provider passed in. The table model class enables us to present our data to the Swing table in a way it understands. TheProviderTableModel
implements a method which supplies column values to the model from the data supplied by the provider. The columns are defined in the latter half of the method and determines the order used to supply data to the table for each column.//here we will create the table model and attach the provider private void initModel(CommaDelimitedProvider<Person> provider) { ProviderTableModel<Person> model = new ProviderTableModel<Person>( provider) { @Override protected Object getColumnValue(Person person, int column) { switch (column) { case 0: return person.getId(); case 1: return person.getName(); case 2: return person.getEmail(); case 3: return person.getAddressText(); default: throw new RuntimeException( "Unexpected column for person object " + column); } } }; //add the columns to the model model.addColumn("Id"); model.addColumn("Name"); model.addColumn("Email"); model.addColumn("Address"); //assign this model to the table table.setModel(model); }
- The last piece we need to change is in the
main
method where we will create our provider and then pass it into the creation of our Swing frame.public static void main(String[] args) { CommaDelimitedProvider<Person> provider; provider = ProviderFactory.createProvider(); new CsvTableFrame(provider); }
Creating the
CsvTableFrame
initializes and shows the Swing window. - If we run this now, we will get a Swing Window which contains a table with our data in it.
The model controls the flow of the data and even includes built-in caching and look-ahead loading so no matter how big your CSV dataset is, there is no long delay while the data is loaded and converted to Java objects. In this case, we pass only the provider to the model and the model is responsible for how much data is fetched in each batch.
If you look in the log as you scroll down the list, you will see that it is loading in the data as you scroll. If you go to the end of the list, and start slowly scrolling back up, you won’t see any more loading messages until you get to the top of the list. This is because the values are cached, but if you have a large dataset, the least recently used items (i.e. those at the top of the table) are removed from the cache and therefore need re-loading when you go back to the start of the list.
Creating a JSF Client
The DataValve API allows us to re-use providers with many different clients which we’ll demonstrate with JSF.
- Start by creating a new JSF web application, or use the Knappsack Archetypes for Maven to get started quickly. Use the basic archetype so there is no existing application in there or alternative
Person
objects to clash with. - Add the
datavalve-dataset
API and thedatavalve-faces
dependencies to the project,<dependency> <groupId>org.fluttercode.datavalve</groupId> <artifactId>datavalve-dataset</artifactId> <version>0.9.0.CR2</version> </dependency> <dependency> <groupId>org.fluttercode.datavalve</groupId> <artifactId>datavalve-faces</artifactId> <version>0.9.0.CR2</version> </dependency>
- For convenience, copy the
ProviderFactory.java
,Person.java
andPersonRowMapper.java
classes over to the new project. - Create a new class called that will be our backing bean that will hold the dataset the JSF page will go against.
@Named("csvDataset") @RequestScoped public class CsvDatasetBean { CommaDelimitedDataset<Person> dataset = new CommaDelimitedDataset<Person>(ProviderFactory.createProvider()); public CsvDatasetBean() { //initialize the dataset settings dataset.setMaxRows(10); } public CommaDelimitedDataset<Person> getDataset() { return dataset; } }
- Now we have created the backing bean pieces, open
home.xhtml
to edit it, and replace the hello world text with the following :<?xml version="1.0" encoding="UTF-8"?> <ui:composition xmlns="http://www.w3.org/1999/xhtml" xmlns:ui="http://java.sun.com/jsf/facelets" xmlns:f="http://java.sun.com/jsf/core" xmlns:h="http://java.sun.com/jsf/html" xmlns:dv="http://java.sun.com/jsf/composite/datavalve" template="/WEB-INF/templates/template.xhtml"> <ui:define name="content"> <h:dataTable value="#{csvDataset.dataset.resultList}" var="v_person"> <h:column> <f:facet name="header">ID</f:facet> <h:outputText value="#{v_person.id}" /> </h:column> <h:column> <f:facet name="header">Name</f:facet> <h:outputText value="#{v_person.name}" /> </h:column> <h:column> <f:facet name="header">Email</f:facet> <h:outputText value="#{v_person.email}" /> </h:column> </h:dataTable> <h:form> <dv:simplePaginator paginator="#{csvDataset.dataset}" /> </h:form> </ui:define> </ui:composition>
This page contains a table that takes the results from
#{csvDataset.dataset.resultList}
and displays the id, name and email fields. The last item wrapped in a form is thedatavalve-faces
default paginator which allows to you scroll across the data. This is provided as part ofdatavalve-faces
and the namespace is added at the top of the page. With JSF 2.0 including support for AJAX, you can have AJAX enabled pagination by setting the attributes on the component.
Summary
This tutorial has shown how to consume csv files in a way that is re-usable across different client applications using DataValve. Alternatively, if you change the implementation of ProviderFactory.createProvider
to return a different type of provider (i.e. JDBC, ORM, Hibernate) as long is it returns the same kind of Person
object, your code will run unchanged in the clients you create. Given the simplicity of the DataValve interfaces, it is not hard to see how easy it is to create providers for other file types whether they be text or binary based.