Start line:  
End line:  

Snippet Preview

Snippet HTML Code

Stack Overflow Questions
  /*
  	Milyn - Copyright (C) 2006 - 2010
  
  	This library is free software; you can redistribute it and/or
  	modify it under the terms of the GNU Lesser General Public
  	License (version 2.1) as published by the Free Software
  	Foundation.
  
  	This library is distributed in the hope that it will be useful,
 	but WITHOUT ANY WARRANTY; without even the implied warranty of
 	MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
 
 	See the GNU Lesser General Public License for more details:
 	http://www.gnu.org/licenses/lgpl.txt
 */
 
 package org.milyn.csv;
 
 
 import java.io.Reader;
 import java.util.List;
 import java.util.Map;


CSV Reader.

This CSV Reader can be plugged into the Smooks (for example) in order to convert a CSV based message stream into a stream of SAX events to be consumed by the DOMBuilder.

Configuration

To maintain a single binding instance in memory:
 <?xml version="1.0"?>
 <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:csv="http://www.milyn.org/xsd/smooks/csv-1.2.xsd">

     <csv:reader fields="" separator="" quote="" skipLines="" rootElementName="" recordElementName="">
         <csv:singleBinding beanId="" class="" />
     </csv:reader>

 </smooks-resource-list>

To maintain a java.util.List of binding instances in memory:

 <?xml version="1.0"?>
 <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:csv="http://www.milyn.org/xsd/smooks/csv-1.2.xsd">

     <csv:reader fields="" separator="" quote="" skipLines="" rootElementName="" recordElementName="">
         <csv:listBinding beanId="" class="" />
     </csv:reader>

 </smooks-resource-list>

To maintain a java.util.Map of binding instances in memory:

 <?xml version="1.0"?>
 <smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd" xmlns:csv="http://www.milyn.org/xsd/smooks/csv-1.2.xsd">

     <csv:reader fields="" separator="" quote="" skipLines="" rootElementName="" recordElementName="">
         <csv:mapBinding beanId="" class="" keyField="" />
     </csv:reader>

 </smooks-resource-list>

Strict parsing

Strict parsing was the only option until Smooks 1.2.x, whereby lines that would not comply with the provided tokens (that it where the tokens present in the line is less than the number of tokens expected) would be garbled and a WARN log statement was provided. Now, you can decide if you want those lines to be parsed too, this is accomplished by setting strict="false" on the config.

String manipulation functions

String manipulation functions can be defined per field. These functions are executed before that the data is converted into SAX events. The functions are defined after the field name, separated with a question mark. So a field definition with string functions could look like this: firstname?trim,lastname?right_trim,gender?upper_case Take a look in the Smooks manual for a list of all available functions.

Ignoring Fields

To ignore a field in a CSV record set, just insert the string "$ignore$" for that field in the fields attribute.

Simple Java Bindings

A simple java binding can be configured on the reader configuration. This allows quick binding configuration where the CSV records map cleanly to the target bean. For more complex bindings, use the Java Binging Framework.

Example Usage

So the following configuration could be used to parse a CSV stream into a stream of SAX events:
 <csv:reader fields="name,address,$ignore$,item,quantity" />

Within Smooks, the stream of SAX events generated by the "Acme-Order-List" message (and this parser) will generate an event stream equivalent to the following:

 <csv-set>
  <csv-record number="1">
   <name>Tom Fennelly</name>
   <address>Ireland</address>
   <item>V1234</item>
   <quantity>3</quantity>
  <csv-record>
  <csv-record number="2">
   <name>Joe Bloggs</name>
   <address>England</address>
   <item>D9123</item>
   <quantity>7</quantity>
  <csv-record>
 </csv-set>

Other profile based transformations can then be used to transform the CSV records in accordance with the requirements of the consuming entities.

Author(s):
tfennelly
public class CSVReader implements SmooksXMLReaderVisitorAppender {
	private static Log logger = LogFactory.getLog(CSVReader.class);
    private static Attributes EMPTY_ATTRIBS = new AttributesImpl();
    private static final String IGNORE_FIELD = "$ignore$";
    private static char[] INDENT_LF = new char[] {'\n'};
    private static char[] INDENT_1  = new char[] {'\t'};
    private static char[] INDENT_2  = new char[] {'\t''\t'};
    private static String RECORD_NUMBER_ATTR = "number";
    private static String RECORD_TRUNCATED_ATTR = "truncated";
    @ConfigParam(name = "fields")
    private String[] csvFields;
    private Field[] fields;
    @ConfigParam(defaultVal = ",")
    private char separator;
    @ConfigParam(name = "quote-char", defaultVal = "\"")
    private char quoteChar;
    @ConfigParam(name = "skip-line-count", defaultVal = "0")
    private int skipLines;
    @ConfigParam(defaultVal = "UTF-8")
    private Charset encoding;
    @ConfigParam(defaultVal="csv-set")
    private String rootElementName;
    @ConfigParam(defaultVal="csv-record")
    private String recordElementName;
    @ConfigParam(defaultVal="false")
    private boolean indent;
    @ConfigParam(defaultVal="true")
    private boolean strict;
	@ConfigParam(defaultVal = "false")
	private boolean validateHeader;
    private String bindBeanId;
    private Class<?> bindBeanClass;
    private CSVBindingType bindingType;
    private String bindMapKeyField;
    private static final String RECORD_BEAN = "csvRecordBean";
	public void initialize() {
	}
    public void addVisitors(VisitorConfigMap visitorMap) {
        if( != null &&  != null) {
            Bean bean;
            if( == .) {
                Bean listBean = new Bean(ArrayList.class"$document");
                bean = listBean.newBean();
                listBean.bindTo(bean);
                addFieldBindings(bean);
                listBean.addVisitors(visitorMap);
            } else if( == .) {
                if( == null) {
                    throw new SmooksConfigurationException("CSV 'MAP' Binding must specify a 'keyField' property on the binding configuration.");
                }
                assertValidFieldName();
                Bean mapBean = new Bean(LinkedHashMap.class"$document");
                Bean recordBean = new Bean();
                MapBindingWiringVisitor wiringVisitor = new MapBindingWiringVisitor();
                addFieldBindings(recordBean);
                mapBean.addVisitors(visitorMap);
                recordBean.addVisitors(visitorMap);
                visitorMap.addVisitor(wiringVisitornullfalse);
            } else {
                bean = new Bean();
                addFieldBindings(bean);
                bean.addVisitors(visitorMap);
            }
        }
    }
    private void addFieldBindings(Bean bean) {
        for(Field field : ) {
            if(!field.ignore()) {
                bean.bindTo(field.getName(),  + "/" + field.getName());
            }
        }
    }
	private void buildFields() {
		// Parse input fields to extract names and lengths
        Field[] fields = new Field[this..length];
    	for(int i = 0; i < this..lengthi++) {
    		// Extract informations about the field
            String fieldInfos = this.[i].trim();
            String fieldName = fieldInfos;
            StringFunctionExecutor stringFunctionExecutor = null;
            if(fieldInfos.indexOf('?') >= 0) {
                fieldName = fieldInfos.substring(0, fieldInfos.indexOf('?'));
                String functionDefinition = fieldInfos.substring(fieldInfos.indexOf('?')+1);
                if(functionDefinition.length() != 0) {
                    stringFunctionExecutor = StringFunctionExecutor.getInstance(functionDefinition);
                }
            }
            fields[i] = new Field(fieldNamestringFunctionExecutor);
    	}
    	this. = fields;
	}
    /* (non-Javadoc)
	 * @see org.milyn.xml.SmooksXMLReader#setExecutionContext(org.milyn.container.ExecutionContext)
	 */
	public void setExecutionContext(ExecutionContext request) {
		this. = request;
	}
	/* (non-Javadoc)
	 * @see org.xml.sax.XMLReader#parse(org.xml.sax.InputSource)
	 */
	public void parse(InputSource csvInputSourcethrows IOExceptionSAXException {
        if( == null) {
            throw new IllegalStateException("'contentHandler' not set.  Cannot parse CSV stream.");
        }
        if( == null) {
            throw new IllegalStateException("'execContext' not set.  Cannot parse CSV stream.");
        }
        try {
			Reader csvStreamReader;
	        String[] csvRecord;
			// Get a reader for the CSV source...
	        csvStreamReader = csvInputSource.getCharacterStream();
	        if(csvStreamReader == null) {
	            csvStreamReader = new InputStreamReader(csvInputSource.getByteStream(), );
	        }
	        // Create the CSV line reader...
	        csvLineReader = new au.com.bytecode.opencsv.CSVReader(csvStreamReader);
				validateHeader(csvLineReader);
			}
	        // Start the document and add the root "csv-set" element...
	        .startDocument();
	        // Output each of the CVS line entries...
	        int lineNumber = 0;
	        int expectedCount = getExpectedColumnsCount();
	        while ((csvRecord = csvLineReader.readNext()) != null) {
	        	lineNumber++; // First line is line "1"
	        	if(csvRecord.length < expectedCount && ) {
	        		.warn("[CORRUPT-CSV] CSV line #" + lineNumber + " invalid [" + Arrays.asList(csvRecord) + "].  The line should contain number of items at least as in CSV config file " + . + " fields [" +  + "], but contains " + csvRecord.length + " fields.  Ignoring!!");
	        		continue;
	        	}
	            if() {
	                .characters(, 0, 1);
	                .characters(, 0, 1);
	            }
	            AttributesImpl attrs = new AttributesImpl();
	            // If we reached here it means that this line has to be in the sax stream
	            // hence we first add the record number attribute on the csv-record element
	            attrs.addAttribute(."xs:int", Integer.toString(lineNumber));
	            // if this line is truncated, we add the truncated attribute onto the csv-record element
	            if (csvRecord.length < expectedCount)
	        	int recordIt = 0;
	            for(Field field : ) {
	                String fieldName = field.getName();
	                if(field.ignore()) {
	                	int toSkip = parseIgnoreFieldDirective(fieldName);
	                	if(toSkip == .){
	                		break;
	                	}
	                	recordIt += toSkip;
	                	continue;
	                }
	                if() {
	                    .characters(, 0, 1);
	                    .characters(, 0, 2);
	                }
	                // Don't insert the element if the csv record does not contain it!!
	                if (recordIt < csvRecord.length) {
	                	String value = csvRecord[recordIt];
	                    .startElement(.fieldName.);
	                    StringFunctionExecutor stringFunctionExecutor = field.getStringFunctionExecutor();
	                    if(stringFunctionExecutor != null) {
	                    	value = stringFunctionExecutor.execute(value);
	                    }
	                    .characters(value.toCharArray(), 0, value.length());
	                    .endElement(.fieldName.);
	                }
	                if() {
	                }
	                recordIt++;
	            }
	            if() {
	                .characters(, 0, 1);
	                .characters(, 0, 1);
	            }
	            .endElement(null.);
	        }
	        if() {
	            .characters(, 0, 1);
	        }
	        // Close out the "csv-set" root element and end the document..
	        .endDocument();
        } finally {
        	// These properties need to be reset for every execution (e.g. when reader is pooled).
        	 = null;
        	 = null;
        }
	}
	private void validateHeader(final au.com.bytecode.opencsv.CSVReader readerthrows IOException {
		String[] headers = reader.readNext();
		if (headers == null) {
		}
		if (validateHeader(headers)) {
			return;
		}
	}
	private String[] getFieldNames(final Field[] fields) {
		if (fields == null) {
			return new String[] {};
		}
		String[] names = new String[fields.length];
		int n = 0;
		for (Field field : fields) {
			if (!field.ignore()) {
				names[n] = field.getName();
			}
			n++;
		}
		return names;
	}
	private boolean validateHeader(final Field[] fieldsfinal String[] headers) {
		if (fields.length != headers.length) {
			return false;
		}
		int n = 0;
		for (Field field : fields) {
			if (!field.ignore()) {
				if (headers.length <= n) {
					return false;
				}
				String header = headers[n];
				if (header == null) {
					header = "";
				}
				String name = field.getName();
				if (name == null) {
					name = "";
				}
				if (!name.equals(header)) {
					return false;
				}
			}
			n++;
		}
		return true;
	}
	private int parseIgnoreFieldDirective(String field) {
        String op = field.substring(.length());
        int toSkip = 0;
        if (op.length() == 0) {
            toSkip = 1;
        } else if ("+".equals(op)) {
            toSkip = .;
        } else {
            toSkip = Integer.parseInt(op);
        }
        return toSkip;
    }
    private int getExpectedColumnsCount() {
        int count = 0;
        for (Field field : ) {
            if (!field.ignore()) {
                count++;
            }
        }
        return count;
    }
    public void setContentHandler(ContentHandler contentHandler) {
        this. = contentHandler;
    }
    public ContentHandler getContentHandler() {
        return ;
    }
    private void assertValidFieldName(String fieldName) {
        for(Field field : ) {
            if(field.getName().equals(fieldName)) {
                return;
            }
        }
        String fieldNames = "";
        for(Field field : ) {
        	if(!field.ignore()) {
                if(fieldNames.length() > 0) {
                    fieldNames += ", ";
                }
                fieldNames += field.getName();
            }
        }
        throw new SmooksConfigurationException("Invalid field name '" + fieldName + "'.  Valid names: [" + fieldNames + "].");
    }

    
The following methods are currently unimplemnted... /
    public void parse(String systemIdthrows IOExceptionSAXException {
        throw new UnsupportedOperationException("Operation not supports by this reader.");
    }
    public boolean getFeature(String namethrows SAXNotRecognizedException,
            SAXNotSupportedException {
        return false;
    }
    public void setFeature(String nameboolean value)
            throws SAXNotRecognizedExceptionSAXNotSupportedException {
    }
    public DTDHandler getDTDHandler() {
        return null;
    }
    public void setDTDHandler(DTDHandler arg0) {
    }
    public EntityResolver getEntityResolver() {
        return null;
    }
    public void setEntityResolver(EntityResolver arg0) {
    }
    public ErrorHandler getErrorHandler() {
        return null;
    }
    public void setErrorHandler(ErrorHandler arg0) {
    }
    public Object getProperty(String namethrows SAXNotRecognizedException,
            SAXNotSupportedException {
        return null;
    }
    public void setProperty(String nameObject value)
            throws SAXNotRecognizedExceptionSAXNotSupportedException {
    }
    private class Field {
    	private final String name;
    	private final boolean ignore;
        public Field(String nameStringFunctionExecutor stringFunctionExecutor) {
			this. = name;
			this. = stringFunctionExecutor;
		}
		public String getName() {
			return ;
		}
		public boolean ignore() {
			return ;
		}
		}
		public String toString() {
			ToStringBuilder builder = new ToStringBuilder(this);
			builder.append("name")
				   .append("stringFunctionExecutor");
			return builder.toString();
		}
    }
    private class MapBindingWiringVisitor implements DOMVisitAfterSAXVisitAfterConsumer {
        private String mapBindingKey;
        private MapBindingWiringVisitor(String bindKeyFieldString mapBindingKey) {
            .setExpression( + "." + bindKeyField);
            this. = mapBindingKey;
        }
        public void visitAfter(Element elementExecutionContext executionContextthrows SmooksException {
            wireObject(executionContext);
        }
        public void visitAfter(SAXElement elementExecutionContext executionContextthrows SmooksExceptionIOException {
            wireObject(executionContext);
        }
		private void wireObject(ExecutionContext executionContext) {
            BeanContext beanContext = executionContext.getBeanContext();
            Map<StringObjectbeanMap = beanContext.getBeanMap();
            Object key = .getValue(beanMap);
            @SuppressWarnings("unchecked"//TODO: Optimize to use the BeanId object
            Map<ObjectObjectmap =  (Map<ObjectObject>) beanContext.getBean();
            Object record = beanContext.getBean();
            map.put(keyrecord);
        }
        public boolean consumes(Object object) {
            if(.getExpression().indexOf(object.toString()) != -1) {
                return true;
            }
            return false;
        }
    }
New to GrepCode? Check out our FAQ X