Start line:  
End line:  

Snippet Preview

Snippet HTML Code

Stack Overflow Questions
wlfxb - a library for creating and processing of TCF data streams. Copyright (C) University of Tübingen. This file is part of wlfxb. This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.
 
 package eu.clarin.weblicht.wlfxb.io;
 
 
 import java.util.List;
Class LexiconStreamed represents TCF Lexicon annotations. These annotations represent linguistic information on a list of words. The class is used for accessing specified annotation layers and (optionally) adding any new annotation layers from/to Lexicon. Only specified in the constructor annotation layers are loaded into the memory. In case all the annotation layers should be loaded into the memory, use eu.clarin.weblicht.wlfxb.xb.WLData class.

Author(s):
Yana Panchenko
 
 public class LexiconStreamed extends LexiconStored {
 
     private EnumSet<LexiconLayerTaglayersToRead;
     private EnumSet<LexiconLayerTagreadSucceeded = EnumSet.noneOf(LexiconLayerTag.class);
     private XMLEventReader xmlEventReader;
     private XMLEventWriter xmlEventWriter;
     private XmlReaderWriter xmlReaderWriter;
     private static final int LAYER_INDENT_RELATIVE = 1;
     private boolean closed = false;

    
Creates a LexiconStreamed from the given TCF input stream and specified annotation layers.

Parameters:
inputStream the underlying input stream with linguistic annotations in TCF format.
layersToRead the annotation layers of Lexicon that should be read into this LexiconStreamed.
Throws:
WLFormatException if an error in input format or an I/O error occurs.
 
     public LexiconStreamed(InputStream inputStream,
             EnumSet<LexiconLayerTaglayersToRead)
             throws WLFormatException {
         super("unknown");
         this. = layersToRead;
         try {
             initializeReaderAndWriter(inputStreamnullfalse);
             process();
         } catch (WLFormatException e) {
             .close();
             throw e;
         }
     }

    
Creates a LexiconStreamed from the given TCF input stream, specified annotation layers and the output stream.

Parameters:
inputStream the underlying input stream with linguistic annotations in TCF format.
layersToRead the annotation layers of Lexicon that should be read into this LexiconStreamed.
outputStream the underlying output stream into which the annotations from the input stream and any new created annotations will be written (in TCF format).
Throws:
WLFormatException if an error in input format or an I/O error occurs.
    public LexiconStreamed(InputStream inputStream,
            EnumSet<LexiconLayerTaglayersToReadOutputStream outputStream)
            throws WLFormatException {
        super("unknown");
        this. = layersToRead;
        try {
            initializeReaderAndWriter(inputStreamoutputStreamfalse);
            process();
        } catch (WLFormatException e) {
            .close();
            throw e;
        }
    }

    
Creates a LexiconStreamed from the given TCF input stream, specified annotation layers and the output stream.

Parameters:
inputStream the underlying input stream with linguistic annotations in TCF format.
layersToRead the annotation layers of Lexicon that should be read into this LexiconStreamed.
outputStream the underlying output stream into which the annotations from the input stream and any new created annotations will be written (in TCF format).
outputAsXmlFragment true if the output should not contain xml headers, false otherwise.
Throws:
WLFormatException if an error in input format or an I/O error occurs.
    public LexiconStreamed(InputStream inputStream,
            EnumSet<LexiconLayerTaglayersToReadOutputStream outputStream,
            boolean outputAsXmlFragment)
            throws WLFormatException {
        super("unknown");
        this. = layersToRead;
        try {
            initializeReaderAndWriter(inputStreamoutputStreamoutputAsXmlFragment);
            process();
        } catch (WLFormatException e) {
            .close();
            throw e;
        }
    }

    
Creates a LexiconStreamed from the given TCF input stream, specified annotation layers, output stream and meta data.

Parameters:
inputStream the underlying input stream with linguistic annotations in TCF format.
layersToRead the annotation layers of Lexicon that should be read into this LexiconStreamed.
outputStream the underlying output stream into which the annotations from the input stream and any new created annotations will be written (in TCF format).
metaDataToAdd meta data to be added to the output TCF.
Throws:
WLFormatException if an error in input format or an I/O error occurs.
    public LexiconStreamed(InputStream inputStream,
            EnumSet<LexiconLayerTaglayersToReadOutputStream outputStream,
            List<MetaDataItemmetaDataToAdd)
            throws WLFormatException {
        super("unknown");
        this. = layersToRead;
        try {
            initializeReaderAndWriter(inputStreamoutputStreamfalse);
            addMetadata(metaDataToAdd);
            process();
        } catch (WLFormatException e) {
            .close();
            throw e;
        }
    }
    private void initializeReaderAndWriter(InputStream inputStreamOutputStream outputStreamboolean outputAsXmlFragmentthrows WLFormatException {
        if (inputStream != null) {
            try {
                XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
                 = xmlInputFactory.createXMLEventReader(inputStream"UTF-8");
            } catch (XMLStreamException e) {
                throw new WLFormatException(e.getMessage(), e);
            }
        }
        if (outputStream != null) {
            try {
                XMLOutputFactory xmlOutputFactory = XMLOutputFactory.newInstance();
                 = new IndentingXMLEventWriter(xmlOutputFactory.createXMLEventWriter(outputStream"UTF-8"));
            } catch (XMLStreamException e) {
                throw new WLFormatException(e.getMessage(), e);
            }
        }
        .setOutputAsXmlFragment(outputAsXmlFragment);
    }
    private void addMetadata(List<MetaDataItemmetaDataToAddthrows WLFormatException {
        try {
            marshall(metaDataToAdd);
            // rewrite metadata end element
            XMLEvent event = .nextEvent();
            .add(event);
        } catch (XMLStreamException e) {
            throw new WLFormatException(e.getMessage(), e);
        } catch (NoSuchElementException e) {
            throw new WLFormatException(e.getMessage(), e);
        } 
    }
    private void process() throws WLFormatException {
        try {
            // process TextCorpus start element
            XMLEvent event = .nextEvent();
            super. = event.asStartElement().getAttributeByName(new QName("lang")).getValue();
            // add processed TextCorpus start back
            .add(event);
            // create TextCorpus object
            // read layers requested stopping before TextCorpus end element
            processLayers();
            super.connectLayers();
            // if no writing requested finish reading the document
            if ( == null) {
                .readWriteToTheEnd();
            }
        } catch (XMLStreamException e) {
            throw new WLFormatException(e.getMessage(), e);
        } catch (NoSuchElementException e) {
            throw new WLFormatException(e.getMessage(), e);
        } 
        if (.size() != .size()) {
            .removeAll();
            throw new WLFormatException("Following layers could not be read: " + .toString());
        }
    }
    private void processLayers() throws WLFormatException {
        boolean textCorpusEnd = false;
        XMLEvent peekedEvent;
        try {
            peekedEvent = .peek();
            while (!textCorpusEnd && peekedEvent != null) {
                if (peekedEvent.getEventType() == .
                        && peekedEvent.asEndElement().getName().getLocalPart().equals(.)) {
                    textCorpusEnd = true;
                } else if (peekedEvent.getEventType() == .) {
                    processLayer();
                    peekedEvent = .peek();
                } else {
                    XMLEvent readEvent = .readEvent();
                    .add(readEvent);
                    peekedEvent = .peek();
                }
            }
        } catch (XMLStreamException e) {
            throw new WLFormatException(e.getMessage(), e);
        }
        if (!textCorpusEnd) {
            throw new WLFormatException(. + " end tag not found");
        }
    }
    private void processLayer() throws WLFormatException {
        XMLEvent peekedEvent;
        try {
            peekedEvent = .peek();
            // now we assume that this event is start of a TextCorpus layer
            String tagName = peekedEvent.asStartElement().getName().getLocalPart();
            LexiconLayerTag layerTag = LexiconLayerTag.getFromXmlName(tagName);
            if (layerTag == null) { // unknown layer, just add it to output
                //readWriteElement(tagName);
                .readWriteElement(tagName);
            } else if (this..contains(layerTag)) { // known layer, and is requested for reading
                // add it to the output, but store its data
                readLayerData(layerTag);
            } else { // known layer, and is not requested for reading
                // just add it to the output
                .readWriteElement(tagName);
            }
        } catch (XMLStreamException e) {
            throw new WLFormatException(e.getMessage(), e);
        }
    }
    private void readLayerData(LexiconLayerTag layerTagthrows WLFormatException {
        JAXBContext context;
        Unmarshaller unmarshaller;
        try {
            context = JAXBContext.newInstance(layerTag.getLayerClass());
            unmarshaller = context.createUnmarshaller();
            LexiconLayerStoredAbstract layer = (LexiconLayerStoredAbstractunmarshaller.unmarshal();
            super.[layerTag.ordinal()] = layer;
            marshall(super.[layerTag.ordinal()]);
        } catch (JAXBException e) {
            throw new WLFormatException(e.getMessage(), e);
        }
        .add(layerTag);
    }
    private void marshall(LexiconLayer layerthrows WLFormatException {
        if ( == null) {
            return;
        }
        JAXBContext context;
        try {
            context = JAXBContext.newInstance(layer.getClass());
            Marshaller marshaller = context.createMarshaller();
            marshaller.setProperty(.true);
            marshaller.setProperty(.true);
            marshaller.marshal(layer);
        } catch (JAXBException e) {
            throw new WLFormatException(e.getMessage(), e);
        } catch (XMLStreamException e) {
            throw new WLFormatException(e.getMessage(), e);
        }
    }
    private void marshall(List<MetaDataItemmetaDataToAddthrows WLFormatException {
        if ( == null) {
            return;
        }
        JAXBContext context;
        try {
            context = JAXBContext.newInstance(MetaDataItem.class);
            Marshaller marshaller = context.createMarshaller();
            marshaller.setProperty(.true);
            marshaller.setProperty(.true);
            for (MetaDataItem mdi : metaDataToAdd) {
                marshaller.marshal(mdi);
            }
        } catch (JAXBException e) {
            throw new WLFormatException(e.getMessage(), e);
        } catch (XMLStreamException e) {
            throw new WLFormatException(e.getMessage(), e);
        }
    }

    
Closes the input and output streams associated with this object and releases any associated system resources. Before the streams are closed, all in-memory annotations of the LexiconStreamed and not-processed part of the input stream are written to the output stream. Therefore, it's important to call close() method, so that all the in-memory annotations are saved to the output stream. Once the LexiconStreamed has been closed, adding further annotations will have no effect on the output stream.

Throws:
WLFormatException if an error in input format or an I/O error occurs.
    public void close() throws WLFormatException {
        if () {
            return;
        }
         = true;
        try {
            boolean[] layersRead = new boolean[super..length];
            for (LexiconLayerTag layerRead : ) {
                layersRead[layerRead.ordinal()] = true;
            }
            for (int i = 0; i < super..lengthi++) {
                if (super.[i] != null && !layersRead[i// && !super.layersInOrder[i].isEmpty() 
                        ) {
                    marshall(super.[i]);
                }
            }
        } finally {
            .readWriteToTheEnd();
        }
    }
New to GrepCode? Check out our FAQ X