Start line:  
End line:  

Snippet Preview

Snippet HTML Code

Stack Overflow Questions
wlfxb - a library for creating and processing of TCF data streams. Copyright (C) Yana Panchenko. This file is part of wlfxb. This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.
 
 package eu.clarin.weblicht.wlfxb.tc.api;
 
 import java.util.List;

Interface TextCorpus represents TCF TextCorpus annotations. Corresponds to TCF TextCorpus specification. These annotations represent linguistic annotations on written connected text. The annotations are divided into the annotation layers, were each layer represents specific linguistic aspect. For example, TextCorpus can contain TokensLayer, PosTagsLayer, ConstituentParsingLayer, etc. In TextCorpus, annotations from any layer usually annotate (directly or indirectly) Token annotations from TokensLayer. An exception is TextLayer which is independent from any other layer. See also: TCF Format description.

Author(s):
Yana Panchenko
 
 public interface TextCorpus {

    
Gets the language of the text/tokens in this TextCorpus.

Returns:
language of TextCorpus.
 
     public String getLanguage();

    
Gets all annotation layers of this TextCorpus.

Returns:
annotations layers.
 
     public List<TextCorpusLayergetLayers();

    
Gets text layer of this TextCorpus.

Returns:
annotation layer containing text.
 
     public TextLayer getTextLayer();

    
Creates empty TextLayer in this TextCorpus.

Returns:
annotation layer that has been created.
 
     public TextLayer createTextLayer();

    
Gets tokens layer of this TextCorpus.

Returns:
annotation layer containing tokens.
 
     public TokensLayer getTokensLayer();

    
Creates empty TokensLayer in this TextCorpus.

Returns:
annotation layer that has been created.
 
     public TokensLayer createTokensLayer();

    
Creates empty TokensLayer in this TextCorpus.

Parameters:
hasCharOffsets true if the Token objects in this TokensLayer will contain character offset in text information, false otherwise.
Returns:
annotation layer that has been created.
 
     public TokensLayer createTokensLayer(boolean hasCharOffsets);

    
Gets lemmas layer of this TextCorpus.

Returns:
layer containing lemma annotations on Token objects from TokensLayer.
    public LemmasLayer getLemmasLayer();

    
Creates empty LemmasLayer in this TextCorpus.

Returns:
annotation layer that has been created.
    public LemmasLayer createLemmasLayer();

    
Gets part-of-speech layer of this TextCorpus.

Returns:
layer containing part-of-speech annotations on Token objects from TokensLayer.
    public PosTagsLayer getPosTagsLayer();

    
Creates empty PosTagsLayer with the given tagset in this TextCorpus.

Parameters:
tagset of the part-of-speech annotations.
Returns:
annotation layer that has been created.
    public PosTagsLayer createPosTagsLayer(String tagset);

    
Gets sentences layer of this TextCorpus.

Returns:
layer containing sentence boundary annotations on Token objects from TokensLayer.
    public SentencesLayer getSentencesLayer();

    
Creates empty SentencesLayer in this TextCorpus.

Returns:
annotation layer that has been created.
    public SentencesLayer createSentencesLayer();

    
Creates empty SentencesLayer in this TextCorpus.

Parameters:
hasCharOffsets true if the Sentence objects in this SentencesLayer will contain character offset in text information, false otherwise.
Returns:
annotation layer that has been created.
    public SentencesLayer createSentencesLayer(boolean hasCharOffsets);

    
Gets constituent parsing layer of this TextCorpus.

Returns:
layer containing constituent parsing annotations on Token objects from TokensLayer.
Creates empty ConstituentParsingLayer with the given tagset in this TextCorpus.

Parameters:
tagset of the parsing annotations.
Returns:
annotation layer that has been created.
Gets dependency parsing layer of this TextCorpus.

Returns:
layer containing dependency parsing annotations on Token objects from TokensLayer.
Creates empty DependencyParsingLayer in this TextCorpus.

Parameters:
multipleGovernorsPossible true if a dependent can be governed by more than 1 governor, false otherwise.
emptyTokensPossible true if dependency annotations can contain empty tokens.
Returns:
annotation layer that has been created.
            boolean multipleGovernorsPossibleboolean emptyTokensPossible);

    
Creates empty DependencyParsingLayer with the given tagset in this TextCorpus.

Parameters:
tagset of the functions between dependent and governor.
multipleGovernorsPossible true if a dependent can be governed by more than 1 governor, false otherwise.
emptyTokensPossible true if dependency annotations can contain empty tokens.
Returns:
annotation layer that has been created.
            boolean multipleGovernorsPossibleboolean emptyTokensPossible);

    
Gets morphology layer of this TextCorpus.

Returns:
layer containing morphological analysis annotations on Token objects from TokensLayer.
    public MorphologyLayer getMorphologyLayer();
    
    
Creates empty MorphologyLayer in this TextCorpus.

Returns:
annotation layer that has been created.
    public MorphologyLayer createMorphologyLayer();

    
Creates empty MorphologyLayer in this TextCorpus.

Parameters:
hasSegmentation true if morphology annotations contain segmentation analysis.
Returns:
annotation layer that has been created.
    public MorphologyLayer createMorphologyLayer(boolean hasSegmentation);

    
Creates empty MorphologyLayer in this TextCorpus.

Parameters:
hasSegmentation true if morphology annotations contain segmentation analysis.
hasCharOffsets true if the MorphologyAnalysis objects in this layer will contain character offset for segmentation within the token information, false otherwise.
Returns:
annotation layer that has been created.
    public MorphologyLayer createMorphologyLayer(boolean hasSegmentationboolean hasCharOffsets);

    
Gets named entities layer of this TextCorpus.

Returns:
layer containing named entity annotations on Token objects from TokensLayer.
    public NamedEntitiesLayer getNamedEntitiesLayer();

    
Creates empty NamedEntitiesLayer with the given tagset for named entity types in this TextCorpus.

Parameters:
entitiesType tagset of the named entity annotations.
Returns:
annotation layer that has been created.
    public NamedEntitiesLayer createNamedEntitiesLayer(String entitiesType);

    
Gets references layer of this TextCorpus.

Returns:
layer containing reference/coreference annotations on Token objects from TokensLayer.
    public ReferencesLayer getReferencesLayer();

    
Creates empty references layers of this TextCorpus, ready to be filled in with the references data.

Parameters:
typetagset tagset for the mention type values of the references (should be null if no types are defined)
reltagset tagset for relation values between the references (should be null if no relations are defined)
externalReferencesSource name of external source (should be null if entities from the external source are not referenced)
Returns:
annotation layer that has been created.
    public ReferencesLayer createReferencesLayer(String typetagsetString reltagsetString externalReferencesSource);
    @SuppressWarnings("deprecation")
    @SuppressWarnings("deprecation")
    public RelationsLayer createRelationsLayer(String type);

    
Gets matches layer of this TextCorpus.

Returns:
layer matches annotations on Token objects from TokensLayer.
    public MatchesLayer getMatchesLayer();

    
Creates empty MatchesLayer layers of this TextCorpus, ready to be filled in with the corpus match annotations.

Parameters:
queryLanguage language of the query used to extract corpus matches from a corpus.
queryString the query used to extract corpus matches from a corpus.
Returns:
annotation layer that has been created.
    public MatchesLayer createMatchesLayer(String queryLanguageString queryString);

    
Gets word splitting layer of this TextCorpus.

Returns:
layer split annotations (e.g. hyphenation) on Token objects from TokensLayer.
    public WordSplittingLayer getWordSplittingLayer();

    
Creates empty WordSplittingLayer with the given type of the splitting in this TextCorpus.

Parameters:
type of the splitting, e.g. hyphenation.
Returns:
annotation layer that has been created.
    public WordSplittingLayer createWordSplittingLayer(String type);

    
Gets phonetics layer of this TextCorpus.

Returns:
layer containing phonetic transcriptions of Token objects from TokensLayer.
    public PhoneticsLayer getPhoneticsLayer();

    
Creates empty PhoneticsLayer with the given alphabet for phonetic transcriptions in this TextCorpus.

Parameters:
alphabet of the phonetic transcription annotations.
Returns:
annotation layer that has been created.
    public PhoneticsLayer createPhotenicsLayer(String alphabet);

    
Gets geo layer of this TextCorpus.

Returns:
layer containing geographical location annotations on Token objects from TokensLayer.
    public GeoLayer getGeoLayer();

    
Creates empty GeoLayer in this TextCorpus.

Parameters:
source of the geographical coordinates.
coordFormat format of the geographical coordinates.
Returns:
annotation layer that has been created.
    public GeoLayer createGeoLayer(String sourceGeoLongLatFormat coordFormat);

    
Creates empty GeoLayer in this TextCorpus.

Parameters:
source of the geographical coordinates.
coordFormat format of the geographical coordinates.
conitentFormat format of the continent (in case no continent is specified should be null).
countryFormat format of the country (in case no country is specified should be null).
capitalFormat format of the capital (in case no capital is specified should be null).
Returns:
annotation layer that has been created.
    public GeoLayer createGeoLayer(String sourceGeoLongLatFormat coordFormat,
            GeoContinentFormat conitentFormatGeoCountryFormat countryFormatGeoCapitalFormat capitalFormat);

    
Gets orthography layer of this TextCorpus.

Returns:
layer containing correct orthographic spellings of misspelled Token objects from TokensLayer.
    public OrthographyLayer getOrthographyLayer();

    
Creates empty OrthographyLayer in this TextCorpus.

Returns:
annotation layer that has been created.
    public OrthographyLayer createOrthographyLayer();

    
Gets text structure layer of this TextCorpus.

Returns:
layer containing original text structure (such as paragraphs, lines, pages, etc.), anchored on Token objects from TokensLayer.
    public TextStructureLayer getTextStructureLayer();

    
Creates empty TextStructureLayer in this TextCorpus.

Returns:
annotation layer that has been created.
Gets synonymy layer of this TextCorpus.

Returns:
layer containing synonyms of Lemma objects from LemmasLayer.
    public LexicalSemanticsLayer getSynonymyLayer();

    
Creates empty synonymy layer in this TextCorpus.

Returns:
annotation layer that has been created.
    public LexicalSemanticsLayer createSynonymyLayer();

    
Gets antonymy layer of this TextCorpus.

Returns:
layer containing antonyms of Lemma objects from LemmasLayer.
    public LexicalSemanticsLayer getAntonymyLayer();

    
Creates empty antonymy layer in this TextCorpus.

Returns:
annotation layer that has been created.
    public LexicalSemanticsLayer createAntonymyLayer();

    
Gets hyponymy layer of this TextCorpus.

Returns:
layer containing hyponyms of Lemma objects from LemmasLayer.
    public LexicalSemanticsLayer getHyponymyLayer();

    
Creates empty hyponymy layer in this TextCorpus.

Returns:
annotation layer that has been created.
    public LexicalSemanticsLayer createHyponymyLayer();

    
Gets hyperonymy layer of this TextCorpus.

Returns:
layer containing hyperonyms of Lemma objects from LemmasLayer.
    public LexicalSemanticsLayer getHyperonymyLayer();

    
Creates empty hyperonymy layer in this TextCorpus.

Returns:
annotation layer that has been created.
Gets discourse connectives layer of this TextCorpus.

Returns:
layer containing discourse connectives annotations on Token objects from TokensLayer.
Creates empty DiscourseConnectivesLayer in this TextCorpus.

Returns:
annotation layer that has been created.
Creates empty DiscourseConnectivesLayer in this TextCorpus.

Parameters:
typeTagset tagset used to label semantic types of the connectives
Returns:
annotation layer that has been created.
    public DiscourseConnectivesLayer createDiscourseConnectivesLayer(String typeTagset);
    
    
Gets word senses layer of this TextCorpus.

Returns:
layer containing word sense annotations on Token objects from TokensLayer.
    public WordSensesLayer getWordSensesLayer();
    
    
Creates empty WordSensesLayer in this TextCorpus.

Parameters:
source from where the word senses are taken
Returns:
annotation layer that has been created.
New to GrepCode? Check out our FAQ X