* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
import static org.apache.stanbol.enhancer.engines.dbpspotlight.utils.XMLParser.getElementsByTagName;
* TODO (Note by rwesten 2012-08-22)
* Added here functionality to extract DBpedia
* Ontoloty types for Annotations. This is mainly to
* choose the best dc:type for fise:TextAnnotations
* created for Annotation.
* This is based on the assumption that the most generic
* dbpedia type is always the last one in the returned list.
* In addition "DBpedia:TopicalConcept" is ignored first
* as it seams not to be used by dbpedia.org and second
* because it is always parsed last (even after schema
* and freebase types) and would therefore be considered
* as the most generic dbpedia type.
* I do not like this solution and would like to find
* a better solution for that
Introduced this to ignore the "TopicalConcept" type.
//TODO: change this to a list with the parsed types
// Processing of XML results should be done during parsing
//NOTE rwesten: changed this to embed a SurfaceFrom so that i
// can reuse code for creating fise:TextAnnotations
// make the returned types referenceable
.format("[uri=%s, support=%i, types=%s, surfaceForm=\"%s\", similarityScore=%d, percentageOfSecondRank=%d]",
xmlDocA XML document containing annotations.
//set the type of the surface form
//set the last type in the list - the most general one - as type
//for the surface form