Start line:  
End line:  

Snippet Preview

Snippet HTML Code

Stack Overflow Questions
DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER Copyright 2010 IBM. All rights reserved. Use is subject to license terms. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0. You can also obtain a copy of the License at http://odftoolkit.org/docs/license.txt Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. /
 
 package org.odftoolkit.simple.common;
 
This is a sub class of DefaultElementVisitor, which is used to extract display text from ODF element. For example, if you want to get all of the text content in a slide notes, you can call getOdfElement() to get the ODF element of this notes, then pass it to newOdfTextExtractor to create a TextExtractor. The last step is very easy, you only need to use getText(), all of the text content will be return as string. Another easier way is pass the ODF element to the static method TextExtractor.getText(OdfElement) directly.

If you pass the content root which you can get by Document.getContentRoot() as the parameter, the whole document content will be returned, without any tag information.

This extractor implements parts of ODF elements' white space handling functions. They are text:p, text:h, text:s, text:tab and text:linebreak, which visit() are override to process white space, according to ODF specification.

 
 public class TextExtractor extends DefaultElementVisitor {
 
 	protected static final char NewLineChar = '\r';
 	protected static final char TabChar = '\t';
 	protected final ExtractorStringBuilder mTextBuilder;
This class is used to provide the string builder functions to extractor. It will automatically process the last NewLineChar.

Since:
0.3.5
 
 	protected static class ExtractorStringBuilder {
 		private boolean lastAppendNewLine;
 
 			 = new StringBuilder();
 			 = false;
 		}

Append a string

Parameters:
str - the string
 
 		public void append(String str) {
 		}

Append a character

Parameters:
ch - the character
 
 		public void append(char ch) {
 		}

Append a new line character at the end
		public void appendLine() {
		}

Return the string value.

If the last character is a new line character and is appended with appendLine(), the last new line character will be removed.

		public String toString() {
			}
			return .toString();
		}
	}

Return the text content of a element as String

Parameters:
ele the ODF element
Returns:
the text content of the element
	public static synchronized String getText(OdfElement ele) {
		return extractor.getText();
	}

Create a TextExtractor instance using specified ODF element, which text content can be extracted by getText().

Parameters:
element the ODF element whose text will be extracted.
Returns:
an instance of TextExtractor
	public static TextExtractor newOdfTextExtractor(OdfElement element) {
		return new TextExtractor(element);
	}

Return the text content of specified ODF element as a string.

Returns:
the text content as a string
	public String getText() {
	}

Default constructor
	protected TextExtractor() {
	}

Constructor with an ODF element as parameter

Parameters:
element the ODF element whose text would be extracted.
	protected TextExtractor(OdfElement element) {
		 = element;
	}

The end users needn't to care of this method, if you don't want to override the text content handling strategy of OdfElement.

	public void visit(OdfElement element) {
		}
	}

The end users needn't to care of this method, if you don't want to override the text content handling strategy of text:p.

	public void visit(TextPElement ele) {
	}

The end users needn't to care of this method, if you don't want to override the text content handling strategy of text:h.

	public void visit(TextHElement ele) {
	}

The end users needn't to care of this method, if you don't want to override the text content handling strategy of text:s.

	public void visit(TextSElement ele) {
		Integer count = ele.getTextCAttribute();
		if (count == null) {
			count = 1;
		}
		for (int i = 0; i < counti++) {
		}
	}

The end users needn't to care of this method, if you don't want to override the text content handling strategy of text:tab.

	public void visit(TextTabElement ele) {
	}

The end users needn't to care of this method, if you don't want to override the text content handling strategy of text:linebreak.

	public void visit(TextLineBreakElement ele) {
	}

Append the text content of this element to string buffer.

Parameters:
ele the ODF element whose text will be appended.
	protected void appendElementText(OdfElement ele) {
		Node node = ele.getFirstChild();
		while (node != null) {
			if (node.getNodeType() == .) {
else if (node.getNodeType() == .) {
				OdfElement element = (OdfElementnode;
				element.accept(this);
			}
			node = node.getNextSibling();
		}
	}
New to GrepCode? Check out our FAQ X