You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-commits@lucene.apache.org by us...@apache.org on 2009/08/26 00:09:47 UTC
svn commit: r807831 -
/lucene/java/trunk/src/java/org/apache/lucene/analysis/package.html
Author: uschindler
Date: Tue Aug 25 22:09:46 2009
New Revision: 807831
URL: http://svn.apache.org/viewvc?rev=807831&view=rev
Log:
More AttributeSource usage tips. addAttribute/getAttribute leads often to a misunderstanding (even the core developers) :-)
Modified:
lucene/java/trunk/src/java/org/apache/lucene/analysis/package.html
Modified: lucene/java/trunk/src/java/org/apache/lucene/analysis/package.html
URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/analysis/package.html?rev=807831&r1=807830&r2=807831&view=diff
==============================================================================
--- lucene/java/trunk/src/java/org/apache/lucene/analysis/package.html (original)
+++ lucene/java/trunk/src/java/org/apache/lucene/analysis/package.html Tue Aug 25 22:09:46 2009
@@ -327,7 +327,11 @@
result. This is especially important to know for addAttribute(). The method takes the <b>type</b> (<code>Class</code>)
of an Attribute as an argument and returns an <b>instance</b>. If an Attribute of the same type was previously added, then
the already existing instance is returned, otherwise a new instance is created and returned. Therefore TokenStreams/-Filters
-can safely call addAttribute() with the same Attribute type multiple times.
+can safely call addAttribute() with the same Attribute type multiple times. Even consumers of TokenStreams should
+normally call addAttribute() instead of getAttribute(), because it would not fail if the TokenStream does not have this
+Attribute (getAttribute() would throw an IllegalArgumentException, if the Attribute is missing). More advanced code
+could simply check with hasAttribute(), if a TokenStream has it, and may conditionally leave out processing for
+extra performance.
</li></ol>
<h3>Example</h3>
In this example we will create a WhiteSpaceTokenizer and use a LengthFilter to suppress all words that only
@@ -352,7 +356,9 @@
TokenStream stream = analyzer.tokenStream("field", new StringReader(text));
// get the TermAttribute from the TokenStream
- TermAttribute termAtt = (TermAttribute) stream.getAttribute(TermAttribute.class);
+ TermAttribute termAtt = (TermAttribute) stream.addAttribute(TermAttribute.class);
+
+ stream.reset();
// print all tokens until stream is exhausted
while (stream.incrementToken()) {
@@ -573,15 +579,20 @@
TokenStream stream = analyzer.tokenStream("field", new StringReader(text));
// get the TermAttribute from the TokenStream
- TermAttribute termAtt = (TermAttribute) stream.getAttribute(TermAttribute.class);
+ TermAttribute termAtt = (TermAttribute) stream.addAttribute(TermAttribute.class);
// get the PartOfSpeechAttribute from the TokenStream
- PartOfSpeechAttribute posAtt = (PartOfSpeechAttribute) stream.getAttribute(PartOfSpeechAttribute.class);
+ PartOfSpeechAttribute posAtt = (PartOfSpeechAttribute) stream.addAttribute(PartOfSpeechAttribute.class);
+ stream.reset();
+
// print all tokens until stream is exhausted
while (stream.incrementToken()) {
System.out.println(termAtt.term() + ": " + posAtt.getPartOfSpeech());
}
+
+ stream.end();
+ stream.close();
}
</pre>
The change that was made is to get the PartOfSpeechAttribute from the TokenStream and print out its contents in