You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Steven A Rowe <sa...@syr.edu> on 2012/04/09 21:41:39 UTC
RE: svn commit: r1311373 - in /lucene/dev/branches/lucene3969:
lucene/test-framework/src/java/org/apache/lucene/analysis/
modules/analysis/common/src/java/org/apache/lucene/analysis/shingle/
modules/analysis/common/src/test/org/apache/lucene/analysis/core
On 4/9/2012 at 3:06 PM, mikemccand@apache.org wrote:
> LUCENE-3969: [...] tenatively add posLen to ShingleFilter
> [...]
> +++ lucene/dev/branches/lucene3969/modules/analysis/common/src/java/org/
> +++ apache/lucene/analysis/shingle/ShingleFilter.java Mon Apr 9 19:05:47 2012
> [...]
> @@ -319,6 +321,8 @@ public final class ShingleFilter extends
> noShingleOutput = false;
> }
> offsetAtt.setOffset(offsetAtt.startOffset(), nextToken.offsetAtt.endOffset());
> + // nocommit is this right!? i'm just guessing...
> + posLenAtt.setPositionLength(builtGramSize);
> isOutputHere = true;
> gramSize.advance();
> tokenAvailable = true;
+1 - looks right to me.
builtGramSize is the position length of the output shingle - missing positions (e.g. from stop words) are represented as "filler" tokens.
Steve