You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Steven Rowe (JIRA)" <ji...@apache.org> on 2008/09/23 16:55:44 UTC

[jira] Commented: (LUCENE-1380) Patch for ShingleFilter.enablePositions (or PositionFilter)

    [ https://issues.apache.org/jira/browse/LUCENE-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633756#action_12633756 ] 

Steven Rowe commented on LUCENE-1380:
-------------------------------------

A couple of comments on the PositionFilter patch:

# The javadocs should be more explicit, e.g. about the fact that positionIncrement defaults to zero
# I think there ought to be a constructor that takes in a positionIncrement, perhaps instead of the setter.
# You don't handle the case where the filter is used for more than one document; there should be an else clause that resets firstTokenPositioned to false after this block:
{code:java}
if(null != reusableToken){
  if(firstTokenPositioned){
    reusableToken.setPositionIncrement(positionIncrement);
  }else{
    firstTokenPositioned = true;
  }
}
{code}
# You should provide a standalone test for the PositionFilter, in addition to the ShingleFilterTest tests.

> Patch for ShingleFilter.enablePositions (or PositionFilter)
> -----------------------------------------------------------
>
>                 Key: LUCENE-1380
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1380
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/analyzers
>            Reporter: Mck SembWever
>            Priority: Trivial
>         Attachments: LUCENE-1380-PositionFilter.patch, LUCENE-1380.patch, LUCENE-1380.patch
>
>
> Make it possible for *all* words and shingles to be placed at the same position, that is for _all_ shingles (and unigrams if included) to be treated as synonyms of each other.
> Today the shingles generated are synonyms only to the first term in the shingle.
> For example the query "abcd efgh ijkl" results in:
>    ("abcd" "abcd efgh" "abcd efgh ijkl") ("efgh" efgh ijkl") ("ijkl")
> where "abcd efgh" and "abcd efgh ijkl" are synonyms of "abcd", and "efgh ijkl" is a synonym of "efgh".
> There exists no way today to alter which token a particular shingle is a synonym for.
> This patch takes the first step in making it possible to make all shingles (and unigrams if included) synonyms of each other.
> See http://comments.gmane.org/gmane.comp.jakarta.lucene.user/34746 for mailing list thread.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org