You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Yonik Seeley (JIRA)" <ji...@apache.org> on 2007/05/11 22:15:15 UTC

[jira] Commented: (SOLR-234) TrimFilter should update the start and end offsets

    [ https://issues.apache.org/jira/browse/SOLR-234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495147 ] 

Yonik Seeley commented on SOLR-234:
-----------------------------------

Updating the offsets does seem like the right thing to do.

I imagine using toCharArray() will be slower than using charAt() given that it will allocate a new array, and the number of charAt() calls will be low in the average case because there will only be a small amount of whitespace.

Isn't it annoying that Java never seems to let you do things as efficiently as the class lib itself...

Another issue here is that the position increment isn't maintained.
And let another future issue is that any payloads aren't maintained (that's in a newer version of Lucene).
I'll bring up the latter issue on the lucene list since I think it's a bit of a design flaw.

> TrimFilter should update the start and end offsets
> --------------------------------------------------
>
>                 Key: SOLR-234
>                 URL: https://issues.apache.org/jira/browse/SOLR-234
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Ryan McKinley
>            Priority: Minor
>         Attachments: SOLR-234-TrimFilterOffsets.patch
>
>
> As implemented, the TrimFilter only trims the text.  It does not update the the startOffset and endOffset
> see:
> http://www.nabble.com/TrimFilter----t.startOffset%28%29%2C-t.endOffset%28%29-tf3728875.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.