You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Tricia Williams (JIRA)" <ji...@apache.org> on 2008/04/03 18:06:24 UTC
[jira] Created: (SOLR-532) WordDelimiterFilter ignores payloads
WordDelimiterFilter ignores payloads
------------------------------------
Key: SOLR-532
URL: https://issues.apache.org/jira/browse/SOLR-532
Project: Solr
Issue Type: Bug
Reporter: Tricia Williams
Priority: Minor
When a WordDelimiterFilter ingests a token stream and creates a new token (newTok) it appears to copy most of the old token attributes, except the payload. I believe this is a bug. My solution is for the WordDelimiterFilter to use the Token clone() method to create a carbon copy and then modify the appropriate attributes (offsets and term text).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (SOLR-532) WordDelimiterFilter ignores payloads
Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Grant Ingersoll reassigned SOLR-532:
------------------------------------
Assignee: Grant Ingersoll
> WordDelimiterFilter ignores payloads
> ------------------------------------
>
> Key: SOLR-532
> URL: https://issues.apache.org/jira/browse/SOLR-532
> Project: Solr
> Issue Type: Bug
> Reporter: Tricia Williams
> Assignee: Grant Ingersoll
> Priority: Minor
> Attachments: SOLR-532-WordDelimiterFilter.patch
>
>
> When a WordDelimiterFilter ingests a token stream and creates a new token (newTok) it appears to copy most of the old token attributes, except the payload. I believe this is a bug. My solution is for the WordDelimiterFilter to use the Token clone() method to create a carbon copy and then modify the appropriate attributes (offsets and term text).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-532) WordDelimiterFilter ignores payloads
Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Grant Ingersoll resolved SOLR-532.
----------------------------------
Resolution: Fixed
Fix Version/s: 1.4
> WordDelimiterFilter ignores payloads
> ------------------------------------
>
> Key: SOLR-532
> URL: https://issues.apache.org/jira/browse/SOLR-532
> Project: Solr
> Issue Type: Bug
> Reporter: Tricia Williams
> Assignee: Grant Ingersoll
> Priority: Minor
> Fix For: 1.4
>
> Attachments: SOLR-532-WordDelimiterFilter.patch
>
>
> When a WordDelimiterFilter ingests a token stream and creates a new token (newTok) it appears to copy most of the old token attributes, except the payload. I believe this is a bug. My solution is for the WordDelimiterFilter to use the Token clone() method to create a carbon copy and then modify the appropriate attributes (offsets and term text).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-532) WordDelimiterFilter ignores payloads
Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641404#action_12641404 ]
Grant Ingersoll commented on SOLR-532:
--------------------------------------
I consolidated this down to take advantage of Lucene's new clone method:
Index: src/java/org/apache/solr/analysis/WordDelimiterFilter.java
===================================================================
--- src/java/org/apache/solr/analysis/WordDelimiterFilter.java (revision 706648)
+++ src/java/org/apache/solr/analysis/WordDelimiterFilter.java (working copy)
@@ -236,11 +236,7 @@
startOff += start;
}
- Token newTok = new Token(startOff,
- endOff,
- orig.type());
- newTok.setTermBuffer(orig.termBuffer(), start, (end - start));
- return newTok;
+ return (Token)orig.clone(orig.termBuffer(), start, (end - start), startOff, endOff);
}
I will likely commit today or tomorrow. Let me know if this works for you, Tricia. The tests pass for me.
> WordDelimiterFilter ignores payloads
> ------------------------------------
>
> Key: SOLR-532
> URL: https://issues.apache.org/jira/browse/SOLR-532
> Project: Solr
> Issue Type: Bug
> Reporter: Tricia Williams
> Assignee: Grant Ingersoll
> Priority: Minor
> Attachments: SOLR-532-WordDelimiterFilter.patch
>
>
> When a WordDelimiterFilter ingests a token stream and creates a new token (newTok) it appears to copy most of the old token attributes, except the payload. I believe this is a bug. My solution is for the WordDelimiterFilter to use the Token clone() method to create a carbon copy and then modify the appropriate attributes (offsets and term text).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-532) WordDelimiterFilter ignores
payloads
Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641404#action_12641404 ]
gsingers edited comment on SOLR-532 at 10/21/08 8:32 AM:
----------------------------------------------------------------
I consolidated this down to take advantage of Lucene's new clone method:
{code}
Index: src/java/org/apache/solr/analysis/WordDelimiterFilter.java
===================================================================
--- src/java/org/apache/solr/analysis/WordDelimiterFilter.java (revision 706648)
+++ src/java/org/apache/solr/analysis/WordDelimiterFilter.java (working copy)
@@ -236,11 +236,7 @@
startOff += start;
}
- Token newTok = new Token(startOff,
- endOff,
- orig.type());
- newTok.setTermBuffer(orig.termBuffer(), start, (end - start));
- return newTok;
+ return (Token)orig.clone(orig.termBuffer(), start, (end - start), startOff, endOff);
}
{code}
I will likely commit today or tomorrow. Let me know if this works for you, Tricia. The tests pass for me.
was (Author: gsingers):
I consolidated this down to take advantage of Lucene's new clone method:
Index: src/java/org/apache/solr/analysis/WordDelimiterFilter.java
===================================================================
--- src/java/org/apache/solr/analysis/WordDelimiterFilter.java (revision 706648)
+++ src/java/org/apache/solr/analysis/WordDelimiterFilter.java (working copy)
@@ -236,11 +236,7 @@
startOff += start;
}
- Token newTok = new Token(startOff,
- endOff,
- orig.type());
- newTok.setTermBuffer(orig.termBuffer(), start, (end - start));
- return newTok;
+ return (Token)orig.clone(orig.termBuffer(), start, (end - start), startOff, endOff);
}
I will likely commit today or tomorrow. Let me know if this works for you, Tricia. The tests pass for me.
> WordDelimiterFilter ignores payloads
> ------------------------------------
>
> Key: SOLR-532
> URL: https://issues.apache.org/jira/browse/SOLR-532
> Project: Solr
> Issue Type: Bug
> Reporter: Tricia Williams
> Assignee: Grant Ingersoll
> Priority: Minor
> Attachments: SOLR-532-WordDelimiterFilter.patch
>
>
> When a WordDelimiterFilter ingests a token stream and creates a new token (newTok) it appears to copy most of the old token attributes, except the payload. I believe this is a bug. My solution is for the WordDelimiterFilter to use the Token clone() method to create a carbon copy and then modify the appropriate attributes (offsets and term text).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-532) WordDelimiterFilter ignores payloads
Posted by "Tricia Williams (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641694#action_12641694 ]
Tricia Williams commented on SOLR-532:
--------------------------------------
Thanks Grant. That's much cleaner using the new clone method. It works for me after catching up with the new slf4j logging. Thanks too for committing it!
> WordDelimiterFilter ignores payloads
> ------------------------------------
>
> Key: SOLR-532
> URL: https://issues.apache.org/jira/browse/SOLR-532
> Project: Solr
> Issue Type: Bug
> Reporter: Tricia Williams
> Assignee: Grant Ingersoll
> Priority: Minor
> Attachments: SOLR-532-WordDelimiterFilter.patch
>
>
> When a WordDelimiterFilter ingests a token stream and creates a new token (newTok) it appears to copy most of the old token attributes, except the payload. I believe this is a bug. My solution is for the WordDelimiterFilter to use the Token clone() method to create a carbon copy and then modify the appropriate attributes (offsets and term text).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Work started: (SOLR-532) WordDelimiterFilter ignores
payloads
Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Work on SOLR-532 started by Grant Ingersoll.
> WordDelimiterFilter ignores payloads
> ------------------------------------
>
> Key: SOLR-532
> URL: https://issues.apache.org/jira/browse/SOLR-532
> Project: Solr
> Issue Type: Bug
> Reporter: Tricia Williams
> Assignee: Grant Ingersoll
> Priority: Minor
> Attachments: SOLR-532-WordDelimiterFilter.patch
>
>
> When a WordDelimiterFilter ingests a token stream and creates a new token (newTok) it appears to copy most of the old token attributes, except the payload. I believe this is a bug. My solution is for the WordDelimiterFilter to use the Token clone() method to create a carbon copy and then modify the appropriate attributes (offsets and term text).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-532) WordDelimiterFilter ignores payloads
Posted by "Tricia Williams (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/SOLR-532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tricia Williams updated SOLR-532:
---------------------------------
Attachment: SOLR-532-WordDelimiterFilter.patch
Quick fix. Does this need a unit test to go with it?
> WordDelimiterFilter ignores payloads
> ------------------------------------
>
> Key: SOLR-532
> URL: https://issues.apache.org/jira/browse/SOLR-532
> Project: Solr
> Issue Type: Bug
> Reporter: Tricia Williams
> Priority: Minor
> Attachments: SOLR-532-WordDelimiterFilter.patch
>
>
> When a WordDelimiterFilter ingests a token stream and creates a new token (newTok) it appears to copy most of the old token attributes, except the payload. I believe this is a bug. My solution is for the WordDelimiterFilter to use the Token clone() method to create a carbon copy and then modify the appropriate attributes (offsets and term text).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.