You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (Created) (JIRA)" <ji...@apache.org> on 2012/03/15 18:21:37 UTC
[jira] [Created] (LUCENE-3873) tie MockGraphTokenFilter into all
analyzers tests
tie MockGraphTokenFilter into all analyzers tests
-------------------------------------------------
Key: LUCENE-3873
URL: https://issues.apache.org/jira/browse/LUCENE-3873
Project: Lucene - Java
Issue Type: Task
Components: modules/analysis
Reporter: Robert Muir
Mike made a MockGraphTokenFilter on LUCENE-3848.
Many filters currently arent tested with anything but a simple tokenstream.
we should test them with this, too, it might find bugs (zero-length terms,
stacked terms/synonyms, etc)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Commented] (LUCENE-3873) tie MockGraphTokenFilter into all
analyzers tests
Posted by "Michael McCandless (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238422#comment-13238422 ]
Michael McCandless commented on LUCENE-3873:
--------------------------------------------
I agree we can use it in specific places for starters...
The patch on LUCENE-3848 mixes in "TokenStream to Automaton" and MockGraphTokenFilter; I'll split that apart and only commit MockGraphTokenFilter here.
One problem is... MockGraphTokenFilter isn't setting offsets currently.... I think to do this "correctly" it needs to buffer up pending input tokens, until it's reached the posLength it wants to output for a random token, and then set the offset accordingly.
> tie MockGraphTokenFilter into all analyzers tests
> -------------------------------------------------
>
> Key: LUCENE-3873
> URL: https://issues.apache.org/jira/browse/LUCENE-3873
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Reporter: Robert Muir
> Assignee: Michael McCandless
>
> Mike made a MockGraphTokenFilter on LUCENE-3848.
> Many filters currently arent tested with anything but a simple tokenstream.
> we should test them with this, too, it might find bugs (zero-length terms,
> stacked terms/synonyms, etc)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Updated] (LUCENE-3873) tie MockGraphTokenFilter into all
analyzers tests
Posted by "Michael McCandless (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless updated LUCENE-3873:
---------------------------------------
Attachment: LUCENE-3873.patch
Patch... I think it's close, but there are still some nocommits...
I had to rework the original MockGraphTokenFilter to sometimes buffer tokens so
it can set the correct offsets.
I added a few test cases to existing analyzers (SynFilter, Japanese,
Standard), and new direct test cases.
I also created a new MockHoleInjectingTokenFilter...
Tests seem to pass... but it wouldn't surprise me if beasting/jenkins
uncovers something...
> tie MockGraphTokenFilter into all analyzers tests
> -------------------------------------------------
>
> Key: LUCENE-3873
> URL: https://issues.apache.org/jira/browse/LUCENE-3873
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Reporter: Robert Muir
> Assignee: Michael McCandless
> Attachments: LUCENE-3873.patch
>
>
> Mike made a MockGraphTokenFilter on LUCENE-3848.
> Many filters currently arent tested with anything but a simple tokenstream.
> we should test them with this, too, it might find bugs (zero-length terms,
> stacked terms/synonyms, etc)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Resolved] (LUCENE-3873) tie MockGraphTokenFilter into all
analyzers tests
Posted by "Michael McCandless (Resolved) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless resolved LUCENE-3873.
----------------------------------------
Resolution: Fixed
Fix Version/s: 4.0
> tie MockGraphTokenFilter into all analyzers tests
> -------------------------------------------------
>
> Key: LUCENE-3873
> URL: https://issues.apache.org/jira/browse/LUCENE-3873
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Reporter: Robert Muir
> Assignee: Michael McCandless
> Fix For: 4.0
>
> Attachments: LUCENE-3873.patch, LUCENE-3873.patch
>
>
> Mike made a MockGraphTokenFilter on LUCENE-3848.
> Many filters currently arent tested with anything but a simple tokenstream.
> we should test them with this, too, it might find bugs (zero-length terms,
> stacked terms/synonyms, etc)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Commented] (LUCENE-3873) tie MockGraphTokenFilter into all
analyzers tests
Posted by "Michael McCandless (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238398#comment-13238398 ]
Michael McCandless commented on LUCENE-3873:
--------------------------------------------
LUCENE-3848 has the MockGraphTokenFilter patch...
> tie MockGraphTokenFilter into all analyzers tests
> -------------------------------------------------
>
> Key: LUCENE-3873
> URL: https://issues.apache.org/jira/browse/LUCENE-3873
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Reporter: Robert Muir
>
> Mike made a MockGraphTokenFilter on LUCENE-3848.
> Many filters currently arent tested with anything but a simple tokenstream.
> we should test them with this, too, it might find bugs (zero-length terms,
> stacked terms/synonyms, etc)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Updated] (LUCENE-3873) tie MockGraphTokenFilter into all
analyzers tests
Posted by "Robert Muir (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir updated LUCENE-3873:
--------------------------------
Fix Version/s: 3.6.1
> tie MockGraphTokenFilter into all analyzers tests
> -------------------------------------------------
>
> Key: LUCENE-3873
> URL: https://issues.apache.org/jira/browse/LUCENE-3873
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Reporter: Robert Muir
> Assignee: Michael McCandless
> Fix For: 4.0, 3.6.1
>
> Attachments: LUCENE-3873.patch, LUCENE-3873.patch
>
>
> Mike made a MockGraphTokenFilter on LUCENE-3848.
> Many filters currently arent tested with anything but a simple tokenstream.
> we should test them with this, too, it might find bugs (zero-length terms,
> stacked terms/synonyms, etc)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Commented] (LUCENE-3873) tie MockGraphTokenFilter into all
analyzers tests
Posted by "Robert Muir (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238405#comment-13238405 ]
Robert Muir commented on LUCENE-3873:
-------------------------------------
One way we can tie this in is via LUCENE-3919.
But: I think we can use this filter in some individual tests immediately?
E.g. we can just add a method testRandomGraphs to the filters that do lots
of crazy state-capturing, putting this thing in-front-of/behind them in
the analyzer and call checkRandomData?
> tie MockGraphTokenFilter into all analyzers tests
> -------------------------------------------------
>
> Key: LUCENE-3873
> URL: https://issues.apache.org/jira/browse/LUCENE-3873
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Reporter: Robert Muir
> Assignee: Michael McCandless
>
> Mike made a MockGraphTokenFilter on LUCENE-3848.
> Many filters currently arent tested with anything but a simple tokenstream.
> we should test them with this, too, it might find bugs (zero-length terms,
> stacked terms/synonyms, etc)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Assigned] (LUCENE-3873) tie MockGraphTokenFilter into all
analyzers tests
Posted by "Michael McCandless (Assigned) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless reassigned LUCENE-3873:
------------------------------------------
Assignee: Michael McCandless
> tie MockGraphTokenFilter into all analyzers tests
> -------------------------------------------------
>
> Key: LUCENE-3873
> URL: https://issues.apache.org/jira/browse/LUCENE-3873
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Reporter: Robert Muir
> Assignee: Michael McCandless
>
> Mike made a MockGraphTokenFilter on LUCENE-3848.
> Many filters currently arent tested with anything but a simple tokenstream.
> we should test them with this, too, it might find bugs (zero-length terms,
> stacked terms/synonyms, etc)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Updated] (LUCENE-3873) tie MockGraphTokenFilter into all
analyzers tests
Posted by "Michael McCandless (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless updated LUCENE-3873:
---------------------------------------
Attachment: LUCENE-3873.patch
New patch, fixing all nocommits. I think it's ready...
> tie MockGraphTokenFilter into all analyzers tests
> -------------------------------------------------
>
> Key: LUCENE-3873
> URL: https://issues.apache.org/jira/browse/LUCENE-3873
> Project: Lucene - Java
> Issue Type: Task
> Components: modules/analysis
> Reporter: Robert Muir
> Assignee: Michael McCandless
> Attachments: LUCENE-3873.patch, LUCENE-3873.patch
>
>
> Mike made a MockGraphTokenFilter on LUCENE-3848.
> Many filters currently arent tested with anything but a simple tokenstream.
> we should test them with this, too, it might find bugs (zero-length terms,
> stacked terms/synonyms, etc)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org