You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2010/08/17 23:07:16 UTC

[jira] Created: (LUCENE-2606) optimize contrib/regex for flex

optimize contrib/regex for flex
-------------------------------

                 Key: LUCENE-2606
                 URL: https://issues.apache.org/jira/browse/LUCENE-2606
             Project: Lucene - Java
          Issue Type: Improvement
          Components: contrib/*
            Reporter: Robert Muir
             Fix For: 4.0


* changes RegexCapabilities match(String) to match(BytesRef)
* the jakarta and jdk impls uses CharacterIterator/CharSequence matching against the utf16result instead.
* i also reuse the matcher for jdk, i don't see why we didnt do this before but it makes sense esp since we reuse the CSQ


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2606) optimize contrib/regex for flex

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899822#action_12899822 ] 

Uwe Schindler commented on LUCENE-2606:
---------------------------------------

Looks good! The thing was broken in 3.x and 3.0, too as it was not threadsafe, if the same capabilities object was used in multiple threads.

> optimize contrib/regex for flex
> -------------------------------
>
>                 Key: LUCENE-2606
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2606
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/*
>            Reporter: Robert Muir
>             Fix For: 4.0
>
>         Attachments: LUCENE-2606.patch, LUCENE-2606.patch
>
>
> * changes RegexCapabilities match(String) to match(BytesRef)
> * the jakarta and jdk impls uses CharacterIterator/CharSequence matching against the utf16result instead.
> * i also reuse the matcher for jdk, i don't see why we didnt do this before but it makes sense esp since we reuse the CSQ

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2606) optimize contrib/regex for flex

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-2606:
--------------------------------

    Attachment: LUCENE-2606.patch

attached is another iteration:
* because the Query stores RegexCapabilities, i pulled the 'matcher' stuff out so the enum just calls matcher = capability.compile(pattern);
This way the capabilities stores no real state, only the matcher which is created in the TermsEnum.
* the RegexCapabilities is also marked serializable (LUCENE-961)


> optimize contrib/regex for flex
> -------------------------------
>
>                 Key: LUCENE-2606
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2606
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/*
>            Reporter: Robert Muir
>             Fix For: 4.0
>
>         Attachments: LUCENE-2606.patch, LUCENE-2606.patch
>
>
> * changes RegexCapabilities match(String) to match(BytesRef)
> * the jakarta and jdk impls uses CharacterIterator/CharSequence matching against the utf16result instead.
> * i also reuse the matcher for jdk, i don't see why we didnt do this before but it makes sense esp since we reuse the CSQ

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2606) optimize contrib/regex for flex

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-2606:
--------------------------------

    Attachment: LUCENE-2606.patch

simple patch, we will have to list the break (matches(String) -> matches(BytesRef) in 
contrib/changes because RegexCapabilities is an interface, no way to do any back compat.


> optimize contrib/regex for flex
> -------------------------------
>
>                 Key: LUCENE-2606
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2606
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/*
>            Reporter: Robert Muir
>             Fix For: 4.0
>
>         Attachments: LUCENE-2606.patch
>
>
> * changes RegexCapabilities match(String) to match(BytesRef)
> * the jakarta and jdk impls uses CharacterIterator/CharSequence matching against the utf16result instead.
> * i also reuse the matcher for jdk, i don't see why we didnt do this before but it makes sense esp since we reuse the CSQ

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2606) optimize contrib/regex for flex

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900028#action_12900028 ] 

Robert Muir commented on LUCENE-2606:
-------------------------------------

bq. Looks good! The thing was broken in 3.x and 3.0, too as it was not threadsafe, if the same capabilities object was used in multiple threads.

True, I think we have the opportunity to fix it in 4.x since we have to break the interface anyway.

Should we do anything about 3.x? It seems good to fix bugs, but it would be frustrating (if someone has a custom RegexCapabilities) to break the API in 3.x, then in 4.x again!


> optimize contrib/regex for flex
> -------------------------------
>
>                 Key: LUCENE-2606
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2606
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/*
>            Reporter: Robert Muir
>             Fix For: 4.0
>
>         Attachments: LUCENE-2606.patch, LUCENE-2606.patch
>
>
> * changes RegexCapabilities match(String) to match(BytesRef)
> * the jakarta and jdk impls uses CharacterIterator/CharSequence matching against the utf16result instead.
> * i also reuse the matcher for jdk, i don't see why we didnt do this before but it makes sense esp since we reuse the CSQ

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Resolved: (LUCENE-2606) optimize contrib/regex for flex

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir resolved LUCENE-2606.
---------------------------------

    Resolution: Fixed

Committed revision 987129.

> optimize contrib/regex for flex
> -------------------------------
>
>                 Key: LUCENE-2606
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2606
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/*
>            Reporter: Robert Muir
>             Fix For: 4.0
>
>         Attachments: LUCENE-2606.patch, LUCENE-2606.patch
>
>
> * changes RegexCapabilities match(String) to match(BytesRef)
> * the jakarta and jdk impls uses CharacterIterator/CharSequence matching against the utf16result instead.
> * i also reuse the matcher for jdk, i don't see why we didnt do this before but it makes sense esp since we reuse the CSQ

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org