You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2009/06/11 13:21:07 UTC

[jira] Created: (LUCENE-1684) Add matchVersion to StandardAnalyzer

Add matchVersion to StandardAnalyzer
------------------------------------

                 Key: LUCENE-1684
                 URL: https://issues.apache.org/jira/browse/LUCENE-1684
             Project: Lucene - Java
          Issue Type: Improvement
          Components: Analysis
            Reporter: Michael McCandless
            Assignee: Michael McCandless
            Priority: Minor
             Fix For: 2.9


I think we should add a matchVersion arg to StandardAnalyzer.  This
allows us to fix bugs (for new users) while keeping precise back
compat (for users who upgrade).

We've discussed this on java-dev, but I'd like to now make it concrete
(patch attached).  I think it actually works very well, and is a
simple tool to help us carry out our back-compat policy.

I coded up an example with StandardAnalyzer:

  * The ctor now takes a required arg (Version matchVersion).  You
    pass Version.LUCENE_CURRENT to always get lates & greatest, or eg
    Version.LUCENE_24 to match 2.4's bugs/settings/behavior.

  * StandardAalyzer conditionalizes the "replace invalid acronym" and
    "enable position increment in StopFilter" based on matchVersion.

  * It also prevents creating zillions of ctors, over time, as we need
    to change settings in the class.  EG StandardAnalyzer now has 2
    settings that are version dependent, and there's at least another
    2 issues open on fixing some more of its bugs.

The migration is also very clean: we'd only add this to classes on an
"as needed" basis.  On the first release that adds the arg, the
default remains back compatible with the prior release.  Then, going
forward, we are free to fix issues on that class and conditionalize by
matchVersion.

The javadoc at the top of StandardAnalyzer clearly calls out what
version specific behavior is done:

{code}
 * <p>You must specify the required {@link Version}
 * compatibility when creating StandardAnalyzer:
 * <ul>
 *   <li> As of 2.9, StopFilter preserves position
 *        increments by default
 *   <li> As of 2.9, Tokens incorrectly idenfied as acronyms
 *        are corrected (see <a href="https://issues.apache.org/jira/browse/LUCENE-1068">LUCENE-1608</a>
 * </ul>
 *
{code}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Resolved: (LUCENE-1684) Add matchVersion to StandardAnalyzer

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless resolved LUCENE-1684.
----------------------------------------

    Resolution: Fixed

> Add matchVersion to StandardAnalyzer
> ------------------------------------
>
>                 Key: LUCENE-1684
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1684
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Analysis
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1684.patch
>
>
> I think we should add a matchVersion arg to StandardAnalyzer.  This
> allows us to fix bugs (for new users) while keeping precise back
> compat (for users who upgrade).
> We've discussed this on java-dev, but I'd like to now make it concrete
> (patch attached).  I think it actually works very well, and is a
> simple tool to help us carry out our back-compat policy.
> I coded up an example with StandardAnalyzer:
>   * The ctor now takes a required arg (Version matchVersion).  You
>     pass Version.LUCENE_CURRENT to always get lates & greatest, or eg
>     Version.LUCENE_24 to match 2.4's bugs/settings/behavior.
>   * StandardAalyzer conditionalizes the "replace invalid acronym" and
>     "enable position increment in StopFilter" based on matchVersion.
>   * It also prevents creating zillions of ctors, over time, as we need
>     to change settings in the class.  EG StandardAnalyzer now has 2
>     settings that are version dependent, and there's at least another
>     2 issues open on fixing some more of its bugs.
> The migration is also very clean: we'd only add this to classes on an
> "as needed" basis.  On the first release that adds the arg, the
> default remains back compatible with the prior release.  Then, going
> forward, we are free to fix issues on that class and conditionalize by
> matchVersion.
> The javadoc at the top of StandardAnalyzer clearly calls out what
> version specific behavior is done:
> {code}
>  * <p>You must specify the required {@link Version}
>  * compatibility when creating StandardAnalyzer:
>  * <ul>
>  *   <li> As of 2.9, StopFilter preserves position
>  *        increments by default
>  *   <li> As of 2.9, Tokens incorrectly idenfied as acronyms
>  *        are corrected (see <a href="https://issues.apache.org/jira/browse/LUCENE-1068">LUCENE-1608</a>
>  * </ul>
>  *
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1684) Add matchVersion to StandardAnalyzer

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719107#action_12719107 ] 

Michael McCandless commented on LUCENE-1684:
--------------------------------------------

Thanks Marvin; I think the approach works very well.  I plan to commit in a day or two...

> Add matchVersion to StandardAnalyzer
> ------------------------------------
>
>                 Key: LUCENE-1684
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1684
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Analysis
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1684.patch
>
>
> I think we should add a matchVersion arg to StandardAnalyzer.  This
> allows us to fix bugs (for new users) while keeping precise back
> compat (for users who upgrade).
> We've discussed this on java-dev, but I'd like to now make it concrete
> (patch attached).  I think it actually works very well, and is a
> simple tool to help us carry out our back-compat policy.
> I coded up an example with StandardAnalyzer:
>   * The ctor now takes a required arg (Version matchVersion).  You
>     pass Version.LUCENE_CURRENT to always get lates & greatest, or eg
>     Version.LUCENE_24 to match 2.4's bugs/settings/behavior.
>   * StandardAalyzer conditionalizes the "replace invalid acronym" and
>     "enable position increment in StopFilter" based on matchVersion.
>   * It also prevents creating zillions of ctors, over time, as we need
>     to change settings in the class.  EG StandardAnalyzer now has 2
>     settings that are version dependent, and there's at least another
>     2 issues open on fixing some more of its bugs.
> The migration is also very clean: we'd only add this to classes on an
> "as needed" basis.  On the first release that adds the arg, the
> default remains back compatible with the prior release.  Then, going
> forward, we are free to fix issues on that class and conditionalize by
> matchVersion.
> The javadoc at the top of StandardAnalyzer clearly calls out what
> version specific behavior is done:
> {code}
>  * <p>You must specify the required {@link Version}
>  * compatibility when creating StandardAnalyzer:
>  * <ul>
>  *   <li> As of 2.9, StopFilter preserves position
>  *        increments by default
>  *   <li> As of 2.9, Tokens incorrectly idenfied as acronyms
>  *        are corrected (see <a href="https://issues.apache.org/jira/browse/LUCENE-1068">LUCENE-1608</a>
>  * </ul>
>  *
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Updated: (LUCENE-1684) Add matchVersion to StandardAnalyzer

Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-1684:
---------------------------------------

    Attachment: LUCENE-1684.patch

> Add matchVersion to StandardAnalyzer
> ------------------------------------
>
>                 Key: LUCENE-1684
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1684
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Analysis
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1684.patch
>
>
> I think we should add a matchVersion arg to StandardAnalyzer.  This
> allows us to fix bugs (for new users) while keeping precise back
> compat (for users who upgrade).
> We've discussed this on java-dev, but I'd like to now make it concrete
> (patch attached).  I think it actually works very well, and is a
> simple tool to help us carry out our back-compat policy.
> I coded up an example with StandardAnalyzer:
>   * The ctor now takes a required arg (Version matchVersion).  You
>     pass Version.LUCENE_CURRENT to always get lates & greatest, or eg
>     Version.LUCENE_24 to match 2.4's bugs/settings/behavior.
>   * StandardAalyzer conditionalizes the "replace invalid acronym" and
>     "enable position increment in StopFilter" based on matchVersion.
>   * It also prevents creating zillions of ctors, over time, as we need
>     to change settings in the class.  EG StandardAnalyzer now has 2
>     settings that are version dependent, and there's at least another
>     2 issues open on fixing some more of its bugs.
> The migration is also very clean: we'd only add this to classes on an
> "as needed" basis.  On the first release that adds the arg, the
> default remains back compatible with the prior release.  Then, going
> forward, we are free to fix issues on that class and conditionalize by
> matchVersion.
> The javadoc at the top of StandardAnalyzer clearly calls out what
> version specific behavior is done:
> {code}
>  * <p>You must specify the required {@link Version}
>  * compatibility when creating StandardAnalyzer:
>  * <ul>
>  *   <li> As of 2.9, StopFilter preserves position
>  *        increments by default
>  *   <li> As of 2.9, Tokens incorrectly idenfied as acronyms
>  *        are corrected (see <a href="https://issues.apache.org/jira/browse/LUCENE-1068">LUCENE-1608</a>
>  * </ul>
>  *
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


[jira] Commented: (LUCENE-1684) Add matchVersion to StandardAnalyzer

Posted by "Marvin Humphrey (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12718485#action_12718485 ] 

Marvin Humphrey commented on LUCENE-1684:
-----------------------------------------

+1

This approach addresses all of my concerns about 
action-at-a-distance behaviors.

Nice work, Mike.

> Add matchVersion to StandardAnalyzer
> ------------------------------------
>
>                 Key: LUCENE-1684
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1684
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Analysis
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>
>         Attachments: LUCENE-1684.patch
>
>
> I think we should add a matchVersion arg to StandardAnalyzer.  This
> allows us to fix bugs (for new users) while keeping precise back
> compat (for users who upgrade).
> We've discussed this on java-dev, but I'd like to now make it concrete
> (patch attached).  I think it actually works very well, and is a
> simple tool to help us carry out our back-compat policy.
> I coded up an example with StandardAnalyzer:
>   * The ctor now takes a required arg (Version matchVersion).  You
>     pass Version.LUCENE_CURRENT to always get lates & greatest, or eg
>     Version.LUCENE_24 to match 2.4's bugs/settings/behavior.
>   * StandardAalyzer conditionalizes the "replace invalid acronym" and
>     "enable position increment in StopFilter" based on matchVersion.
>   * It also prevents creating zillions of ctors, over time, as we need
>     to change settings in the class.  EG StandardAnalyzer now has 2
>     settings that are version dependent, and there's at least another
>     2 issues open on fixing some more of its bugs.
> The migration is also very clean: we'd only add this to classes on an
> "as needed" basis.  On the first release that adds the arg, the
> default remains back compatible with the prior release.  Then, going
> forward, we are free to fix issues on that class and conditionalize by
> matchVersion.
> The javadoc at the top of StandardAnalyzer clearly calls out what
> version specific behavior is done:
> {code}
>  * <p>You must specify the required {@link Version}
>  * compatibility when creating StandardAnalyzer:
>  * <ul>
>  *   <li> As of 2.9, StopFilter preserves position
>  *        increments by default
>  *   <li> As of 2.9, Tokens incorrectly idenfied as acronyms
>  *        are corrected (see <a href="https://issues.apache.org/jira/browse/LUCENE-1068">LUCENE-1608</a>
>  * </ul>
>  *
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org