You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Kay Kay (JIRA)" <ji...@apache.org> on 2010/01/11 21:38:54 UTC

[jira] Created: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0

Upgrading Lucene 2.2 to Lucene 3.0.0 
-------------------------------------

                 Key: HBASE-2107
                 URL: https://issues.apache.org/jira/browse/HBASE-2107
             Project: Hadoop HBase
          Issue Type: Improvement
            Reporter: Kay Kay


HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) . 

This patch increases the version in libraries.properties and addresses some deprecations towards moving it. 

Rationale for upgradation:
====================

A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently.  Hence - we need to keep up with the same and make the utility publish indices for the new version. 

Caveats:
=======
Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code.  In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well. 


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack reassigned HBASE-2107:
----------------------------

    Assignee: Kay Kay

> Upgrading Lucene 2.2 to Lucene 3.0.0 
> -------------------------------------
>
>                 Key: HBASE-2107
>                 URL: https://issues.apache.org/jira/browse/HBASE-2107
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Kay Kay
>            Assignee: Kay Kay
>             Fix For: 0.21.0
>
>         Attachments: HBASE-2107.patch, HBASE-2107.patch, HBASE-2107.patch
>
>
> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) . 
> This patch increases the version in libraries.properties and addresses some deprecations towards moving it. 
> Rationale for upgradation:
> ====================
> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently.  Hence - we need to keep up with the same and make the utility publish indices for the new version. 
> Caveats:
> =======
> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code.  In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0

Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kay Kay updated HBASE-2107:
---------------------------

    Attachment: HBASE-2107.patch

> Upgrading Lucene 2.2 to Lucene 3.0.0 
> -------------------------------------
>
>                 Key: HBASE-2107
>                 URL: https://issues.apache.org/jira/browse/HBASE-2107
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Kay Kay
>         Attachments: HBASE-2107.patch
>
>
> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) . 
> This patch increases the version in libraries.properties and addresses some deprecations towards moving it. 
> Rationale for upgradation:
> ====================
> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently.  Hence - we need to keep up with the same and make the utility publish indices for the new version. 
> Caveats:
> =======
> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code.  In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798894#action_12798894 ] 

stack commented on HBASE-2107:
------------------------------

Thanks for patch Kay Kay.

> Upgrading Lucene 2.2 to Lucene 3.0.0 
> -------------------------------------
>
>                 Key: HBASE-2107
>                 URL: https://issues.apache.org/jira/browse/HBASE-2107
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Kay Kay
>             Fix For: 0.21.0
>
>         Attachments: HBASE-2107.patch, HBASE-2107.patch, HBASE-2107.patch
>
>
> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) . 
> This patch increases the version in libraries.properties and addresses some deprecations towards moving it. 
> Rationale for upgradation:
> ====================
> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently.  Hence - we need to keep up with the same and make the utility publish indices for the new version. 
> Caveats:
> =======
> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code.  In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0

Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kay Kay updated HBASE-2107:
---------------------------

    Attachment: HBASE-2107.patch

> Upgrading Lucene 2.2 to Lucene 3.0.0 
> -------------------------------------
>
>                 Key: HBASE-2107
>                 URL: https://issues.apache.org/jira/browse/HBASE-2107
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Kay Kay
>         Attachments: HBASE-2107.patch, HBASE-2107.patch
>
>
> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) . 
> This patch increases the version in libraries.properties and addresses some deprecations towards moving it. 
> Rationale for upgradation:
> ====================
> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently.  Hence - we need to keep up with the same and make the utility publish indices for the new version. 
> Caveats:
> =======
> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code.  In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Re: [jira] Resolved: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0

Posted by Kay Kay <ka...@gmail.com>.
Thanks stack for helping committing this.

Out of curiosity - this lucene tool ( buildtableIndex) seems totally 
tangential to hbase , albeit an useful tool for sure. Can this be in a 
separate contrib to lessen the dependencies on the core hbase tree ?  
Thoughts ?


On 1/11/10 2:30 PM, stack (JIRA) wrote:
>       [ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>
> stack resolved HBASE-2107.
> --------------------------
>
>         Resolution: Fixed
>      Fix Version/s: 0.21.0
>
> Committed.
>
>    
>> Upgrading Lucene 2.2 to Lucene 3.0.0
>> -------------------------------------
>>
>>                  Key: HBASE-2107
>>                  URL: https://issues.apache.org/jira/browse/HBASE-2107
>>              Project: Hadoop HBase
>>           Issue Type: Improvement
>>             Reporter: Kay Kay
>>              Fix For: 0.21.0
>>
>>          Attachments: HBASE-2107.patch, HBASE-2107.patch, HBASE-2107.patch
>>
>>
>> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) .
>> This patch increases the version in libraries.properties and addresses some deprecations towards moving it.
>> Rationale for upgradation:
>> ====================
>> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently.  Hence - we need to keep up with the same and make the utility publish indices for the new version.
>> Caveats:
>> =======
>> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code.  In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well.
>>      
>    


[jira] Resolved: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-2107.
--------------------------

       Resolution: Fixed
    Fix Version/s: 0.21.0

Committed.

> Upgrading Lucene 2.2 to Lucene 3.0.0 
> -------------------------------------
>
>                 Key: HBASE-2107
>                 URL: https://issues.apache.org/jira/browse/HBASE-2107
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Kay Kay
>             Fix For: 0.21.0
>
>         Attachments: HBASE-2107.patch, HBASE-2107.patch, HBASE-2107.patch
>
>
> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) . 
> This patch increases the version in libraries.properties and addresses some deprecations towards moving it. 
> Rationale for upgradation:
> ====================
> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently.  Hence - we need to keep up with the same and make the utility publish indices for the new version. 
> Caveats:
> =======
> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code.  In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798865#action_12798865 ] 

stack commented on HBASE-2107:
------------------------------

When I try the patch, I get this:

{code}
compile-core-test:
    [javac] Compiling 98 source files to /Users/stack/checkouts/hbase/trunk/build/test
    [javac] /Users/stack/checkouts/hbase/trunk/src/test/org/apache/hadoop/hbase/mapreduce/DisabledBecauseVariableSubstTooLargeExceptionTestTableIndex.java:208: cannot find symbol
    [javac] symbol  : constructor IndexSearcher(java.lang.String)
    [javac] location: class org.apache.lucene.search.IndexSearcher
    [javac]         searcher = new IndexSearcher((new File(indexDirs[0].getPath().
    [javac]                    ^
    [javac] /Users/stack/checkouts/hbase/trunk/src/test/org/apache/hadoop/hbase/mapreduce/DisabledBecauseVariableSubstTooLargeExceptionTestTableIndex.java:213: cannot find symbol
    [javac] symbol  : constructor IndexSearcher(java.lang.String)
    [javac] location: class org.apache.lucene.search.IndexSearcher
    [javac]           searchers[i] = new IndexSearcher((new File(indexDirs[i].getPath().
    [javac]                          ^
    [javac] /Users/stack/checkouts/hbase/trunk/src/test/org/apache/hadoop/hbase/mapreduce/DisabledBecauseVariableSubstTooLargeExceptionTestTableIndex.java:238: cannot find symbol
    [javac] symbol  : method search(org.apache.lucene.search.TermQuery)
    [javac] location: class org.apache.lucene.search.Searcher
    [javac]         int hitCount = searcher.search(new TermQuery(term)).length();
    [javac]                                ^
    [javac] Note: Some input files use or override a deprecated API.
    [javac] Note: Recompile with -Xlint:deprecation for details.
    [javac] Note: /Users/stack/checkouts/hbase/trunk/src/test/org/apache/hadoop/hbase/HBaseTestCase.java uses unchecked or unsafe operations.
    [javac] Note: Recompile with -Xlint:unchecked for details.
    [javac] 3 errors
{code}

Its a disabled test.  I can remove it but if you are going to play with lucene, you might want to pay attention to it.  The way configuration was passed to lucene was by embedding a bunch of xml into a Configuration property.  On occasion we were seeing that hadoop would complain because there were too many substitutions/interpolations happening... it has a max of 20 or so (The substitution I'm referring to is the feature where ${hadoop.tmp.dir} gets replaced by whatever the value of the hadoop.tmp.dir property is).

> Upgrading Lucene 2.2 to Lucene 3.0.0 
> -------------------------------------
>
>                 Key: HBASE-2107
>                 URL: https://issues.apache.org/jira/browse/HBASE-2107
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Kay Kay
>         Attachments: HBASE-2107.patch
>
>
> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) . 
> This patch increases the version in libraries.properties and addresses some deprecations towards moving it. 
> Rationale for upgradation:
> ====================
> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently.  Hence - we need to keep up with the same and make the utility publish indices for the new version. 
> Caveats:
> =======
> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code.  In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0

Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798886#action_12798886 ] 

Kay Kay commented on HBASE-2107:
--------------------------------

For now - I have just fixed the compilation error , but left the test disabled as before. 


> Upgrading Lucene 2.2 to Lucene 3.0.0 
> -------------------------------------
>
>                 Key: HBASE-2107
>                 URL: https://issues.apache.org/jira/browse/HBASE-2107
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Kay Kay
>         Attachments: HBASE-2107.patch, HBASE-2107.patch
>
>
> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) . 
> This patch increases the version in libraries.properties and addresses some deprecations towards moving it. 
> Rationale for upgradation:
> ====================
> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently.  Hence - we need to keep up with the same and make the utility publish indices for the new version. 
> Caveats:
> =======
> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code.  In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0

Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798867#action_12798867 ] 

Kay Kay commented on HBASE-2107:
--------------------------------

Oops. Sorry.  My bad. Let me add more (src/test/**) to fix this. 

> Upgrading Lucene 2.2 to Lucene 3.0.0 
> -------------------------------------
>
>                 Key: HBASE-2107
>                 URL: https://issues.apache.org/jira/browse/HBASE-2107
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Kay Kay
>         Attachments: HBASE-2107.patch
>
>
> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) . 
> This patch increases the version in libraries.properties and addresses some deprecations towards moving it. 
> Rationale for upgradation:
> ====================
> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently.  Hence - we need to keep up with the same and make the utility publish indices for the new version. 
> Caveats:
> =======
> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code.  In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798883#action_12798883 ] 

stack commented on HBASE-2107:
------------------------------

For the exception that caused us disable this test, see HBASE-26.  HBASE-172 has actual stack trace.

> Upgrading Lucene 2.2 to Lucene 3.0.0 
> -------------------------------------
>
>                 Key: HBASE-2107
>                 URL: https://issues.apache.org/jira/browse/HBASE-2107
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Kay Kay
>         Attachments: HBASE-2107.patch
>
>
> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) . 
> This patch increases the version in libraries.properties and addresses some deprecations towards moving it. 
> Rationale for upgradation:
> ====================
> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently.  Hence - we need to keep up with the same and make the utility publish indices for the new version. 
> Caveats:
> =======
> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code.  In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0

Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kay Kay updated HBASE-2107:
---------------------------

    Attachment: HBASE-2107.patch

Auto-generated code stub comments removed. 

> Upgrading Lucene 2.2 to Lucene 3.0.0 
> -------------------------------------
>
>                 Key: HBASE-2107
>                 URL: https://issues.apache.org/jira/browse/HBASE-2107
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Kay Kay
>         Attachments: HBASE-2107.patch, HBASE-2107.patch, HBASE-2107.patch
>
>
> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) . 
> This patch increases the version in libraries.properties and addresses some deprecations towards moving it. 
> Rationale for upgradation:
> ====================
> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently.  Hence - we need to keep up with the same and make the utility publish indices for the new version. 
> Caveats:
> =======
> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code.  In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0

Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798853#action_12798853 ] 

Kay Kay commented on HBASE-2107:
--------------------------------

Deprecated methods / removals: 

* Field.TOKENIZED renamed as Field.ANALYZED (with the former removed altogether ). 
* IndexWriter ctor.   IndexWrite(String,Analyzer,boolean) removed in favor of Directory based input. In our case - we prefer file-based Directory implementation. 




> Upgrading Lucene 2.2 to Lucene 3.0.0 
> -------------------------------------
>
>                 Key: HBASE-2107
>                 URL: https://issues.apache.org/jira/browse/HBASE-2107
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Kay Kay
>         Attachments: HBASE-2107.patch
>
>
> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) . 
> This patch increases the version in libraries.properties and addresses some deprecations towards moving it. 
> Rationale for upgradation:
> ====================
> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently.  Hence - we need to keep up with the same and make the utility publish indices for the new version. 
> Caveats:
> =======
> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code.  In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.