You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Kay Kay (JIRA)" <ji...@apache.org> on 2010/01/11 21:38:54 UTC
[jira] Created: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0
Upgrading Lucene 2.2 to Lucene 3.0.0
-------------------------------------
Key: HBASE-2107
URL: https://issues.apache.org/jira/browse/HBASE-2107
Project: Hadoop HBase
Issue Type: Improvement
Reporter: Kay Kay
HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) .
This patch increases the version in libraries.properties and addresses some deprecations towards moving it.
Rationale for upgradation:
====================
A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently. Hence - we need to keep up with the same and make the utility publish indices for the new version.
Caveats:
=======
Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code. In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack reassigned HBASE-2107:
----------------------------
Assignee: Kay Kay
> Upgrading Lucene 2.2 to Lucene 3.0.0
> -------------------------------------
>
> Key: HBASE-2107
> URL: https://issues.apache.org/jira/browse/HBASE-2107
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: Kay Kay
> Assignee: Kay Kay
> Fix For: 0.21.0
>
> Attachments: HBASE-2107.patch, HBASE-2107.patch, HBASE-2107.patch
>
>
> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) .
> This patch increases the version in libraries.properties and addresses some deprecations towards moving it.
> Rationale for upgradation:
> ====================
> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently. Hence - we need to keep up with the same and make the utility publish indices for the new version.
> Caveats:
> =======
> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code. In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0
Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kay Kay updated HBASE-2107:
---------------------------
Attachment: HBASE-2107.patch
> Upgrading Lucene 2.2 to Lucene 3.0.0
> -------------------------------------
>
> Key: HBASE-2107
> URL: https://issues.apache.org/jira/browse/HBASE-2107
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: Kay Kay
> Attachments: HBASE-2107.patch
>
>
> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) .
> This patch increases the version in libraries.properties and addresses some deprecations towards moving it.
> Rationale for upgradation:
> ====================
> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently. Hence - we need to keep up with the same and make the utility publish indices for the new version.
> Caveats:
> =======
> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code. In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798894#action_12798894 ]
stack commented on HBASE-2107:
------------------------------
Thanks for patch Kay Kay.
> Upgrading Lucene 2.2 to Lucene 3.0.0
> -------------------------------------
>
> Key: HBASE-2107
> URL: https://issues.apache.org/jira/browse/HBASE-2107
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: Kay Kay
> Fix For: 0.21.0
>
> Attachments: HBASE-2107.patch, HBASE-2107.patch, HBASE-2107.patch
>
>
> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) .
> This patch increases the version in libraries.properties and addresses some deprecations towards moving it.
> Rationale for upgradation:
> ====================
> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently. Hence - we need to keep up with the same and make the utility publish indices for the new version.
> Caveats:
> =======
> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code. In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0
Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kay Kay updated HBASE-2107:
---------------------------
Attachment: HBASE-2107.patch
> Upgrading Lucene 2.2 to Lucene 3.0.0
> -------------------------------------
>
> Key: HBASE-2107
> URL: https://issues.apache.org/jira/browse/HBASE-2107
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: Kay Kay
> Attachments: HBASE-2107.patch, HBASE-2107.patch
>
>
> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) .
> This patch increases the version in libraries.properties and addresses some deprecations towards moving it.
> Rationale for upgradation:
> ====================
> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently. Hence - we need to keep up with the same and make the utility publish indices for the new version.
> Caveats:
> =======
> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code. In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
Re: [jira] Resolved: (HBASE-2107) Upgrading Lucene 2.2 to Lucene
3.0.0
Posted by Kay Kay <ka...@gmail.com>.
Thanks stack for helping committing this.
Out of curiosity - this lucene tool ( buildtableIndex) seems totally
tangential to hbase , albeit an useful tool for sure. Can this be in a
separate contrib to lessen the dependencies on the core hbase tree ?
Thoughts ?
On 1/11/10 2:30 PM, stack (JIRA) wrote:
> [ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>
> stack resolved HBASE-2107.
> --------------------------
>
> Resolution: Fixed
> Fix Version/s: 0.21.0
>
> Committed.
>
>
>> Upgrading Lucene 2.2 to Lucene 3.0.0
>> -------------------------------------
>>
>> Key: HBASE-2107
>> URL: https://issues.apache.org/jira/browse/HBASE-2107
>> Project: Hadoop HBase
>> Issue Type: Improvement
>> Reporter: Kay Kay
>> Fix For: 0.21.0
>>
>> Attachments: HBASE-2107.patch, HBASE-2107.patch, HBASE-2107.patch
>>
>>
>> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) .
>> This patch increases the version in libraries.properties and addresses some deprecations towards moving it.
>> Rationale for upgradation:
>> ====================
>> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently. Hence - we need to keep up with the same and make the utility publish indices for the new version.
>> Caveats:
>> =======
>> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code. In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well.
>>
>
[jira] Resolved: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack resolved HBASE-2107.
--------------------------
Resolution: Fixed
Fix Version/s: 0.21.0
Committed.
> Upgrading Lucene 2.2 to Lucene 3.0.0
> -------------------------------------
>
> Key: HBASE-2107
> URL: https://issues.apache.org/jira/browse/HBASE-2107
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: Kay Kay
> Fix For: 0.21.0
>
> Attachments: HBASE-2107.patch, HBASE-2107.patch, HBASE-2107.patch
>
>
> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) .
> This patch increases the version in libraries.properties and addresses some deprecations towards moving it.
> Rationale for upgradation:
> ====================
> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently. Hence - we need to keep up with the same and make the utility publish indices for the new version.
> Caveats:
> =======
> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code. In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798865#action_12798865 ]
stack commented on HBASE-2107:
------------------------------
When I try the patch, I get this:
{code}
compile-core-test:
[javac] Compiling 98 source files to /Users/stack/checkouts/hbase/trunk/build/test
[javac] /Users/stack/checkouts/hbase/trunk/src/test/org/apache/hadoop/hbase/mapreduce/DisabledBecauseVariableSubstTooLargeExceptionTestTableIndex.java:208: cannot find symbol
[javac] symbol : constructor IndexSearcher(java.lang.String)
[javac] location: class org.apache.lucene.search.IndexSearcher
[javac] searcher = new IndexSearcher((new File(indexDirs[0].getPath().
[javac] ^
[javac] /Users/stack/checkouts/hbase/trunk/src/test/org/apache/hadoop/hbase/mapreduce/DisabledBecauseVariableSubstTooLargeExceptionTestTableIndex.java:213: cannot find symbol
[javac] symbol : constructor IndexSearcher(java.lang.String)
[javac] location: class org.apache.lucene.search.IndexSearcher
[javac] searchers[i] = new IndexSearcher((new File(indexDirs[i].getPath().
[javac] ^
[javac] /Users/stack/checkouts/hbase/trunk/src/test/org/apache/hadoop/hbase/mapreduce/DisabledBecauseVariableSubstTooLargeExceptionTestTableIndex.java:238: cannot find symbol
[javac] symbol : method search(org.apache.lucene.search.TermQuery)
[javac] location: class org.apache.lucene.search.Searcher
[javac] int hitCount = searcher.search(new TermQuery(term)).length();
[javac] ^
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: /Users/stack/checkouts/hbase/trunk/src/test/org/apache/hadoop/hbase/HBaseTestCase.java uses unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.
[javac] 3 errors
{code}
Its a disabled test. I can remove it but if you are going to play with lucene, you might want to pay attention to it. The way configuration was passed to lucene was by embedding a bunch of xml into a Configuration property. On occasion we were seeing that hadoop would complain because there were too many substitutions/interpolations happening... it has a max of 20 or so (The substitution I'm referring to is the feature where ${hadoop.tmp.dir} gets replaced by whatever the value of the hadoop.tmp.dir property is).
> Upgrading Lucene 2.2 to Lucene 3.0.0
> -------------------------------------
>
> Key: HBASE-2107
> URL: https://issues.apache.org/jira/browse/HBASE-2107
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: Kay Kay
> Attachments: HBASE-2107.patch
>
>
> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) .
> This patch increases the version in libraries.properties and addresses some deprecations towards moving it.
> Rationale for upgradation:
> ====================
> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently. Hence - we need to keep up with the same and make the utility publish indices for the new version.
> Caveats:
> =======
> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code. In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0
Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798886#action_12798886 ]
Kay Kay commented on HBASE-2107:
--------------------------------
For now - I have just fixed the compilation error , but left the test disabled as before.
> Upgrading Lucene 2.2 to Lucene 3.0.0
> -------------------------------------
>
> Key: HBASE-2107
> URL: https://issues.apache.org/jira/browse/HBASE-2107
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: Kay Kay
> Attachments: HBASE-2107.patch, HBASE-2107.patch
>
>
> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) .
> This patch increases the version in libraries.properties and addresses some deprecations towards moving it.
> Rationale for upgradation:
> ====================
> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently. Hence - we need to keep up with the same and make the utility publish indices for the new version.
> Caveats:
> =======
> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code. In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0
Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798867#action_12798867 ]
Kay Kay commented on HBASE-2107:
--------------------------------
Oops. Sorry. My bad. Let me add more (src/test/**) to fix this.
> Upgrading Lucene 2.2 to Lucene 3.0.0
> -------------------------------------
>
> Key: HBASE-2107
> URL: https://issues.apache.org/jira/browse/HBASE-2107
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: Kay Kay
> Attachments: HBASE-2107.patch
>
>
> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) .
> This patch increases the version in libraries.properties and addresses some deprecations towards moving it.
> Rationale for upgradation:
> ====================
> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently. Hence - we need to keep up with the same and make the utility publish indices for the new version.
> Caveats:
> =======
> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code. In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798883#action_12798883 ]
stack commented on HBASE-2107:
------------------------------
For the exception that caused us disable this test, see HBASE-26. HBASE-172 has actual stack trace.
> Upgrading Lucene 2.2 to Lucene 3.0.0
> -------------------------------------
>
> Key: HBASE-2107
> URL: https://issues.apache.org/jira/browse/HBASE-2107
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: Kay Kay
> Attachments: HBASE-2107.patch
>
>
> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) .
> This patch increases the version in libraries.properties and addresses some deprecations towards moving it.
> Rationale for upgradation:
> ====================
> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently. Hence - we need to keep up with the same and make the utility publish indices for the new version.
> Caveats:
> =======
> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code. In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0
Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kay Kay updated HBASE-2107:
---------------------------
Attachment: HBASE-2107.patch
Auto-generated code stub comments removed.
> Upgrading Lucene 2.2 to Lucene 3.0.0
> -------------------------------------
>
> Key: HBASE-2107
> URL: https://issues.apache.org/jira/browse/HBASE-2107
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: Kay Kay
> Attachments: HBASE-2107.patch, HBASE-2107.patch, HBASE-2107.patch
>
>
> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) .
> This patch increases the version in libraries.properties and addresses some deprecations towards moving it.
> Rationale for upgradation:
> ====================
> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently. Hence - we need to keep up with the same and make the utility publish indices for the new version.
> Caveats:
> =======
> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code. In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2107) Upgrading Lucene 2.2 to Lucene 3.0.0
Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798853#action_12798853 ]
Kay Kay commented on HBASE-2107:
--------------------------------
Deprecated methods / removals:
* Field.TOKENIZED renamed as Field.ANALYZED (with the former removed altogether ).
* IndexWriter ctor. IndexWrite(String,Analyzer,boolean) removed in favor of Directory based input. In our case - we prefer file-based Directory implementation.
> Upgrading Lucene 2.2 to Lucene 3.0.0
> -------------------------------------
>
> Key: HBASE-2107
> URL: https://issues.apache.org/jira/browse/HBASE-2107
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: Kay Kay
> Attachments: HBASE-2107.patch
>
>
> HBase has an utility to export columns as Lucene indices. (o.a.h.hbase.mapreduce.BuildTableIndex ) .
> This patch increases the version in libraries.properties and addresses some deprecations towards moving it.
> Rationale for upgradation:
> ====================
> A lot has been happening in the Lucene since 2.2, with improved performance and focus on NRT (Near Real Time search) happening recently. Hence - we need to keep up with the same and make the utility publish indices for the new version.
> Caveats:
> =======
> Index created by Lucene 3.0 is *not backward-compatible* with Lucene 2.2 code. In other words - as part of this upgradation - indices need to be created all over again and the library interacting with the index ( readers / searchers ) need to be upgraded to the new version as well.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.