You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jaikit Savla <ja...@yahoo.com.INVALID> on 2015/01/14 09:42:51 UTC
Load existing Lucene sharded indexes onto single Solr collection
Folks,
I have generated multiple (count of 100) sharded Lucene indexes on Hadoop and they are of format. The total indexed data (sum of all the index-*) is of size 500GB and hence the number of shards.drwxr-x--- 2 index-66drwxr-x--- 2 index-68drwxr-x--- 2 index-9........
and each index directory is of formatls index-9_4.fdt _4.fdx _4.fnm _4_Lucene40_0.frq _4_Lucene40_0.prx _4_Lucene40_0.tim _4_Lucene40_0.tip _4_nrm.cfe _4_nrm.cfs _4.si segments_1 segments.gen write.lock
Now to load this index, I am currently using Lucene IndexMergeTool to merge all the shards into one giant index. My question is, is there a way to load shared index without merging into one giant index on to single collection ?
Thanks,Jaikit
Re: Load existing Lucene sharded indexes onto single Solr
collection
Posted by Jaikit Savla <ja...@yahoo.com.INVALID>.
Yes, I wanted to get rid of merge step. But looks like merge is not that cumbersome either. Thanks Mikhail and Erick for pointers, that helped.
Jaikit
On Wednesday, January 14, 2015 8:24 AM, Erick Erickson <er...@gmail.com> wrote:
You certainly can't do this into a single directory, there would be
zillions of name conflicts.
I believe I saw Uwe make a comment on the Lucene list about using
MultiReaders and
keeping the sub-indexes in different directories, but that's
lower-level than Solr has access to
Plus, you'd have to control index updates _very_ carefully.
So I don't think there's something built into Solr to work with
indexes like this, so merge is
probably your only option here.
Do note that the contrib MapReduceIndexerTool that will do most all of
this for you, it includes
a --go-live option. That option still copies things around though.
Best,
Erick
On Wed, Jan 14, 2015 at 1:25 AM, Jaikit Savla
<ja...@yahoo.com.invalid> wrote:
> This solution will merge the index as well. I want to find out if merge is "required" before loading indexes onto Solr ? If that is possible than I can just point solrconfig.xml to directory where I have all the shards.
> Jaikit
>
> On Wednesday, January 14, 2015 1:11 AM, Mikhail Khludnev <mk...@griddynamics.com> wrote:
>
>
>
> On Wed, Jan 14, 2015 at 11:42 AM, Jaikit Savla <ja...@yahoo.com.invalid> wrote:
>
> Now to load this index, I am currently using Lucene IndexMergeTool to merge all the shards into one giant index. My question is, is there a way to load shared index without merging into one giant index on to single collection ?
>
> https://cwiki.apache.org/confluence/display/solr/CoreAdmin+API#CoreAdminAPI-MERGEINDEXES ?
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
>
>
>
>
>
Re: Load existing Lucene sharded indexes onto single Solr collection
Posted by Erick Erickson <er...@gmail.com>.
You certainly can't do this into a single directory, there would be
zillions of name conflicts.
I believe I saw Uwe make a comment on the Lucene list about using
MultiReaders and
keeping the sub-indexes in different directories, but that's
lower-level than Solr has access to
Plus, you'd have to control index updates _very_ carefully.
So I don't think there's something built into Solr to work with
indexes like this, so merge is
probably your only option here.
Do note that the contrib MapReduceIndexerTool that will do most all of
this for you, it includes
a --go-live option. That option still copies things around though.
Best,
Erick
On Wed, Jan 14, 2015 at 1:25 AM, Jaikit Savla
<ja...@yahoo.com.invalid> wrote:
> This solution will merge the index as well. I want to find out if merge is "required" before loading indexes onto Solr ? If that is possible than I can just point solrconfig.xml to directory where I have all the shards.
> Jaikit
>
> On Wednesday, January 14, 2015 1:11 AM, Mikhail Khludnev <mk...@griddynamics.com> wrote:
>
>
>
> On Wed, Jan 14, 2015 at 11:42 AM, Jaikit Savla <ja...@yahoo.com.invalid> wrote:
>
> Now to load this index, I am currently using Lucene IndexMergeTool to merge all the shards into one giant index. My question is, is there a way to load shared index without merging into one giant index on to single collection ?
>
> https://cwiki.apache.org/confluence/display/solr/CoreAdmin+API#CoreAdminAPI-MERGEINDEXES ?
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
>
>
>
>
>
Re: Load existing Lucene sharded indexes onto single Solr
collection
Posted by Jaikit Savla <ja...@yahoo.com.INVALID>.
This solution will merge the index as well. I want to find out if merge is "required" before loading indexes onto Solr ? If that is possible than I can just point solrconfig.xml to directory where I have all the shards.
Jaikit
On Wednesday, January 14, 2015 1:11 AM, Mikhail Khludnev <mk...@griddynamics.com> wrote:
On Wed, Jan 14, 2015 at 11:42 AM, Jaikit Savla <ja...@yahoo.com.invalid> wrote:
Now to load this index, I am currently using Lucene IndexMergeTool to merge all the shards into one giant index. My question is, is there a way to load shared index without merging into one giant index on to single collection ?
https://cwiki.apache.org/confluence/display/solr/CoreAdmin+API#CoreAdminAPI-MERGEINDEXES ?
--
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics
Re: Load existing Lucene sharded indexes onto single Solr collection
Posted by Mikhail Khludnev <mk...@griddynamics.com>.
On Wed, Jan 14, 2015 at 11:42 AM, Jaikit Savla <
jaikit.savla@yahoo.com.invalid> wrote:
> Now to load this index, I am currently using Lucene IndexMergeTool to
> merge all the shards into one giant index. My question is, is there a way
> to load shared index without merging into one giant index on to single
> collection ?
https://cwiki.apache.org/confluence/display/solr/CoreAdmin+API#CoreAdminAPI-MERGEINDEXES
?
--
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics
<http://www.griddynamics.com>
<mk...@griddynamics.com>