You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Bruno Mannina <bm...@free.fr> on 2014/02/18 13:28:13 UTC

Indexed a new big database while the old is running?

Dear Solr Users,

We have actually a SOLR db with around 88 000 000 docs.
All work fine :)

We receive each year a new backfile with the same content (but improved).

Index these docs takes several days on SOLR,
So is it possible to create a new collection (restart SOLR) and
Index these new 88 000 000 docs without stopping the current collection ?

We have around 1 million connections by month.

Do you think that this new indexation may cause problem to SOLR using?
Note: new database will not be used until the current collection will be 
stopped.

Thx for your comment,
Bruno


Re: Indexed a new big database while the old is running?

Posted by Bruno Mannina <bm...@free.fr>.
Hi Shaw,

Thanks for your answer.

Actually we haven't performance problem because we do only select request.
We have 4 CPUs 8cores 24Go Ram.

I know how to create alias, my question was just concerning performance, 
and you have right,
impossible to answer to this question without more information about my 
system, sorry.

I will do real test and I will check if perf will be down, if yes I will 
stop new indexation....

If you have more information concerning indexation performance with my 
server config, don't miss to
write me. :)

Have a nice day,

Regards,
Bruno


Le 18/02/2014 16:30, Shawn Heisey a écrit :
> On 2/18/2014 5:28 AM, Bruno Mannina wrote:
>> We have actually a SOLR db with around 88 000 000 docs.
>> All work fine :)
>>
>> We receive each year a new backfile with the same content (but improved).
>>
>> Index these docs takes several days on SOLR,
>> So is it possible to create a new collection (restart SOLR) and
>> Index these new 88 000 000 docs without stopping the current collection ?
>>
>> We have around 1 million connections by month.
>>
>> Do you think that this new indexation may cause problem to SOLR using?
>> Note: new database will not be used until the current collection will be
>> stopped.
> You can instantly switch between collections by using the alias feature.
>   To do this, you would have collections named something like test201302
> and test201402, then you would create an alias named 'test' that points
> to one of these collections.  Your code can use 'test' as the collection
> name.
>
> Without a lot more information, it's impossible to say whether building
> a new collection will cause performance problems for the existing
> collection.
>
> It does seem like a problem that rebuilding the index takes several
> days.  You might already be having performance problems.  It's also
> possible that there's an aspect to this that I am not seeing, and that
> several days is perfectly normal for YOUR index.
>
> Not enough RAM is the most common reason for performance issues on a
> large index:
>
> http://wiki.apache.org/solr/SolrPerformanceProblems
>
> Thanks,
> Shawn
>
>
>


Re: Indexed a new big database while the old is running?

Posted by Shawn Heisey <so...@elyograg.org>.
On 2/18/2014 5:28 AM, Bruno Mannina wrote:
> We have actually a SOLR db with around 88 000 000 docs.
> All work fine :)
> 
> We receive each year a new backfile with the same content (but improved).
> 
> Index these docs takes several days on SOLR,
> So is it possible to create a new collection (restart SOLR) and
> Index these new 88 000 000 docs without stopping the current collection ?
> 
> We have around 1 million connections by month.
> 
> Do you think that this new indexation may cause problem to SOLR using?
> Note: new database will not be used until the current collection will be
> stopped.

You can instantly switch between collections by using the alias feature.
 To do this, you would have collections named something like test201302
and test201402, then you would create an alias named 'test' that points
to one of these collections.  Your code can use 'test' as the collection
name.

Without a lot more information, it's impossible to say whether building
a new collection will cause performance problems for the existing
collection.

It does seem like a problem that rebuilding the index takes several
days.  You might already be having performance problems.  It's also
possible that there's an aspect to this that I am not seeing, and that
several days is perfectly normal for YOUR index.

Not enough RAM is the most common reason for performance issues on a
large index:

http://wiki.apache.org/solr/SolrPerformanceProblems

Thanks,
Shawn