You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Max Lynch <ih...@gmail.com> on 2010/08/03 23:38:03 UTC

Duplicate a core

Is it possible to duplicate a core?  I want to have one core contain only
documents within a certain date range (ex: 3 days old), and one core with
all documents that have ever been in the first core.  The small core is then
replicated to other servers which do "real-time" processing on it, but the
"archive" core exists for longer term searching.

I understand I could just connect to both cores from my indexer, but I would
like to not have to send duplicate documents across the network to save
bandwidth.

Is this possible?

Thanks.

Re: Duplicate a core

Posted by Max Lynch <ih...@gmail.com>.
What I'm doing now is just adding the documents to the other core each night
and deleting old documents from the other core when I'm finished.  Is there
a better way?

On Tue, Aug 3, 2010 at 4:38 PM, Max Lynch <ih...@gmail.com> wrote:

> Is it possible to duplicate a core?  I want to have one core contain only
> documents within a certain date range (ex: 3 days old), and one core with
> all documents that have ever been in the first core.  The small core is then
> replicated to other servers which do "real-time" processing on it, but the
> "archive" core exists for longer term searching.
>
> I understand I could just connect to both cores from my indexer, but I
> would like to not have to send duplicate documents across the network to
> save bandwidth.
>
> Is this possible?
>
> Thanks.
>

Re: Duplicate a core

Posted by Chris Hostetter <ho...@fucit.org>.
: Is it possible to duplicate a core?  I want to have one core contain only
: documents within a certain date range (ex: 3 days old), and one core with
: all documents that have ever been in the first core.  The small core is then
: replicated to other servers which do "real-time" processing on it, but the
: "archive" core exists for longer term searching.

It's not something i've ever dealt with, but if i were going to pursue it 
i would investigate wether this works...

1) have three+ solr instances: "master", "archive" and one or more "query" 
   machines
2) index everything to core named "recent" on server "master"
3) configure the "query" machines to replicate "recent" from "master"
4) configure the "archive" machine to replicate "recent" from "master"
5) configure the "archive" machine to also have an "all" core
6) on some timed bases:
   - delete docs from "recent" on "master" that are *older* then X
   - delete docs from "recent" on "archive" that are *newer* then X
   - use the index merge command on "archive" to merge the "recent" 
     core into the "all" core


...i'm pretty sure that merge command will require that you shutdown both 
cores on archive during the merge, but that's a good idea anyway.

if you need continuous searching of the "all" core to be available, then 
just setup that core on "archive" as a repeater and have some 
"archive-query" machines slaving off of it.


that should work.



-Hoss