You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Pravin Karne <pr...@persistent.co.in> on 2009/10/09 14:40:56 UTC

dose solr sopport distribute index storage ?

Hi,
I am new to solr. I have configured solr successfully and its working smoothly.

I have one query:

I want index large data(around 100GB).So can we store these indexes on different machine as distributed system.

So there will be one master and more slave . Also we have to keep these data in sync over all the node.

So when I send update request solr will update that record from corresponding node.

In short I want to create scalable and optimal search system.

Is this possible with solr?

Please help in this. Any pointer  regarding this will be highly appreciated.

Thanks in advance


-Pravin

DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.

RE: dose solr sopport distribute index storage ?

Posted by Pravin Karne <pr...@persistent.co.in>.
I am looking for one large index with 100GB of data.

How to store this on distribute system.

-Thanks

-----Original Message-----
From: Shalin Shekhar Mangar [mailto:shalinmangar@gmail.com] 
Sent: Friday, October 09, 2009 6:51 PM
To: solr-user@lucene.apache.org
Subject: Re: dose solr sopport distribute index storage ?

On Fri, Oct 9, 2009 at 6:10 PM, Pravin Karne
<pr...@persistent.co.in>wrote:

> Hi,
> I am new to solr. I have configured solr successfully and its working
> smoothly.
>
> I have one query:
>
> I want index large data(around 100GB).So can we store these indexes on
> different machine as distributed system.
>
>
Are you talking about one large index with 100GB of data? Or do you plan to
shard the data into multiple smaller indexes and use Solr's distributed
search?




> So there will be one master and more slave . Also we have to keep these
> data in sync over all the node.
>
> So when I send update request solr will update that record from
> corresponding node.
>
>
Solr will not update corresponding node automatically. You have to make sure
to send the add/delete request to the master of the correct shard. Solr does
not support update operation (it is always a replace by uniqueKey).


> In short I want to create scalable and optimal search system.
>
> Is this possible with solr?
>
>
Of course you can create a scalable and optimal search system with Solr. We
do that all the time ;)

-- 
Regards,
Shalin Shekhar Mangar.

DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.

Re: dose solr sopport distribute index storage ?

Posted by Camilo Aguilar <ca...@gmail.com>.
HI!

I have the same question

Thanks in advance

On Mon, Oct 12, 2009 at 1:55 PM, Pieter Steyn <pi...@gmail.com> wrote:

> Sorry for the hijack, but s replication necessary when using a cluster
> file-system such as GFS2.  Where the files are the same for any
> instance of Solr?
>
>
> On Mon, Oct 12, 2009 at 8:36 PM, Dan Trainor <dt...@toolbox.com> wrote:
> > On 10/12/2009 10:49 AM, Chaitali Gupta wrote:
> >>
> >> Hi,
> >>
> >> How should we setup masI ter and slaves in Solr? What configuration
> files
> >> and parameters should we need to change and how ?
> >>
> >> Thanks,
> >> Chaitali
> >
> > Hi -
> >
> > I think Shalin was pretty clear on that, it is documented very well at
> > http://wiki.apache.org/solr/SolrReplication .
> >
> > I am responding, however, to explain something that took me a bit of time
> to
> > wrap my brain around in the hopes that it helps you and perhaps some
> others.
> >
> > Solr in itself does not replicate.  Instead, Solr relies on an underlying
> > rsync setup to keep these indices sync'd throughout the collective.  When
> > you break it down, its simply rsync with a configuration file making all
> the
> > nodes "aware" that they participate in this configuration.  Wrap a cron
> > around this between all the nodes, and they simply replicate raw data
> from
> > one "master" to one or more slave.
> >
> > I would suggest reading up on how snapshots are preformed and how the log
> > files are created/what they do.  Of course it would benefit you to know
> the
> > ins and outs of all the elements that help Solr replicate, but its been
> my
> > experience that most of it has to do with those particular items.
> >
> > Thanks
> > -dant
> >
> >
>



-- 
  There are two ways of constructing a software design: One way is to make
it so simple that there are obviously no deficiencies, and the other way is
to make it so complicated that there are no obvious deficiencies. The first
method is far more difficult.  -C. A. R. Hoare

  Any fool can write code that a computer can understand. Good programmers
write code that humans can understand. - Martin Fowler

Re: dose solr sopport distribute index storage ?

Posted by Pieter Steyn <pi...@gmail.com>.
Sorry for the hijack, but s replication necessary when using a cluster
file-system such as GFS2.  Where the files are the same for any
instance of Solr?


On Mon, Oct 12, 2009 at 8:36 PM, Dan Trainor <dt...@toolbox.com> wrote:
> On 10/12/2009 10:49 AM, Chaitali Gupta wrote:
>>
>> Hi,
>>
>> How should we setup master and slaves in Solr? What configuration files
>> and parameters should we need to change and how ?
>>
>> Thanks,
>> Chaitali
>
> Hi -
>
> I think Shalin was pretty clear on that, it is documented very well at
> http://wiki.apache.org/solr/SolrReplication .
>
> I am responding, however, to explain something that took me a bit of time to
> wrap my brain around in the hopes that it helps you and perhaps some others.
>
> Solr in itself does not replicate.  Instead, Solr relies on an underlying
> rsync setup to keep these indices sync'd throughout the collective.  When
> you break it down, its simply rsync with a configuration file making all the
> nodes "aware" that they participate in this configuration.  Wrap a cron
> around this between all the nodes, and they simply replicate raw data from
> one "master" to one or more slave.
>
> I would suggest reading up on how snapshots are preformed and how the log
> files are created/what they do.  Of course it would benefit you to know the
> ins and outs of all the elements that help Solr replicate, but its been my
> experience that most of it has to do with those particular items.
>
> Thanks
> -dant
>
>

Re: dose solr sopport distribute index storage ?

Posted by Dan Trainor <dt...@toolbox.com>.
On 10/12/2009 10:49 AM, Chaitali Gupta wrote:
> Hi,
>
> How should we setup master and slaves in Solr? What configuration files and parameters should we need to change and how ?
>
> Thanks,
> Chaitali

Hi -

I think Shalin was pretty clear on that, it is documented very well at 
http://wiki.apache.org/solr/SolrReplication .

I am responding, however, to explain something that took me a bit of 
time to wrap my brain around in the hopes that it helps you and perhaps 
some others.

Solr in itself does not replicate.  Instead, Solr relies on an 
underlying rsync setup to keep these indices sync'd throughout the 
collective.  When you break it down, its simply rsync with a 
configuration file making all the nodes "aware" that they participate in 
this configuration.  Wrap a cron around this between all the nodes, and 
they simply replicate raw data from one "master" to one or more slave.

I would suggest reading up on how snapshots are preformed and how the 
log files are created/what they do.  Of course it would benefit you to 
know the ins and outs of all the elements that help Solr replicate, but 
its been my experience that most of it has to do with those particular 
items.

Thanks
-dant


Re: dose solr sopport distribute index storage ?

Posted by Chaitali Gupta <ch...@yahoo.com>.
Hi, 

How should we setup master and slaves in Solr? What configuration files and parameters should we need to change and how ? 

Thanks, 
Chaitali 

--- On Mon, 10/12/09, Shalin Shekhar Mangar <sh...@gmail.com> wrote:

From: Shalin Shekhar Mangar <sh...@gmail.com>
Subject: Re: dose solr sopport distribute index storage ?
To: solr-user@lucene.apache.org
Date: Monday, October 12, 2009, 3:17 AM

On Mon, Oct 12, 2009 at 10:27 AM, Pravin Karne <
pravin_karne@persistent.co.in> wrote:

> How to set master/slave setup for solr.
>
>
Index documents only on the master. Put the slaves behind a load balancer
and query only on slaves. Setup replication between the master and slaves.
See http://wiki.apache.org/solr/SolrReplication

-- 
Regards,
Shalin Shekhar Mangar.



      

Re: dose solr sopport distribute index storage ?

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
On Mon, Oct 12, 2009 at 10:27 AM, Pravin Karne <
pravin_karne@persistent.co.in> wrote:

> How to set master/slave setup for solr.
>
>
Index documents only on the master. Put the slaves behind a load balancer
and query only on slaves. Setup replication between the master and slaves.
See http://wiki.apache.org/solr/SolrReplication

-- 
Regards,
Shalin Shekhar Mangar.

RE: dose solr sopport distribute index storage ?

Posted by Pravin Karne <pr...@persistent.co.in>.
How to set master/slave setup for solr.

What are the configuration steps for this?


-----Original Message-----
From: Shalin Shekhar Mangar [mailto:shalinmangar@gmail.com] 
Sent: Friday, October 09, 2009 6:51 PM
To: solr-user@lucene.apache.org
Subject: Re: dose solr sopport distribute index storage ?

On Fri, Oct 9, 2009 at 6:10 PM, Pravin Karne
<pr...@persistent.co.in>wrote:

> Hi,
> I am new to solr. I have configured solr successfully and its working
> smoothly.
>
> I have one query:
>
> I want index large data(around 100GB).So can we store these indexes on
> different machine as distributed system.
>
>
Are you talking about one large index with 100GB of data? Or do you plan to
shard the data into multiple smaller indexes and use Solr's distributed
search?


> So there will be one master and more slave . Also we have to keep these
> data in sync over all the node.
>
> So when I send update request solr will update that record from
> corresponding node.
>
>
Solr will not update corresponding node automatically. You have to make sure
to send the add/delete request to the master of the correct shard. Solr does
not support update operation (it is always a replace by uniqueKey).


> In short I want to create scalable and optimal search system.
>
> Is this possible with solr?
>
>
Of course you can create a scalable and optimal search system with Solr. We
do that all the time ;)

-- 
Regards,
Shalin Shekhar Mangar.

DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.

Re: dose solr sopport distribute index storage ?

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
On Fri, Oct 9, 2009 at 6:10 PM, Pravin Karne
<pr...@persistent.co.in>wrote:

> Hi,
> I am new to solr. I have configured solr successfully and its working
> smoothly.
>
> I have one query:
>
> I want index large data(around 100GB).So can we store these indexes on
> different machine as distributed system.
>
>
Are you talking about one large index with 100GB of data? Or do you plan to
shard the data into multiple smaller indexes and use Solr's distributed
search?


> So there will be one master and more slave . Also we have to keep these
> data in sync over all the node.
>
> So when I send update request solr will update that record from
> corresponding node.
>
>
Solr will not update corresponding node automatically. You have to make sure
to send the add/delete request to the master of the correct shard. Solr does
not support update operation (it is always a replace by uniqueKey).


> In short I want to create scalable and optimal search system.
>
> Is this possible with solr?
>
>
Of course you can create a scalable and optimal search system with Solr. We
do that all the time ;)

-- 
Regards,
Shalin Shekhar Mangar.