You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Dilip.TS" <di...@starmarksv.com> on 2008/01/16 09:51:10 UTC
RE: Solr replication
Hi Bill,
I have some questions regarding the SOLR collection distribution.
!) Is it possilbe to add the index operations on the the slave server using
SOLR collection distribution and still the master server is updated with
these changes?
2)I have a requirement of having more than one solr instance (the
corresponding data directory for each solr core). Is it possible to maintain
different solr cores and still achieve SOLR collection distribution for all
of these cores independently. If yes, then how ?
Regards,
Dilip
-----Original Message-----
From: Bill Au [mailto:bill.w.au@gmail.com]
Sent: Monday, January 14, 2008 9:40 PM
To: dilip.ts@starmarksv.com
Subject: Re: Solr replication
Yes, you need the same changes in scripts.conf on the slave server but you
don't need the post commit hook enabled on the slave server.
The post commit hook is used to create snapshots. You will see a new
snapshot in the data directory every time you do a commit on the master
server. There is no need to create snapshots on the slave server as the
slave server copies the snapshots from the master server.
The scripts are designed to run under Unix/Linux. It uses symbolic link
and Unix/Linux commands like scp, ssh, rsync, cp. I don't know much about
Windows so I don't know for sure if all the Unix/Linux stuff used by the
sccripts are available in Windows or not.
Bill
On 1/14/08, Dilip.TS <di...@starmarksv.com> wrote:
Hi Bill,
I m trying to use the solr collection distribution.
and done the following changes:
1)Changes done in Master server on linux
#In scripts.conf file
user=
solr_hostname=localhost
solr_port=8983
rsyncd_port=18983
data_dir=/usr/solr/data/data_tenantID_1
webapp_name=solr
master_host=192.168.168.50
master_data_dir=/usr/solr/data/data_tenantID_1
master_status_dir=/usr/solr/logs
2)Enable the postcommit in solrconfig.xml
<!-- A postCommit event is fired after every commit or optimize
ommand -->
<listener event="postCommit" class="solr.RunExecutableListener ">
<str name="exe">/usr/solr/bin/snapshooter</str> <str
name="dir">/usr/solr/bin</str>
<bool name="wait">true</bool>
<!--arr name="args"><str>-u jetty-6.1.6</str> <str>-d
/opt/solr/data</str></arr-->
<arr name="env"> </arr>
</listener>
i run the Embedded solr folder and added a document to it..
and did a search for a word on the same server.
I found the following observations in the console:
INFO: query parser default operator is OR
Jan 14, 2008 3:37:38 PM org.apache.solr.schema.IndexSchema readSchema
INFO: unique key field: id
Jan 14, 2008 3:37:38 PM org.apache.solr.core.SolrCore <init>
INFO: Opening new SolrCore at //usr//solr/,
dataDir=//usr//solr//data//data_tenantID_1
Jan 14, 2008 3:37:38 PM org.apache.solr.core.SolrCore parseListener
INFO: Searching for listeners: //listener[@event="firstSearcher"]
Jan 14, 2008 3:37:38 PM org.apache.solr.core.SolrCore parseListener
INFO: Searching for listeners: //listener[@event="newSearcher"]
Jan 14, 2008 3:37:39 PM org.apache.solr.util.plugin.AbstractPluginLoader
load
INFO: created xslt: org.apache.solr.request.XSLTResponseWriter
Jan 14, 2008 3:37:39 PM org.apache.solr.request.XSLTResponseWriter init
INFO: xsltCacheLifetimeSeconds=5
Jan 14, 2008 3:37:39 PM org.apache.solr.util.plugin.AbstractPluginLoader
load
INFO: created standard: org.apache.solr.handler.StandardRequestHandler
.
.
.
.
INFO: Opening Searcher@2bb514 main
Jan 14, 2008 3:37:39 PM org.apache.solr.core.SolrCore registerSearcher
INFO: Registered new searcher Searcher@2bb514 main
Jan 14, 2008 3:37:39 PM org.apache.solr.update.UpdateHandler
parseEventListeners
INFO: added SolrEventListener for postCommit:
org.apache.solr.core.RunExecutableListener{exe=/usr/solr/bin/snapshooter,dir
=/usr/solr/bin,wait=true,env=[]}
Jan 14, 2008 3:37:39 PM
org.apache.solr.update.DirectUpdateHandler2$CommitTracker <init>
INFO: AutoCommit: disabled
In the above console i find "postCommit:
org.apache.solr.core.RunExecutableListener{exe=/usr/solr/bin/snapshooter,dir
=/usr/solr/bin,wait=true,env=[]}"
command being called after doing a commit.
This is a scenario for the add/search done on the same master server on
Linux.
1)I would like to know do we require similar entries for the scrips.conf
and
the postcommit enabled in the solrconfig.xml for the slave server too.
If yes, are these entries for the slave server should be identical to
that of master or it is different?
2)Also can we have the Linux machine acting as a master server and the
slave
can be made to run on windows machine?
Thanks in advance.
Regards
Dilip
-----Original Message-----
From: Bill Au [mailto:bill.w.au@gmail.com ]
Sent: Saturday, December 15, 2007 1:08 AM
To: solr-user@lucene.apache.org; dilip.ts@starmarksv.com
Cc: hossman_lucene@fucit.org; yuhui.jin@gmail.com
Subject: Re: Solr replication
On Dec 14, 2007 7:00 AM, Dilip.TS <dilip.ts@starmarksv.com > wrote:
> Hi,
> I have the following requirement for SOLR Collection Distribution
using
> Embedded Solr with the Jetty server:
>
> I have different data folders for multiple instances of SOLR within
the
> Same
> application.
> Im using the same SOLR_HOME with a single bin and conf folder.
>
> My query is:
> 1)Is is possible to have the same SOLR_HOME for multiple solr
instances
> and
> still be able to
> achieve Solr Distribution?
> (As i understand that we need to have differnet rsync port for
different
> solr instances)
Yes, solr distribution will work for multiple solr instances even if
they
all use the same SOLR_HOME.
All the distribution scripts have a command line argument for specifying
the
data directory.
>
> 2)Can i get some more information about how to start this rsyncd
daemon
> and
> which is the best way of doing it i.e. to start during system reboot
or
> doing it manually?
Please note that the rsyncd
-CollectionDistributionScripts#head-1e6cdce516ecf1eb31bffceaccf2abeb72bd
ce81
So it is best to configure the master server to run the rsyncd-start
script
at system boot time. If the rsync daemon has for some reasons been
disabled, it will not be started automatically at system reboot even if
it
is configured to do so. If rsyncd is started manually, then one will
have
to remember to start it every time the master server is rebooted.
>
> 3)Let me know if my understanding is correct. We require 1 Master
Server
> and
> a minimum of 1 slave server.
> The master server and the slave server cannot be running on the same
> machine. Am i right?
>
> In the case of the SOLR Distribution, if the SOLR server acts as the
> Master server
> then how about this slave server ? Is it the Application server which
> calls the Master SOLR Server
> acts as slave server?
Both the master and slave are SOLR servers. Typically they are on
different
machines.
It doesn't make sense (at least not to me) to have both of them on the
same
machine.
>
> 4)I observe the file scripts.conf for master server:
> solr_port=8983
> rsyncd_port=18983
>
> +Enable and start rsync:
> rsyncd-enable; rsyncd-start
> +Run snapshooter:
> snapshooter
>
> Just to confirm is it mandatory that the solr master server should
have
> the solr_port as 8983 only?
It does not to be 8983. That's just an example.
>
>
> 5) How do we enable and start rsync? The link to
> SolrCollectionDistributionScripts mentions about
> installing rsyncd daemon either during system boot time or by
manually.
> Which method is more preferrable?
> How do we achieve this as iam not clear on this?
>
> 6) How do we setup crontab to run snappuller and snapinstaller
> periodically?
How to start rsyncd at system boot time and setup crontab to run
snappuller
and snapinstaller depends on the OS that Solr is running on.
>
>
>
> Regards,
> Dilip TS
> Starmark Services Pvt. Ltd.
>
>
Re: Solr replication
Posted by Bill Au <bi...@gmail.com>.
my answers inilne...
On Jan 16, 2008 3:51 AM, Dilip.TS <di...@starmarksv.com> wrote:
> Hi Bill,
> I have some questions regarding the SOLR collection distribution.
> !) Is it possilbe to add the index operations on the the slave server
> using
> SOLR collection distribution and still the master server is updated with
> these changes?
No. The replication process is only one way, from the master to the slave.
The idea behind it is that the slave servers would be for query only and the
number of slaves can
be increased or decreased according to traffic load.
> 2)I have a requirement of having more than one solr instance (the
> corresponding data directory for each solr core). Is it possible to
> maintain
> different solr cores and still achieve SOLR collection distribution for
> all
> of these cores independently. If yes, then how ?
Does each solr instance has its own solr home? If so you can use
replication within each instance by simply adjusting the parameters in
scripts.conf for each instance. Even if they all share a single solr home,
the replication related scripts all have command line option to override
values set in scripts.conf:
http://wiki.apache.org/solr/SolrCollectionDistributionScripts
So you can invoke the scripts for each instance by setting the data
directory on the command line.
>
> Regards,
> Dilip
>
>
> -----Original Message-----
> From: Bill Au [mailto:bill.w.au@gmail.com]
> Sent: Monday, January 14, 2008 9:40 PM
> To: dilip.ts@starmarksv.com
> Subject: Re: Solr replication
>
>
> Yes, you need the same changes in scripts.conf on the slave server but
> you
> don't need the post commit hook enabled on the slave server.
> The post commit hook is used to create snapshots. You will see a new
> snapshot in the data directory every time you do a commit on the master
> server. There is no need to create snapshots on the slave server as the
> slave server copies the snapshots from the master server.
>
> The scripts are designed to run under Unix/Linux. It uses symbolic link
> and Unix/Linux commands like scp, ssh, rsync, cp. I don't know much about
> Windows so I don't know for sure if all the Unix/Linux stuff used by the
> sccripts are available in Windows or not.
>
> Bill
>
>
> On 1/14/08, Dilip.TS <di...@starmarksv.com> wrote:
> Hi Bill,
> I m trying to use the solr collection distribution.
> and done the following changes:
>
> 1)Changes done in Master server on linux
> #In scripts.conf file
>
> user=
> solr_hostname=localhost
> solr_port=8983
> rsyncd_port=18983
> data_dir=/usr/solr/data/data_tenantID_1
> webapp_name=solr
> master_host=192.168.168.50
> master_data_dir=/usr/solr/data/data_tenantID_1
> master_status_dir=/usr/solr/logs
>
> 2)Enable the postcommit in solrconfig.xml
>
> <!-- A postCommit event is fired after every commit or optimize
> ommand -->
> <listener event="postCommit" class="solr.RunExecutableListener ">
> <str name="exe">/usr/solr/bin/snapshooter</str> <str
> name="dir">/usr/solr/bin</str>
> <bool name="wait">true</bool>
> <!--arr name="args"><str>-u jetty-6.1.6</str> <str>-d
> /opt/solr/data</str></arr-->
> <arr name="env"> </arr>
> </listener>
>
> i run the Embedded solr folder and added a document to it..
> and did a search for a word on the same server.
> I found the following observations in the console:
>
> INFO: query parser default operator is OR
> Jan 14, 2008 3:37:38 PM org.apache.solr.schema.IndexSchema readSchema
> INFO: unique key field: id
> Jan 14, 2008 3:37:38 PM org.apache.solr.core.SolrCore <init>
> INFO: Opening new SolrCore at //usr//solr/,
> dataDir=//usr//solr//data//data_tenantID_1
> Jan 14, 2008 3:37:38 PM org.apache.solr.core.SolrCore parseListener
> INFO: Searching for listeners: //listener[@event="firstSearcher"]
> Jan 14, 2008 3:37:38 PM org.apache.solr.core.SolrCore parseListener
> INFO: Searching for listeners: //listener[@event="newSearcher"]
> Jan 14, 2008 3:37:39 PM
> org.apache.solr.util.plugin.AbstractPluginLoader
> load
> INFO: created xslt: org.apache.solr.request.XSLTResponseWriter
> Jan 14, 2008 3:37:39 PM org.apache.solr.request.XSLTResponseWriter init
> INFO: xsltCacheLifetimeSeconds=5
> Jan 14, 2008 3:37:39 PM
> org.apache.solr.util.plugin.AbstractPluginLoader
> load
> INFO: created standard: org.apache.solr.handler.StandardRequestHandler
> .
> .
> .
> .
> INFO: Opening Searcher@2bb514 main
> Jan 14, 2008 3:37:39 PM org.apache.solr.core.SolrCore registerSearcher
> INFO: Registered new searcher Searcher@2bb514 main
> Jan 14, 2008 3:37:39 PM org.apache.solr.update.UpdateHandler
> parseEventListeners
> INFO: added SolrEventListener for postCommit:
>
> org.apache.solr.core.RunExecutableListener{exe=/usr/solr/bin/snapshooter
> ,dir
> =/usr/solr/bin,wait=true,env=[]}
> Jan 14, 2008 3:37:39 PM
> org.apache.solr.update.DirectUpdateHandler2$CommitTracker <init>
> INFO: AutoCommit: disabled
>
>
> In the above console i find "postCommit:
>
> org.apache.solr.core.RunExecutableListener{exe=/usr/solr/bin/snapshooter
> ,dir
> =/usr/solr/bin,wait=true,env=[]}"
> command being called after doing a commit.
> This is a scenario for the add/search done on the same master server on
> Linux.
>
>
> 1)I would like to know do we require similar entries for the
> scrips.conf
> and
> the postcommit enabled in the solrconfig.xml for the slave server too.
> If yes, are these entries for the slave server should be identical
> to
> that of master or it is different?
>
> 2)Also can we have the Linux machine acting as a master server and the
> slave
> can be made to run on windows machine?
>
> Thanks in advance.
> Regards
> Dilip
>
>
>
>
>
>
> -----Original Message-----
> From: Bill Au [mailto:bill.w.au@gmail.com ]
> Sent: Saturday, December 15, 2007 1:08 AM
> To: solr-user@lucene.apache.org; dilip.ts@starmarksv.com
> Cc: hossman_lucene@fucit.org; yuhui.jin@gmail.com
> Subject: Re: Solr replication
>
>
> On Dec 14, 2007 7:00 AM, Dilip.TS <dilip.ts@starmarksv.com > wrote:
>
> > Hi,
> > I have the following requirement for SOLR Collection Distribution
> using
> > Embedded Solr with the Jetty server:
> >
> > I have different data folders for multiple instances of SOLR within
> the
> > Same
> > application.
> > Im using the same SOLR_HOME with a single bin and conf folder.
> >
> > My query is:
> > 1)Is is possible to have the same SOLR_HOME for multiple solr
> instances
> > and
> > still be able to
> > achieve Solr Distribution?
> > (As i understand that we need to have differnet rsync port for
> different
> > solr instances)
>
>
> Yes, solr distribution will work for multiple solr instances even if
> they
> all use the same SOLR_HOME.
> All the distribution scripts have a command line argument for
> specifying
> the
> data directory.
>
>
> >
> > 2)Can i get some more information about how to start this rsyncd
> daemon
> > and
> > which is the best way of doing it i.e. to start during system reboot
> or
> > doing it manually?
>
>
> Please note that the rsyncd
>
> -CollectionDistributionScripts#head-1e6cdce516ecf1eb31bffceaccf2abeb72bd
> ce81
>
> So it is best to configure the master server to run the rsyncd-start
> script
> at system boot time. If the rsync daemon has for some reasons been
> disabled, it will not be started automatically at system reboot even if
> it
> is configured to do so. If rsyncd is started manually, then one will
> have
> to remember to start it every time the master server is rebooted.
>
>
> >
> > 3)Let me know if my understanding is correct. We require 1 Master
> Server
> > and
> > a minimum of 1 slave server.
> > The master server and the slave server cannot be running on the same
> > machine. Am i right?
> >
> > In the case of the SOLR Distribution, if the SOLR server acts as the
> > Master server
> > then how about this slave server ? Is it the Application server
> which
> > calls the Master SOLR Server
> > acts as slave server?
>
>
> Both the master and slave are SOLR servers. Typically they are on
> different
> machines.
> It doesn't make sense (at least not to me) to have both of them on the
> same
> machine.
>
>
> >
> > 4)I observe the file scripts.conf for master server:
> > solr_port=8983
> > rsyncd_port=18983
> >
> > +Enable and start rsync:
> > rsyncd-enable; rsyncd-start
> > +Run snapshooter:
> > snapshooter
> >
> > Just to confirm is it mandatory that the solr master server should
> have
> > the solr_port as 8983 only?
>
>
>
> It does not to be 8983. That's just an example.
>
>
> >
> >
> > 5) How do we enable and start rsync? The link to
> > SolrCollectionDistributionScripts mentions about
> > installing rsyncd daemon either during system boot time or by
> manually.
> > Which method is more preferrable?
> > How do we achieve this as iam not clear on this?
>
>
> >
> > 6) How do we setup crontab to run snappuller and snapinstaller
> > periodically?
>
>
>
> How to start rsyncd at system boot time and setup crontab to run
> snappuller
> and snapinstaller depends on the OS that Solr is running on.
>
>
> >
> >
> >
> > Regards,
> > Dilip TS
> > Starmark Services Pvt. Ltd.
> >
> >
>
>
>
>