You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Mike Franon <ko...@gmail.com> on 2011/03/01 20:49:05 UTC
solr different sizes on master and slave
I was curious why would the size be dramatically different even though
the index versions are the same?
One is 1.2 Gb, and on the slave it is 512 MB
I would think they should both be the same size no?
Thanks
Re: solr different sizes on master and slave
Posted by Markus Jelsma <ma...@openindex.io>.
Yes. But keep in mind that Solr may be actually using an index.<TIMESTAMP>
directory for its live search. See either the replication.properties file or
consult the replication page to see what index directory it uses.
If it uses an index.<TIMESTAMP> directory you can safely move it to index and
remove or modify replication.properties.
On Wednesday 02 March 2011 15:03:54 Mike Franon wrote:
> Is it ok if I just delete the old copies manually? or maybe run a
> script that does it?
>
> On Tue, Mar 1, 2011 at 7:47 PM, Markus Jelsma
>
> <ma...@openindex.io> wrote:
> > Indeed, the slave should not have useless copies but it does, at least in
> > 1.4.0, i haven't seen it in 3.x, but that was just a small test that did
> > not exactly meet my other production installs.
> >
> > In 1.4.0 Solr does not remove old copies at startup and it does not
> > cleanly abort running replications at shutdown. Between shutdown and
> > startup there might be a higher index version, it will then proceed as
> > expected; download the new version and continue. Old copies will appear.
> >
> > There is an earlier thread i started but without patch. You can, however,
> > work around the problem by letting Solr delete a running replication by:
> > 1. disable polling and then 2) abort replication. You can also write a
> > script that will compare current and available replication directories
> > before startup and act accordingly.
> >
> >> The slave should not keep multiple copies _permanently_, but might
> >> temporarily after it's fetched the new files from master, but before
> >> it's committed them and fully wamred the new index searchers in the
> >> slave. Could that be what's going on, is your slave just still working
> >> on committing and warming the new version(s) of the index?
> >>
> >> [If you do 'commit' to slave (and a replication pull counts as a
> >> 'commit') so quick that you get overlapping commits before the slave was
> >> able to warm a new index... its' going to be trouble all around.]
> >>
> >> On 3/1/2011 4:27 PM, Mike Franon wrote:
> >> > ok doing some more research I noticed, on the slave it has multiple
> >> > folders where it keeps them for example
> >> >
> >> > index
> >> > index.20110204010900
> >> > index.20110204013355
> >> > index.20110218125400
> >> >
> >> > and then there is an index.properties that shows which index it is
> >> > using.
> >> >
> >> > I am just curious why does it keep multiple copies? Is there a
> >> > setting somewhere I can change to only keep one copy so not to lose
> >> > space?
> >> >
> >> > Thanks
> >> >
> >> > On Tue, Mar 1, 2011 at 3:26 PM, Mike Franon<ko...@gmail.com>
wrote:
> >> >> No pending commits, what it looks like is there are almost two copies
> >> >> of the index on the master, not sure how that happened.
> >> >>
> >> >>
> >> >>
> >> >> On Tue, Mar 1, 2011 at 3:08 PM, Markus Jelsma
> >> >>
> >> >> <ma...@openindex.io> wrote:
> >> >>> Are there pending commits on the master?
> >> >>>
> >> >>>> I was curious why would the size be dramatically different even
> >> >>>> though the index versions are the same?
> >> >>>>
> >> >>>> One is 1.2 Gb, and on the slave it is 512 MB
> >> >>>>
> >> >>>> I would think they should both be the same size no?
> >> >>>>
> >> >>>> Thanks
--
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350
Re: solr different sizes on master and slave
Posted by Mike Franon <ko...@gmail.com>.
Is it ok if I just delete the old copies manually? or maybe run a
script that does it?
On Tue, Mar 1, 2011 at 7:47 PM, Markus Jelsma
<ma...@openindex.io> wrote:
> Indeed, the slave should not have useless copies but it does, at least in
> 1.4.0, i haven't seen it in 3.x, but that was just a small test that did not
> exactly meet my other production installs.
>
> In 1.4.0 Solr does not remove old copies at startup and it does not cleanly
> abort running replications at shutdown. Between shutdown and startup there
> might be a higher index version, it will then proceed as expected; download
> the new version and continue. Old copies will appear.
>
> There is an earlier thread i started but without patch. You can, however, work
> around the problem by letting Solr delete a running replication by: 1. disable
> polling and then 2) abort replication. You can also write a script that will
> compare current and available replication directories before startup and act
> accordingly.
>
>
>> The slave should not keep multiple copies _permanently_, but might
>> temporarily after it's fetched the new files from master, but before
>> it's committed them and fully wamred the new index searchers in the
>> slave. Could that be what's going on, is your slave just still working
>> on committing and warming the new version(s) of the index?
>>
>> [If you do 'commit' to slave (and a replication pull counts as a
>> 'commit') so quick that you get overlapping commits before the slave was
>> able to warm a new index... its' going to be trouble all around.]
>>
>> On 3/1/2011 4:27 PM, Mike Franon wrote:
>> > ok doing some more research I noticed, on the slave it has multiple
>> > folders where it keeps them for example
>> >
>> > index
>> > index.20110204010900
>> > index.20110204013355
>> > index.20110218125400
>> >
>> > and then there is an index.properties that shows which index it is using.
>> >
>> > I am just curious why does it keep multiple copies? Is there a
>> > setting somewhere I can change to only keep one copy so not to lose
>> > space?
>> >
>> > Thanks
>> >
>> > On Tue, Mar 1, 2011 at 3:26 PM, Mike Franon<ko...@gmail.com> wrote:
>> >> No pending commits, what it looks like is there are almost two copies
>> >> of the index on the master, not sure how that happened.
>> >>
>> >>
>> >>
>> >> On Tue, Mar 1, 2011 at 3:08 PM, Markus Jelsma
>> >>
>> >> <ma...@openindex.io> wrote:
>> >>> Are there pending commits on the master?
>> >>>
>> >>>> I was curious why would the size be dramatically different even though
>> >>>> the index versions are the same?
>> >>>>
>> >>>> One is 1.2 Gb, and on the slave it is 512 MB
>> >>>>
>> >>>> I would think they should both be the same size no?
>> >>>>
>> >>>> Thanks
>
Re: solr different sizes on master and slave
Posted by Markus Jelsma <ma...@openindex.io>.
Indeed, the slave should not have useless copies but it does, at least in
1.4.0, i haven't seen it in 3.x, but that was just a small test that did not
exactly meet my other production installs.
In 1.4.0 Solr does not remove old copies at startup and it does not cleanly
abort running replications at shutdown. Between shutdown and startup there
might be a higher index version, it will then proceed as expected; download
the new version and continue. Old copies will appear.
There is an earlier thread i started but without patch. You can, however, work
around the problem by letting Solr delete a running replication by: 1. disable
polling and then 2) abort replication. You can also write a script that will
compare current and available replication directories before startup and act
accordingly.
> The slave should not keep multiple copies _permanently_, but might
> temporarily after it's fetched the new files from master, but before
> it's committed them and fully wamred the new index searchers in the
> slave. Could that be what's going on, is your slave just still working
> on committing and warming the new version(s) of the index?
>
> [If you do 'commit' to slave (and a replication pull counts as a
> 'commit') so quick that you get overlapping commits before the slave was
> able to warm a new index... its' going to be trouble all around.]
>
> On 3/1/2011 4:27 PM, Mike Franon wrote:
> > ok doing some more research I noticed, on the slave it has multiple
> > folders where it keeps them for example
> >
> > index
> > index.20110204010900
> > index.20110204013355
> > index.20110218125400
> >
> > and then there is an index.properties that shows which index it is using.
> >
> > I am just curious why does it keep multiple copies? Is there a
> > setting somewhere I can change to only keep one copy so not to lose
> > space?
> >
> > Thanks
> >
> > On Tue, Mar 1, 2011 at 3:26 PM, Mike Franon<ko...@gmail.com> wrote:
> >> No pending commits, what it looks like is there are almost two copies
> >> of the index on the master, not sure how that happened.
> >>
> >>
> >>
> >> On Tue, Mar 1, 2011 at 3:08 PM, Markus Jelsma
> >>
> >> <ma...@openindex.io> wrote:
> >>> Are there pending commits on the master?
> >>>
> >>>> I was curious why would the size be dramatically different even though
> >>>> the index versions are the same?
> >>>>
> >>>> One is 1.2 Gb, and on the slave it is 512 MB
> >>>>
> >>>> I would think they should both be the same size no?
> >>>>
> >>>> Thanks
Re: solr different sizes on master and slave
Posted by Mike Franon <ko...@gmail.com>.
Thanks you very much for this info, that helps a lot!
On Wed, Mar 2, 2011 at 10:05 AM, Jayendra Patil
<ja...@gmail.com> wrote:
> Hi Mike,
>
> There was an issue with the Snappuller wherein it fails to clean up
> the old index directories on the slave side.
> https://issues.apache.org/jira/browse/SOLR-2156
>
> The patch can be applied to fix the issue.
> You can also delete the old index directories, except for the current
> one which is mentioned in the index.properties.
>
> Regards,
> Jayendra
>
> On Tue, Mar 1, 2011 at 4:27 PM, Mike Franon <ko...@gmail.com> wrote:
>> ok doing some more research I noticed, on the slave it has multiple
>> folders where it keeps them for example
>>
>> index
>> index.20110204010900
>> index.20110204013355
>> index.20110218125400
>>
>> and then there is an index.properties that shows which index it is using.
>>
>> I am just curious why does it keep multiple copies? Is there a
>> setting somewhere I can change to only keep one copy so not to lose
>> space?
>>
>> Thanks
>>
>> On Tue, Mar 1, 2011 at 3:26 PM, Mike Franon <ko...@gmail.com> wrote:
>>> No pending commits, what it looks like is there are almost two copies
>>> of the index on the master, not sure how that happened.
>>>
>>>
>>>
>>> On Tue, Mar 1, 2011 at 3:08 PM, Markus Jelsma
>>> <ma...@openindex.io> wrote:
>>>> Are there pending commits on the master?
>>>>
>>>>> I was curious why would the size be dramatically different even though
>>>>> the index versions are the same?
>>>>>
>>>>> One is 1.2 Gb, and on the slave it is 512 MB
>>>>>
>>>>> I would think they should both be the same size no?
>>>>>
>>>>> Thanks
>>>>
>>>
>>
>
Re: solr different sizes on master and slave
Posted by Jayendra Patil <ja...@gmail.com>.
Hi Mike,
There was an issue with the Snappuller wherein it fails to clean up
the old index directories on the slave side.
https://issues.apache.org/jira/browse/SOLR-2156
The patch can be applied to fix the issue.
You can also delete the old index directories, except for the current
one which is mentioned in the index.properties.
Regards,
Jayendra
On Tue, Mar 1, 2011 at 4:27 PM, Mike Franon <ko...@gmail.com> wrote:
> ok doing some more research I noticed, on the slave it has multiple
> folders where it keeps them for example
>
> index
> index.20110204010900
> index.20110204013355
> index.20110218125400
>
> and then there is an index.properties that shows which index it is using.
>
> I am just curious why does it keep multiple copies? Is there a
> setting somewhere I can change to only keep one copy so not to lose
> space?
>
> Thanks
>
> On Tue, Mar 1, 2011 at 3:26 PM, Mike Franon <ko...@gmail.com> wrote:
>> No pending commits, what it looks like is there are almost two copies
>> of the index on the master, not sure how that happened.
>>
>>
>>
>> On Tue, Mar 1, 2011 at 3:08 PM, Markus Jelsma
>> <ma...@openindex.io> wrote:
>>> Are there pending commits on the master?
>>>
>>>> I was curious why would the size be dramatically different even though
>>>> the index versions are the same?
>>>>
>>>> One is 1.2 Gb, and on the slave it is 512 MB
>>>>
>>>> I would think they should both be the same size no?
>>>>
>>>> Thanks
>>>
>>
>
Re: solr different sizes on master and slave
Posted by Mike Franon <ko...@gmail.com>.
Right now I have the slave polling every 10 seconds, becuase we want
to make sure they stay in sync. I have users who will do post
directly from a web application. But I do notice it syncs very quick,
becuase usually the update is only one or two records at a time.
I am thinking maybe 10 seconds is too fast?
On Tue, Mar 1, 2011 at 4:40 PM, Jonathan Rochkind <ro...@jhu.edu> wrote:
> The slave should not keep multiple copies _permanently_, but might
> temporarily after it's fetched the new files from master, but before it's
> committed them and fully wamred the new index searchers in the slave. Could
> that be what's going on, is your slave just still working on committing and
> warming the new version(s) of the index?
>
> [If you do 'commit' to slave (and a replication pull counts as a 'commit')
> so quick that you get overlapping commits before the slave was able to warm
> a new index... its' going to be trouble all around.]
>
> On 3/1/2011 4:27 PM, Mike Franon wrote:
>>
>> ok doing some more research I noticed, on the slave it has multiple
>> folders where it keeps them for example
>>
>> index
>> index.20110204010900
>> index.20110204013355
>> index.20110218125400
>>
>> and then there is an index.properties that shows which index it is using.
>>
>> I am just curious why does it keep multiple copies? Is there a
>> setting somewhere I can change to only keep one copy so not to lose
>> space?
>>
>> Thanks
>>
>> On Tue, Mar 1, 2011 at 3:26 PM, Mike Franon<ko...@gmail.com> wrote:
>>>
>>> No pending commits, what it looks like is there are almost two copies
>>> of the index on the master, not sure how that happened.
>>>
>>>
>>>
>>> On Tue, Mar 1, 2011 at 3:08 PM, Markus Jelsma
>>> <ma...@openindex.io> wrote:
>>>>
>>>> Are there pending commits on the master?
>>>>
>>>>> I was curious why would the size be dramatically different even though
>>>>> the index versions are the same?
>>>>>
>>>>> One is 1.2 Gb, and on the slave it is 512 MB
>>>>>
>>>>> I would think they should both be the same size no?
>>>>>
>>>>> Thanks
>
Re: solr different sizes on master and slave
Posted by Jonathan Rochkind <ro...@jhu.edu>.
The slave should not keep multiple copies _permanently_, but might
temporarily after it's fetched the new files from master, but before
it's committed them and fully wamred the new index searchers in the
slave. Could that be what's going on, is your slave just still working
on committing and warming the new version(s) of the index?
[If you do 'commit' to slave (and a replication pull counts as a
'commit') so quick that you get overlapping commits before the slave was
able to warm a new index... its' going to be trouble all around.]
On 3/1/2011 4:27 PM, Mike Franon wrote:
> ok doing some more research I noticed, on the slave it has multiple
> folders where it keeps them for example
>
> index
> index.20110204010900
> index.20110204013355
> index.20110218125400
>
> and then there is an index.properties that shows which index it is using.
>
> I am just curious why does it keep multiple copies? Is there a
> setting somewhere I can change to only keep one copy so not to lose
> space?
>
> Thanks
>
> On Tue, Mar 1, 2011 at 3:26 PM, Mike Franon<ko...@gmail.com> wrote:
>> No pending commits, what it looks like is there are almost two copies
>> of the index on the master, not sure how that happened.
>>
>>
>>
>> On Tue, Mar 1, 2011 at 3:08 PM, Markus Jelsma
>> <ma...@openindex.io> wrote:
>>> Are there pending commits on the master?
>>>
>>>> I was curious why would the size be dramatically different even though
>>>> the index versions are the same?
>>>>
>>>> One is 1.2 Gb, and on the slave it is 512 MB
>>>>
>>>> I would think they should both be the same size no?
>>>>
>>>> Thanks
Re: solr different sizes on master and slave
Posted by Mike Franon <ko...@gmail.com>.
ok doing some more research I noticed, on the slave it has multiple
folders where it keeps them for example
index
index.20110204010900
index.20110204013355
index.20110218125400
and then there is an index.properties that shows which index it is using.
I am just curious why does it keep multiple copies? Is there a
setting somewhere I can change to only keep one copy so not to lose
space?
Thanks
On Tue, Mar 1, 2011 at 3:26 PM, Mike Franon <ko...@gmail.com> wrote:
> No pending commits, what it looks like is there are almost two copies
> of the index on the master, not sure how that happened.
>
>
>
> On Tue, Mar 1, 2011 at 3:08 PM, Markus Jelsma
> <ma...@openindex.io> wrote:
>> Are there pending commits on the master?
>>
>>> I was curious why would the size be dramatically different even though
>>> the index versions are the same?
>>>
>>> One is 1.2 Gb, and on the slave it is 512 MB
>>>
>>> I would think they should both be the same size no?
>>>
>>> Thanks
>>
>
Re: solr different sizes on master and slave
Posted by Mike Franon <ko...@gmail.com>.
No pending commits, what it looks like is there are almost two copies
of the index on the master, not sure how that happened.
On Tue, Mar 1, 2011 at 3:08 PM, Markus Jelsma
<ma...@openindex.io> wrote:
> Are there pending commits on the master?
>
>> I was curious why would the size be dramatically different even though
>> the index versions are the same?
>>
>> One is 1.2 Gb, and on the slave it is 512 MB
>>
>> I would think they should both be the same size no?
>>
>> Thanks
>
Re: solr different sizes on master and slave
Posted by Markus Jelsma <ma...@openindex.io>.
Are there pending commits on the master?
> I was curious why would the size be dramatically different even though
> the index versions are the same?
>
> One is 1.2 Gb, and on the slave it is 512 MB
>
> I would think they should both be the same size no?
>
> Thanks