You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Shalin Shekhar Mangar <sh...@gmail.com> on 2015/06/10 06:46:07 UTC

Re: [CONF] Apache Solr Reference Guide > Index Replication

Hi Shawn,

That comment about commitReserveDuration is not accurate. It has nothing to
do with the size of the index. It is the time for which commits are
reserved even if no request has accessed them. A commit and its related
files will never be deleted as long as a request is fetching them. The only
reason why you would increase commitReserveDuration is when you have a
cross datacenter repeater setup and requests to fetch index files are made
more than 10 seconds apart by the repeater (due to whatever reasons). Heavy
GC can also cause the commitReserveDuration to be exceeded.

On Tue, Jun 9, 2015 at 10:34 PM, Shawn Heisey (Confluence) <
confluence@apache.org> wrote:

>          Shawn Heisey
> <https://cwiki.apache.org/confluence/display/~elyograg> edited the page:
>     *Index Replication*
> <https://cwiki.apache.org/confluence/display/solr/Index+Replication>
>
> *Comment:* Improved docs for commitReserveDuration.
>    View Online
> <https://cwiki.apache.org/confluence/display/solr/Index+Replication> ·
> Like
> <https://cwiki.apache.org/confluence/plugins/likes/like.action?contentId=32604221>
> · View Changes
> <https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=32604221&revisedVersion=28&originalVersion=27>
> · Add Comment
> <https://cwiki.apache.org/confluence/display/solr/Index+Replication?showComments=true&showCommentArea=true#addcomment>  Stop
> watching space
> <https://cwiki.apache.org/confluence/users/removespacenotification.action?spaceKey=solr>
> · Manage Notifications
> <https://cwiki.apache.org/confluence/users/editmyemailsettings.action>
> This message was sent by Atlassian Confluence
> <http://www.atlassian.com/software/confluence> 5.0.3, Team Collaboration
> Software
> <http://www.atlassian.com/software/confluence/overview/team-collaboration-software?utm_source=email-footer>
>



-- 
Regards,
Shalin Shekhar Mangar.

Re: [CONF] Apache Solr Reference Guide > Index Replication

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
On Wed, Jun 10, 2015 at 8:01 PM, Shawn Heisey <ap...@elyograg.org> wrote:

> On 6/10/2015 12:49 AM, Shalin Shekhar Mangar wrote:
> > When the actual transfer of a file starts, then the whole commit point
> > is "saved" i.e. it won't be released until the entire file transfer is
> > complete (this is different from a "reserve" which is time based). At
> > the end of the transfer of a single file, the reserve is again extended
> > by the commitReserveDuration to allow for the next file to be fetched.
> >
> > In my practical experience, increasing the commitReserveDuration in a
> > well-behaved Solr install has only been necessary for repeater setups.
> >
> > You can see this code in ReplicationHandler.DirectoryFileStream.write()
> > method.
>
> It's all starting to make sense. Before the file transfer starts (in the
> initWrite() method), the commit point is explicitly saved, and not
> released until the file transfer completes.
>
> Do you think there's any chance that a delete could manage to happen
> between the two statements in releaseCommitPointAndExtendReserve()?  I
> wonder if maybe those two statements should be reversed so the reserve
> is extended before the commit point is released.  I don't know enough
> about the lower layers to know whether that's a safe thing to do.
>

Yeah, I was wondering the same thing looking at the code again today. It
makes sense to extend the reserve first and then release the commit point.
Can you please open an issue?


>
> Thanks,
> Shawn
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>


-- 
Regards,
Shalin Shekhar Mangar.

Re: [CONF] Apache Solr Reference Guide > Index Replication

Posted by Shawn Heisey <ap...@elyograg.org>.
On 6/10/2015 12:49 AM, Shalin Shekhar Mangar wrote:
> When the actual transfer of a file starts, then the whole commit point
> is "saved" i.e. it won't be released until the entire file transfer is
> complete (this is different from a "reserve" which is time based). At
> the end of the transfer of a single file, the reserve is again extended
> by the commitReserveDuration to allow for the next file to be fetched.
> 
> In my practical experience, increasing the commitReserveDuration in a
> well-behaved Solr install has only been necessary for repeater setups.
> 
> You can see this code in ReplicationHandler.DirectoryFileStream.write()
> method.

It's all starting to make sense. Before the file transfer starts (in the
initWrite() method), the commit point is explicitly saved, and not
released until the file transfer completes.

Do you think there's any chance that a delete could manage to happen
between the two statements in releaseCommitPointAndExtendReserve()?  I
wonder if maybe those two statements should be reversed so the reserve
is extended before the commit point is released.  I don't know enough
about the lower layers to know whether that's a safe thing to do.

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: [CONF] Apache Solr Reference Guide > Index Replication

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
Comments inline:

On Wed, Jun 10, 2015 at 11:45 AM, Shawn Heisey <ap...@elyograg.org> wrote:

> On 6/9/2015 10:46 PM, Shalin Shekhar Mangar wrote:
> > That comment about commitReserveDuration is not accurate. It has nothing
> > to do with the size of the index. It is the time for which commits are
> > reserved even if no request has accessed them. A commit and its related
> > files will never be deleted as long as a request is fetching them. The
> > only reason why you would increase commitReserveDuration is when you
> > have a cross datacenter repeater setup and requests to fetch index files
> > are made more than 10 seconds apart by the repeater (due to whatever
> > reasons). Heavy GC can also cause the commitReserveDuration to be
> exceeded.
>
> I've reverted the reference guide change.  Once I can figure out exactly
> what happens here, I'll work on the wiki page as well and figure out
> whether the ref guide needs any new edits.
>
> I'm finding the code very difficult to follow, which is pretty normal
> for me -- there are a lot of inheritance levels and it takes a lot of
> skill and time to decipher it.
>
> I'd like to write something in the docs that explains in simple terms
> when a user would actually need to increase this value, so I need to
> understand what events will reset the deletion timer?
>
> Is it separate calls to the replication handler, to get a file list,
> request a file, etc?  If it is, then if the index is dozens of gigabytes
> and optimized to a single segment, there will be multiple files that
> take a lot longer than the 10 second default to transfer.
>

If the time between multiple API calls such "filelist", "filecontent"
exceeds 10 seconds then the commit is released (i.e. free to be deleted).


> If the timer is continually reset throughout the actual transfer of a
> file, then this is a lot less brittle than I had originally suspected.
> I can't find any evidence that this is how it works, but as I said
> before, the code is hard to follow.
>

When the actual transfer of a file starts, then the whole commit point is
"saved" i.e. it won't be released until the entire file transfer is
complete (this is different from a "reserve" which is time based). At the
end of the transfer of a single file, the reserve is again extended by the
commitReserveDuration to allow for the next file to be fetched.

In my practical experience, increasing the commitReserveDuration in a
well-behaved Solr install has only been necessary for repeater setups.

You can see this code in ReplicationHandler.DirectoryFileStream.write()
method.


>
> Thanks,
> Shawn
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>


-- 
Regards,
Shalin Shekhar Mangar.

Re: [CONF] Apache Solr Reference Guide > Index Replication

Posted by Shawn Heisey <ap...@elyograg.org>.
On 6/9/2015 10:46 PM, Shalin Shekhar Mangar wrote:
> That comment about commitReserveDuration is not accurate. It has nothing
> to do with the size of the index. It is the time for which commits are
> reserved even if no request has accessed them. A commit and its related
> files will never be deleted as long as a request is fetching them. The
> only reason why you would increase commitReserveDuration is when you
> have a cross datacenter repeater setup and requests to fetch index files
> are made more than 10 seconds apart by the repeater (due to whatever
> reasons). Heavy GC can also cause the commitReserveDuration to be exceeded.

I've reverted the reference guide change.  Once I can figure out exactly
what happens here, I'll work on the wiki page as well and figure out
whether the ref guide needs any new edits.

I'm finding the code very difficult to follow, which is pretty normal
for me -- there are a lot of inheritance levels and it takes a lot of
skill and time to decipher it.

I'd like to write something in the docs that explains in simple terms
when a user would actually need to increase this value, so I need to
understand what events will reset the deletion timer?

Is it separate calls to the replication handler, to get a file list,
request a file, etc?  If it is, then if the index is dozens of gigabytes
and optimized to a single segment, there will be multiple files that
take a lot longer than the 10 second default to transfer.

If the timer is continually reset throughout the actual transfer of a
file, then this is a lot less brittle than I had originally suspected.
I can't find any evidence that this is how it works, but as I said
before, the code is hard to follow.

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org