You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Dawid Weiss (JIRA)" <ji...@apache.org> on 2017/03/08 11:05:37 UTC

[jira] [Updated] (LUCENE-7700) Move throughput control and merge aborting out of IndexWriter's core?

     [ https://issues.apache.org/jira/browse/LUCENE-7700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dawid Weiss updated LUCENE-7700:
--------------------------------
    Attachment: LUCENE-7700.patch

Thanks for comments Mike!

bq. Looks like javadocs for the private MergeRateLimiter.maybePause method are stale?

Corrected. I also changed some internal comments concerning waits < 1ms. (these are
possible with the new locks API, but we still don't bother) and introduced some 
more informative constants where appropriate.

bq.Why are we creating MergeRateLimiter on init of MergeThread and then again in CMS.wrapForMerge? 
Seems like we could cast the current thread to MergeThread and get its already-created instance?

Good catch, corrected.

bq. Why not updateIOThrottle in the main CMS thread, not the merge thread? Else, I think we have 
an added delay, from when a backlog'd merge shows up, to when the already running merge threads kick up their IO throttle?

I admit I didn't try to understand all of the CMS's rate-limit logic as it was quite complex, so
I don't understand you exactly here. Start of the merge thread seemed like a sensible place to run 
the update IO throttle update, but I moved it back -- doesn't seem to hurt anything.

bq. Maybe add a comment to OneMergeProgress.owner and .setMergeThread that it's only used for catching misuse?

Done.

bq. Can we rename OneMergeProgress.pauseTimes -> pauseTimesNanos or NS?

Hehe... sure, sure. 

bq. You can just remove the //private final Directory mergeDirectory from IW.

Done.

bq. Hmm it looks like CFS building is still unthrottled?

Already discussed. 

Running tests now.

> Move throughput control and merge aborting out of IndexWriter's core?
> ---------------------------------------------------------------------
>
>                 Key: LUCENE-7700
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7700
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Dawid Weiss
>            Assignee: Dawid Weiss
>            Priority: Minor
>         Attachments: LUCENE-7700.patch, LUCENE-7700.patch, LUCENE-7700.patch
>
>
> Here is a bit of a background:
> - I wanted to implement a custom merging strategy that would have a custom i/o flow control (global),
> - currently, the CMS is tightly bound with a few classes -- MergeRateLimiter, OneMerge, IndexWriter.
> Looking at the code it seems to me that everything with respect to I/O control could be nicely pulled out into classes that explicitly control the merging process, that is only MergePolicy and MergeScheduler. By default, one could even run without any additional I/O accounting overhead (which is currently in there, even if one doesn't use the CMS's throughput control).
> Such refactoring would also give a chance to nicely move things where they belong -- job aborting into OneMerge (currently in RateLimiter), rate limiter lifecycle bound to OneMerge (MergeScheduler could then use per-merge or global accounting, as it pleases).
> Just a thought and some initial refactorings for discussion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org