You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Shai Erera (JIRA)" <ji...@apache.org> on 2015/11/05 11:57:27 UTC

[jira] [Commented] (LUCENE-6849) Add IndexWriter API to write segment(s) without refreshing them

    [ https://issues.apache.org/jira/browse/LUCENE-6849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14991515#comment-14991515 ] 

Shai Erera commented on LUCENE-6849:
------------------------------------

LGTM. And +1 on making both flush public API. It's an expert API and I believe users who intend to call {{flush()}} can also understand the implications of calling {{flush(true, true)}}. Later we can consider consolidating and enhance this with a {{flush(FlushOptions)}} method, where {{FlushOptions}} lets you specify whether you want to merge, applyDeletes, segment size flush threshold etc.

A few comments:

* If you make the second flush() public
** I think we should document perhaps document in ({{flush()}}) when you should use the second one?
** We should add a testFlushNoCommitButMergeAndApplyDeletes?
** Do you want to also add the second flush() variant to RandomIndexWriter?
* In {{RandomIndexWriter.maybeFlushOrCommit}}, should we also sometimes randomly apply deletes and trigger merges?


> Add IndexWriter API to write segment(s) without refreshing them
> ---------------------------------------------------------------
>
>                 Key: LUCENE-6849
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6849
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: Trunk, 5.4
>
>         Attachments: LUCENE-6849.patch
>
>
> Today, the only way to have {{IndexWriter}} free up some heap is to invoke refresh or flush or close it, but these are all quite costly, and do much more than simply "move bytes to disk".
> I think we should add a simple API, e.g. "move the biggest in-memory segment to disk" to 1) give more granularity (there could be multiple in-memory segments), and 2) only move bytes to disk (not refresh, not fsync, etc.).
> This way apps that want to be more careful on how heap is used can have more control.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org