You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Shai Erera (JIRA)" <ji...@apache.org> on 2013/10/08 08:25:42 UTC

[jira] [Commented] (LUCENE-5263) Deletes may be silently lost if disk fills up and then frees up

    [ https://issues.apache.org/jira/browse/LUCENE-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13788959#comment-13788959 ] 

Shai Erera commented on LUCENE-5263:
------------------------------------

These are nice catches :). Few comments:

* This is not only about disk-full, but about any transient IOE (e.g. ran out of file handles temporarily) that occur. So maybe change the issue's description (and also the CHANGES entry text)?

* In TestIndexWriterReader I see this:
{code}
     r1.close();
+    assertTrue(r2.isCurrent());
     writer.close();
     assertTrue(r2.isCurrent());
{code}
Did you intend to check r2.isCurrent before and after writer.close?

* You think maybe we should move FakeIOE to MDW so that other tests can use it? I use it in two other places already.

* In ReaderPool.release: why don't you call writer.checkpoint()? It calls both deleter.checkpoint() and increments changed++. Did you want to avoid it also notifying sis.changed()? If so, maybe drop a comment why we don't do it?
** If you choose to keep the code like that, there's a place in release() which calls deleter then increments changeCount, where in all other places the order is reverse. Maybe change it to be consistent? I don't know if it's an issue that changeCount isn't incremented, if e.g. deleter.checkpoint throws an ex?

* Separately, I think it will be a useful utility to have a method like Utils.throwEx(Throwable t) which does the "if instanceof" logic (checking for IOE, RuntimeE, Error etc.).

Otherwise looks good!

> Deletes may be silently lost if disk fills up and then frees up
> ---------------------------------------------------------------
>
>                 Key: LUCENE-5263
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5263
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/index
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 5.0, 4.6
>
>         Attachments: LUCENE-5263.patch
>
>
> This case is tricky to handle, yet I think realistic: disk fills up
> temporarily, causes an exception in writeLiveDocs, and then the app
> keeps using the IW instance.
> Meanwhile disk later frees up again, IW is closed "successfully".  In
> certain cases, we can silently lose deletes in this case.
> I had already committed
> TestIndexWriterDeletes.testNoLostDeletesOnDiskFull, and Jenkins seems
> happy with it so far, but when I added fangs to the test (cutover to
> RandomIndexWriter from IndexWriter, allow IOE during getReader, add
> randomness to when exc is thrown, etc.), it uncovered some real/nasty
> bugs:
>   * ReaderPool.dropAll was suppressing any exception it hit, because
>     {code}if (priorE != null){code} should instead be {code}if (priorE == null){code}
>   * After a merge, we have to write deletes before committing the
>     segment, because an exception when writing deletes means we need
>     to abort the merge
>   * Several places that were directly calling deleter.checkpoint must
>     also increment the changeCount else on close IW thinks there are
>     no changes and doesn't write a new segments file.
>   * closeInternal was dropping pooled readers after writing the
>     segments file, which would lose deletes still buffered due to a
>     previous exc.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org