You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2008/01/01 15:15:43 UTC

[jira] Updated: (LUCENE-1112) Document is partially indexed on an unhandled exception

     [ https://issues.apache.org/jira/browse/LUCENE-1112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael McCandless updated LUCENE-1112:
---------------------------------------

    Attachment: LUCENE-1112.patch

Patch attached.  All tests pass.  I plan to commit in a day or two.

Here are the changes:

  * No longer throw an exception when massive term is hit.  Instead,
    we now print this message to infoStream, if set:

      WARNING: document contains at least one immense term (longer than the max length 16383), all of which were skipped.  Please correct the analyzer to not produce such terms.  The prefix of the first immense term is: 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...'

  * Still increment position when we hit a massive term

  * An unhandled "non-aborting" exception immediately marks the
    document that hit the exception as deleted.  I added comments at
    the top of DocumentsWriter to explain aborting vs non-aborting
    exceptions.  This change actually adds the infrastructure for
    deleting by doc ID, which we've discussed adding to IW in the
    past, but, I haven't exposed any public APIs for doing so.

  * No longer log to infoStream how many docs were deleted on flush
    since that deletion count is not accurate when mixing delete by
    term and by docID.


> Document is partially indexed on an unhandled exception
> -------------------------------------------------------
>
>                 Key: LUCENE-1112
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1112
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.3
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.3
>
>         Attachments: lucene-1112-test.patch, LUCENE-1112.patch
>
>
> With LUCENE-843, it's now possible for a subset of a document's
> fields/terms to be indexed or stored when an exception is hit.  This
> was not the case in the past (it was "all or none").
> I plan to make it "all or none" again by immediately marking a
> document as deleted if any exception is hit while indexing it.
> Discussion leading up to this:
>   http://www.gossamer-threads.com/lists/lucene/java-dev/56103

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org