You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Joe Ye <yu...@gmail.com> on 2017/07/04 12:39:26 UTC

DocValue update methods don't appear to throw exception if the document doesn't exist

Hi,

I'm using Lucene core 6.6.

I noticed an issue that DocValue update methods
(indexWriter.updateNumericDocValue
& indexWriter.updateBinaryDocValue) don't appear to throw exception or
return any error code if the document doesn't exist. Is this intentional? I
don't want to check the existence of the document before each docValue
update.


Kind regards,

Joe

Re: DocValue update methods don't appear to throw exception if the document doesn't exist

Posted by Michael McCandless <lu...@mikemccandless.com>.
This is by design: you are able to add a doc values field to a previously
indexed document even if that document didn't originally index that doc
values field.

The update brings the doc values field into existence for that document.

Mike McCandless

http://blog.mikemccandless.com

On Tue, Jul 4, 2017 at 8:39 AM, Joe Ye <yu...@gmail.com> wrote:

> Hi,
>
> I'm using Lucene core 6.6.
>
> I noticed an issue that DocValue update methods (indexWriter.
> updateNumericDocValue & indexWriter.updateBinaryDocValue) don't appear to
> throw exception or return any error code if the document doesn't exist. Is
> this intentional? I don't want to check the existence of the document
> before each docValue update.
>
>
> Kind regards,
>
> Joe
>
>
>
>

Re: DocValue update methods don't appear to throw exception if the document doesn't exist

Posted by Michael McCandless <lu...@mikemccandless.com>.
Trejkaz described it correctly: you must commit to make your recent
indexing changes durable on disk, and visible to a newly opened reader.

If the JVM, OS, hardware crashes, then any changes since the last commit
are lost.

Mike McCandless

http://blog.mikemccandless.com

On Fri, Jul 7, 2017 at 11:58 AM, Trejkaz <tr...@trypticon.org> wrote:

> On Thu, Jul 6, 2017 at 8:28 PM, Joe Ye <yu...@gmail.com> wrote:
> > Thanks very much TX!
> >
> > Regarding "But the updates don't actually occur during the call", could
> you
> > elaborate on this a bit more? So when would the actual update occur, by
> > which I mean persisting to disk?
>
> The same as any other updates - when you call commit().
>
> > Is there a cache of a number of docValues updates before committing to
> disk?
>
> Someone who knows the internals better would probably have to answer
> this one. I don't know how merging of updates to doc values works. (I
> am guessing the doc values generation come into play here?) But
> flushing to disk and committing changes are two different things
> anyway. Lucene will periodically flush changes to disk when it decides
> that it can't keep more in memory, This is not the same as it actually
> committing, which makes the changes visible to newly-opened readers.
> You just end up with files on disk which aren't referenced from the
> segments file yet.
>
> > If so, when happens if a crash occurs before those updates are committed?
>
> Hopefully none of the updates occur, but the science hasn't been done.
> (It seems like a fairly easy experiment to do if you really want to
> test it.)
>
> TX
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: DocValue update methods don't appear to throw exception if the document doesn't exist

Posted by Trejkaz <tr...@trypticon.org>.
On Thu, Jul 6, 2017 at 8:28 PM, Joe Ye <yu...@gmail.com> wrote:
> Thanks very much TX!
>
> Regarding "But the updates don't actually occur during the call", could you
> elaborate on this a bit more? So when would the actual update occur, by
> which I mean persisting to disk?

The same as any other updates - when you call commit().

> Is there a cache of a number of docValues updates before committing to disk?

Someone who knows the internals better would probably have to answer
this one. I don't know how merging of updates to doc values works. (I
am guessing the doc values generation come into play here?) But
flushing to disk and committing changes are two different things
anyway. Lucene will periodically flush changes to disk when it decides
that it can't keep more in memory, This is not the same as it actually
committing, which makes the changes visible to newly-opened readers.
You just end up with files on disk which aren't referenced from the
segments file yet.

> If so, when happens if a crash occurs before those updates are committed?

Hopefully none of the updates occur, but the science hasn't been done.
(It seems like a fairly easy experiment to do if you really want to
test it.)

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: DocValue update methods don't appear to throw exception if the document doesn't exist

Posted by Joe Ye <yu...@gmail.com>.
Thanks very much TX!

Regarding "But the updates don't actually occur during the call", could you
elaborate on this a bit more? So when would the actual update occur, by
which I mean persisting to disk? Is there a cache of a number of docValues
updates before committing to disk? If so, when happens if a crash occurs
before those updates are committed?

Many thanks,
Joe


On Tue, Jul 4, 2017 at 10:53 PM, Trejkaz <tr...@trypticon.org> wrote:

> On Tue, 4 Jul 2017 at 22:39, Joe Ye <yu...@gmail.com> wrote:
>
> > Hi,
> >
> > I'm using Lucene core 6.6.
> >
> > I noticed an issue that DocValue update methods
> > (indexWriter.updateNumericDocValue
> > & indexWriter.updateBinaryDocValue) don't appear to throw exception or
> > return any error code if the document doesn't exist. Is this
> intentional? I
> > don't want to check the existence of the document before each docValue
> > update.
>
>
> Given that they take Term or Query, that's what one would intuitively
> expect from such an API (it will match 0..n docs.)
>
> But the updates don't actually occur during the call, so there is no way
> for it to know how many updates will happen in advance anyway. (Otherwise
> it would be nice to know the number of docs it updated, like in JDBC.)
>
> TX
>

Re: DocValue update methods don't appear to throw exception if the document doesn't exist

Posted by Trejkaz <tr...@trypticon.org>.
On Tue, 4 Jul 2017 at 22:39, Joe Ye <yu...@gmail.com> wrote:

> Hi,
>
> I'm using Lucene core 6.6.
>
> I noticed an issue that DocValue update methods
> (indexWriter.updateNumericDocValue
> & indexWriter.updateBinaryDocValue) don't appear to throw exception or
> return any error code if the document doesn't exist. Is this intentional? I
> don't want to check the existence of the document before each docValue
> update.


Given that they take Term or Query, that's what one would intuitively
expect from such an API (it will match 0..n docs.)

But the updates don't actually occur during the call, so there is no way
for it to know how many updates will happen in advance anyway. (Otherwise
it would be nice to know the number of docs it updated, like in JDBC.)

TX