You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@lucy.apache.org by Edwin Crockford <ec...@invicro.com> on 2014/12/08 21:05:22 UTC

[lucy-user] Indexing error message

Repeatedly get errors like this:

/Can't Seek '/home/ipacs/ipacs/index/webdisk/seg_qjqo/highlight.ix' past 
EOF (8 > 0)/

Anybody have and idea what is causing this?

Thanks

Edwin

RE: [lucy-user] Indexing error message

Posted by "Zebrowski, Zak" <za...@mitre.org>.

Hello Edwin,
Seems like the index is trying to read something beyond the end of file.  My guess is that it's possibly a hard drive error or full hard disk at the time the index was being created.
Good luck,
Zak

-----Original Message-----
From: Edwin Crockford [mailto:ecrockford@invicro.com] 
Sent: Monday, December 08, 2014 3:05 PM
To: user@lucy.apache.org
Subject: [lucy-user] Indexing error message

Repeatedly get errors like this:

/Can't Seek '/home/ipacs/ipacs/index/webdisk/seg_qjqo/highlight.ix' past 
EOF (8 > 0)/

Anybody have and idea what is causing this?

Thanks

Edwin

Re: [lucy-user] Indexing error message

Posted by Edwin Crockford <ec...@invicro.com>.

Thanks Marvin, could well be a manifestation of the same issue we 
previously talked about. At least it gives me a point to start. I'll 
talk with the ops guys to see how and when they run the bulk indexer and 
see if we can sort out a better way of running it.

Thanks again
Edwin

On 08/12/2014 23:18, Marvin Humphrey wrote:
> On Mon, Dec 8, 2014 at 12:55 PM, Edwin Crockford <ec...@invicro.com> wrote:
>> Hi Marvin,
>>
>> Thanks for the quick reply, here's a fragment of  the cfmeta.json file for
>> the segment:
>>
>> {
>>    "files": {
>>      "documents.dat": {
>>        "length": "17556716",
>>        "offset": "0"
>>      },
>>      "documents.ix": {
>>        "length": "238760",
>>        "offset": "17556720"
>>      },
>>      "highlight.dat": {
>>        "length": "47793",
>>        "offset": "17795480"
>>      },
>>      "highlight.ix": {
>>        "length": "0",
>>        "offset": "17843280"
>>      },
>>
>>
>> Not quite sure what the format is but it  has a 0 length for "highlight.ix",
>> highlight.dat has a largish length. Is this some failure in the highlighting
>> mechansim?
> Looking at the code in HighlightWriter.c, nothing jumps out at me.  I can't
> see how it's possible to write to highlight.dat without also writing to
> highlight.ix.  And for what it's worth, HighlightWriter's codebase has been
> largely stable since 2009, recieving only minor modifications.
>
> Disk filling up also seems unlikely -- lots of other files are written to at
> the same time as highlight.ix (e.g. documents.ix) and those don't exhibit the
> same problem; we should detect that a flush had failed when the file
> descriptor gets closed; the "compound files" cf.dat and cfmeta.json get
> written *after* highlight.ix, at which point you need *more* disk space.
>
> Insead, I speculate what we are looking at is a different manifestation of the
> same problem we talked about in August 2013.
>
>      http://markmail.org/message/vynzixtoxfxhcx42
>
>      > I believe we have traced the issue back to an interaction between two
>      > different systems (one doing bulk updates and another doing on the fly
>      > single document indexing) attempting updates at the same time. I
> think there
>      > was a way around the locking that caused the problem, does that seem
>      > plausable?
>
>      Yes, that makes sense. The error can be explained by having two Indexers
>      trying to write to the same segment. One of them happens to delete the temp
>      file "lextemp" first, and then the other can't find it and throws an
>      exception.
>
>      Only one Indexer may operate on a given index at a time. A BackgroundMerger
>      may operate at the same time as an Indexer, but even it must
> acquire the write
>      lock briefly (once at the start of its run and once again at the end). While
>      Lucy's locking APIs provide the technical capacity to disable the locking
>      mechanism, it is not generally possible to get around the need for
> locking in
>      order to coordinate write access to an index.
>
> Generally, when you disable locking and two writers attempt to write the same
> segment, the first Indexer will crash before commit() completes and the index
> will be left in a consistent state.  If you are unlucky, though, there's a
> possibility you'll get corrupt data instead.
>
> For that to happen, the second indexing process would have to start up while
> the first was nearly done and in the process of assembling the compound file
> `cf.dat` from temporary files such as `highlight.ix`.  The second process
> "cleans up" the temp files from the "crashed" first process and initializes
> new empty files.  The first process doesn't realize that its own
> highlight.ix file has been clobbered and slurps the new empty file into
> cf.dat.
>
> How was your issue from last year resolved?
>
> Marvin Humphrey

Re: [lucy-user] Indexing error message

Posted by Marvin Humphrey <ma...@rectangular.com>.

On Mon, Dec 8, 2014 at 12:55 PM, Edwin Crockford <ec...@invicro.com> wrote:
> Hi Marvin,
>
> Thanks for the quick reply, here's a fragment of  the cfmeta.json file for
> the segment:
>
> {
>   "files": {
>     "documents.dat": {
>       "length": "17556716",
>       "offset": "0"
>     },
>     "documents.ix": {
>       "length": "238760",
>       "offset": "17556720"
>     },
>     "highlight.dat": {
>       "length": "47793",
>       "offset": "17795480"
>     },
>     "highlight.ix": {
>       "length": "0",
>       "offset": "17843280"
>     },
>
>
> Not quite sure what the format is but it  has a 0 length for "highlight.ix",
> highlight.dat has a largish length. Is this some failure in the highlighting
> mechansim?

Looking at the code in HighlightWriter.c, nothing jumps out at me.  I can't
see how it's possible to write to highlight.dat without also writing to
highlight.ix.  And for what it's worth, HighlightWriter's codebase has been
largely stable since 2009, recieving only minor modifications.

Disk filling up also seems unlikely -- lots of other files are written to at
the same time as highlight.ix (e.g. documents.ix) and those don't exhibit the
same problem; we should detect that a flush had failed when the file
descriptor gets closed; the "compound files" cf.dat and cfmeta.json get
written *after* highlight.ix, at which point you need *more* disk space.

Insead, I speculate what we are looking at is a different manifestation of the
same problem we talked about in August 2013.

    http://markmail.org/message/vynzixtoxfxhcx42

    > I believe we have traced the issue back to an interaction between two
    > different systems (one doing bulk updates and another doing on the fly
    > single document indexing) attempting updates at the same time. I
think there
    > was a way around the locking that caused the problem, does that seem
    > plausable?

    Yes, that makes sense. The error can be explained by having two Indexers
    trying to write to the same segment. One of them happens to delete the temp
    file "lextemp" first, and then the other can't find it and throws an
    exception.

    Only one Indexer may operate on a given index at a time. A BackgroundMerger
    may operate at the same time as an Indexer, but even it must
acquire the write
    lock briefly (once at the start of its run and once again at the end). While
    Lucy's locking APIs provide the technical capacity to disable the locking
    mechanism, it is not generally possible to get around the need for
locking in
    order to coordinate write access to an index.

Generally, when you disable locking and two writers attempt to write the same
segment, the first Indexer will crash before commit() completes and the index
will be left in a consistent state.  If you are unlucky, though, there's a
possibility you'll get corrupt data instead.

For that to happen, the second indexing process would have to start up while
the first was nearly done and in the process of assembling the compound file
`cf.dat` from temporary files such as `highlight.ix`.  The second process
"cleans up" the temp files from the "crashed" first process and initializes
new empty files.  The first process doesn't realize that its own
highlight.ix file has been clobbered and slurps the new empty file into
cf.dat.

How was your issue from last year resolved?

Marvin Humphrey

Re: [lucy-user] Indexing error message

Posted by Edwin Crockford <ec...@invicro.com>.

Hi Marvin,

Thanks for the quick reply, here's a fragment of  the cfmeta.json file 
for the segment:

{
   "files": {
     "documents.dat": {
       "length": "17556716",
       "offset": "0"
     },
     "documents.ix": {
       "length": "238760",
       "offset": "17556720"
     },
     "highlight.dat": {
       "length": "47793",
       "offset": "17795480"
     },
     "highlight.ix": {
       "length": "0",
       "offset": "17843280"
     },


Not quite sure what the format is but it  has a 0 length for 
"highlight.ix", highlight.dat has a largish length. Is this some failure 
in the highlighting mechansim?

Regards
Edwin

On 08/12/2014 20:32, Marvin Humphrey wrote:
> On Mon, Dec 8, 2014 at 12:05 PM, Edwin Crockford <ec...@invicro.com> wrote:
>> Repeatedly get errors like this:
>>
>> /Can't Seek '/home/ipacs/ipacs/index/webdisk/seg_qjqo/highlight.ix' past EOF
>> (8 > 0)/
>>
>> Anybody have and idea what is causing this?
> The `highlight.ix` virtual file is a sequence of 8-byte file pointers, each of
> which points into a variable size blob in the virtual file `highlight.dat`.
> Lucy document numbers for each segment begin at 1, and the length of
> `highlight.ix` should be `highest_doc_num * 8`.  If the file's length is 0,
> that implies that there are no documents in that segment.
>
> The next step when debugging this is to examine the contents of cfmeta.json
> for the specific segment.  Does the segment really contain no documents?  Is
> the virtual file `documents.ix`, which follows the same format, also
> zero-length?
>
> Marvin Humphrey

Re: [lucy-user] Indexing error message

Posted by Marvin Humphrey <ma...@rectangular.com>.

On Mon, Dec 8, 2014 at 12:05 PM, Edwin Crockford <ec...@invicro.com> wrote:
> Repeatedly get errors like this:
>
> /Can't Seek '/home/ipacs/ipacs/index/webdisk/seg_qjqo/highlight.ix' past EOF
> (8 > 0)/
>
> Anybody have and idea what is causing this?

The `highlight.ix` virtual file is a sequence of 8-byte file pointers, each of
which points into a variable size blob in the virtual file `highlight.dat`.
Lucy document numbers for each segment begin at 1, and the length of
`highlight.ix` should be `highest_doc_num * 8`.  If the file's length is 0,
that implies that there are no documents in that segment.

The next step when debugging this is to examine the contents of cfmeta.json
for the specific segment.  Does the segment really contain no documents?  Is
the virtual file `documents.ix`, which follows the same format, also
zero-length?

Marvin Humphrey