You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@nutch.apache.org by charlie w <sp...@gmail.com> on 2007/08/08 04:34:37 UTC

index locking in nutch

Is there documentation that explains how Nutch does locking?  According to
the Lucene doc, the lock should go in java.io.tmpdir, but I never see
anything looking like a lock file appear there.  I do see a file "write.lock"
in the directory where the Lucene index lives.

But strangely, that file is never removed, and sometimes I then get lock
acquire timeouts.  the Luke application, in particular, always complains
that the index is locked.

I am only crawling to the local filesystem, not a HDFS.  Is there something
special Nutch does to lock the Lucene index?

Re: index locking in nutch

Posted by DES <sa...@gmail.com>.

Look at the way the hadoop handles file locks. AFAIK newst versions of
hadoop (0.13.x) don't support file locks anymore, so consider hadoop
0.12.x

On 8/8/07, charlie w <sp...@gmail.com> wrote:
> Is there documentation that explains how Nutch does locking?  According to
> the Lucene doc, the lock should go in java.io.tmpdir, but I never see
> anything looking like a lock file appear there.  I do see a file "write.lock"
> in the directory where the Lucene index lives.
>
> But strangely, that file is never removed, and sometimes I then get lock
> acquire timeouts.  the Luke application, in particular, always complains
> that the index is locked.
>
> I am only crawling to the local filesystem, not a HDFS.  Is there something
> special Nutch does to lock the Lucene index?
>