You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Paul <pa...@waite.net.nz> on 2004/04/11 03:00:50 UTC

Optimize Crash

The problem I have is that when I try to execute an optimize on my Lucene
index I get the following error thrown (see below).

If anyone can help, and the answer requires some digging, then I have the
very index tarred and gzipped for anon FTP access at ftp.catalyst.net.nz (in
the "pub" sub-directory). This is 462Mb, and unpacks to roughly twice that
size. There is also a README file there.

Here is the error I get very quickly when optimize runs:

--- CUT ---
java.lang.ArrayIndexOutOfBoundsException: 111 >= 23
        at java.util.Vector.elementAt(Vector.java(Compiled Code))
        at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java(Compiled 
Code))
        at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java(Compiled 
Code))
        at 
org.apache.lucene.index.SegmentReader.document(SegmentReader.java(Compiled 
Code))
        at 
org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java(Compiled 
Code))
        at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:92)
        at 
org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:473)
        at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:354)
        at nz.net.catalyst.lucene.server.Optimize.execute(Optimize.java:80)
        at nz.net.catalyst.lucene.server.Control.optimize(Control.java:87)
        at nz.net.catalyst.lucene.server.Control.execute(Control.java:49)
        at nz.net.catalyst.lucene.server.Dialogue.process(Dialogue.java:111)
        at nz.net.catalyst.lucene.server.Session.communicate(Session.java:125)
        at nz.net.catalyst.SocketClient.run(SocketClient.java:70)
        at java.lang.Thread.run(Thread.java:512)
--- CUT ---

This was actually thrown by Lucene v1.4-rc2, which I was testing to see if it
solved my problem. I am currently running v1.3-Final on my live site and this 
does the same thing. This is running on Debian Linux, Woody, and is using the
IBM Runtime Environment for Linux Java(TM) 2 Technology Edition, Version
1.3.1, JRE.

It should be noted that I have had this problem before, and I solved it by
completely re-indexing the article set from scratch (starting with no index at
all). After that process, the optimize worked fine. Then somewhere along the
line of many days indexing new articles, and doing an optimise every day at
about 3.30am, the problem has returned.

The articles being indexed are all homogeneous in terms of fields being 
indexed, details below:

FIELD DEFINITIONS
Field name      Field type      Stored?         Indexed?
----------      ----------      -------         --------
Domain          Text            STORED          INDEXED
Id              Id              STORED          INDEXED
date            Date            STORED          INDEXED
datetime        Date            STORED          INDEXED
added           Date            STORED          INDEXED
category        Text            STORED          INDEXED
subcategory     Text            STORED          INDEXED
source          Text            STORED          INDEXED
title           Text            STORED          NOT INDEXED
slug            Text            STORED          NOT INDEXED
type            Text            STORED          NOT INDEXED
sourcetype      Text            NOT STORED      INDEXED


Any help greatly appreciated.

Cheers,
Paul.

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org