You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ganesh <em...@yahoo.co.in> on 2010/02/02 09:23:59 UTC

Re: [Bulk] RE: Exception while adding document in 3.0

I tried My App with v2.9.0 without reusing Documents and it is working fine.

Based on the suggestion from group (refer attached mail), I modified the code to use v3.0.0 and in order to increase performance, I am reusing the Documents and Fields objects as suggessted. I am facing problems with this approach. 

Documents cannot be re-used in v3.0?

Regards
Ganesh

----- Original Message ----- 
From: "Uwe Schindler" <uw...@thetaphi.de>
To: <ja...@lucene.apache.org>
Sent: Tuesday, February 02, 2010 1:41 PM
Subject: [Bulk] RE: Exception while adding document in 3.0


> They can.
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
>> -----Original Message-----
>> From: Glen Newton [mailto:glen.newton@gmail.com]
>> Sent: Tuesday, February 02, 2010 9:03 AM
>> To: java-user@lucene.apache.org
>> Subject: Re: Exception while adding document in 3.0
>> 
>> Documents cannot be re-used in v3.0?
>>  http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
>> 
>> -glen
>> http://zzzoot.blogspot.com/
>> 
>> On 2 February 2010 02:55, Simon Willnauer
>> <si...@googlemail.com> wrote:
>> > Ganesh,
>> >
>> > do you reuse your Document instances in any way or do you create new
>> > docs for each add?
>> >
>> > simon
>> >
>> > On Tue, Feb 2, 2010 at 7:18 AM, Ganesh <em...@yahoo.co.in> wrote:
>> >> I am getting below exception, while adding documents. I am adding
>> documents continously and at some point, i am getting the below
>> exception. This exception is not occuring with v2.9.0
>> >>
>> >>  Exception: Index: 21, Size: 2
>> >>  java.util.ArrayList.RangeCheck(Unknown Source)
>> >>  java.util.ArrayList.get(Unknown Source)
>> >>
>>  org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(Doc
>> FieldProcessorPerThread.java:175)
>> >>
>>  org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter
>> .java:779)
>> >>
>>  org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.ja
>> va:757)
>> >>
>>  org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2472)
>> >>
>>  org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2446)
>> >>
>> >> Regards
>> >> Ganesh
>> >> Send instant messages to your online friends
>> http://in.messenger.yahoo.com
>> >>
>> >> --------------------------------------------------------------------
>> -
>> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-user-help@lucene.apache.org
>> >>
>> >>
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail: java-user-help@lucene.apache.org
>> >
>> >
>> 
>> 
>> 
>> --
>> 
>> -
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

Re: [Bulk] RE: [Bulk] RE: Exception while adding document in 3.0

Posted by Ian Lea <ia...@gmail.com>.
> But still i didn't see much increase in performance by reusing documents and field objects.

Oh well, some you win, some you lose.  Although it sounds like you're
still winning.  Are you following all the other advice on
http://wiki.apache.org/lucene-java/ImproveIndexingSpeed.  Maybe the
best advice is to run a profiler to see where the time really is being
spent.


--
Ian.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: [Bulk] RE: [Bulk] RE: Exception while adding document in 3.0

Posted by Ganesh <em...@yahoo.co.in>.
Yes. I am using the objects across threads. Thanks for pointing out.

But still i didn't see much increase in performance by reusing documents and field objects. 

Regards
Ganesh

----- Original Message ----- 
From: "Uwe Schindler" <uw...@thetaphi.de>
To: <ja...@lucene.apache.org>
Sent: Tuesday, February 02, 2010 2:02 PM
Subject: [Bulk] RE: [Bulk] RE: Exception while adding document in 3.0


They can be reused, but the exception and looking into the code shows that you are doing it wrong. You can reuse Documents but only under two conditions:

a) In one thread, not one document in multithreaded app - one document per thread! And using one document in more than one thread is what you are likely doing. This exception cannot occur in one thread (just look into the lucene source code): What it does: it iterates over all fields of the document using a for-loop. A RangeCheck exception can only occur, if the document is modified during this iteration.
b) Only reuse the Document instance after the IndexWriter.xxxDocumentis finished! So don't prepare the document and the do the IndexWriter stuffs in another task, this would exactly produce this error.

Your problem is exactly the same as when you create an ArrayList, then iterate over the ArrayList using a for-loop and remove elements from the ArrayList in a parallel thread.

Both of above conditions are mentioned in the FAQ (indirectly).

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Ganesh [mailto:emailgane@yahoo.co.in]
> Sent: Tuesday, February 02, 2010 9:24 AM
> To: java-user@lucene.apache.org
> Subject: Re: [Bulk] RE: Exception while adding document in 3.0
> 
> I tried My App with v2.9.0 without reusing Documents and it is working
> fine.
> 
> Based on the suggestion from group (refer attached mail), I modified
> the code to use v3.0.0 and in order to increase performance, I am
> reusing the Documents and Fields objects as suggessted. I am facing
> problems with this approach.
> 
> Documents cannot be re-used in v3.0?
> 
> Regards
> Ganesh
> 
> ----- Original Message -----
> From: "Uwe Schindler" <uw...@thetaphi.de>
> To: <ja...@lucene.apache.org>
> Sent: Tuesday, February 02, 2010 1:41 PM
> Subject: [Bulk] RE: Exception while adding document in 3.0
> 
> 
> > They can.
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> >> -----Original Message-----
> >> From: Glen Newton [mailto:glen.newton@gmail.com]
> >> Sent: Tuesday, February 02, 2010 9:03 AM
> >> To: java-user@lucene.apache.org
> >> Subject: Re: Exception while adding document in 3.0
> >>
> >> Documents cannot be re-used in v3.0?
> >>  http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
> >>
> >> -glen
> >> http://zzzoot.blogspot.com/
> >>
> >> On 2 February 2010 02:55, Simon Willnauer
> >> <si...@googlemail.com> wrote:
> >> > Ganesh,
> >> >
> >> > do you reuse your Document instances in any way or do you create
> new
> >> > docs for each add?
> >> >
> >> > simon
> >> >
> >> > On Tue, Feb 2, 2010 at 7:18 AM, Ganesh <em...@yahoo.co.in>
> wrote:
> >> >> I am getting below exception, while adding documents. I am adding
> >> documents continously and at some point, i am getting the below
> >> exception. This exception is not occuring with v2.9.0
> >> >>
> >> >>  Exception: Index: 21, Size: 2
> >> >>  java.util.ArrayList.RangeCheck(Unknown Source)
> >> >>  java.util.ArrayList.get(Unknown Source)
> >> >>
> >>
> org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(Doc
> >> FieldProcessorPerThread.java:175)
> >> >>
> >>
> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter
> >> .java:779)
> >> >>
> >>
> org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.ja
> >> va:757)
> >> >>
> >>
> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2472)
> >> >>
> >>
> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2446)
> >> >>
> >> >> Regards
> >> >> Ganesh
> >> >> Send instant messages to your online friends
> >> http://in.messenger.yahoo.com
> >> >>
> >> >> -----------------------------------------------------------------
> ---
> >> -
> >> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >> >>
> >> >>
> >> >
> >> > ------------------------------------------------------------------
> ---
> >> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >>
> >> -
> >>
> >> --------------------------------------------------------------------
> -
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Send instant messages to your online friends http://in.messenger.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: [Bulk] RE: Exception while adding document in 3.0

Posted by Uwe Schindler <uw...@thetaphi.de>.
They can be reused, but the exception and looking into the code shows that you are doing it wrong. You can reuse Documents but only under two conditions:

a) In one thread, not one document in multithreaded app - one document per thread! And using one document in more than one thread is what you are likely doing. This exception cannot occur in one thread (just look into the lucene source code): What it does: it iterates over all fields of the document using a for-loop. A RangeCheck exception can only occur, if the document is modified during this iteration.
b) Only reuse the Document instance after the IndexWriter.xxxDocumentis finished! So don't prepare the document and the do the IndexWriter stuffs in another task, this would exactly produce this error.

Your problem is exactly the same as when you create an ArrayList, then iterate over the ArrayList using a for-loop and remove elements from the ArrayList in a parallel thread.

Both of above conditions are mentioned in the FAQ (indirectly).

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Ganesh [mailto:emailgane@yahoo.co.in]
> Sent: Tuesday, February 02, 2010 9:24 AM
> To: java-user@lucene.apache.org
> Subject: Re: [Bulk] RE: Exception while adding document in 3.0
> 
> I tried My App with v2.9.0 without reusing Documents and it is working
> fine.
> 
> Based on the suggestion from group (refer attached mail), I modified
> the code to use v3.0.0 and in order to increase performance, I am
> reusing the Documents and Fields objects as suggessted. I am facing
> problems with this approach.
> 
> Documents cannot be re-used in v3.0?
> 
> Regards
> Ganesh
> 
> ----- Original Message -----
> From: "Uwe Schindler" <uw...@thetaphi.de>
> To: <ja...@lucene.apache.org>
> Sent: Tuesday, February 02, 2010 1:41 PM
> Subject: [Bulk] RE: Exception while adding document in 3.0
> 
> 
> > They can.
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> >> -----Original Message-----
> >> From: Glen Newton [mailto:glen.newton@gmail.com]
> >> Sent: Tuesday, February 02, 2010 9:03 AM
> >> To: java-user@lucene.apache.org
> >> Subject: Re: Exception while adding document in 3.0
> >>
> >> Documents cannot be re-used in v3.0?
> >>  http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
> >>
> >> -glen
> >> http://zzzoot.blogspot.com/
> >>
> >> On 2 February 2010 02:55, Simon Willnauer
> >> <si...@googlemail.com> wrote:
> >> > Ganesh,
> >> >
> >> > do you reuse your Document instances in any way or do you create
> new
> >> > docs for each add?
> >> >
> >> > simon
> >> >
> >> > On Tue, Feb 2, 2010 at 7:18 AM, Ganesh <em...@yahoo.co.in>
> wrote:
> >> >> I am getting below exception, while adding documents. I am adding
> >> documents continously and at some point, i am getting the below
> >> exception. This exception is not occuring with v2.9.0
> >> >>
> >> >>  Exception: Index: 21, Size: 2
> >> >>  java.util.ArrayList.RangeCheck(Unknown Source)
> >> >>  java.util.ArrayList.get(Unknown Source)
> >> >>
> >>
> org.apache.lucene.index.DocFieldProcessorPerThread.processDocument(Doc
> >> FieldProcessorPerThread.java:175)
> >> >>
> >>
> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter
> >> .java:779)
> >> >>
> >>
> org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.ja
> >> va:757)
> >> >>
> >>
> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2472)
> >> >>
> >>
> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2446)
> >> >>
> >> >> Regards
> >> >> Ganesh
> >> >> Send instant messages to your online friends
> >> http://in.messenger.yahoo.com
> >> >>
> >> >> -----------------------------------------------------------------
> ---
> >> -
> >> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >> >>
> >> >>
> >> >
> >> > ------------------------------------------------------------------
> ---
> >> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >>
> >> -
> >>
> >> --------------------------------------------------------------------
> -
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org