You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by bebe1437 <ed...@gmail.com> on 2017/09/01 04:07:31 UTC

ArrayIndexOutOfBoundsException: -65536 during full-import from old index

My solr version is 5.5.4,
I set docValues="true" to some old fields, 
and I use dataimport to reindex,
but it keep throwing the exception: Caused by:
java.lang.ArrayIndexOutOfBoundsException: -65536
	In org. apache. Lucene. index. TermsHashPerField.
writeByte(TermsHashPerField.java:197)
	
I checked the issue: https://issues.apache.org/jira/browse/LUCENE-1995
and the other article:
http://lucene.472066.n3.nabble.com/ArrayIndexOutOfBoundsException-65536-td3661945.html

I got 1M docs but it's only cost around 3-400mb, I keep the ramBufferSizeMB
as default. I'm not sure what's the problem, because if update those old
documents with API, it's working, but it fails with dataimport.

and here is the exception:

org.apache.solr.common.SolrException: Exception writing document id
uw9G000007301O_T00000Ox01t_E000005L09u_E000005L09u to the index; possible
analysis error.
	at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:180)
	at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:68)
	at
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:48)
	at
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:934)
	at
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1089)
	at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:712)
	at
org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
	at org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:74)
	at
org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:260)
	at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:524)
	at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:414)
	at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:329)
	at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
	at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
	at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
	at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)
Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexWriter
is closed
	at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:720)
	at org.apache.lucene.index.IndexWriter.ensureOpen(IndexWriter.java:734)
	at
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1473)
	at
org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:282)
	at
org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:214)
	at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:169)
	... 15 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: -65536
	In org. apache. Lucene. index. TermsHashPerField.
writeByte(TermsHashPerField.java:197)
	at
org.apache.lucene.index.TermsHashPerField.writeVInt(TermsHashPerField.java:223)
	at
org.apache.lucene.index.FreqProxTermsWriterPerField.writeProx(FreqProxTermsWriterPerField.java:82)
	at
org.apache.lucene.index.FreqProxTermsWriterPerField.newTerm(FreqProxTermsWriterPerField.java:122)
	at
org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:177)
	at
org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:682)
	at
org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:365)
	at
org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:321)
	at
org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:234)
	at
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:450)
	at
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1477)
	... 18 more





--
Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Users-f532864.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: ArrayIndexOutOfBoundsException: -65536 during full-import from old index

Posted by bebe1437 <ed...@gmail.com>.
I figure out the problem,

I custom an NGramFilter which takes the token's length as a default
maxGramSize,
and there are some documents fulled with non sense data like
'xakldjfklajsdfklajdslkf',
when the token is too big to do NGramFilter , it crushed the IndexWriter.



--
Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Users-f532864.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: ArrayIndexOutOfBoundsException: -65536 during full-import from old index

Posted by Michael McCandless <lu...@mikemccandless.com>.
Is it possible that exception is thrown when trying to index an extremely
large document?

Mike McCandless

http://blog.mikemccandless.com

On Fri, Sep 1, 2017 at 12:07 AM, bebe1437 <ed...@gmail.com> wrote:

> My solr version is 5.5.4,
> I set docValues="true" to some old fields,
> and I use dataimport to reindex,
> but it keep throwing the exception: Caused by:
> java.lang.ArrayIndexOutOfBoundsException: -65536
>         In org. apache. Lucene. index. TermsHashPerField.
> writeByte(TermsHashPerField.java:197)
>
> I checked the issue: https://issues.apache.org/jira/browse/LUCENE-1995
> and the other article:
> http://lucene.472066.n3.nabble.com/ArrayIndexOutOfBoundsException
> -65536-td3661945.html
>
> I got 1M docs but it's only cost around 3-400mb, I keep the ramBufferSizeMB
> as default. I'm not sure what's the problem, because if update those old
> documents with API, it's working, but it fails with dataimport.
>
> and here is the exception:
>
> org.apache.solr.common.SolrException: Exception writing document id
> uw9G000007301O_T00000Ox01t_E000005L09u_E000005L09u to the index; possible
> analysis error.
>         at
> org.apache.solr.update.DirectUpdateHandler2.addDoc(
> DirectUpdateHandler2.java:180)
>         at
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(
> RunUpdateProcessorFactory.java:68)
>         at
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(
> UpdateRequestProcessor.java:48)
>         at
> org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(
> DistributedUpdateProcessor.java:934)
>         at
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(
> DistributedUpdateProcessor.java:1089)
>         at
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(
> DistributedUpdateProcessor.java:712)
>         at
> org.apache.solr.update.processor.LogUpdateProcessorFactory$
> LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>         at org.apache.solr.handler.dataimport.SolrWriter.upload(
> SolrWriter.java:74)
>         at
> org.apache.solr.handler.dataimport.DataImportHandler$
> 1.upload(DataImportHandler.java:260)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.
> buildDocument(DocBuilder.java:524)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.
> buildDocument(DocBuilder.java:414)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:
> 329)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
>         at
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.
> java:416)
>         at
> org.apache.solr.handler.dataimport.DataImporter.
> runCmd(DataImporter.java:480)
>         at
> org.apache.solr.handler.dataimport.DataImporter$1.run(
> DataImporter.java:461)
> Caused by: org.apache.lucene.store.AlreadyClosedException: this
> IndexWriter
> is closed
>         at org.apache.lucene.index.IndexWriter.ensureOpen(
> IndexWriter.java:720)
>         at org.apache.lucene.index.IndexWriter.ensureOpen(
> IndexWriter.java:734)
>         at
> org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1473)
>         at
> org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(
> DirectUpdateHandler2.java:282)
>         at
> org.apache.solr.update.DirectUpdateHandler2.addDoc0(
> DirectUpdateHandler2.java:214)
>         at
> org.apache.solr.update.DirectUpdateHandler2.addDoc(
> DirectUpdateHandler2.java:169)
>         ... 15 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: -65536
>         In org. apache. Lucene. index. TermsHashPerField.
> writeByte(TermsHashPerField.java:197)
>         at
> org.apache.lucene.index.TermsHashPerField.writeVInt(
> TermsHashPerField.java:223)
>         at
> org.apache.lucene.index.FreqProxTermsWriterPerField.writeProx(
> FreqProxTermsWriterPerField.java:82)
>         at
> org.apache.lucene.index.FreqProxTermsWriterPerField.newTerm(
> FreqProxTermsWriterPerField.java:122)
>         at
> org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:177)
>         at
> org.apache.lucene.index.DefaultIndexingChain$PerField.
> invert(DefaultIndexingChain.java:682)
>         at
> org.apache.lucene.index.DefaultIndexingChain.processField(
> DefaultIndexingChain.java:365)
>         at
> org.apache.lucene.index.DefaultIndexingChain.processDocument(
> DefaultIndexingChain.java:321)
>         at
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(
> DocumentsWriterPerThread.java:234)
>         at
> org.apache.lucene.index.DocumentsWriter.updateDocument(
> DocumentsWriter.java:450)
>         at
> org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1477)
>         ... 18 more
>
>
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Users-
> f532864.html
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: ArrayIndexOutOfBoundsException: -65536 during full-import from old index

Posted by bebe1437 <ed...@gmail.com>.
Updated :
Some documents will throw the same exception while update with API,
but the others updated with API still throw the same exception while use
dataimport. 



--
Sent from: http://lucene.472066.n3.nabble.com/Lucene-Java-Users-f532864.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org