You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by andi rexha <a_...@hotmail.com> on 2014/09/15 15:23:56 UTC

Exception while using a custom analyzer in a parallel indexing!

Hi, 
I have an index writer that is used to from a pool of threads to index. The index writer is using a "PerFieldAnalyzerWrapper":

this.analyzer = new PerFieldAnalyzerWrapper(DEFAULT_ANALYZER, fields);

 
If I add the documents single threaded I dont get any exception. In the case that I add the documents through a pool of threads, I get the exception below: 


java.lang.IllegalStateException: TokenStream contract violation: reset()/close() call missing, reset() called multiple times, or subclass does not call super.reset(). Please see Javadocs of TokenStream class for more information about the correct consuming workflow.
    at org.apache.lucene.analysis.Tokenizer$1.read(Tokenizer.java:110)
    at org.apache.lucene.analysis.standard.StandardTokenizerImpl.zzRefill(StandardTokenizerImpl.java:1023)
    at org.apache.lucene.analysis.standard.StandardTokenizerImpl.getNextToken(StandardTokenizerImpl.java:1230)
    at org.apache.lucene.analysis.standard.StandardTokenizer.incrementToken(StandardTokenizer.java:178)
    at org.apache.lucene.analysis.standard.StandardFilter.incrementToken(StandardFilter.java:49)
    at org.apache.lucene.analysis.core.LowerCaseFilter.incrementToken(LowerCaseFilter.java:54)
    at at.knowcenter.ir.index.analyzer.StemmingTokenStream.incrementToken(StemmingTokenStream.java:94)
    at org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:108)
    at org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:248)
    at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:253)
    at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:465)
    at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1537)
    at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1207)
    at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1188)

When I use the "PerFieldAnalyzerWrapper" only with the analyzer as default analyzer:
this.analyzer = new PerFieldAnalyzerWrapper(DEFAULT_ANALYZER);

I dont get the exception.

Looks like there is something wrong to the usecase. Does anybody know how to handle this problem?

Thank you in advance!