You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Tarrall (JIRA)" <ji...@apache.org> on 2015/09/09 05:50:45 UTC
[jira] [Created] (LUCENE-6788) Mishandling of Integer.MIN_VALUE in
FuzzySet leads to AssertionError
Robert Tarrall created LUCENE-6788:
--------------------------------------
Summary: Mishandling of Integer.MIN_VALUE in FuzzySet leads to AssertionError
Key: LUCENE-6788
URL: https://issues.apache.org/jira/browse/LUCENE-6788
Project: Lucene - Core
Issue Type: Bug
Components: core/index
Affects Versions: 4.10.4, Trunk
Reporter: Robert Tarrall
Reindexing some data in the DataStax Enterprise Search product (which uses Solr) led to these stack traces:
ERROR [Lucene Merge Thread #13430] 2015-09-08 11:14:36,582 CassandraDaemon.java (line 258) Exception in thread Thread[Lucene Merge Thread #13430,6,main]
org.apache.lucene.index.MergePolicy$MergeException: java.lang.AssertionError
at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:545)
at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:518)
Caused by: java.lang.AssertionError
at org.apache.lucene.codecs.bloom.FuzzySet.mayContainValue(FuzzySet.java:216)
at org.apache.lucene.codecs.bloom.FuzzySet.contains(FuzzySet.java:165)
at org.apache.lucene.codecs.bloom.BloomFilteringPostingsFormat$BloomFilteredFieldsProducer$BloomFilteredTermsEnum.seekExact(BloomFilteringPostingsFormat.java:351)
at org.apache.lucene.index.BufferedUpdatesStream.applyTermDeletes(BufferedUpdatesStream.java:414)
at org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:283)
at org.apache.lucene.index.IndexWriter._mergeInit(IndexWriter.java:3838)
at org.apache.lucene.index.IndexWriter.mergeInit(IndexWriter.java:3799)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3651)
at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:405)
at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:482)
In tracking down the cause of the stack trace, I noticed this:
https://github.com/apache/lucene-solr/blob/trunk/lucene/codecs/src/java/org/apache/lucene/codecs/bloom/FuzzySet.java#L164
It is possible for the Murmur2 hash to return Integer.MIN_VALUE (e.g. when hashing "WeH44wlbCK"). Multiplying Integer.MIN_VALUE by -1 returns Integer.MIN_VALUE again, so the "positiveHash >= 0" assertion at line 217 fails.
We could special-case Integer.MIN_VALUE, map it to 42 or some other magic number... since the same "* -1" logic appears on line 236 perhaps it should be part of the hash function?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org