You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gora.apache.org by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2013/05/18 23:01:16 UTC
[jira] [Created] (GORA-225) Various Issues with MemStore
Lewis John McGibbney created GORA-225:
-----------------------------------------
Summary: Various Issues with MemStore
Key: GORA-225
URL: https://issues.apache.org/jira/browse/GORA-225
Project: Apache Gora
Issue Type: Bug
Components: gora-core, testing
Affects Versions: 0.3
Environment: Nutch 2.x HEAD, gora-core 0.3
Reporter: Lewis John McGibbney
Fix For: 0.4
In Nutch we have numerous testing scenarios which simulate persistence of data to Gora in some form or other. It has worked good as until now.
Now that gora-sql-0.1.1-incubating artifact is non-compatible with gora-core 0.3, there is a requirement to address this situation in order to keep some degree of integrity within the Nutch codebase.
Specifcally a number of tests [0][1][2][3] all extend a Util testing class which utilizes functionality from the gora-sql artifact.
My initial solution was to switch to using MemStore... which brought me to logging this issue!
Test [0] fails with the following useless logging... I need to DEBUG this much more throughly
{code}
Testcase: testGenerateHighest took 1.845 sec
FAILED
expected:<2> but was:<0>
junit.framework.AssertionFailedError: expected:<2> but was:<0>
at org.apache.nutch.crawl.TestGenerator.testGenerateHighest(TestGenerator.java:78)
Testcase: testGenerateHostLimit took 1.207 sec
FAILED
expected:<1> but was:<0>
junit.framework.AssertionFailedError: expected:<1> but was:<0>
at org.apache.nutch.crawl.TestGenerator.testGenerateHostLimit(TestGenerator.java:134)
Testcase: testGenerateDomainLimit took 1.175 sec
FAILED
expected:<1> but was:<0>
junit.framework.AssertionFailedError: expected:<1> but was:<0>
at org.apache.nutch.crawl.TestGenerator.testGenerateDomainLimit(TestGenerator.java:185)
Testcase: testFilter took 2.31 sec
FAILED
expected:<3> but was:<0>
junit.framework.AssertionFailedError: expected:<3> but was:<0>
at org.apache.nutch.crawl.TestGenerator.testFilter(TestGenerator.java:239)
{code}
Tests [1][2] are fail identically with the following stack trace
{code}
Testcase: testInject took 1.931 sec
Caused an ERROR
null
java.util.NoSuchElementException
at java.util.TreeMap.key(TreeMap.java:1221)
at java.util.TreeMap.firstKey(TreeMap.java:285)
at org.apache.gora.memory.store.MemStore.execute(MemStore.java:122)
at org.apache.nutch.util.CrawlTestUtil.readContents(CrawlTestUtil.java:112)
at org.apache.nutch.crawl.TestInjector.readDb(TestInjector.java:104)
at org.apache.nutch.crawl.TestInjector.testInject(TestInjector.java:62)
{code}
Finally, a multithreaded test in [3] fails with the following
{code}
java.util.ConcurrentModificationException
at java.util.TreeMap$NavigableSubMap$SubMapIterator.nextEntry(TreeMap.java:1594)
at java.util.TreeMap$NavigableSubMap$SubMapKeyIterator.next(TreeMap.java:1655)
at org.apache.gora.memory.store.MemStore$MemResult.nextInner(MemStore.java:81)
at org.apache.gora.query.impl.ResultBase.next(ResultBase.java:112)
at org.apache.nutch.storage.TestGoraStorage.readWrite(TestGoraStorage.java:74)
at org.apache.nutch.storage.TestGoraStorage.access$100(TestGoraStorage.java:41)
at org.apache.nutch.storage.TestGoraStorage$1.call(TestGoraStorage.java:107)
at org.apache.nutch.storage.TestGoraStorage$1.call(TestGoraStorage.java:102)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
{code}
I believe that the final failure is due to to the use of TreeMap [5] as a private object in MemStore. TreeMap implementations are not synchronized. If multiple threads access a map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more mappings; merely changing the value associated with an existing key is not a structural modification.) This is typically accomplished by synchronizing on some object that naturally encapsulates the map. If no such object exists, the map should be "wrapped" using the Collections.synchronizedSortedMap method. This is best done at creation time, to prevent accidental unsynchronized access to the map e.g.
SortedMap m = Collections.synchronizedSortedMap(new TreeMap(...));
N.B. The NOTE on TreeMap's come right from the Oracle JavaDoc.
[0] http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/crawl/TestGenerator.java?view=markup
[1] http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/crawl/TestInjector.java?view=markup
[2] http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/fetcher/TestFetcher.java?view=markup
[3] http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/storage/TestGoraStorage.java?view=markup
[4] http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/util/AbstractNutchTest.java?view=markup
[5] http://docs.oracle.com/javase/6/docs/api/java/util/TreeMap.html
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira