You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gora.apache.org by "Sergey Weiss (JIRA)" <ji...@apache.org> on 2014/10/07 13:24:34 UTC

[jira] [Commented] (GORA-227) Failing assertions when putting and getting Values using MemStore#execute

    [ https://issues.apache.org/jira/browse/GORA-227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161775#comment-14161775 ] 

Sergey Weiss commented on GORA-227:
-----------------------------------

Hello!

I have debugged TestGenerator and, from what I saw, it fails due to the fact that query is being executed on a different MemStore instance rather than one that holds injected web pages. That is, when GeneratorJob inits its mapper and reducer, it creates new instance of MemStore for both. Each of this two instances hold their internal map and know nothing about MemStore created by TestGenerator (and populated with web pages).

What is the best way to address this issue? Should we somehow amend DataStoreFactory to make it return single instance of MemStore or should all MemStores share their states? Any suggestions?

> Failing assertions when putting and getting Values using MemStore#execute
> -------------------------------------------------------------------------
>
>                 Key: GORA-227
>                 URL: https://issues.apache.org/jira/browse/GORA-227
>             Project: Apache Gora
>          Issue Type: Sub-task
>          Components: gora-core
>    Affects Versions: 0.3
>         Environment: gora-core 0.3, Nutch 2.x HEAD
>            Reporter: Lewis John McGibbney
>             Fix For: 0.6
>
>
> Test [0] fails with the following useless logging... I need to DEBUG this much more throughly
> {code}
> Testcase: testGenerateHighest took 1.845 sec
> 	FAILED
> expected:<2> but was:<0>
> junit.framework.AssertionFailedError: expected:<2> but was:<0>
> 	at org.apache.nutch.crawl.TestGenerator.testGenerateHighest(TestGenerator.java:78)
> Testcase: testGenerateHostLimit took 1.207 sec
> 	FAILED
> expected:<1> but was:<0>
> junit.framework.AssertionFailedError: expected:<1> but was:<0>
> 	at org.apache.nutch.crawl.TestGenerator.testGenerateHostLimit(TestGenerator.java:134)
> Testcase: testGenerateDomainLimit took 1.175 sec
> 	FAILED
> expected:<1> but was:<0>
> junit.framework.AssertionFailedError: expected:<1> but was:<0>
> 	at org.apache.nutch.crawl.TestGenerator.testGenerateDomainLimit(TestGenerator.java:185)
> Testcase: testFilter took 2.31 sec
> 	FAILED
> expected:<3> but was:<0>
> junit.framework.AssertionFailedError: expected:<3> but was:<0>
> 	at org.apache.nutch.crawl.TestGenerator.testFilter(TestGenerator.java:239)
> {code}
> However so far I have found commonality in the fact that the tests all use the following code:
> {code}
>   public static ArrayList<URLWebPage> readContents(DataStore<String,WebPage> store,
>       Mark requiredMark, String... fields) throws Exception {
>     ArrayList<URLWebPage> l = new ArrayList<URLWebPage>();
>     Query<String, WebPage> query = store.newQuery();
>     if (fields != null) {
>       query.setFields(fields);
>     }
>     Result<String, WebPage> results = store.execute(query);
>     while (results.next()) {
>       try {
>         WebPage page = results.get();
>         String url = results.getKey();
>         if (page == null)
>           continue;
>         if (requiredMark != null && requiredMark.checkMark(page) == null)
>           continue;
>         l.add(new URLWebPage(TableUtil.unreverseUrl(url), (WebPage)page.clone()));
>       } catch (Exception e) {
>         e.printStackTrace();
>       }
>     }
>     return l;
>   }
> {code}
> and also that the assertions are all of the type
> {code}
>     ArrayList<URLWebPage> fetchList = CrawlTestUtil.readContents(webPageStore, Mark.GENERATE_MARK, FIELDS);
>     // verify we got right amount of records
>     assertEquals(1, fetchList.size());
> {code}
> [0] http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/crawl/TestGenerator.java?view=markup



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)