You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Julien Nioche (JIRA)" <ji...@apache.org> on 2014/05/16 13:23:02 UTC

[jira] [Created] (NUTCH-1777) Fetcher not getting all the entries in input

Julien Nioche created NUTCH-1777:
------------------------------------

             Summary: Fetcher not getting all the entries in input
                 Key: NUTCH-1777
                 URL: https://issues.apache.org/jira/browse/NUTCH-1777
             Project: Nutch
          Issue Type: Bug
          Components: fetcher
    Affects Versions: 2.2.1
            Reporter: Julien Nioche
             Fix For: 2.3


See comments in  [NUTCH-1714] :

bq. The Generator marks 50K entries with GENERATE_MARK but the Fetcher shows only 49,461 as Map Input Records (and the same number as Reduce input records) => looks like we are not getting all the records we should be getting. I dumped the content of the table pre-fetching and it contains the right number of entries i.e. 50K

This was noticed after applying [NUTCH-1714] and [NUTCH-1674] but could also have been the case before that.



--
This message was sent by Atlassian JIRA
(v6.2#6252)