You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2014/05/13 23:43:15 UTC

[jira] [Created] (NUTCH-1775) IndexingFilter: document origin of passed CrawlDatum

Sebastian Nagel created NUTCH-1775:
--------------------------------------

             Summary: IndexingFilter: document origin of passed CrawlDatum
                 Key: NUTCH-1775
                 URL: https://issues.apache.org/jira/browse/NUTCH-1775
             Project: Nutch
          Issue Type: Improvement
          Components: indexer
    Affects Versions: 1.8
            Reporter: Sebastian Nagel
            Priority: Trivial
             Fix For: 1.9
         Attachments: NUTCH-1775-trunk.patch

Only the fetch datum from segment is passed to IndexingFilters, the datum from CrawlDb is not available to IndexingFilters. This fact should be documented because there may be subtle differences between fetch and db datum (e.g., fetch time).



--
This message was sent by Atlassian JIRA
(v6.2#6252)