You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2014/08/23 01:22:11 UTC

[jira] [Commented] (NUTCH-1775) IndexingFilter: document origin of passed CrawlDatum

    [ https://issues.apache.org/jira/browse/NUTCH-1775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14107688#comment-14107688 ] 

Hudson commented on NUTCH-1775:
-------------------------------

SUCCESS: Integrated in Nutch-trunk #2749 (See [https://builds.apache.org/job/Nutch-trunk/2749/])
NUTCH-1775 IndexingFilter: document origin of passed CrawlDatum (snagel: http://svn.apache.org/viewvc/nutch/trunk/?view=rev&rev=1619944)
* /nutch/trunk/CHANGES.txt
* /nutch/trunk/src/java/org/apache/nutch/indexer/IndexingFilter.java


> IndexingFilter: document origin of passed CrawlDatum
> ----------------------------------------------------
>
>                 Key: NUTCH-1775
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1775
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>    Affects Versions: 1.8
>            Reporter: Sebastian Nagel
>            Priority: Trivial
>             Fix For: 1.10
>
>         Attachments: NUTCH-1775-trunk.patch
>
>
> Only the fetch datum from segment is passed to IndexingFilters, the datum from CrawlDb is not available to IndexingFilters. This fact should be documented because there may be subtle differences between fetch and db datum (e.g., fetch time).



--
This message was sent by Atlassian JIRA
(v6.2#6252)