You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nutch.apache.org by le...@apache.org on 2018/08/01 18:26:08 UTC
[nutch] branch 2.x updated: NUTCH-2222 re-fetch deletes all
metadata except _csh_ and _rs_
This is an automated email from the ASF dual-hosted git repository.
lewismc pushed a commit to branch 2.x
in repository https://gitbox.apache.org/repos/asf/nutch.git
The following commit(s) were added to refs/heads/2.x by this push:
new c43c2c8 NUTCH-2222 re-fetch deletes all metadata except _csh_ and _rs_
c43c2c8 is described below
commit c43c2c85874295ef94982694fc28c068d5447234
Author: Lewis John McGibbney <le...@gmail.com>
AuthorDate: Wed Aug 1 11:26:04 2018 -0700
NUTCH-2222 re-fetch deletes all metadata except _csh_ and _rs_
---
src/java/org/apache/nutch/fetcher/FetcherJob.java | 1 +
1 file changed, 1 insertion(+)
diff --git a/src/java/org/apache/nutch/fetcher/FetcherJob.java b/src/java/org/apache/nutch/fetcher/FetcherJob.java
index 82e7a12..f4b97cb 100644
--- a/src/java/org/apache/nutch/fetcher/FetcherJob.java
+++ b/src/java/org/apache/nutch/fetcher/FetcherJob.java
@@ -75,6 +75,7 @@ public class FetcherJob extends NutchTool implements Tool {
FIELDS.add(WebPage.Field.MARKERS);
FIELDS.add(WebPage.Field.REPR_URL);
FIELDS.add(WebPage.Field.FETCH_TIME);
+ FIELDS.add(WebPage.Field.METADATA);
}
/**