You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nutch.apache.org by sn...@apache.org on 2017/09/29 11:46:44 UTC
[nutch] 01/01: Merge pull request #224 from maborec/NUTCH-2433
This is an automated email from the ASF dual-hosted git repository.
snagel pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/nutch.git
commit 777e759ada24eac84072a5f1722938442432eadc
Merge: da64358 3067753
Author: Sebastian Nagel <sn...@apache.org>
AuthorDate: Fri Sep 29 13:46:40 2017 +0200
Merge pull request #224 from maborec/NUTCH-2433
Nutch 2433 - New configuration for HTML parser to keep the HTML nodes in outlinks metadata
conf/nutch-default.xml | 7 ++++
.../apache/nutch/parse/html/DOMContentUtils.java | 24 ++++++++++++--
.../apache/nutch/parse/tika/DOMContentUtils.java | 37 +++++++++++++++++-----
3 files changed, 58 insertions(+), 10 deletions(-)
--
To stop receiving notification emails like this one, please contact
"commits@nutch.apache.org" <co...@nutch.apache.org>.