You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nutch.apache.org by sn...@apache.org on 2017/09/29 11:46:44 UTC

[nutch] 01/01: Merge pull request #224 from maborec/NUTCH-2433

This is an automated email from the ASF dual-hosted git repository.

snagel pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/nutch.git

commit 777e759ada24eac84072a5f1722938442432eadc
Merge: da64358 3067753
Author: Sebastian Nagel <sn...@apache.org>
AuthorDate: Fri Sep 29 13:46:40 2017 +0200

    Merge pull request #224 from maborec/NUTCH-2433
    
    Nutch 2433 - New configuration for HTML parser to keep the HTML nodes in outlinks metadata

 conf/nutch-default.xml                             |  7 ++++
 .../apache/nutch/parse/html/DOMContentUtils.java   | 24 ++++++++++++--
 .../apache/nutch/parse/tika/DOMContentUtils.java   | 37 +++++++++++++++++-----
 3 files changed, 58 insertions(+), 10 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
"commits@nutch.apache.org" <co...@nutch.apache.org>.