You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nutch.apache.org by sn...@apache.org on 2018/06/28 11:07:43 UTC

[nutch] branch master updated (4c1d94a -> a4b4bf6)

This is an automated email from the ASF dual-hosted git repository.

snagel pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/nutch.git.


    from 4c1d94a  Merge pull request #352 from sebastian-nagel/NUTCH-2607-parsechecker-sc-passscoreafterparsing
     add 26616f5  NUTCH-2547 urlnormalizer-basic fails on special characters in path/query NUTCH-2609 urlnormalizer-basic to normalize path of file: URLs - escape more special characters - escape percent when not followed by a valid escape sequence   (two-digit hex number) - escape special characters before normalizing the path   so that URI.normalize() can be used on valid URIs - also normalize path '/..' - normalize path on file: URLs - complete unit tests
     new a4b4bf6  Merge pull request #353 from sebastian-nagel/nutch-2547-2609-url-normalizer-basic

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../urlnormalizer/basic/BasicURLNormalizer.java    | 108 +++++++++++++++++----
 .../basic/TestBasicURLNormalizer.java              |  43 ++++++++
 2 files changed, 134 insertions(+), 17 deletions(-)


[nutch] 01/01: Merge pull request #353 from sebastian-nagel/nutch-2547-2609-url-normalizer-basic

Posted by sn...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

snagel pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/nutch.git

commit a4b4bf6853d0b85565831112f5bdc4f901736f4d
Merge: 4c1d94a 26616f5
Author: Sebastian Nagel <sn...@apache.org>
AuthorDate: Thu Jun 28 13:07:40 2018 +0200

    Merge pull request #353 from sebastian-nagel/nutch-2547-2609-url-normalizer-basic
    
    NUTCH-2547 urlnormalizer-basic fails on special characters in path/query
    NUTCH-2609 urlnormalizer-basic to normalize path of file:// URLs

 .../urlnormalizer/basic/BasicURLNormalizer.java    | 108 +++++++++++++++++----
 .../basic/TestBasicURLNormalizer.java              |  43 ++++++++
 2 files changed, 134 insertions(+), 17 deletions(-)