You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/11/08 12:39:00 UTC

[jira] [Commented] (NUTCH-3017) Allow fast-urlfilter to load from HDFS/S3 and support gzipped input

    [ https://issues.apache.org/jira/browse/NUTCH-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17784024#comment-17784024 ] 

ASF GitHub Bot commented on NUTCH-3017:
---------------------------------------

sebastian-nagel closed pull request #793: [NUTCH-3017] Allow fast-urlfilter to load from HDFS/S3 
URL: https://github.com/apache/nutch/pull/793




> Allow fast-urlfilter to load from HDFS/S3 and support gzipped input
> -------------------------------------------------------------------
>
>                 Key: NUTCH-3017
>                 URL: https://issues.apache.org/jira/browse/NUTCH-3017
>             Project: Nutch
>          Issue Type: Improvement
>          Components: plugin, urlfilter
>    Affects Versions: 1.19
>            Reporter: Julien Nioche
>            Priority: Minor
>             Fix For: 1.20
>
>
> This provide an easier way to refresh the resources since no rebuild of the jar will be needed. The path can point to either HDFS or S3. Additionally, .gz files should be handled automatically



--
This message was sent by Atlassian Jira
(v8.20.10#820010)