You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Markus Jelsma (Updated) (JIRA)" <ji...@apache.org> on 2012/01/24 17:52:41 UTC

[jira] [Updated] (NUTCH-1201) Allow for different FetcherThread impls

     [ https://issues.apache.org/jira/browse/NUTCH-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Markus Jelsma updated NUTCH-1201:
---------------------------------

    Attachment: CustomFetcher.java
                NUTCH-1201-1.5-wip.patch

Here's a WIP that allows us to implement different parts of the fetcher. We use it to for two different impl's of FetcherThread depending on customer.
I also attached a CustomFetcher, this behaves the same as the current Fetcher and just acts as an example (to myself).

It's ugly but does the trick ;)
                
> Allow for different FetcherThread impls
> ---------------------------------------
>
>                 Key: NUTCH-1201
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1201
>             Project: Nutch
>          Issue Type: New Feature
>          Components: fetcher
>            Reporter: Markus Jelsma
>            Assignee: Markus Jelsma
>             Fix For: 1.5
>
>         Attachments: CustomFetcher.java, NUTCH-1201-1.5-wip.patch
>
>
> For certain cases we need to modify parts in FetcherThread and make it pluggable. This introduces a new config directive fetcher.impl that takes a FQCN and uses that setting Fetcher.fetch to load a class to use for job.setMapRunnerClass(). This new class has to extend Fetcher and and inner class FetcherThread. This allows for overriding methods in FetcherThread but also methods in Fetcher itself if required.
> A follow up on this issue would be to refactor parts of FetcherThread to make it easier to override small sections instead of copying the entire method body for a small change, which is now the case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira