You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/07/25 00:57:05 UTC

[jira] [Updated] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X

     [ https://issues.apache.org/jira/browse/NUTCH-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lewis John McGibbney updated NUTCH-1936:
----------------------------------------
    Attachment: NUTCH-1939.patch

Prelim patch which folks can try out.
N.B. tests fail with IOException RE: failure to load specific mapred-site.xml properties.
I am not sure that all API upgrades are done 100% properly however this is an effort for us to upgrade to 2.X.
I need to admit, I've pegged dependencies at 2.4.0 simply because this is what EMR uses... and right now we are using EMR for crawls. This is nothing bias from me, it is merely my observation that both client and server should be using the same. I understand that this is not adequate for everyone.
[~mjoyce]

> GSoC 2015 - Move Nutch to Hadoop 2.X
> ------------------------------------
>
>                 Key: NUTCH-1936
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1936
>             Project: Nutch
>          Issue Type: Task
>          Components: build
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>              Labels: gsoc2015
>             Fix For: 2.4, 1.11
>
>         Attachments: NUTCH-1939.patch
>
>
> The Nutch PMC [discussed|http://www.mail-archive.com/dev%40nutch.apache.org/msg16250.html] ideas for a good 2015 GSoC project. It appears that porting the (trunk) codebase to [Hadoop 2.X|http://hadoop.apache.org/docs/stable/] seems to an attractive option and one which would present an excellent learning experience for a summer student.
> A more comprehensive description of this issue should be included within either a mentor-defined project description or a successful student application.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)