You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@nutch.apache.org by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2013/12/13 17:59:09 UTC

[jira] [Updated] (NUTCH-1360) Suport the storing of IP address connected to when web crawling

     [ https://issues.apache.org/jira/browse/NUTCH-1360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lewis John McGibbney updated NUTCH-1360:
----------------------------------------

    Attachment: NUTCH-1360-trunkv2.patch

This patch updates [~wal]'s patch to trunk.
Just to clarify on top of [~wal] comments, this patch takes storing of Ip to a new level over and above what we have currently implemented in 2.X. 
In 2.X we merely wished to add a WebPage's IP to it's own metadata. This patch goes far beyond that call of duty.
Would be nice to hear some comments from those watching this issue and beyond :)
 

> Suport the storing of IP address connected to when web crawling
> ---------------------------------------------------------------
>
>                 Key: NUTCH-1360
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1360
>             Project: Nutch
>          Issue Type: New Feature
>          Components: protocol
>    Affects Versions: nutchgora, 1.5
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>            Priority: Minor
>             Fix For: 2.3, 1.8
>
>         Attachments: NUTCH-1360-NUTCH-289-nutch-1.5.1.patch, NUTCH-1360-nutchgora-v2.patch, NUTCH-1360-nutchgora.patch, NUTCH-1360-trunk.patch, NUTCH-1360-trunkv2.patch, NUTCH-1360v3.patch, NUTCH-1360v4.patch, NUTCH-1360v5.patch
>
>
> Simple issue enabling us to capture the specific IP address of the host which we connect to to fetch a page.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)