You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Yasin Kılınç (JIRA)" <ji...@apache.org> on 2013/11/01 07:40:18 UTC

[jira] [Commented] (NUTCH-1360) Suport the storing of IP address connected to when web crawling

    [ https://issues.apache.org/jira/browse/NUTCH-1360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13811088#comment-13811088 ] 

Yasin Kılınç commented on NUTCH-1360:
-------------------------------------

1- when protocol is any other protocol than HTTP it doesn't make sense property name was named http.store.ip.adress.in my opinion it doesn't general name. maybe we can name it store.ip.address and other protocols can implement this one. What do you think about it?

2- Don't we already have the IP of connected host when we establish socket connection? Why do we need to add IP in HTTP GET request? Is it because initially connected host IP and fetched host IP may be different?

Thanks 

> Suport the storing of IP address connected to when web crawling
> ---------------------------------------------------------------
>
>                 Key: NUTCH-1360
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1360
>             Project: Nutch
>          Issue Type: New Feature
>          Components: protocol
>    Affects Versions: nutchgora, 1.5
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>            Priority: Minor
>             Fix For: 1.8
>
>         Attachments: NUTCH-1360-nutchgora-v2.patch, NUTCH-1360-nutchgora.patch, NUTCH-1360-trunk.patch
>
>
> Simple issue enabling us to capture the specific IP address of the host which we connect to to fetch a page.



--
This message was sent by Atlassian JIRA
(v6.1#6144)