You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Remco Verhoef (JIRA)" <ji...@apache.org> on 2008/04/06 23:18:24 UTC

[jira] Created: (NUTCH-626) fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects

fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects
------------------------------------------------------------------------------------------

                 Key: NUTCH-626
                 URL: https://issues.apache.org/jira/browse/NUTCH-626
             Project: Nutch
          Issue Type: Bug
          Components: fetcher
    Affects Versions: 1.0.0
         Environment: Linux Debian
            Reporter: Remco Verhoef


Fetcher2 breaks out of the db.ignore.external.links directive when encounterin a cross domain redirect. The redirected url is followed without checking for db.ignore.external.links and cross domain. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (NUTCH-626) fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects

Posted by "Doğacan Güney (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doğacan Güney updated NUTCH-626:
--------------------------------

    Attachment: NUTCH-626_v2.patch

I updated your patch to apply and compile in latest trunk.

I am not committing this patch since I don't want to mess with Todd's
Fetcher work. For now :D

> fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects
> ------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-626
>                 URL: https://issues.apache.org/jira/browse/NUTCH-626
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher
>    Affects Versions: 1.0.0
>         Environment: Linux Debian
>            Reporter: Remco Verhoef
>            Assignee: Doğacan Güney
>             Fix For: 1.0.0
>
>         Attachments: fetcher2.diff, NUTCH-626_v2.patch
>
>
> Fetcher2 breaks out of the db.ignore.external.links directive when encounterin a cross domain redirect. The redirected url is followed without checking for db.ignore.external.links and cross domain. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (NUTCH-626) fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects

Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Otis Gospodnetic updated NUTCH-626:
-----------------------------------

    Fix Version/s: 1.0.0

> fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects
> ------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-626
>                 URL: https://issues.apache.org/jira/browse/NUTCH-626
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher
>    Affects Versions: 1.0.0
>         Environment: Linux Debian
>            Reporter: Remco Verhoef
>            Assignee: Doğacan Güney
>             Fix For: 1.0.0
>
>         Attachments: fetcher2.diff
>
>
> Fetcher2 breaks out of the db.ignore.external.links directive when encounterin a cross domain redirect. The redirected url is followed without checking for db.ignore.external.links and cross domain. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (NUTCH-626) fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects

Posted by "Doğacan Güney (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doğacan Güney reassigned NUTCH-626:
-----------------------------------

    Assignee: Doğacan Güney

> fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects
> ------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-626
>                 URL: https://issues.apache.org/jira/browse/NUTCH-626
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher
>    Affects Versions: 1.0.0
>         Environment: Linux Debian
>            Reporter: Remco Verhoef
>            Assignee: Doğacan Güney
>         Attachments: fetcher2.diff
>
>
> Fetcher2 breaks out of the db.ignore.external.links directive when encounterin a cross domain redirect. The redirected url is followed without checking for db.ignore.external.links and cross domain. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (NUTCH-626) fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects

Posted by "Remco Verhoef (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Remco Verhoef updated NUTCH-626:
--------------------------------

    Attachment: fetcher2.diff

this patch also fixes an other issue with redirects.

> fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects
> ------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-626
>                 URL: https://issues.apache.org/jira/browse/NUTCH-626
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher
>    Affects Versions: 1.0.0
>         Environment: Linux Debian
>            Reporter: Remco Verhoef
>         Attachments: fetcher2.diff
>
>
> Fetcher2 breaks out of the db.ignore.external.links directive when encounterin a cross domain redirect. The redirected url is followed without checking for db.ignore.external.links and cross domain. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (NUTCH-626) fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects

Posted by "Sami Siren (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sami Siren resolved NUTCH-626.
------------------------------

    Resolution: Fixed
      Assignee: Sami Siren  (was: Doğacan Güney)

committed

> fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects
> ------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-626
>                 URL: https://issues.apache.org/jira/browse/NUTCH-626
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher
>    Affects Versions: 1.0.0
>         Environment: Linux Debian
>            Reporter: Remco Verhoef
>            Assignee: Sami Siren
>             Fix For: 1.0.0
>
>         Attachments: fetcher2.diff, NUTCH-626_v2.patch
>
>
> Fetcher2 breaks out of the db.ignore.external.links directive when encounterin a cross domain redirect. The redirected url is followed without checking for db.ignore.external.links and cross domain. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-626) fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676496#action_12676496 ] 

Hudson commented on NUTCH-626:
------------------------------

Integrated in Nutch-trunk #735 (See [http://hudson.zones.apache.org/hudson/job/Nutch-trunk/735/])
     - Fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects, contributed by Remco Verhoef, dogacan


> fetcher2 breaks out the domain with db.ignore.external.links set at cross domain redirects
> ------------------------------------------------------------------------------------------
>
>                 Key: NUTCH-626
>                 URL: https://issues.apache.org/jira/browse/NUTCH-626
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher
>    Affects Versions: 1.0.0
>         Environment: Linux Debian
>            Reporter: Remco Verhoef
>            Assignee: Sami Siren
>             Fix For: 1.0.0
>
>         Attachments: fetcher2.diff, NUTCH-626_v2.patch
>
>
> Fetcher2 breaks out of the db.ignore.external.links directive when encounterin a cross domain redirect. The redirected url is followed without checking for db.ignore.external.links and cross domain. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.