You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Murali Parth <mu...@gmail.com> on 2014/11/29 20:49:29 UTC

302

Hello,
        We are using Nutch 1.7 and are having the following issue.

1) When a site responds with 302, Nutch does not follow the redirect. We
have set the number of redirection property to 5.

2) How to we put Nutch in debug mode so we understand where the call is
failing.

Any help will be appreciated.

Thanks
Murali

Re: 302

Posted by Murali Parth <mu...@gmail.com>.
Hi Sebastian,
                     Thanks for the reply.

We are going to the check the URL filters, this did  not come to our mind.

Thanks for your help
 Murali


On Sun, Nov 30, 2014 at 4:28 AM, Sebastian Nagel <wastl.nagel@googlemail.com
> wrote:

> Hi Murali,
>
> > We have set the number of redirection property to 5.
> By http.redirect.max = 5, right?
>
> Just edit $NUTCH_HOME/conf/log4j.properties :
>   log4j.logger.org.apache.nutch.fetcher.Fetcher=DEBUG,cmdstdout
>
> Redirects are then logged by Fetcher.
>
> Btw., even with http.redirect.max == 0 redirects are followed,
> but they are treated same as ordinary links: they are fetched
> in the next crawler cycle.
>
> Do the redirect targets pass the URL filters?
>
> Sebastian
>
>
> On 11/29/2014 08:49 PM, Murali Parth wrote:
> > Hello,
> >         We are using Nutch 1.7 and are having the following issue.
> >
> > 1) When a site responds with 302, Nutch does not follow the redirect. We
> > have set the number of redirection property to 5.
> >
> > 2) How to we put Nutch in debug mode so we understand where the call is
> > failing.
> >
> > Any help will be appreciated.
> >
> > Thanks
> > Murali
> >
>
>

Re: 302

Posted by Sebastian Nagel <wa...@googlemail.com>.
Hi Murali,

> We have set the number of redirection property to 5.
By http.redirect.max = 5, right?

Just edit $NUTCH_HOME/conf/log4j.properties :
  log4j.logger.org.apache.nutch.fetcher.Fetcher=DEBUG,cmdstdout

Redirects are then logged by Fetcher.

Btw., even with http.redirect.max == 0 redirects are followed,
but they are treated same as ordinary links: they are fetched
in the next crawler cycle.

Do the redirect targets pass the URL filters?

Sebastian


On 11/29/2014 08:49 PM, Murali Parth wrote:
> Hello,
>         We are using Nutch 1.7 and are having the following issue.
> 
> 1) When a site responds with 302, Nutch does not follow the redirect. We
> have set the number of redirection property to 5.
> 
> 2) How to we put Nutch in debug mode so we understand where the call is
> failing.
> 
> Any help will be appreciated.
> 
> Thanks
> Murali
>