You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Vincent Slot <vi...@openindex.io> on 2016/09/12 09:19:23 UTC

Problem using authentication with Nutch

Hello everyone,

i'm having a problem using Nutch to crawl a website that requires authentication, https://www.romnetwerk.nl/. 

I am using Nutch 1.11. Following the "start out as simple as possible"-advice, my httpclient-auth.xml looks like this:

  <credentials username=[username] password=[password]>
    <default/>
  </credentials>

I followed the debug-steps from http://wiki.apache.org/nutch/HttpAuthenticationSchemes?highlight=%28%28HttpPostAuthentication%29%29

The first five steps are OK, up until the credentials showing up in the debug logs, but the following line is NOT showing up: "auth.AuthChallengeProcessor - basic authentication scheme selected" (or similar). The tutorial says that this is a server side problem. Is there a way to make this work from my side?

Thanks in advance,
Vincent

Re: Problem using authentication with Nutch

Posted by Vincent Slot <vi...@openindex.io>.
Never mind, I think I didn't make the distinction well enough between this and HTTP POST authentication. Sorry!

On maandag 12 september 2016 11:19:23 CEST you wrote:
> Hello everyone,
> 
> i'm having a problem using Nutch to crawl a website that requires authentication, https://www.romnetwerk.nl/. 
> 
> I am using Nutch 1.11. Following the "start out as simple as possible"-advice, my httpclient-auth.xml looks like this:
> 
>   <credentials username=[username] password=[password]>
>     <default/>
>   </credentials>
> 
> I followed the debug-steps from http://wiki.apache.org/nutch/HttpAuthenticationSchemes?highlight=%28%28HttpPostAuthentication%29%29
> 
> The first five steps are OK, up until the credentials showing up in the debug logs, but the following line is NOT showing up: "auth.AuthChallengeProcessor - basic authentication scheme selected" (or similar). The tutorial says that this is a server side problem. Is there a way to make this work from my side?
> 
> Thanks in advance,
> Vincent
>