You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Vincent Slot <vi...@openindex.io> on 2016/09/12 09:19:23 UTC
Problem using authentication with Nutch
Hello everyone,
i'm having a problem using Nutch to crawl a website that requires authentication, https://www.romnetwerk.nl/.
I am using Nutch 1.11. Following the "start out as simple as possible"-advice, my httpclient-auth.xml looks like this:
<credentials username=[username] password=[password]>
<default/>
</credentials>
I followed the debug-steps from http://wiki.apache.org/nutch/HttpAuthenticationSchemes?highlight=%28%28HttpPostAuthentication%29%29
The first five steps are OK, up until the credentials showing up in the debug logs, but the following line is NOT showing up: "auth.AuthChallengeProcessor - basic authentication scheme selected" (or similar). The tutorial says that this is a server side problem. Is there a way to make this work from my side?
Thanks in advance,
Vincent
Re: Problem using authentication with Nutch
Posted by Vincent Slot <vi...@openindex.io>.
Never mind, I think I didn't make the distinction well enough between this and HTTP POST authentication. Sorry!
On maandag 12 september 2016 11:19:23 CEST you wrote:
> Hello everyone,
>
> i'm having a problem using Nutch to crawl a website that requires authentication, https://www.romnetwerk.nl/.
>
> I am using Nutch 1.11. Following the "start out as simple as possible"-advice, my httpclient-auth.xml looks like this:
>
> <credentials username=[username] password=[password]>
> <default/>
> </credentials>
>
> I followed the debug-steps from http://wiki.apache.org/nutch/HttpAuthenticationSchemes?highlight=%28%28HttpPostAuthentication%29%29
>
> The first five steps are OK, up until the credentials showing up in the debug logs, but the following line is NOT showing up: "auth.AuthChallengeProcessor - basic authentication scheme selected" (or similar). The tutorial says that this is a server side problem. Is there a way to make this work from my side?
>
> Thanks in advance,
> Vincent
>