You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Pan Zhiwei <zh...@theadventus.com> on 2011/03/23 10:58:45 UTC

Having problem configuring Nutch to crawl into NTLM website

Dear all, 


I am currently having a little problem getting Nutch to access NTLM on a 
Sharepoint 2010 setup. We have no problems using the username/password 
from a Firefox browser (on the same Centos machine as nutch), but when we run 
Nutch, we get : 

hadoop.log 
2011-03-17 17:21:16,383 INFO auth.AuthChallengeProcessor - ntlm authentication scheme selected 
2011-03-17 17:21:16,423 INFO httpclient.HttpMethodDirector - Failure authenticating with NTLM <any realm>@myhomenet:80 
2011-03-17 17:21:16,430 INFO auth.AuthChallengeProcessor - ntlm authentication scheme selected 
2011-03-17 17:21:16,442 INFO httpclient.HttpMethodDirector - Failure authenticating with NTLM <any realm>@myhomenet:80 
2011-03-17 17:21:16,520 INFO fetcher.Fetcher - -finishing thread FetcherThread, activeThreads=0 

We are running Nutch 1.1, and here is my conf/httpclient-auth.xml extract 

<?xml version="1.0"?> 
<auth-configuration> 
<credentials username="zhiwei" password="zhiwei123"> 
<default scheme="ntlm" realm="myhomenet"/> 
</credentials> 
</auth-configuration> 

Would really appreciate any help on this!