You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by "Campbell, John" <Jo...@VerizonWireless.com> on 2010/09/21 22:21:57 UTC

Httpclient Authentication Failure authenticating with NTLM

We are running nutch 1.1 and are attempting to crawl pages that are
behind Siteminder (NTLM). However, we're getting an error that we can't
seem to get around. Here is our setup -

Httpclient-auth.xml
<auth-configuration>
        <credentials username="user" password="pass">
                <default />
        </credentials>
</auth-configuration>

Plugin is enabled, http.agent.host is set to our server ip

Here is some relevant log info:

2010-09-21 16:03:22,954 INFO  httpclient.Http - http.proxy.host = null
2010-09-21 16:03:22,955 INFO  httpclient.Http - http.proxy.port = 8080
2010-09-21 16:03:22,955 INFO  httpclient.Http - http.timeout = 20000
2010-09-21 16:03:22,955 INFO  httpclient.Http - http.content.limit =
65536
2010-09-21 16:03:22,955 INFO  httpclient.Http - http.agent =
nutch-solr-integration/Nutch-1.1
2010-09-21 16:03:22,955 INFO  httpclient.Http - http.accept.language =
en-us,en-gb,en;q=0.7,*;q=0.3
2010-09-21 16:03:22,956 INFO  httpclient.Http -
protocol.plugin.check.blocking = false
2010-09-21 16:03:22,956 INFO  httpclient.Http -
protocol.plugin.check.robots = false
2010-09-21 16:03:24,450 DEBUG auth.AuthChallengeProcessor - Supported
authentication schemes in the order of preference: [ntlm, digest, basic]
2010-09-21 16:03:24,451 INFO  auth.AuthChallengeProcessor - ntlm
authentication scheme selected
2010-09-21 16:03:24,451 DEBUG auth.AuthChallengeProcessor - Using
authentication scheme: ntlm
2010-09-21 16:03:24,452 DEBUG auth.AuthChallengeProcessor -
Authorization challenge processed
2010-09-21 16:03:24,579 DEBUG auth.AuthChallengeProcessor - Using
authentication scheme: ntlm
2010-09-21 16:03:24,579 DEBUG auth.AuthChallengeProcessor -
Authorization challenge processed
2010-09-21 16:03:25,006 DEBUG auth.AuthChallengeProcessor - Using
authentication scheme: ntlm
2010-09-21 16:03:25,007 DEBUG auth.AuthChallengeProcessor -
Authorization challenge processed
2010-09-21 16:03:25,007 INFO  httpclient.HttpMethodDirector - Failure
authenticating with NTLM <any realm>@oursiteminderip:port

I noticed that our log doesn't contain any "Credentials - username
someuser; set .." which makes me think its not grabbing those
credentials correctly out of httpclient-auth.xml. However, siteminder
locks out our username after so many failed attempts and we have been
getting locked out so it seems like it is trying to authenticate.

Thanks for any help