You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Alexis Votta <al...@gmail.com> on 2007/09/25 19:01:28 UTC

Does authentication work?

I am setting up Nutch for our intranet. It has sites protected by
authentication box. I was advised to use protocol-httpclient and
authentication related patches in JIRA. Neither the
protocol-httpclient alone nor protocol-httpclient with authentication
patch works. I have defined all the properties carefully after
including protocol-httpclient in properties.

<property>
<name>http.auth.ntlm.username</name>
<value>myusername</value>
<description>
username for http ntlm auth
</description>
</property>

<property>
<name>http.auth.ntlm.password</name>
<value>mypassword</value>
<description>
password for http ntlm auth
</description>
</property>

<property>
<name>http.auth.ntlm.domain</name>
<value>lan.gmu.edu</value>
<description>
password for http ntlm auth
</description>
</property>

<property>
<name>http.auth.ntlm.host</name>
<value>gmusearch</value>
<description>
password for http ntlm auth
</description>
</property>

<property>
<name>http.auth.basic.username</name>
<value>myusername</value>
<description>
username for http basic auth
</description>
</property>

<property>
<name>http.auth.basic.password</name>
<value>mypassword</value>
<description>
password for http basic auth
</description>
</property>

I can't find any useful information in the debug logs. In a past
discussion on Aug 8, I read basic authentication to work. But
unfortunately it doesn't work for me. Is there anything else that
needs to be done?

Re: Does authentication work?

Posted by Susam Pal <su...@gmail.com>.
It seems you are using the basic authentication patch which has some
hard-coded Authscope parameters which might be creating problems.

Please try NUTCH-559 <https://issues.apache.org/jira/browse/NUTCH-559>
and set these properties.

http.auth.username
http.auth.password
http.auth.realm
http.auth.host

This should work fine. I'll be revising this patch as per the
suggestions of Doğacan in order to reduce the 'diff'.

Regards,
Susam Pal
http://susam.in/

On 9/26/07, Alexis Votta <al...@gmail.com> wrote:
> I tried the new properties but they don't work. I don't know where the
> new properties come from but the instructions along with the patch
> talked only about http.auth.basic.* properties. I do not find any
> documentation for http.auth.ntlm.* properties. Are http.auth.ntlm.*
> and http.auth.* valid properties? I am new to Nutch and I can't see
> the bigger picture of how authentication works. So I am unable to find
> any clue to debug the problem. Any help to resolve this problem would
> be highly appreciated. Please let me know what logs and debug messages
> you require.
>
> - Alexis
>
> On 9/25/07, Susam Pal <su...@gmail.com> wrote:
> > The properties you are trying were meant for the original
> > protocol-httpclient which doesn't work for NTLM authentication due to
> > a bug. The patch I have submitted uses these properties:-
> >
> > http.auth.username
> > http.auth.password
> > http.auth.realm
> > http.auth.host
> >
> > Please try these.
> >
> > Regards,
> > Susam Pal
> > http://susam.in/
> >
> > On 9/25/07, Alexis Votta <al...@gmail.com> wrote:
> > > I am setting up Nutch for our intranet. It has sites protected by
> > > authentication box. I was advised to use protocol-httpclient and
> > > authentication related patches in JIRA. Neither the
> > > protocol-httpclient alone nor protocol-httpclient with authentication
> > > patch works. I have defined all the properties carefully after
> > > including protocol-httpclient in properties.
> > >
> > > <property>
> > > <name>http.auth.ntlm.username</name>
> > > <value>myusername</value>
> > > <description>
> > > username for http ntlm auth
> > > </description>
> > > </property>
> > >
> > > <property>
> > > <name>http.auth.ntlm.password</name>
> > > <value>mypassword</value>
> > > <description>
> > > password for http ntlm auth
> > > </description>
> > > </property>
> > >
> > > <property>
> > > <name>http.auth.ntlm.domain</name>
> > > <value>lan.gmu.edu</value>
> > > <description>
> > > password for http ntlm auth
> > > </description>
> > > </property>
> > >
> > > <property>
> > > <name>http.auth.ntlm.host</name>
> > > <value>gmusearch</value>
> > > <description>
> > > password for http ntlm auth
> > > </description>
> > > </property>
> > >
> > > <property>
> > > <name>http.auth.basic.username</name>
> > > <value>myusername</value>
> > > <description>
> > > username for http basic auth
> > > </description>
> > > </property>
> > >
> > > <property>
> > > <name>http.auth.basic.password</name>
> > > <value>mypassword</value>
> > > <description>
> > > password for http basic auth
> > > </description>
> > > </property>
> > >
> > > I can't find any useful information in the debug logs. In a past
> > > discussion on Aug 8, I read basic authentication to work. But
> > > unfortunately it doesn't work for me. Is there anything else that
> > > needs to be done?
> > >
> >
>

Re: Does authentication work?

Posted by Susam Pal <su...@gmail.com>.
It seems you are using the basic authentication patch which has some
hard-coded Authscope parameters which might be creating problems.

Please try NUTCH-559 <https://issues.apache.org/jira/browse/NUTCH-559>
and set these properties.

http.auth.username
http.auth.password
http.auth.realm
http.auth.host

This should work fine. I'll be revising this patch as per the
suggestions of Doğacan in order to reduce the 'diff'.

Regards,
Susam Pal
http://susam.in/

On 9/26/07, Alexis Votta <al...@gmail.com> wrote:
> I tried the new properties but they don't work. I don't know where the
> new properties come from but the instructions along with the patch
> talked only about http.auth.basic.* properties. I do not find any
> documentation for http.auth.ntlm.* properties. Are http.auth.ntlm.*
> and http.auth.* valid properties? I am new to Nutch and I can't see
> the bigger picture of how authentication works. So I am unable to find
> any clue to debug the problem. Any help to resolve this problem would
> be highly appreciated. Please let me know what logs and debug messages
> you require.
>
> - Alexis
>
> On 9/25/07, Susam Pal <su...@gmail.com> wrote:
> > The properties you are trying were meant for the original
> > protocol-httpclient which doesn't work for NTLM authentication due to
> > a bug. The patch I have submitted uses these properties:-
> >
> > http.auth.username
> > http.auth.password
> > http.auth.realm
> > http.auth.host
> >
> > Please try these.
> >
> > Regards,
> > Susam Pal
> > http://susam.in/
> >
> > On 9/25/07, Alexis Votta <al...@gmail.com> wrote:
> > > I am setting up Nutch for our intranet. It has sites protected by
> > > authentication box. I was advised to use protocol-httpclient and
> > > authentication related patches in JIRA. Neither the
> > > protocol-httpclient alone nor protocol-httpclient with authentication
> > > patch works. I have defined all the properties carefully after
> > > including protocol-httpclient in properties.
> > >
> > > <property>
> > > <name>http.auth.ntlm.username</name>
> > > <value>myusername</value>
> > > <description>
> > > username for http ntlm auth
> > > </description>
> > > </property>
> > >
> > > <property>
> > > <name>http.auth.ntlm.password</name>
> > > <value>mypassword</value>
> > > <description>
> > > password for http ntlm auth
> > > </description>
> > > </property>
> > >
> > > <property>
> > > <name>http.auth.ntlm.domain</name>
> > > <value>lan.gmu.edu</value>
> > > <description>
> > > password for http ntlm auth
> > > </description>
> > > </property>
> > >
> > > <property>
> > > <name>http.auth.ntlm.host</name>
> > > <value>gmusearch</value>
> > > <description>
> > > password for http ntlm auth
> > > </description>
> > > </property>
> > >
> > > <property>
> > > <name>http.auth.basic.username</name>
> > > <value>myusername</value>
> > > <description>
> > > username for http basic auth
> > > </description>
> > > </property>
> > >
> > > <property>
> > > <name>http.auth.basic.password</name>
> > > <value>mypassword</value>
> > > <description>
> > > password for http basic auth
> > > </description>
> > > </property>
> > >
> > > I can't find any useful information in the debug logs. In a past
> > > discussion on Aug 8, I read basic authentication to work. But
> > > unfortunately it doesn't work for me. Is there anything else that
> > > needs to be done?
> > >
> >
>

Re: Does authentication work?

Posted by Alexis Votta <al...@gmail.com>.
I tried the new properties but they don't work. I don't know where the
new properties come from but the instructions along with the patch
talked only about http.auth.basic.* properties. I do not find any
documentation for http.auth.ntlm.* properties. Are http.auth.ntlm.*
and http.auth.* valid properties? I am new to Nutch and I can't see
the bigger picture of how authentication works. So I am unable to find
any clue to debug the problem. Any help to resolve this problem would
be highly appreciated. Please let me know what logs and debug messages
you require.

- Alexis

On 9/25/07, Susam Pal <su...@gmail.com> wrote:
> The properties you are trying were meant for the original
> protocol-httpclient which doesn't work for NTLM authentication due to
> a bug. The patch I have submitted uses these properties:-
>
> http.auth.username
> http.auth.password
> http.auth.realm
> http.auth.host
>
> Please try these.
>
> Regards,
> Susam Pal
> http://susam.in/
>
> On 9/25/07, Alexis Votta <al...@gmail.com> wrote:
> > I am setting up Nutch for our intranet. It has sites protected by
> > authentication box. I was advised to use protocol-httpclient and
> > authentication related patches in JIRA. Neither the
> > protocol-httpclient alone nor protocol-httpclient with authentication
> > patch works. I have defined all the properties carefully after
> > including protocol-httpclient in properties.
> >
> > <property>
> > <name>http.auth.ntlm.username</name>
> > <value>myusername</value>
> > <description>
> > username for http ntlm auth
> > </description>
> > </property>
> >
> > <property>
> > <name>http.auth.ntlm.password</name>
> > <value>mypassword</value>
> > <description>
> > password for http ntlm auth
> > </description>
> > </property>
> >
> > <property>
> > <name>http.auth.ntlm.domain</name>
> > <value>lan.gmu.edu</value>
> > <description>
> > password for http ntlm auth
> > </description>
> > </property>
> >
> > <property>
> > <name>http.auth.ntlm.host</name>
> > <value>gmusearch</value>
> > <description>
> > password for http ntlm auth
> > </description>
> > </property>
> >
> > <property>
> > <name>http.auth.basic.username</name>
> > <value>myusername</value>
> > <description>
> > username for http basic auth
> > </description>
> > </property>
> >
> > <property>
> > <name>http.auth.basic.password</name>
> > <value>mypassword</value>
> > <description>
> > password for http basic auth
> > </description>
> > </property>
> >
> > I can't find any useful information in the debug logs. In a past
> > discussion on Aug 8, I read basic authentication to work. But
> > unfortunately it doesn't work for me. Is there anything else that
> > needs to be done?
> >
>

Re: Does authentication work?

Posted by Susam Pal <su...@gmail.com>.
The properties you are trying were meant for the original
protocol-httpclient which doesn't work for NTLM authentication due to
a bug. The patch I have submitted uses these properties:-

http.auth.username
http.auth.password
http.auth.realm
http.auth.host

Please try these.

Regards,
Susam Pal
http://susam.in/

On 9/25/07, Alexis Votta <al...@gmail.com> wrote:
> I am setting up Nutch for our intranet. It has sites protected by
> authentication box. I was advised to use protocol-httpclient and
> authentication related patches in JIRA. Neither the
> protocol-httpclient alone nor protocol-httpclient with authentication
> patch works. I have defined all the properties carefully after
> including protocol-httpclient in properties.
>
> <property>
> <name>http.auth.ntlm.username</name>
> <value>myusername</value>
> <description>
> username for http ntlm auth
> </description>
> </property>
>
> <property>
> <name>http.auth.ntlm.password</name>
> <value>mypassword</value>
> <description>
> password for http ntlm auth
> </description>
> </property>
>
> <property>
> <name>http.auth.ntlm.domain</name>
> <value>lan.gmu.edu</value>
> <description>
> password for http ntlm auth
> </description>
> </property>
>
> <property>
> <name>http.auth.ntlm.host</name>
> <value>gmusearch</value>
> <description>
> password for http ntlm auth
> </description>
> </property>
>
> <property>
> <name>http.auth.basic.username</name>
> <value>myusername</value>
> <description>
> username for http basic auth
> </description>
> </property>
>
> <property>
> <name>http.auth.basic.password</name>
> <value>mypassword</value>
> <description>
> password for http basic auth
> </description>
> </property>
>
> I can't find any useful information in the debug logs. In a past
> discussion on Aug 8, I read basic authentication to work. But
> unfortunately it doesn't work for me. Is there anything else that
> needs to be done?
>