You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> on 2011/10/25 02:12:18 UTC

Re: Request for help with setting up authenticated crawling

Fwd'ing...

Praveen -- you need to send an email to user-subscribe@nutch.apache.org, and then 
follow the instructions from there to join the list. Then you send emails to user@nutch.apache.org 
to communicate with the ML (your original email was sent to user-owner@nutch.apache.org).

Cheers,
Chris

On Oct 24, 2011, at 2:22 PM, Praveen Adivi wrote:

> Hi Guys,
>               I am new to nutch and I am trying to configure nutch to crawl a secured site and I am unable to get secured content. I followed the following document to set it up http://wiki.apache.org/nutch/HttpAuthenticationSchemes. However, there seems to be some problem in setting it up. Hope you guys could kindly help me out.
> 
> Please find the relevant files as an attachment. Thank you in advance. 
> 
> 
> 
> -- 
> Thanks and regards,
> 
> Praveen Adivi
> Java Developer
> Yaskawa America
> Ext: 7232
> 
> 
> 
> 
> 
> -- 
> Thanks and regards,
> 
> Praveen Adivi
> Java Developer
> Yaskawa America
> Ext: 7232
> 
> 
> IMPORTANT: The information contained in this transmission may be privileged, 
> proprietary and confidential and protected from disclosure. It is intended only for 
> the intended recipient. If you are not the intended recipient or a person responsible 
> for delivering this transmission to the intended recipient, you may not disclose, copy 
> or distribute this transmission or take any action in reliance on it. If you received this 
> transmission in error, please notify us immediately by replying to this message and 
> please dispose of and delete this transmission.
> 
> Thank you.
> 
> Yaskawa America, Inc.
> 
> 
> <httpclient-auth.xml><log4j.properties><nutch-site.xml><hadoop.log>


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: Request for help with setting up authenticated crawling

Posted by lewis john mcgibbney <le...@gmail.com>.
Hi Praveen,

In current 1.4 trunk, protocol-httpclient is unstable. There is the
intention to rewrite this plugin entirely, however at the moment I'm not
sure if there is any ongoing work specifically addressing this issue.

Please check out the Nutch mailing list archives for recent correspondence
on the topic [1]

[1] http://www.mail-archive.com/user%40nutch.apache.org/

On Tue, Oct 25, 2011 at 2:12 AM, Mattmann, Chris A (388J) <
chris.a.mattmann@jpl.nasa.gov> wrote:

> Fwd'ing...
>
> Praveen -- you need to send an email to user-subscribe@nutch.apache.org,
> and then
> follow the instructions from there to join the list. Then you send emails
> to user@nutch.apache.org
> to communicate with the ML (your original email was sent to
> user-owner@nutch.apache.org).
>
> Cheers,
> Chris
>
> On Oct 24, 2011, at 2:22 PM, Praveen Adivi wrote:
>
> > Hi Guys,
> >               I am new to nutch and I am trying to configure nutch to
> crawl a secured site and I am unable to get secured content. I followed the
> following document to set it up
> http://wiki.apache.org/nutch/HttpAuthenticationSchemes. However, there
> seems to be some problem in setting it up. Hope you guys could kindly help
> me out.
> >
> > Please find the relevant files as an attachment. Thank you in advance.
> >
> >
> >
> > --
> > Thanks and regards,
> >
> > Praveen Adivi
> > Java Developer
> > Yaskawa America
> > Ext: 7232
> >
> >
> >
> >
> >
> > --
> > Thanks and regards,
> >
> > Praveen Adivi
> > Java Developer
> > Yaskawa America
> > Ext: 7232
> >
> >
> > IMPORTANT: The information contained in this transmission may be
> privileged,
> > proprietary and confidential and protected from disclosure. It is
> intended only for
> > the intended recipient. If you are not the intended recipient or a person
> responsible
> > for delivering this transmission to the intended recipient, you may not
> disclose, copy
> > or distribute this transmission or take any action in reliance on it. If
> you received this
> > transmission in error, please notify us immediately by replying to this
> message and
> > please dispose of and delete this transmission.
> >
> > Thank you.
> >
> > Yaskawa America, Inc.
> >
> >
> > <httpclient-auth.xml><log4j.properties><nutch-site.xml><hadoop.log>
>
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattmann@nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>


-- 
*Lewis*