You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by iacueva <ia...@utpl.edu.ec> on 2011/03/28 21:09:00 UTC

Why Nutch is more accurate than Regain?

Hello, Do you know why the tool Nutch is more accurate than Regain on the
search results? Will depend on their plugins? or TIKA?
Both work with lucene

http://regain.sourceforge.net/
http://regain.sourceforge.net/details.php (REgain user other crawler, ) -
regain only use Lucene

Re: How do i upgrade httpclient 3.1 to httpclient 4 for NUTCH

Posted by Pan Zhiwei <zh...@theadventus.com>.
Hi Julien, 

Thanks for the quick reply. So does it means so far, 
we would not be able to run nutch with httpclient 4 unless we create a new plugin? 

Has anyone done before? 

Regards, 
Zhiwei 

----- Original Message ----- 
From: "Julien Nioche" <li...@gmail.com> 
To: user@nutch.apache.org 
Sent: Wednesday, March 30, 2011 6:39:01 PM GMT +08:00 Beijing / Chongqing / Hong Kong / Urumqi 
Subject: Re: How do i upgrade httpclient 3.1 to httpclient 4 for NUTCH 

See https://issues.apache.org/jira/browse/NUTCH-751 

You'll need to write a brand new plugin for it as the code has changed a lot 
since 3.1; it is a substantial task which is probably why it hasn't been 
done yet 

There was a discussion about doing that as part of 
http://code.google.com/p/crawler-commons/. 

Julien 

On 30 March 2011 11:24, Pan Zhiwei <zh...@theadventus.com> wrote: 

> Hi, 
> 
> How do i upgrade httpclient 3.1 to httpclient 4 for nutch? 
> 
> Is there anyone manage to run nutch with httpclient 4 or 4.xxx 
> 
> -- 
> 
> 
> 
> 
> 
> 
> The Adventus Consultants Pte Ltd 
> 1100 Lower Delta Road, #02-04 EPL Building 
> S169206 
> Tel: +65 6738 9416 (ext 9105) 
> Fax: +65 6738 9415 
> Website: www.theadventus.com 
> 
> 


-- 
* 
*Open Source Solutions for Text Engineering 

http://digitalpebble.blogspot.com/ 
http://www.digitalpebble.com 


-- 






The Adventus Consultants Pte Ltd 
1100 Lower Delta Road, #02-04 EPL Building 
S169206 
Tel: +65 6738 9416 (ext 9105) 
Fax: +65 6738 9415 
Website: www.theadventus.com 


Re: How do i upgrade httpclient 3.1 to httpclient 4 for NUTCH

Posted by Julien Nioche <li...@gmail.com>.
See https://issues.apache.org/jira/browse/NUTCH-751

You'll need to write a brand new plugin for it as the code has changed a lot
since 3.1; it is a substantial task which is probably why it hasn't been
done yet

There was a discussion about doing that as part of
http://code.google.com/p/crawler-commons/.

Julien

On 30 March 2011 11:24, Pan Zhiwei <zh...@theadventus.com> wrote:

> Hi,
>
> How do i upgrade httpclient 3.1 to httpclient 4 for nutch?
>
> Is there anyone manage to run nutch with httpclient 4 or 4.xxx
>
> --
>
>
>
>
>
>
> The Adventus Consultants Pte Ltd
> 1100 Lower Delta Road, #02-04 EPL Building
> S169206
> Tel: +65 6738 9416 (ext 9105)
> Fax: +65 6738 9415
> Website: www.theadventus.com
>
>


-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com

How do i upgrade httpclient 3.1 to httpclient 4 for NUTCH

Posted by Pan Zhiwei <zh...@theadventus.com>.
Hi, 

How do i upgrade httpclient 3.1 to httpclient 4 for nutch? 

Is there anyone manage to run nutch with httpclient 4 or 4.xxx 

-- 






The Adventus Consultants Pte Ltd 
1100 Lower Delta Road, #02-04 EPL Building 
S169206 
Tel: +65 6738 9416 (ext 9105) 
Fax: +65 6738 9415 
Website: www.theadventus.com