You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Stefan Neufeind <ap...@stefan-neufeind.de> on 2006/07/25 01:37:29 UTC

Why was "prune" removed in 0.8?

Hi,

I might be bringing up old discussions (sorry if so) - but discussing 
about segread/readseg I wondered why "prune" is missing in bin/nutch. 
It's still working when you give the full classname by hand. But could 
it be (re)added to bin/nutch again as well?


Regards,
  Stefan

Re: Why was "prune" removed in 0.8?

Posted by Andrzej Bialecki <ab...@getopt.org>.
Stefan Neufeind wrote:
> Hi,
>
> I might be bringing up old discussions (sorry if so) - but discussing 
> about segread/readseg I wondered why "prune" is missing in bin/nutch. 
> It's still working when you give the full classname by hand. But could 
> it be (re)added to bin/nutch again as well?

I think PruneIndexTool is not fully compatible in the command-line usage 
with the current layout of indexes. I mean that in 0.8 indexes are not 
created inside each segment directory, and also a single output index 
consists of as many parts as there were reduce tasks ... so, some 
fiddling around with paths and arguments will be necessary to fix it.

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com