You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Marcin Okraszewski <ok...@o2.pl> on 2007/08/16 20:41:21 UTC

What is the proper way of deleting segments?

Hi,
What is the proper way of removing segment in context of hadoop? Recrawl scripts tend to use normal "rm" command for this (eg.: http://wiki.apache.org/nutch/IntranetRecrawl). Does it work with nutch distributed on several computers? Shouldn't it use "bin/hadoop fs -rmr"?

I did tried to find the answer in archive. I couldn't find a question like this, though some examples used hadoop commands instead of regular ones.

Thanks,
Marcin

Re: What is the proper way of deleting segments?

Posted by Andrzej Bialecki <ab...@getopt.org>.
Marcin Okraszewski wrote:
> Hi, What is the proper way of removing segment in context of hadoop?
> Recrawl scripts tend to use normal "rm" command for this (eg.:
> http://wiki.apache.org/nutch/IntranetRecrawl). Does it work with
> nutch distributed on several computers? Shouldn't it use "bin/hadoop
> fs -rmr"?


They should use the Hadoop version - normal /bin/rm doesn't work on HDFS.


-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com