You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Alaak <al...@gmx.de> on 2012/09/08 10:43:49 UTC
Problem with corrupted index "Input path does not exist:"
Hi,
I needed to abort a crawl this morning and it seems my drawl directory
is somehow corrupted now. I get the error message: "Input path does not
exist: file:/home/user/Apache
Nutch/crawl/segments/20120908095131/parse_data" Is there any way to
delete the data already created by the non finished crawl to clean up
the crawl directory?
The solution I found on stackoverflow was to delete the whole crawl db,
which I would like to avoid since it already contains one week of data.
Thanks.
Re: Problem with corrupted index "Input path does not exist:"
Posted by Lewis John Mcgibbney <le...@gmail.com>.
http://wiki.apache.org/nutch/FAQ#How_can_I_recover_an_aborted_fetch_process.3F
hth
Lewis
On Sat, Sep 8, 2012 at 9:43 AM, Alaak <al...@gmx.de> wrote:
> Hi,
>
> I needed to abort a crawl this morning and it seems my drawl directory is
> somehow corrupted now. I get the error message: "Input path does not exist:
> file:/home/user/Apache Nutch/crawl/segments/20120908095131/parse_data" Is
> there any way to delete the data already created by the non finished crawl
> to clean up the crawl directory?
>
> The solution I found on stackoverflow was to delete the whole crawl db,
> which I would like to avoid since it already contains one week of data.
>
> Thanks.
--
Lewis
Re: Problem with corrupted index "Input path does not exist:"
Posted by Alaak <al...@gmx.de>.
Hi,
Yeah. That helped. Thank you.
Am 08.09.2012 11:03, schrieb remi tassing:
> deleting that specific segment directory [0] should fix the problem
> but it depends on what you're attempting to do.
>
> Remi
>
> [0]: /home/user/Apache Nutch/crawl/segments/20120908095131/
>
> On Saturday, September 8, 2012, Alaak wrote:
>
> Hi,
>
> I needed to abort a crawl this morning and it seems my drawl
> directory is somehow corrupted now. I get the error message:
> "Input path does not exist: file:/home/user/Apache
> Nutch/crawl/segments/20120908095131/parse_data" Is there any way
> to delete the data already created by the non finished crawl to
> clean up the crawl directory?
>
> The solution I found on stackoverflow was to delete the whole
> crawl db, which I would like to avoid since it already contains
> one week of data.
>
> Thanks.
>
Re: Problem with corrupted index "Input path does not exist:"
Posted by remi tassing <ta...@gmail.com>.
deleting that specific segment directory [0] should fix the problem but it
depends on what you're attempting to do.
Remi
[0]: /home/user/Apache Nutch/crawl/segments/**20120908095131/
On Saturday, September 8, 2012, Alaak wrote:
> Hi,
>
> I needed to abort a crawl this morning and it seems my drawl directory is
> somehow corrupted now. I get the error message: "Input path does not exist:
> file:/home/user/Apache Nutch/crawl/segments/**20120908095131/parse_data"
> Is there any way to delete the data already created by the non finished
> crawl to clean up the crawl directory?
>
> The solution I found on stackoverflow was to delete the whole crawl db,
> which I would like to avoid since it already contains one week of data.
>
> Thanks.
>