You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Amit Sela <am...@infolinks.com> on 2013/02/19 14:24:24 UTC

Is there a bug in the crawl script coming with nutch 1.6 ?

I get InvalidInputException when the script is running:

$bin/nutch solrindex $SOLRURL $CRAWL_PATH/crawldb -linkdb
$CRAWL_PATH/linkdb $SEGMENT

I think it's because it is looking for the segments directory in
runtime/local/ instead of in the crawl/segments path...

I changed it to:

$bin/nutch solrindex $SOLRURL $CRAWL_PATH/crawldb -linkdb
$CRAWL_PATH/linkdb $CRAWL_PATH/segments/$SEGMENT

and it seems to work now.

Or maybe this whole thing is just because I'm running in standalone ?

Anyway, hope it helps if anyone gets the same error.

Cheers.

Re: Is there a bug in the crawl script coming with nutch 1.6 ?

Posted by Sebastian Nagel <wa...@googlemail.com>.
Hi Amit, hi Lewis,

see NUTCH-1500 for details.

You can take
 http://svn.apache.org/repos/asf/nutch/trunk/src/bin/crawl
and replace (runtime/local/)bin/crawl of 1.6. It should work.
Thanks, anyway!

Sebastian


On 02/19/2013 06:15 PM, Lewis John Mcgibbney wrote:
> Hi Amit,
> I think Seb fixed this in trunk.
> Thanks for reporting.
> Lewis
> 
> On Tuesday, February 19, 2013, Amit Sela <am...@infolinks.com> wrote:
>> I get InvalidInputException when the script is running:
>>
>> $bin/nutch solrindex $SOLRURL $CRAWL_PATH/crawldb -linkdb
>> $CRAWL_PATH/linkdb $SEGMENT
>>
>> I think it's because it is looking for the segments directory in
>> runtime/local/ instead of in the crawl/segments path...
>>
>> I changed it to:
>>
>> $bin/nutch solrindex $SOLRURL $CRAWL_PATH/crawldb -linkdb
>> $CRAWL_PATH/linkdb $CRAWL_PATH/segments/$SEGMENT
>>
>> and it seems to work now.
>>
>> Or maybe this whole thing is just because I'm running in standalone ?
>>
>> Anyway, hope it helps if anyone gets the same error.
>>
>> Cheers.
>>
> 


Re: Is there a bug in the crawl script coming with nutch 1.6 ?

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Hi Amit,
I think Seb fixed this in trunk.
Thanks for reporting.
Lewis

On Tuesday, February 19, 2013, Amit Sela <am...@infolinks.com> wrote:
> I get InvalidInputException when the script is running:
>
> $bin/nutch solrindex $SOLRURL $CRAWL_PATH/crawldb -linkdb
> $CRAWL_PATH/linkdb $SEGMENT
>
> I think it's because it is looking for the segments directory in
> runtime/local/ instead of in the crawl/segments path...
>
> I changed it to:
>
> $bin/nutch solrindex $SOLRURL $CRAWL_PATH/crawldb -linkdb
> $CRAWL_PATH/linkdb $CRAWL_PATH/segments/$SEGMENT
>
> and it seems to work now.
>
> Or maybe this whole thing is just because I'm running in standalone ?
>
> Anyway, hope it helps if anyone gets the same error.
>
> Cheers.
>

-- 
*Lewis*