You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Beats <ta...@yahoo.com> on 2009/07/18 10:32:02 UTC

error in using generate command

hi,

i m getting this weird error ( at least for me):

i m trying to crawl a some web pages..
with normal crawl command i m able to crawl, index -- no problem at all....

But when use each command seperately (inject, generate..)
i get error::

Generator: 0 records selected for fetching, exiting ...

the command i m using is:

bin/nutch inject test.crawl/crawldb urls/seed.txt

this succesfully insert the urls

then when i use this,

bin/nutch generate test.crawl/crawldb test.crawl/segments

then it give::

Generator: Selecting best-scoring urls due for fetch.
Generator: starting
Generator: segment: monster.crawl/segments/20090718135110
Generator: filtering: true
Generator: jobtracker is 'local', generating exactly one partition.
Generator: reached
Generator: 0 records selected for fetching, exiting ...



While when use crawl command it gvs the correct result......


im using inject,generate command on fresh crawl dir ....


plz Help!!


with Regards

Tarun 
-- 
View this message in context: http://www.nabble.com/error-in-using-generate-command-tp24545711p24545711.html
Sent from the Nutch - User mailing list archive at Nabble.com.


Re: error in using generate command

Posted by Beats <ta...@yahoo.com>.
Sorry for the error
it is just typing error.

thanx for replying

alexmc wrote:
> 
> Why does your example say both monster.crawl and test.crawl ?
> 
> Are you perhaps entering the command wrong or is this just an error in
> the email?
> 
> Alex
> 
> 
> 2009/7/18 Beats <ta...@yahoo.com>:
>>
>> hi,
>>
>> i m getting this weird error ( at least for me):
>>
>> i m trying to crawl a some web pages..
>> with normal crawl command i m able to crawl, index -- no problem at
>> all....
>>
>> But when use each command seperately (inject, generate..)
>> i get error::
>>
>> Generator: 0 records selected for fetching, exiting ...
>>
>> the command i m using is:
>>
>> bin/nutch inject test.crawl/crawldb urls/seed.txt
>>
>> this succesfully insert the urls
>>
>> then when i use this,
>>
>> bin/nutch generate test.crawl/crawldb test.crawl/segments
>>
>> then it give::
>>
>> Generator: Selecting best-scoring urls due for fetch.
>> Generator: starting
>> Generator: segment: monster.crawl/segments/20090718135110
>> Generator: filtering: true
>> Generator: jobtracker is 'local', generating exactly one partition.
>> Generator: reached
>> Generator: 0 records selected for fetching, exiting ...
>>
>>
>>
>> While when use crawl command it gvs the correct result......
>>
>>
>> im using inject,generate command on fresh crawl dir ....
>>
>>
>> plz Help!!
>>
>>
>> with Regards
>>
>> Tarun
>> --
>> View this message in context:
>> http://www.nabble.com/error-in-using-generate-command-tp24545711p24545711.html
>> Sent from the Nutch - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/error-in-using-generate-command-tp24545715p24622920.html
Sent from the Nutch - User mailing list archive at Nabble.com.


Re: error in using generate command

Posted by Alex McLintock <al...@gmail.com>.
Why does your example say both monster.crawl and test.crawl ?

Are you perhaps entering the command wrong or is this just an error in
the email?

Alex


2009/7/18 Beats <ta...@yahoo.com>:
>
> hi,
>
> i m getting this weird error ( at least for me):
>
> i m trying to crawl a some web pages..
> with normal crawl command i m able to crawl, index -- no problem at all....
>
> But when use each command seperately (inject, generate..)
> i get error::
>
> Generator: 0 records selected for fetching, exiting ...
>
> the command i m using is:
>
> bin/nutch inject test.crawl/crawldb urls/seed.txt
>
> this succesfully insert the urls
>
> then when i use this,
>
> bin/nutch generate test.crawl/crawldb test.crawl/segments
>
> then it give::
>
> Generator: Selecting best-scoring urls due for fetch.
> Generator: starting
> Generator: segment: monster.crawl/segments/20090718135110
> Generator: filtering: true
> Generator: jobtracker is 'local', generating exactly one partition.
> Generator: reached
> Generator: 0 records selected for fetching, exiting ...
>
>
>
> While when use crawl command it gvs the correct result......
>
>
> im using inject,generate command on fresh crawl dir ....
>
>
> plz Help!!
>
>
> with Regards
>
> Tarun
> --
> View this message in context: http://www.nabble.com/error-in-using-generate-command-tp24545711p24545711.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>
>