You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Beats <ta...@yahoo.com> on 2009/07/18 10:32:02 UTC
error in using generate command
hi,
i m getting this weird error ( at least for me):
i m trying to crawl a some web pages..
with normal crawl command i m able to crawl, index -- no problem at all....
But when use each command seperately (inject, generate..)
i get error::
Generator: 0 records selected for fetching, exiting ...
the command i m using is:
bin/nutch inject test.crawl/crawldb urls/seed.txt
this succesfully insert the urls
then when i use this,
bin/nutch generate test.crawl/crawldb test.crawl/segments
then it give::
Generator: Selecting best-scoring urls due for fetch.
Generator: starting
Generator: segment: monster.crawl/segments/20090718135110
Generator: filtering: true
Generator: jobtracker is 'local', generating exactly one partition.
Generator: reached
Generator: 0 records selected for fetching, exiting ...
While when use crawl command it gvs the correct result......
im using inject,generate command on fresh crawl dir ....
plz Help!!
with Regards
Tarun
--
View this message in context: http://www.nabble.com/error-in-using-generate-command-tp24545711p24545711.html
Sent from the Nutch - User mailing list archive at Nabble.com.
Re: error in using generate command
Posted by Beats <ta...@yahoo.com>.
Sorry for the error
it is just typing error.
thanx for replying
alexmc wrote:
>
> Why does your example say both monster.crawl and test.crawl ?
>
> Are you perhaps entering the command wrong or is this just an error in
> the email?
>
> Alex
>
>
> 2009/7/18 Beats <ta...@yahoo.com>:
>>
>> hi,
>>
>> i m getting this weird error ( at least for me):
>>
>> i m trying to crawl a some web pages..
>> with normal crawl command i m able to crawl, index -- no problem at
>> all....
>>
>> But when use each command seperately (inject, generate..)
>> i get error::
>>
>> Generator: 0 records selected for fetching, exiting ...
>>
>> the command i m using is:
>>
>> bin/nutch inject test.crawl/crawldb urls/seed.txt
>>
>> this succesfully insert the urls
>>
>> then when i use this,
>>
>> bin/nutch generate test.crawl/crawldb test.crawl/segments
>>
>> then it give::
>>
>> Generator: Selecting best-scoring urls due for fetch.
>> Generator: starting
>> Generator: segment: monster.crawl/segments/20090718135110
>> Generator: filtering: true
>> Generator: jobtracker is 'local', generating exactly one partition.
>> Generator: reached
>> Generator: 0 records selected for fetching, exiting ...
>>
>>
>>
>> While when use crawl command it gvs the correct result......
>>
>>
>> im using inject,generate command on fresh crawl dir ....
>>
>>
>> plz Help!!
>>
>>
>> with Regards
>>
>> Tarun
>> --
>> View this message in context:
>> http://www.nabble.com/error-in-using-generate-command-tp24545711p24545711.html
>> Sent from the Nutch - User mailing list archive at Nabble.com.
>>
>>
>
>
--
View this message in context: http://www.nabble.com/error-in-using-generate-command-tp24545715p24622920.html
Sent from the Nutch - User mailing list archive at Nabble.com.
Re: error in using generate command
Posted by Alex McLintock <al...@gmail.com>.
Why does your example say both monster.crawl and test.crawl ?
Are you perhaps entering the command wrong or is this just an error in
the email?
Alex
2009/7/18 Beats <ta...@yahoo.com>:
>
> hi,
>
> i m getting this weird error ( at least for me):
>
> i m trying to crawl a some web pages..
> with normal crawl command i m able to crawl, index -- no problem at all....
>
> But when use each command seperately (inject, generate..)
> i get error::
>
> Generator: 0 records selected for fetching, exiting ...
>
> the command i m using is:
>
> bin/nutch inject test.crawl/crawldb urls/seed.txt
>
> this succesfully insert the urls
>
> then when i use this,
>
> bin/nutch generate test.crawl/crawldb test.crawl/segments
>
> then it give::
>
> Generator: Selecting best-scoring urls due for fetch.
> Generator: starting
> Generator: segment: monster.crawl/segments/20090718135110
> Generator: filtering: true
> Generator: jobtracker is 'local', generating exactly one partition.
> Generator: reached
> Generator: 0 records selected for fetching, exiting ...
>
>
>
> While when use crawl command it gvs the correct result......
>
>
> im using inject,generate command on fresh crawl dir ....
>
>
> plz Help!!
>
>
> with Regards
>
> Tarun
> --
> View this message in context: http://www.nabble.com/error-in-using-generate-command-tp24545711p24545711.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>
>