You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by A Laxmi <a....@gmail.com> on 2013/07/12 17:09:36 UTC

Nutch 2.2.1 - scripts "crawl" and "nutch"

Hello,

I have installed Nutch 2.2.1 without any issues. However, I could find two
scripts "crawl" and "nutch" instead of one script - "nutch"  like in
earlier releases.

Could anyone tell me why we have two scripts? what is the advantage of
using one over the other?

Thanks for your help!

Re: Nutch 2.2.1 - scripts "crawl" and "nutch"

Posted by "H. Coskun Gunduz" <co...@agmlab.com>.
Hello

./crawl <SEED_DIR> <CRAWL_ID> <SOLR_URL> <DEPTH>

Please see: 3.3. Using the crawl script at 
http://wiki.apache.org/nutch/NutchTutorial

Happy crawling.

coskun...


On 07/31/2013 11:35 PM, A Laxmi wrote:
> what is the syntax for using "bin/crawl" command?
>
>
> On Fri, Jul 12, 2013 at 12:09 PM, Tejas Patil <te...@gmail.com>wrote:
>
>> bin/nutch : allows to run individual commands separately.
>> bin/crawl : contains calls to the "bin/nutch" script and invokes nutch
>> commands required for a typical nutch crawl cycle. This makes life easy for
>> users as you need not know the internal phases (and thus commands) of nutch
>> and yet run a crawl.
>>
>>
>> On Fri, Jul 12, 2013 at 8:09 AM, A Laxmi <a....@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I have installed Nutch 2.2.1 without any issues. However, I could find
>> two
>>> scripts "crawl" and "nutch" instead of one script - "nutch"  like in
>>> earlier releases.
>>>
>>> Could anyone tell me why we have two scripts? what is the advantage of
>>> using one over the other?
>>>
>>> Thanks for your help!
>>>

-- 
H. Coşkun Gündüz
Software Team Leader

AGMLAB
Gülbahar Mah. Avnidilligil sok.
çelik iş merkezi b blok kat:4 no:18
Mecidiyeköy/istanbul
Tel: 	0212 347 64 42
Fax: 0212 347 64 43


Re: Nutch 2.2.1 - scripts "crawl" and "nutch"

Posted by A Laxmi <a....@gmail.com>.
what is the syntax for using "bin/crawl" command?


On Fri, Jul 12, 2013 at 12:09 PM, Tejas Patil <te...@gmail.com>wrote:

> bin/nutch : allows to run individual commands separately.
> bin/crawl : contains calls to the "bin/nutch" script and invokes nutch
> commands required for a typical nutch crawl cycle. This makes life easy for
> users as you need not know the internal phases (and thus commands) of nutch
> and yet run a crawl.
>
>
> On Fri, Jul 12, 2013 at 8:09 AM, A Laxmi <a....@gmail.com> wrote:
>
> > Hello,
> >
> > I have installed Nutch 2.2.1 without any issues. However, I could find
> two
> > scripts "crawl" and "nutch" instead of one script - "nutch"  like in
> > earlier releases.
> >
> > Could anyone tell me why we have two scripts? what is the advantage of
> > using one over the other?
> >
> > Thanks for your help!
> >
>

Re: Nutch 2.2.1 - scripts "crawl" and "nutch"

Posted by Tejas Patil <te...@gmail.com>.
bin/nutch : allows to run individual commands separately.
bin/crawl : contains calls to the "bin/nutch" script and invokes nutch
commands required for a typical nutch crawl cycle. This makes life easy for
users as you need not know the internal phases (and thus commands) of nutch
and yet run a crawl.


On Fri, Jul 12, 2013 at 8:09 AM, A Laxmi <a....@gmail.com> wrote:

> Hello,
>
> I have installed Nutch 2.2.1 without any issues. However, I could find two
> scripts "crawl" and "nutch" instead of one script - "nutch"  like in
> earlier releases.
>
> Could anyone tell me why we have two scripts? what is the advantage of
> using one over the other?
>
> Thanks for your help!
>