You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oodt.apache.org by Bill Rideout <br...@haystack.mit.edu> on 2013/06/07 23:15:17 UTC

Problem setting up crawler service

I'm setting up the Catalog and Archive Crawling Framework as described in http://oodt.apache.org/components/maven/crawler/user/ .  The service started fine, but when I ran the test command:

./crawler_launcher \
               --crawlerId StdProductCrawler \
               --productPath /data/test \
               --filemgrUrl http://localhost:9000/ \
               --failureDir /tmp \
               --actionIds DeleteDataFile MoveDataFileToFailureDir Unique \
               --metFileExtension met \
               --clientTransferer org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferFactory

I got the following error:

ERROR: Invalid option: 'crawlerId'

Indeed, running "./crawler_launcher -h" makes no reference to crawlerId.  Leaving out that options give the error:

ERROR: Must specify an action option!

I tried reading the help directly, but the options/suboptions were a bit daunting, so I'm posting to the list.  

Thanks again,

Bill Rideout
brideout@haystack,mit.edu


Re: Problem setting up crawler service

Posted by Cameron Goodale <go...@apache.org>.
Bill,

To try and get you moving forward again check out this Wiki page:

https://cwiki.apache.org/confluence/display/OODT/OODT+Crawler+Help

Many of the maven generated docs are stale since a lot of documentation
efforts have been moved to the wiki.  This makes it easier for users and
devs to collaborate on documentation, but also requires the additional step
of flushing changes back into maven.

I will create a JIRA issue to address to disconnect for the crawler
component.

Thanks for the emails to the list.  Keep them coming.


Cheers,


Cameron


On Fri, Jun 7, 2013 at 2:15 PM, Bill Rideout <br...@haystack.mit.edu>wrote:

> I'm setting up the Catalog and Archive Crawling Framework as described in
> http://oodt.apache.org/components/maven/crawler/user/ .  The service
> started fine, but when I ran the test command:
>
> ./crawler_launcher \
>                --crawlerId StdProductCrawler \
>                --productPath /data/test \
>                --filemgrUrl http://localhost:9000/ \
>                --failureDir /tmp \
>                --actionIds DeleteDataFile MoveDataFileToFailureDir Unique \
>                --metFileExtension met \
>                --clientTransferer
> org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferFactory
>
> I got the following error:
>
> ERROR: Invalid option: 'crawlerId'
>
> Indeed, running "./crawler_launcher -h" makes no reference to crawlerId.
>  Leaving out that options give the error:
>
> ERROR: Must specify an action option!
>
> I tried reading the help directly, but the options/suboptions were a bit
> daunting, so I'm posting to the list.
>
> Thanks again,
>
> Bill Rideout
> brideout@haystack,mit.edu
>
>

Re: Problem setting up crawler service

Posted by Cameron Goodale <go...@apache.org>.
Bill,

To try and get you moving forward again check out this Wiki page:

https://cwiki.apache.org/confluence/display/OODT/OODT+Crawler+Help

Many of the maven generated docs are stale since a lot of documentation
efforts have been moved to the wiki.  This makes it easier for users and
devs to collaborate on documentation, but also requires the additional step
of flushing changes back into maven.

I will create a JIRA issue to address to disconnect for the crawler
component.

Thanks for the emails to the list.  Keep them coming.


Cheers,


Cameron


On Fri, Jun 7, 2013 at 2:15 PM, Bill Rideout <br...@haystack.mit.edu>wrote:

> I'm setting up the Catalog and Archive Crawling Framework as described in
> http://oodt.apache.org/components/maven/crawler/user/ .  The service
> started fine, but when I ran the test command:
>
> ./crawler_launcher \
>                --crawlerId StdProductCrawler \
>                --productPath /data/test \
>                --filemgrUrl http://localhost:9000/ \
>                --failureDir /tmp \
>                --actionIds DeleteDataFile MoveDataFileToFailureDir Unique \
>                --metFileExtension met \
>                --clientTransferer
> org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferFactory
>
> I got the following error:
>
> ERROR: Invalid option: 'crawlerId'
>
> Indeed, running "./crawler_launcher -h" makes no reference to crawlerId.
>  Leaving out that options give the error:
>
> ERROR: Must specify an action option!
>
> I tried reading the help directly, but the options/suboptions were a bit
> daunting, so I'm posting to the list.
>
> Thanks again,
>
> Bill Rideout
> brideout@haystack,mit.edu
>
>

Re: Problem setting up crawler service

Posted by "Mattmann, Chris A (398J)" <ch...@jpl.nasa.gov>.
Hey Bill,

Sorry that the website docs are out of date. See:

https://cwiki.apache.org/confluence/display/OODT/OODT+Crawler+Help


When in doubt, the wiki is probably the canonical source. If you
have the time or energy or desire to help us update our website
docs, the source for them e.g., the crawler, (in XDOC) lives here:

http://svn.apache.org/repos/asf/oodt/trunk/crawler/src/site/xdoc/user/index
.xml


Other component XDOCs are shipped along with the rest of the components
in a similar structure.

If you'd like to submit a patch to help us update them the process would
go like:

0. Create JIRA issue at https://issues.apache.org/jira/browse/OODT
1. svn co 
http://svn.apache.org/repos/asf/oodt/trunk/crawler/src/site/xdoc/user
2. cd user
3. edit index.xml in your favorite editor
4. svn status (make sure the file shows up as changed)
5. svn diff > OODT-xxx.brideout.yyMMdd.patch.txt where OODT-xxx is the JIRA
issue ID # from #1.
6. attach patch from #5 to #1.

Done!

Cheers,
Chris


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Bill Rideout <br...@haystack.mit.edu>
Reply-To: "user@oodt.apache.org" <us...@oodt.apache.org>
Date: Friday, June 7, 2013 2:15 PM
To: "user@oodt.apache.org" <us...@oodt.apache.org>
Subject: Problem setting up crawler service

>I'm setting up the Catalog and Archive Crawling Framework as described in
>http://oodt.apache.org/components/maven/crawler/user/ .  The service
>started fine, but when I ran the test command:
>
>./crawler_launcher \
>               --crawlerId StdProductCrawler \
>               --productPath /data/test \
>               --filemgrUrl http://localhost:9000/ \
>               --failureDir /tmp \
>               --actionIds DeleteDataFile MoveDataFileToFailureDir Unique
>\
>               --metFileExtension met \
>               --clientTransferer
>org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferFactory
>
>I got the following error:
>
>ERROR: Invalid option: 'crawlerId'
>
>Indeed, running "./crawler_launcher -h" makes no reference to crawlerId.
>Leaving out that options give the error:
>
>ERROR: Must specify an action option!
>
>I tried reading the help directly, but the options/suboptions were a bit
>daunting, so I'm posting to the list.
>
>Thanks again,
>
>Bill Rideout
>brideout@haystack,mit.edu
>


Re: Problem setting up crawler service

Posted by "Mattmann, Chris A (398J)" <ch...@jpl.nasa.gov>.
Hey Bill,

Sorry that the website docs are out of date. See:

https://cwiki.apache.org/confluence/display/OODT/OODT+Crawler+Help


When in doubt, the wiki is probably the canonical source. If you
have the time or energy or desire to help us update our website
docs, the source for them e.g., the crawler, (in XDOC) lives here:

http://svn.apache.org/repos/asf/oodt/trunk/crawler/src/site/xdoc/user/index
.xml


Other component XDOCs are shipped along with the rest of the components
in a similar structure.

If you'd like to submit a patch to help us update them the process would
go like:

0. Create JIRA issue at https://issues.apache.org/jira/browse/OODT
1. svn co 
http://svn.apache.org/repos/asf/oodt/trunk/crawler/src/site/xdoc/user
2. cd user
3. edit index.xml in your favorite editor
4. svn status (make sure the file shows up as changed)
5. svn diff > OODT-xxx.brideout.yyMMdd.patch.txt where OODT-xxx is the JIRA
issue ID # from #1.
6. attach patch from #5 to #1.

Done!

Cheers,
Chris


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Bill Rideout <br...@haystack.mit.edu>
Reply-To: "user@oodt.apache.org" <us...@oodt.apache.org>
Date: Friday, June 7, 2013 2:15 PM
To: "user@oodt.apache.org" <us...@oodt.apache.org>
Subject: Problem setting up crawler service

>I'm setting up the Catalog and Archive Crawling Framework as described in
>http://oodt.apache.org/components/maven/crawler/user/ .  The service
>started fine, but when I ran the test command:
>
>./crawler_launcher \
>               --crawlerId StdProductCrawler \
>               --productPath /data/test \
>               --filemgrUrl http://localhost:9000/ \
>               --failureDir /tmp \
>               --actionIds DeleteDataFile MoveDataFileToFailureDir Unique
>\
>               --metFileExtension met \
>               --clientTransferer
>org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferFactory
>
>I got the following error:
>
>ERROR: Invalid option: 'crawlerId'
>
>Indeed, running "./crawler_launcher -h" makes no reference to crawlerId.
>Leaving out that options give the error:
>
>ERROR: Must specify an action option!
>
>I tried reading the help directly, but the options/suboptions were a bit
>daunting, so I'm posting to the list.
>
>Thanks again,
>
>Bill Rideout
>brideout@haystack,mit.edu
>