You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oodt.apache.org by Chris Mattmann <ch...@gmail.com> on 2014/11/20 12:11:26 UTC

Re: OODT Crawler Setup

You need to prefix the crawler config with file: and then it
should work :)

------------------------
Chris Mattmann
chris.mattmann@gmail.com




-----Original Message-----
From: "Mallder, Valerie" <Va...@jhuapl.edu>
Reply-To: <us...@oodt.apache.org>
Date: Thursday, November 20, 2014 at 6:10 PM
To: "user@oodt.apache.org" <us...@oodt.apache.org>, "Roberts, Joe T (398H)"
<Jo...@jpl.nasa.gov>
Subject: RE: OODT Crawler Setup

>I got it to work by setting the directory to be relative to workflow/bin.
>Mine is currently set to:
>../../crawler/policy/crawler-config.xml.
> 
>You might try that and see what happens.
> 
>Val
> 
> 
> 
>Valerie
> A. Mallder
>New Horizons
> Deputy Mission System Engineer
>Johns Hopkins
> University/Applied Physics Laboratory
>
> 
>From:
> Verma, Rishi (398M) [mailto:Rishi.Verma@jpl.nasa.gov]
>Sent: Thursday, November 20, 2014 11:59 AM
>To: Roberts, Joe T (398H)
>Cc: user@oodt.apache.org
>Subject: Re: OODT Crawler Setup
>
>
> 
>Hi Joe,
>
> 
>
>Looks like a bad file path reference to crawler-config.xml within the
>crawler_launcher script. You should have ³/usr² instead
> of ³usr² in the path at the very least.
>
> 
>
>Are you using 0.8-SNAPSHOT version of OODT? Are you using RADiX or just
>the Crawler component downloaded separately?
>
> 
>
>Can you try the following:
>
>1. Open up crawler_launcher script, and find the line that has the string
>³crawler-config.xml²
>
>2. Check to see if the path looks good (i.e. does it have a ³file:²
>prefix like the other XML config files listed below it?)
> or even try printing the path to see what it resolves to.
>
> 
>
>By the way, I¹m CCing your question to the dev mailing list in case
>others could help. Feel free to CC the mailing list on general
> questions in the future too!
>
> 
>
>Thanks,
>
>Rishi
>
> 
>On Nov 19, 2014, at 4:42 PM, Roberts, Joe T (398H)
><Jo...@jpl.nasa.gov>
> wrote:
>
>
>
>
>Hi Rishi,
>
> 
>
>I've finally gotten around to configuring Crawler for the first time on
>cmsun-dev (new dev instance
> of cmsun).  I'm trying to test out a simple ingest but am seeing this
>exception:
>
> 
>
>[cms@cmsun-dev bin]$ pwd
>
>/usr/local/cms/deploy/crawler/bin
>
> 
>
>[cms@cmsun-dev bin]$ ./crawler_launcher -op --launchStdCrawler
>--productPath /usr/local/cms/test/
> --metFileExtension met --filemgrUrl http://localhost:9000
>
>Setting property 'StdProductCrawler.metFileExtension'
>
>Setting property 'DeleteMetadataFile.fileExtension'
>
>Setting property 'MoveMetadataFileToBackupDir.fileExtension'
>
>Setting property 'MoveMetadataFileToFailureDir.fileExtension'
>
>Setting property 'StdProductCrawler.productPath'
>
>Setting property 'MetExtractorProductCrawler.productPath'
>
>Setting property 'AutoDetectProductCrawler.productPath'
>
>Setting property 'StdProductCrawler.filemgrUrl'
>
>Setting property 'MetExtractorProductCrawler.filemgrUrl'
>
>Setting property 'AutoDetectProductCrawler.filemgrUrl'
>
>ERROR: IOException parsing XML document from file
>[/data/local/cms/deploy/crawler/bin/usr/local/cms/deploy/crawler/policy/cr
>awler-config.xml];
> nested exception is java.io.FileNotFoundException:
>usr/local/cms/deploy/crawler/policy/crawler-config.xml (No such file or
>directory)
>
>
> 
>
> 
>
>It looks like it can't find the proper location of crawler-config.xml.  I
>can't seem to figure out
> where it is configured to look‹any suggestions?
>
> 
>
>Thanks,
>
>Joe
>
>
>
> 
>---
>
>Rishi Verma
>
>NASA Jet Propulsion Laboratory
>
>California Institute of Technology
>
>
>
> 
>
>
>