You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by mi...@gxs.com on 2012/03/09 19:44:11 UTC

DIH - FileListEntityProcessor reading from Multiple Disk Directories

All,

I have an application that has RDF files in multiple subdirectories under a root directory. I'm using the DIH with a FileListEntityProcessor to load the index. All worked fine when the files were in a single directory, but I can't seem to figure out how to make a single data-config.xml read multiple directories.

The baseDir attribute seems to allow only a single absolute path. I tried multiple "document" elements with a different baseDir for each FileListEntityProcessor, but it only executed the first one.

Is there an easy way to do this, short of running multiple imports and changing baseDir for each?

Thanks,

Mike


Mike Rawlins
Sr. Software Engineer
Chair, ASC X12 Technical Assessment Subcommittee
18111 Preston Road, Suite 600
Dallas, TX 75252
+1 972.643.3101 direct
mike.rawlins@gxs.com<ma...@inovis.com>
www.gxs.com<http://www.inovis.com/>
GXS Blog<http://blogs.inovis.com/>
[GXS_2color_pos]


RE: DIH - FileListEntityProcessor reading from Multiple Disk Directories

Posted by mi...@gxs.com.
I knew there had to be an easy way. That was it. Thanks for the tip!


Mike Rawlins
Sr. Software Engineer
Chair, ASC X12 Technical Assessment Subcommittee
18111 Preston Road, Suite 600
Dallas, TX 75252
+1 972.643.3101 direct
mike.rawlins@gxs.com
www.gxs.com 
GXS Blog


-----Original Message-----
From: Dyer, James [mailto:James.Dyer@ingrambook.com] 
Sent: Friday, March 09, 2012 1:14 PM
To: solr-user@lucene.apache.org
Subject: RE: DIH - FileListEntityProcessor reading from Multiple Disk Directories

Did you try setting "baseDir" to the root directory and "recursive" to true ?  (see http://wiki.apache.org/solr/DataImportHandler#FileListEntityProcessor for more information).

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

From: mike.rawlins@gxs.com [mailto:mike.rawlins@gxs.com]
Sent: Friday, March 09, 2012 12:44 PM
To: solr-user@lucene.apache.org
Subject: DIH - FileListEntityProcessor reading from Multiple Disk Directories

All,

I have an application that has RDF files in multiple subdirectories under a root directory. I'm using the DIH with a FileListEntityProcessor to load the index. All worked fine when the files were in a single directory, but I can't seem to figure out how to make a single data-config.xml read multiple directories.

The baseDir attribute seems to allow only a single absolute path. I tried multiple "document" elements with a different baseDir for each FileListEntityProcessor, but it only executed the first one.

Is there an easy way to do this, short of running multiple imports and changing baseDir for each?

Thanks,

Mike


Mike Rawlins
Sr. Software Engineer
Chair, ASC X12 Technical Assessment Subcommittee
18111 Preston Road, Suite 600
Dallas, TX 75252
+1 972.643.3101 direct
mike.rawlins@gxs.com<ma...@inovis.com>
www.gxs.com<http://www.inovis.com/>
GXS Blog<http://blogs.inovis.com/>
[cid:image001.gif@01CCFDF2.39D86E20]


RE: DIH - FileListEntityProcessor reading from Multiple Disk Directories

Posted by "Dyer, James" <Ja...@ingrambook.com>.
Did you try setting "baseDir" to the root directory and "recursive" to true ?  (see http://wiki.apache.org/solr/DataImportHandler#FileListEntityProcessor for more information).

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

From: mike.rawlins@gxs.com [mailto:mike.rawlins@gxs.com]
Sent: Friday, March 09, 2012 12:44 PM
To: solr-user@lucene.apache.org
Subject: DIH - FileListEntityProcessor reading from Multiple Disk Directories

All,

I have an application that has RDF files in multiple subdirectories under a root directory. I'm using the DIH with a FileListEntityProcessor to load the index. All worked fine when the files were in a single directory, but I can't seem to figure out how to make a single data-config.xml read multiple directories.

The baseDir attribute seems to allow only a single absolute path. I tried multiple "document" elements with a different baseDir for each FileListEntityProcessor, but it only executed the first one.

Is there an easy way to do this, short of running multiple imports and changing baseDir for each?

Thanks,

Mike


Mike Rawlins
Sr. Software Engineer
Chair, ASC X12 Technical Assessment Subcommittee
18111 Preston Road, Suite 600
Dallas, TX 75252
+1 972.643.3101 direct
mike.rawlins@gxs.com<ma...@inovis.com>
www.gxs.com<http://www.inovis.com/>
GXS Blog<http://blogs.inovis.com/>
[cid:image001.gif@01CCFDF2.39D86E20]