You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Karan Saini <ma...@gmail.com> on 2017/12/12 11:30:01 UTC
Solr - Achieve Delta-Import with the FileListEntityProcessor for PDF Files
Solr version :: 6.6.1
I am using the solr to index the PDF files and it is working fine as
expected. Now i have a requirement to perform the option of delta-import on
the PDF file.
I am not able to locate the example of implementing the delta-import with
FileListEntityProcessor.
Please suggest.
*data-config.xml* file looks like this one.
<dataConfig>
<dataSource type="BinFileDataSource"/>
<document>
<entity name="K1FileEntity" processor="FileListEntityProcessor"
dataSource="null"
recursive = "true"
baseDir="\\CLD02\RemoteDepot"
fileName=".*pdf" rootEntity="false">
<field column="file" name="id"/>
<!--<field column="fileAbsolutePath" name="path" />
<field column="fileSize" name="size" />-->
<field column="fileLastModified" name="lastmodified" />
<entity name="pdf" processor="TikaEntityProcessor"
onError="skip"
url="${K1FileEntity.fileAbsolutePath}" format="text">
<field column="title" name="title" meta="true"/>
<field column="dc:format" name="format" meta="true"/>
<field column="text" name="text"/>
</entity>
</entity>
</document>
</dataConfig>
Thanks,
Karan