You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jayendra Patil <ja...@gmail.com> on 2011/03/07 19:33:45 UTC

Solr Cell & DataImport Tika handler broken - fails to index Zip file contents

Working with the latest Solr Trunk code and seems the Tika handlers
for Solr Cell (ExtractingDocumentLoader.java) and Data Import handler
(TikaEntityProcessor.java) fails to index the zip file contents again.
It just indexes the file names again.
This issue was addressed some time back, late last year, but seems to
have reappeared with the latest code.

I had raised a jira for the Data Import handler part with the patch
and the testcase - https://issues.apache.org/jira/browse/SOLR-2332.
The same fix is needed for the Solr Cell as well.

I can raise a jira and provide the patch for the same, if the above
patch seems good enough.

Regards,
Jayendra