You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Reeza Edah Tally <re...@nova-hub.com> on 2011/04/14 12:15:11 UTC
Do EntityProcessor honor onError=skip when nextRow() fails?
Hi,
The document that I am trying to index with DIH contains an entity with
fields queried from a DB and an entity with the content of a file extracted
with TikaEntityProcessor. I was testing the onError="skip" option with
TikaEntityProcessor and found out it does not work. It basically behaves
like an onError="continue". I.e. the document still ends up in my index with
the DB fields but no file content. This is a problem because my index is
inconsistent with respect to my business data.
It seems that the issue lies in EntityProcessorWrapper which swallows
exceptions from nextRow() unless onError="abort". So is it safe to say that
this option just does not work? Can somebody please suggest an alternative
that would enable me to import all or nothing?
1 more observation: TikaEntityProcessor line 132 does not close the
InputStream in a finally clause; if parsing fails it remains open.
Thanks,
Reeza