You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Erick Erickson (JIRA)" <ji...@apache.org> on 2016/07/14 15:52:20 UTC

[jira] [Commented] (SOLR-9307) DIH detect all corrupt files: even if i mentionned at tika_data_config onError="skip" solr stop indexing at first corrupt file found

    [ https://issues.apache.org/jira/browse/SOLR-9307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15377153#comment-15377153 ] 

Erick Erickson commented on SOLR-9307:
--------------------------------------

Please raise questions like this on the user's list first. It'll get way more people looking at it and we try to reserve JIRAs for known code issues rather than usage questions.

Unless, of course, you have a patch or approach to changing the code.....

> DIH detect all corrupt files: even if i mentionned at tika_data_config onError="skip" solr  stop indexing at first corrupt file found
> -------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-9307
>                 URL: https://issues.apache.org/jira/browse/SOLR-9307
>             Project: Solr
>          Issue Type: Wish
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: contrib - DataImportHandler
>    Affects Versions: 5.4.1
>         Environment: windows
>            Reporter: kostali
>              Labels: tika
>
> I try to index many files msword and pdf using solr-5.4.1 ;
> In solr logg I get only the description of ERROR not the file who cause the Error;
>  how to get a list of files are corrupt and Tika cannot index them; AND even if solr try index corrupt file and fail how force solr to continue indexing the next file ,because in handler DIH of solr I wrote in tika_data_config.xml onError="skip" or onError="continue" dont work because the indexation stop when tika try index the first corrupt file found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org