You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by "Karl Wright (JIRA)" <ji...@apache.org> on 2014/10/23 12:06:33 UTC

[jira] [Commented] (CONNECTORS-1079) the parsing in TikaExtractor always return empty result

    [ https://issues.apache.org/jira/browse/CONNECTORS-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14181188#comment-14181188 ] 

Karl Wright commented on CONNECTORS-1079:
-----------------------------------------

That's odd, but I don't see why we can't have tika-core in two places rather than just one.


> the parsing in TikaExtractor always return empty result
> -------------------------------------------------------
>
>                 Key: CONNECTORS-1079
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1079
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Tika extractor
>    Affects Versions: ManifoldCF 2.0
>            Reporter: Mingchun Zhao
>
> When I use latest trunk source(2.0) to try the Tika content extractor,It did not return any expected results.
> I looked at it using debugging tools, found that the parser of Tika content extractor does not return any data.
> I've tried to move lib/tika-core-1.6.jar into connector-lib/, 
> Then, the Tika content extractor returned data as expected.
> My configurations are as below:
> ==
> Transformation:
>  Type: Tika content extractor
> Output:
>  Type:Solr(Use extract update handler=false)
> Repository:
>  type: Web
> Job:
>  1.type: repository
>  2.type: transformation
>  3.type: output
> ==
> Maybe, it is related to CONNECTORS-1074(?), 
> It looks like that the place of tika-core-1.6.jar affects the result of TikaExtractor.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)