You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by "Mingchun Zhao (JIRA)" <ji...@apache.org> on 2014/10/23 10:18:34 UTC

[jira] [Created] (CONNECTORS-1079) the parsing in TikaExtractor always return empty result

Mingchun Zhao created CONNECTORS-1079:
-----------------------------------------

             Summary: the parsing in TikaExtractor always return empty result
                 Key: CONNECTORS-1079
                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1079
             Project: ManifoldCF
          Issue Type: Bug
          Components: Tika extractor
    Affects Versions: ManifoldCF 2.0
            Reporter: Mingchun Zhao


When I use latest trunk source(2.0) to try the Tika content extractor,It did not return any expected results.
I looked at it using debugging tools, found that the parser of Tika content extractor does not return any data.
I've tried to move lib/tika-core-1.6.jar into connector-lib/, 
Then, the Tika content extractor returned data as expected.

My configurations are as below:
==
Transformation:
 Type: Tika content extractor
Output:
 Type:Solr(Use extract update handler=false)
Repository:
 type: Web
Job:
 1.type: repository
 2.type: transformation
 3.type: output
==

Maybe, it is related to CONNECTORS-1074(?), 
It looks like that the place of tika-core-1.6.jar affects the result of TikaExtractor.
 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)