You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Tim Allison (JIRA)" <ji...@apache.org> on 2016/05/31 19:28:13 UTC

[jira] [Comment Edited] (SOLR-8981) Upgrade to Tika 1.13 when it is available

    [ https://issues.apache.org/jira/browse/SOLR-8981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15308355#comment-15308355 ] 

Tim Allison edited comment on SOLR-8981 at 5/31/16 7:27 PM:
------------------------------------------------------------

I'm getting a failure on that test too.  I'm getting exactly the same output with the standalone Tika 1.7 and 1.13 apps on the test file...argh...

For some reason, it looks like Tika is now emitting 2 bodies, if you double the body in both tests, this now works:
{noformat}
ExtractingParams.XPATH_EXPRESSION, "/xhtml:html/xhtml:body/xhtml:body/xhtml:a/descendant::node()",
{noformat}
{noformat}
"xpath", "/xhtml:html/xhtml:body/xhtml:body/xhtml:div//node()",
{noformat}


was (Author: tallison@mitre.org):
I'm getting a failure on that test too.  I can't figure out what's going on.  I'm getting exactly the same output with the standalone Tika 1.7 and 1.13 apps on the test file...argh...

> Upgrade to Tika 1.13 when it is available
> -----------------------------------------
>
>                 Key: SOLR-8981
>                 URL: https://issues.apache.org/jira/browse/SOLR-8981
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Priority: Minor
>
> Tika 1.13 should be out within a month.  This includes PDFBox 2.0.0 and a number of other upgrades and improvements.  
> If there are any showstoppers in 1.13 from Solr's side or requests before we roll 1.13, let us know.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org