You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Matt Sheppard (JIRA)" <ji...@apache.org> on 2015/09/07 06:30:45 UTC
[jira] [Updated] (TIKA-1730) Excel to HTML filtering seems to
produce some font setting gibberish in output
[ https://issues.apache.org/jira/browse/TIKA-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matt Sheppard updated TIKA-1730:
--------------------------------
Description:
Noticed while upgrading form Tika 1.8 to 1.10 - An .xls file linked below, which used to filter pretty normally, now produces the following...
{noformat}
<div class="outside">&C&"Arial,Bold"&11&F</div>
{noformat}
...seemingly at the end of the first sheet's output when filtered with {{java -jar tika-app-1.10.jar funnelback-claim-form-with-expense-codes.xls}}.
It looks like some styling information which should not be getting displayed at text here.
Would be nice if that could be fixed in some future version.
was:
Noticed while upgrading form Tika 1.8 to 1.10 - An .xls file I can provide, which used to filter pretty normally, now produces the following...
{noformat}
<div class="outside">&C&"Arial,Bold"&11&F</div>
{noformat}
...seemingly at the end of the first sheet's output when filtered with {{java -jar tika-app-1.10.jar funnelback-claim-form-with-expense-codes.xls}}.
It looks like some styling information which should not be getting displayed at text here.
Would be nice if that could be fixed in some future version.
> Excel to HTML filtering seems to produce some font setting gibberish in output
> ------------------------------------------------------------------------------
>
> Key: TIKA-1730
> URL: https://issues.apache.org/jira/browse/TIKA-1730
> Project: Tika
> Issue Type: Bug
> Reporter: Matt Sheppard
>
> Noticed while upgrading form Tika 1.8 to 1.10 - An .xls file linked below, which used to filter pretty normally, now produces the following...
> {noformat}
> <div class="outside">&C&"Arial,Bold"&11&F</div>
> {noformat}
> ...seemingly at the end of the first sheet's output when filtered with {{java -jar tika-app-1.10.jar funnelback-claim-form-with-expense-codes.xls}}.
> It looks like some styling information which should not be getting displayed at text here.
> Would be nice if that could be fixed in some future version.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)