You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tyler Palsulich (JIRA)" <ji...@apache.org> on 2015/03/14 00:35:38 UTC

[jira] [Closed] (TIKA-1067) Tika extracts non-existent asterisks (*) from .ppt files

     [ https://issues.apache.org/jira/browse/TIKA-1067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tyler Palsulich closed TIKA-1067.
---------------------------------
    Resolution: Cannot Reproduce

I tried this on a recent PPT and didn't see the issue. So, I'm closing as Cannot Reproduce. But, please reopen if you have a file which still triggers this!

> Tika extracts non-existent asterisks (*) from .ppt files
> --------------------------------------------------------
>
>                 Key: TIKA-1067
>                 URL: https://issues.apache.org/jira/browse/TIKA-1067
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>            Reporter: Michael McCandless
>
> I created a new blank presentation, put in title + subtitle, saved it as .ppt, and then ran TikaCLI -t:
> {noformat}
> <body><div class="slideShow"><div class="slide"><p class="slide-master-content">*<br/>
> *<br/>
> </p>
> <p class="slide-content">Testing<br/>
> testing<br/>
> </p>
> </div>
> </div>
> <div class="slideNotes"/>
> {noformat}
> The two extra *'s seem to be coming from the master slide, but I'm not sure which text runs they are and how to stop them ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)