You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Hudson (Jira)" <ji...@apache.org> on 2021/04/16 16:07:00 UTC
[jira] [Commented] (TIKA-3359) Extract swf from PDFs
[ https://issues.apache.org/jira/browse/TIKA-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17323918#comment-17323918 ]
Hudson commented on TIKA-3359:
------------------------------
SUCCESS: Integrated in Jenkins build Tika ยป tika-main-jdk8 #201 (See [https://ci-builds.apache.org/job/Tika/job/tika-main-jdk8/201/])
TIKA-3359 -- extract rich media from PDFs (tallison: [https://github.com/apache/tika/commit/601cfff8762e0bf69a6e08f2cdf09590a6dc311b])
* (edit) tika-parsers/tika-parsers-classic/tika-parsers-classic-modules/tika-parser-pdf-module/src/test/java/org/apache/tika/parser/pdf/PDFParserTest.java
* (add) tika-parsers/tika-parsers-classic/tika-parsers-classic-modules/tika-parser-pdf-module/src/test/resources/test-documents/testFlashInPDF.pdf
* (edit) tika-parsers/tika-parsers-classic/tika-parsers-classic-modules/tika-parser-pdf-module/src/main/java/org/apache/tika/parser/pdf/AbstractPDF2XHTML.java
> Extract swf from PDFs
> ---------------------
>
> Key: TIKA-3359
> URL: https://issues.apache.org/jira/browse/TIKA-3359
> Project: Tika
> Issue Type: Task
> Reporter: Tim Allison
> Priority: Major
> Fix For: 2.0.0
>
>
> On twitter, @terminalboredom and Tyler Thorsted shared examples of PDF files with embedded flash. I ran -z on tika-app, and we're not extracting these files. I suspect they're in a structure we're not currently checking.
> https://twitter.com/CHLThor/status/1382888365767360513?s=20
> https://twitter.com/sonicstacey/status/1382956466332573701?s=20
> Many thanks to @beet_keeper for putting us in touch.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)