You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2017/03/27 15:04:41 UTC

[jira] [Commented] (TIKA-2302) Make handling of macros equivalent btwn VBA in MSOffice and JS in PDFs

    [ https://issues.apache.org/jira/browse/TIKA-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15943416#comment-15943416 ] 

Hudson commented on TIKA-2302:
------------------------------

SUCCESS: Integrated in Jenkins build Tika-trunk #1233 (See [https://builds.apache.org/job/Tika-trunk/1233/])
TIKA-2302 -- make extraction of macros optional in OfficeParsers and set (tallison: [https://github.com/apache/tika/commit/19c0e916982174da20ee98196db840c7465471eb])
* (edit) tika-parsers/src/main/java/org/apache/tika/parser/microsoft/OfficeParser.java
* (edit) tika-parsers/src/test/java/org/apache/tika/parser/microsoft/ExcelParserTest.java
* (edit) tika-parsers/src/main/java/org/apache/tika/parser/microsoft/OfficeParserConfig.java
* (edit) tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/OOXMLExtractorFactory.java
* (edit) tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/AbstractOOXMLExtractor.java
* (edit) tika-parsers/src/test/java/org/apache/tika/parser/microsoft/WordParserTest.java
* (edit) tika-parsers/src/test/java/org/apache/tika/parser/microsoft/ooxml/SXWPFExtractorTest.java
* (add) tika-parsers/src/test/resources/org/apache/tika/parser/microsoft/tika-config-macros.xml
* (edit) CHANGES.txt
* (edit) tika-parsers/src/test/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParserTest.java
* (add) tika-parsers/src/test/resources/org/apache/tika/parser/microsoft/ooxml/tika-config-sax-macros.xml
* (edit) tika-parsers/src/test/java/org/apache/tika/parser/microsoft/ooxml/SXSLFExtractorTest.java
* (edit) tika-parsers/src/main/java/org/apache/tika/parser/microsoft/AbstractOfficeParser.java
* (edit) tika-parsers/src/test/java/org/apache/tika/parser/microsoft/PowerPointParserTest.java
* (add) tika-parsers/src/test/resources/org/apache/tika/parser/microsoft/ooxml/tika-config-dom-macros.xml
TIKA-2302 -- make extraction of macros optional in OfficeParsers and set (tallison: [https://github.com/apache/tika/commit/5877c4c8702a10a76b6c3ee59fbae7daf3c9b062])
* (edit) tika-parsers/src/test/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParserTest.java
* (edit) tika-parsers/src/test/java/org/apache/tika/parser/microsoft/ExcelParserTest.java
* (edit) tika-parsers/src/test/java/org/apache/tika/parser/microsoft/ooxml/SXSLFExtractorTest.java
* (edit) tika-parsers/src/test/java/org/apache/tika/parser/microsoft/ooxml/SXWPFExtractorTest.java
* (edit) tika-parsers/src/test/java/org/apache/tika/parser/microsoft/WordParserTest.java


> Make handling of macros equivalent btwn VBA in MSOffice and JS in PDFs
> ----------------------------------------------------------------------
>
>                 Key: TIKA-2302
>                 URL: https://issues.apache.org/jira/browse/TIKA-2302
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Priority: Minor
>
> The current default behavior is to extract VBA macros from MSOffice files but not to extract JS from PDFs.  Now that we have a config for MSOffice files, I propose changing the default behavior to NOT extract VBA macros from MSOffice files.  Users can opt in to extraction of macros via configuration.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)