You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2022/04/26 17:22:00 UTC
[jira] [Comment Edited] (TIKA-3730) New ExternalParser doesn't work on Windows
[ https://issues.apache.org/jira/browse/TIKA-3730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17528315#comment-17528315 ]
Tim Allison edited comment on TIKA-3730 at 4/26/22 5:21 PM:
------------------------------------------------------------
Thank you, [~tilman]! I've gotten mostly clean builds on my Windows laptop. Confirmation is welcomed!
What I can't figure out is that I got one failed build because maven's resources plugin was complaining that droste.zip was a malicious file, but then I -rf from the pkg module, and it worked. Who knows...
was (Author: tallison@mitre.org):
Thank you, [~tilman]! I've gotten mostly clean builds on my Windows laptop. Confirmation is welcomed!
What I can't figure out is that I got a failed build because maven's resources plugin was complaining that droste.zip was a malicious file, but then I -rf from the pkg module, and it worked. Who knows...
> New ExternalParser doesn't work on Windows
> ------------------------------------------
>
> Key: TIKA-3730
> URL: https://issues.apache.org/jira/browse/TIKA-3730
> Project: Tika
> Issue Type: Task
> Reporter: Tim Allison
> Priority: Trivial
> Fix For: 2.4.0
>
>
> [~tilman] noted that the external2.ExternalParser uses "replaceAll" on a regex where the replacement is a file path does not work on Windows. The replaceAll strips the file separators. I admit that I cannot figure out why this is is happening. I've tried a couple of combinations of backslashing etc, but nothing is working. I even tried Pattern.quote() and that doesn't work on Windows.
> If we back off to use "replace" with a string, everything seems to work.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)