You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Nissim Shiman (Jira)" <ji...@apache.org> on 2023/03/03 20:20:00 UTC

[jira] [Updated] (NIFI-4718) IdentifyMimeType: increase priority for FFv3

     [ https://issues.apache.org/jira/browse/NIFI-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nissim Shiman updated NIFI-4718:
--------------------------------
    Status: Patch Available  (was: Open)

> IdentifyMimeType: increase priority for FFv3
> --------------------------------------------
>
>                 Key: NIFI-4718
>                 URL: https://issues.apache.org/jira/browse/NIFI-4718
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Extensions
>            Reporter: Brandon Rhys DeVries
>            Assignee: Nissim Shiman
>            Priority: Minor
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> IdentifyMimeType uses tika configured with a custom-mimetypes.xml\[1] to specify (among others) the flowfile-v* mime types.  However, these do not include priorities.  Therefore, a NiFi FlowFile V3 package with a payload containing, for example, html including the string:
> {code}
> <html xmlns=
> {code}
> will be identified as "application/xhtml+xml" \[2] which, while matching the pattern, is not as correct as identifying it as application/flowfile-v3.  To fix this, I believe we need to specify a higher priority for the FlowFile V3 "magic"...
> \[1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/resources/org/apache/tika/mime/custom-mimetypes.xml#L26-L31
> \[2] https://gitbox.apache.org/repos/asf?p=tika.git;a=blob;f=tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml;hb=refs/heads/master



--
This message was sent by Atlassian Jira
(v8.20.10#820010)