You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "pdwalker (JIRA)" <ji...@apache.org> on 2018/03/16 10:20:00 UTC
[jira] [Updated] (TIKA-2608) tika matlab parser incorrectly
identifies content type of minified javascript file
[ https://issues.apache.org/jira/browse/TIKA-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
pdwalker updated TIKA-2608:
---------------------------
Description:
When the tika "detects" the following file, it returns the wrong content type:
{{$ curl -I [https://wiki.charltonslaw.com/xwiki/webjars/wiki%3Ait/mxgraph-editor/3.7.2/mxGraphEditor.min.js]}}
{{HTTP/1.1 200 OK}}
{{Server: nginx/1.10.3 (Ubuntu)}}
{{Date: Fri, 16 Mar 2018 10:09:54 GMT}}
{{Content-Type: text/x-matlab}}
{{ [snip]}}
{{X-Frame-Options: SAMEORIGIN}}
However, the unminified version of the same file returns the correct type:
{{$ curl -I [https://wiki.charltonslaw.com/xwiki/webjars/wiki%3Ait/mxgraph-editor/3.7.2/mxGraphEditor.js]}}
{{HTTP/1.1 200 OK}}
{{Server: nginx/1.10.3 (Ubuntu)}}
{{Date: Fri, 16 Mar 2018 10:10:25 GMT}}
{{Content-Type: application/javascript}}
{{ [snip]}}
{{X-Frame-Options: SAMEORIGIN}}
The problem this causes is when my xwiki installation is behind an ssl proxy (nginx) and I enable the add_header X-Content-Type-Options nosniff; header.
Modern browsers return the following error:
{quote}Refused to execute script from '[https://wiki.charltonslaw.com/xwiki/webjars/wiki%3Ait/mxgraph-editor/3.7.2/mxGraphEditor.min.js|https://wiki.proxy.domain/xwiki/webjars/wiki%3Ait/mxgraph-editor/3.7.2/mxGraphEditor.min.js]' because its MIME type ('text/x-matlab') is not executable, and strict MIME type checking is enabled.
{quote}
My "solution" is to disable the strict mime type checking in the ssl proxy, but I don't think that is idea. It'd be better of the matlab parser didn't claim random minified js files as its own.
was:
When the tika "detects" the following file, it returns the wrong content type:
{{$ curl -I https://wiki.charltonslaw.com/xwiki/webjars/wiki%3Ait/mxgraph-editor/3.7.2/mxGraphEditor.min.js}}
{{HTTP/1.1 200 OK}}
{{Server: nginx/1.10.3 (Ubuntu)}}
{{Date: Fri, 16 Mar 2018 10:09:54 GMT}}
{{Content-Type: text/x-matlab}}
{{Connection: keep-alive}}
{{Access-Control-Allow-Origin: *}}
{{Set-Cookie: JSESSIONID=B1FD2399240BB7BEA6EC83095806491F; Path=/xwiki/; HttpOnly}}
{{Cache-Control: public}}
{{Expires: Sat, 16 Mar 2019 10:09:54 GMT}}
{{Last-Modified: Fri, 16 Mar 2018 10:09:54 GMT}}
{{Strict-Transport-Security: max-age=31536000; includeSubdomains}}
{{X-Frame-Options: SAMEORIGIN}}
However, the unminified version of the same file returns the correct type:
{{$ curl -I https://wiki.charltonslaw.com/xwiki/webjars/wiki%3Ait/mxgraph-editor/3.7.2/mxGraphEditor.js}}
{{HTTP/1.1 200 OK}}
{{Server: nginx/1.10.3 (Ubuntu)}}
{{Date: Fri, 16 Mar 2018 10:10:25 GMT}}
{{Content-Type: application/javascript}}
{{Connection: keep-alive}}
{{Access-Control-Allow-Origin: *}}
{{Set-Cookie: JSESSIONID=604F281C24DFD6C8897F0BEBDD123339; Path=/xwiki/; HttpOnly}}
{{Cache-Control: public}}
{{Expires: Sat, 16 Mar 2019 10:10:25 GMT}}
{{Last-Modified: Fri, 16 Mar 2018 10:10:25 GMT}}
{{Vary: Accept-Encoding}}
{{Strict-Transport-Security: max-age=31536000; includeSubdomains}}
{{X-Frame-Options: SAMEORIGIN}}
The problem this causes is when my xwiki installation is behind an ssl proxy (nginx) and I enable the add_header X-Content-Type-Options nosniff; header.
Modern browsers return the following error:
{quote}Refused to execute script from '[https://wiki.proxy.domain/xwiki/webjars/wiki%3Ait/mxgraph-editor/3.7.2/mxGraphEditor.min.js]' because its MIME type ('text/x-matlab') is not executable, and strict MIME type checking is enabled.
{quote}
My "solution" is to disable the strict mime type checking in the ssl proxy, but I don't think that is idea. It'd be better of the matlab parser didn't claim random minified js files as its own.
> tika matlab parser incorrectly identifies content type of minified javascript file
> ----------------------------------------------------------------------------------
>
> Key: TIKA-2608
> URL: https://issues.apache.org/jira/browse/TIKA-2608
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 1.17
> Environment: * xwiki 10.1,
> * Tomcat 8 (8.0.32-1ubuntu1)
> * Ubuntu 16.04.4 LTS
> * Oracle Java 1.8.0_161-b12
> Reporter: pdwalker
> Priority: Minor
>
> When the tika "detects" the following file, it returns the wrong content type:
> {{$ curl -I [https://wiki.charltonslaw.com/xwiki/webjars/wiki%3Ait/mxgraph-editor/3.7.2/mxGraphEditor.min.js]}}
> {{HTTP/1.1 200 OK}}
> {{Server: nginx/1.10.3 (Ubuntu)}}
> {{Date: Fri, 16 Mar 2018 10:09:54 GMT}}
> {{Content-Type: text/x-matlab}}
> {{ [snip]}}
> {{X-Frame-Options: SAMEORIGIN}}
> However, the unminified version of the same file returns the correct type:
> {{$ curl -I [https://wiki.charltonslaw.com/xwiki/webjars/wiki%3Ait/mxgraph-editor/3.7.2/mxGraphEditor.js]}}
> {{HTTP/1.1 200 OK}}
> {{Server: nginx/1.10.3 (Ubuntu)}}
> {{Date: Fri, 16 Mar 2018 10:10:25 GMT}}
> {{Content-Type: application/javascript}}
> {{ [snip]}}
> {{X-Frame-Options: SAMEORIGIN}}
> The problem this causes is when my xwiki installation is behind an ssl proxy (nginx) and I enable the add_header X-Content-Type-Options nosniff; header.
> Modern browsers return the following error:
> {quote}Refused to execute script from '[https://wiki.charltonslaw.com/xwiki/webjars/wiki%3Ait/mxgraph-editor/3.7.2/mxGraphEditor.min.js|https://wiki.proxy.domain/xwiki/webjars/wiki%3Ait/mxgraph-editor/3.7.2/mxGraphEditor.min.js]' because its MIME type ('text/x-matlab') is not executable, and strict MIME type checking is enabled.
> {quote}
> My "solution" is to disable the strict mime type checking in the ssl proxy, but I don't think that is idea. It'd be better of the matlab parser didn't claim random minified js files as its own.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)