You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Boris Slobodin (JIRA)" <ji...@apache.org> on 2016/02/03 16:29:40 UTC

[jira] [Created] (TIKA-1850) Tika erroneously detects some versions of jQuery as "text/html"

Boris Slobodin created TIKA-1850:
------------------------------------

             Summary: Tika erroneously detects some versions of jQuery as "text/html"
                 Key: TIKA-1850
                 URL: https://issues.apache.org/jira/browse/TIKA-1850
             Project: Tika
          Issue Type: Bug
          Components: detector
    Affects Versions: 1.11
         Environment: {code}
ProductName:	Mac OS X
ProductVersion:	10.11.3
BuildVersion:	15D21
{code}
            Reporter: Boris Slobodin


This sets the wrong {{Content-Type}} on S3 as a result, for example, when using s3_website and breaks some browsers like IE.

{code}
➜  wget https://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js -O jquery-1.7.1.min.js
--2016-02-02 15:21:33--  https://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js
Resolving ajax.googleapis.com... 216.58.193.106, 2607:f8b0:400a:801::200a
Connecting to ajax.googleapis.com|216.58.193.106|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/javascript]
Saving to: 'jquery-1.7.1.min.js'

jquery-1.7.1.min.js                        [  <=>                                                                      ]  91.67K   323KB/s    in 0.3s

2016-02-02 15:21:33 (323 KB/s) - 'jquery-1.7.1.min.js' saved [93868]
{code}
{code}
➜  wget https://ajax.googleapis.com/ajax/libs/jquery/1.12.0/jquery.min.js -O jquery-1.12.0.min.js
--2016-02-02 15:22:10--  https://ajax.googleapis.com/ajax/libs/jquery/1.12.0/jquery.min.js
Resolving ajax.googleapis.com... 216.58.193.106, 2607:f8b0:400a:801::200a
Connecting to ajax.googleapis.com|216.58.193.106|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/javascript]
Saving to: 'jquery-1.12.0.min.js'

jquery-1.12.0.min.js                       [ <=>                                                                       ]  95.08K  --.-KB/s    in 0.03s

2016-02-02 15:22:10 (3.30 MB/s) - 'jquery-1.12.0.min.js' saved [97362]
{code}
{code}
➜  wget https://ajax.googleapis.com/ajax/libs/jquery/2.2.0/jquery.min.js -O jquery-2.2.0.min.js
--2016-02-02 15:22:24--  https://ajax.googleapis.com/ajax/libs/jquery/2.2.0/jquery.min.js
Resolving ajax.googleapis.com... 216.58.193.106, 2607:f8b0:400a:801::200a
Connecting to ajax.googleapis.com|216.58.193.106|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/javascript]
Saving to: 'jquery-2.2.0.min.js'

jquery-2.2.0.min.js                        [ <=>                                                                       ]  83.58K  --.-KB/s    in 0.02s

2016-02-02 15:22:24 (3.39 MB/s) - 'jquery-2.2.0.min.js' saved [85589]
{code}

{color:red}{{jquery-1.7.1.min.js}}{color}
{code}
➜  java -jar tika-app-1.11.jar --detect jquery-1.7.1.min.js
text/html
{code}
{color:green}{{jquery-1.12.0.min.js}}{color}
{code}
➜  java -jar tika-app-1.11.jar --detect jquery-1.12.0.min.js
application/javascript
{code}
{color:green}{{jquery-2.2.0.min.js}}{color}
{code}
➜  java -jar tika-app-1.11.jar --detect jquery-2.2.0.min.js
application/javascript
{code}

Thank you.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)