You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Keith R. Bennett (JIRA)" <ji...@apache.org> on 2007/10/17 23:00:51 UTC

[jira] Created: (TIKA-75) Provide a MimeUtils.getType(URL) method that will determine MIME type based on the stream and, if necessary, the name.

Provide a MimeUtils.getType(URL) method that will determine MIME type based on the stream and, if necessary, the name.
----------------------------------------------------------------------------------------------------------------------

                 Key: TIKA-75
                 URL: https://issues.apache.org/jira/browse/TIKA-75
             Project: Tika
          Issue Type: Improvement
          Components: general
    Affects Versions: 0.1-incubator
            Reporter: Keith R. Bennett
            Priority: Minor
             Fix For: 0.1-incubator


We have a MimeUtils method that returns a MIME type based solely on the name.  It would be very helpful to also have a method that examines the header as well.  I've added a method (patch coming) that does this.  It opens a stream from the URL, reads the header, closes the stream, and then calls the existing method.

This may not be usable in the course of parsing, since it violates our decision to read a stream only once.  However, it is very useful as a way to test our MIME type determination, and as a non-parse service to our users (as recently discussed on the forum).


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (TIKA-75) Provide a MimeUtils.getType(URL) method that will determine MIME type based on the stream and, if necessary, the name.

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-75?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting updated TIKA-75:
------------------------------

    Attachment: TIKA-75.jukka.patch

Here's a slightly modified patch with the following improvements:

1) Uses MimeTypes.getMinLength() instead of a hardcoded constant

2) Works correctly even if InputStream.read(byte[]) doesn't fill the whole buffer

3) Doesn't use new File("/test-documents/" + filename).toString() to avoid problems on Windows

> Provide a MimeUtils.getType(URL) method that will determine MIME type based on the stream and, if necessary, the name.
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: TIKA-75
>                 URL: https://issues.apache.org/jira/browse/TIKA-75
>             Project: Tika
>          Issue Type: Improvement
>          Components: general
>    Affects Versions: 0.1-incubator
>            Reporter: Keith R. Bennett
>            Priority: Minor
>             Fix For: 0.1-incubator
>
>         Attachments: TIKA-75.jukka.patch, tika-75.patch
>
>
> We have a MimeUtils method that returns a MIME type based solely on the name.  It would be very helpful to also have a method that examines the header as well.  I've added a method (patch coming) that does this.  It opens a stream from the URL, reads the header, closes the stream, and then calls the existing method.
> This may not be usable in the course of parsing, since it violates our decision to read a stream only once.  However, it is very useful as a way to test our MIME type determination, and as a non-parse service to our users (as recently discussed on the forum).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (TIKA-75) Provide a MimeUtils.getType(URL) method that will determine MIME type based on the stream and, if necessary, the name.

Posted by "Keith R. Bennett (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-75?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Keith R. Bennett updated TIKA-75:
---------------------------------

    Attachment: tika-75.patch

This patch adds MimeUtils.getType(URL) and unit tests that verify that it works correctly.

> Provide a MimeUtils.getType(URL) method that will determine MIME type based on the stream and, if necessary, the name.
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: TIKA-75
>                 URL: https://issues.apache.org/jira/browse/TIKA-75
>             Project: Tika
>          Issue Type: Improvement
>          Components: general
>    Affects Versions: 0.1-incubator
>            Reporter: Keith R. Bennett
>            Priority: Minor
>             Fix For: 0.1-incubator
>
>         Attachments: tika-75.patch
>
>
> We have a MimeUtils method that returns a MIME type based solely on the name.  It would be very helpful to also have a method that examines the header as well.  I've added a method (patch coming) that does this.  It opens a stream from the URL, reads the header, closes the stream, and then calls the existing method.
> This may not be usable in the course of parsing, since it violates our decision to read a stream only once.  However, it is very useful as a way to test our MIME type determination, and as a non-parse service to our users (as recently discussed on the forum).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (TIKA-75) Provide a MimeUtils.getType(URL) method that will determine MIME type based on the stream and, if necessary, the name.

Posted by "Keith R. Bennett (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-75?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Keith R. Bennett resolved TIKA-75.
----------------------------------

    Resolution: Fixed
      Assignee: Keith R. Bennett

Applied Jukka's patch.

> Provide a MimeUtils.getType(URL) method that will determine MIME type based on the stream and, if necessary, the name.
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: TIKA-75
>                 URL: https://issues.apache.org/jira/browse/TIKA-75
>             Project: Tika
>          Issue Type: Improvement
>          Components: general
>    Affects Versions: 0.1-incubator
>            Reporter: Keith R. Bennett
>            Assignee: Keith R. Bennett
>            Priority: Minor
>             Fix For: 0.1-incubator
>
>         Attachments: TIKA-75.jukka.patch, tika-75.patch
>
>
> We have a MimeUtils method that returns a MIME type based solely on the name.  It would be very helpful to also have a method that examines the header as well.  I've added a method (patch coming) that does this.  It opens a stream from the URL, reads the header, closes the stream, and then calls the existing method.
> This may not be usable in the course of parsing, since it violates our decision to read a stream only once.  However, it is very useful as a way to test our MIME type determination, and as a non-parse service to our users (as recently discussed on the forum).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.