You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tyler Palsulich (JIRA)" <ji...@apache.org> on 2014/06/10 16:46:02 UTC

[jira] [Commented] (TIKA-411) Generate list of supported and detected types automatically

    [ https://issues.apache.org/jira/browse/TIKA-411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026518#comment-14026518 ] 

Tyler Palsulich commented on TIKA-411:
--------------------------------------

I'm interested in working on this. Should the list be generated on the fly (if so, how?), or once, statically, when the website is generated for each new version? Getting the list of types isn't difficult (cat tika-mimetypes.xml | grep "mime-type type=" | awk -F '"' '{ print $2 }' > types), but there are 1441 different types. So... we can't just make a big, unwieldily list. The list should be browsable, searchable, and [ctrl + f]able. One idea for the full list is to have expandable sublists -- application, audio, etc. Any ideas?

> Generate list of supported and detected types automatically
> -----------------------------------------------------------
>
>                 Key: TIKA-411
>                 URL: https://issues.apache.org/jira/browse/TIKA-411
>             Project: Tika
>          Issue Type: Improvement
>          Components: documentation
>            Reporter: Jukka Zitting
>            Priority: Minor
>
> Currently we edit the list of supported types (http://lucene.apache.org/tika/0.7/formats.html) manually, which is bound to leave the list outdated and incomplete. It would be better if the list was automatically generated from the tika-mimetypes.xml file and the getSupportedTypes() response of the AutoDetectParser class.



--
This message was sent by Atlassian JIRA
(v6.2#6252)