You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Andreas Meier (JIRA)" <ji...@apache.org> on 2018/03/16 14:29:00 UTC
[jira] [Created] (TIKA-2609) Refine Emacs Lisp file recognition
(.elc)
Andreas Meier created TIKA-2609:
-----------------------------------
Summary: Refine Emacs Lisp file recognition (.elc)
Key: TIKA-2609
URL: https://issues.apache.org/jira/browse/TIKA-2609
Project: Tika
Issue Type: Improvement
Components: core
Reporter: Andreas Meier
Some newer .elc files are not recognized properly by the current matcher.
(Tested with emacs 24.4 files from [https://github.com/jwiegley/emacs-release/tree/master/lisp])
I created a regex that should handle these files similar to the linux magic:
{code:java}
# Emacs 18 - this is always correct, but not very magical.
0 string \012( Emacs v18 byte-compiled Lisp data
!:mime application/x-elc
# Emacs 19+ - ver. recognition added by Ian Springer
# Also applies to XEmacs 19+ .elc files; could tell them apart with regexs
# - Chris Chittleborough <cc...@yahoo.com.au>
0 string ;ELC
>4 byte >18
>4 byte <32 Emacs/XEmacs v%d byte-compiled Lisp data
!:mime application/x-elc{code}
{code:xml}
<mime-type type="application/x-elc">
<_comment>Emacs Lisp bytecode</_comment>
<magic priority="50">
<!-- Emacs 18 -->
<match value="\012(" type="string" offset="0" />
<!-- Emacs 19 -->
<match value=";ELC" type="string" offset="0" >
<match value="[\\x13-\\x1F]" type="regex" offset="4"/>
</match>
</magic>
<glob pattern="*.elc"/>
</mime-type>
{code}
Please verify the hexvalues before committing.
Regards
Andreas
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)