You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/03/21 23:01:25 UTC

[jira] [Commented] (TIKA-774) ExifTool Parser

    [ https://issues.apache.org/jira/browse/TIKA-774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15205245#comment-15205245 ] 

ASF GitHub Bot commented on TIKA-774:
-------------------------------------

GitHub user rgauss opened a pull request:

    https://github.com/apache/tika/pull/92

    TIKA-774: ExifTool Parser

    Contribution of tika-exiftool for review

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/Alfresco/tika tika-exiftool

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tika/pull/92.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #92
    
----
commit 8eb474b06e1463ca172128b59b713782eb4bece8
Author: rgauss <rg...@rgauss.com>
Date:   2016-03-19T20:37:37Z

    Initial commit of tika-exiftool as is

commit 5ff139d68bebd39382d5ed9626bff42797ece01d
Author: rgauss <rg...@rgauss.com>
Date:   2016-03-19T22:44:00Z

    Added git ignore of properties override

commit c8f4fb062ce809661527c91df89b230da95f592c
Author: rgauss <rg...@rgauss.com>
Date:   2016-03-21T18:49:38Z

    Merge branch 'master' into tika-exiftool

commit e8a2fa30b16f8b947d118b61ca12476420e9bee0
Author: rgauss <rg...@rgauss.com>
Date:   2016-03-21T21:24:29Z

    TIKA-774: ExifTool Parser
      - Moved tika-exiftool from separate project to parsers
      - Updated license headers
      - Removed author Javadoc
      - Fixed a few forbiddenapi violations

commit 37aae337c5ca3b5a45c2e45804e3768e08a8bbb6
Author: rgauss <rg...@rgauss.com>
Date:   2016-03-21T21:31:31Z

    TIKA-774: ExifTool Parser
      - Removed more author Javadocs

commit 90f8550c03aa873a81975dfa10cfd77aa557fc6f
Author: rgauss <rg...@rgauss.com>
Date:   2016-03-21T22:00:00Z

    TIKA-774: ExifTool Parser
      - Renamed ExecutableUtils to ExiftoolExecutableUtils
      - Changed ExifToolImageParserTest to skip when exiftool is not
    available

----


> ExifTool Parser
> ---------------
>
>                 Key: TIKA-774
>                 URL: https://issues.apache.org/jira/browse/TIKA-774
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>    Affects Versions: 1.0
>         Environment: Requires be installed (http://www.sno.phy.queensu.ca/~phil/exiftool/)
>            Reporter: Ray Gauss II
>              Labels: features, new-parser, newbie, patch
>             Fix For: 1.13
>
>         Attachments: testJPEG_IPTC_EXT.jpg, tika-core-exiftool-parser-patch.txt, tika-parsers-exiftool-parser-patch.txt
>
>
> Adds an external parser that calls ExifTool to extract extended metadata fields from images and other content types.
> In the core project:
> An ExifTool interface is added which contains Property objects that define the metadata fields available.
> An additional Property constructor for internalTextBag type.
> In the parsers project:
> An ExiftoolMetadataExtractor is added which does the work of calling ExifTool on the command line and mapping the response to tika metadata fields.  This extractor could be called instead of or in addition to the existing ImageMetadataExtractor and JempboxExtractor under TiffParser and/or JpegParser but those have not been changed at this time.
> An ExiftoolParser is added which calls only the ExiftoolMetadataExtractor.
> An ExiftoolTikaMapper is added which is responsible for mapping the ExifTool metadata fields to existing tika and Drew Noakes metadata fields if enabled.
> An ElementRdfBagMetadataHandler is added for extracting multi-valued RDF Bag implementations in XML files.
> An ExifToolParserTest is added which tests several expected XMP and IPTC metadata values in testJPEG_IPTC_EXT.jpg.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)