You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Jukka Zitting (Created) (JIRA)" <ji...@apache.org> on 2011/10/18 20:41:11 UTC

[jira] [Created] (TIKA-756) XMP output from Tika CLI

XMP output from Tika CLI
------------------------

                 Key: TIKA-756
                 URL: https://issues.apache.org/jira/browse/TIKA-756
             Project: Tika
          Issue Type: New Feature
          Components: cli, metadata
            Reporter: Jukka Zitting
            Assignee: Jukka Zitting


It would be great if the Tika CLI could output metadata also in the XMP format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (TIKA-756) XMP output from Tika CLI

Posted by "Jörg Ehrlich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TIKA-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402968#comment-13402968 ] 

Jörg Ehrlich commented on TIKA-756:
-----------------------------------

The tika-xmp module provided by the patches use the XMPCore library available in the Maven Central repository. Unfortunately the current verson 5.1.0 has been compiled for JDK 1.7 which is not compatible with Tika. We are in the process of uploading an update to 5.1.1 which will solve that problem. The Patch can only be applied when the new XMPCore version 5.1.1 is available.
                
> XMP output from Tika CLI
> ------------------------
>
>                 Key: TIKA-756
>                 URL: https://issues.apache.org/jira/browse/TIKA-756
>             Project: Tika
>          Issue Type: New Feature
>          Components: cli, metadata
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>              Labels: metadata, xmp
>         Attachments: tika-xmp.patch, tika-xmp_dependsOn_TIKA929changes.patch
>
>
> It would be great if the Tika CLI could output metadata also in the XMP format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (TIKA-756) XMP output from Tika CLI

Posted by "Jörg Ehrlich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jörg Ehrlich updated TIKA-756:
------------------------------

    Attachment:     (was: tika-xmp_dependsOn_TIKA929changes.patch)
    
> XMP output from Tika CLI
> ------------------------
>
>                 Key: TIKA-756
>                 URL: https://issues.apache.org/jira/browse/TIKA-756
>             Project: Tika
>          Issue Type: New Feature
>          Components: cli, metadata
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>              Labels: metadata, xmp
>         Attachments: tika-xmp.patch, tika-xmp_dependsOn_TIKA929changes.patch
>
>
> It would be great if the Tika CLI could output metadata also in the XMP format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (TIKA-756) XMP output from Tika CLI

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TIKA-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406405#comment-13406405 ] 

Jukka Zitting commented on TIKA-756:
------------------------------------

Thanks! I applied the style and header patch in revision 1357199.
                
> XMP output from Tika CLI
> ------------------------
>
>                 Key: TIKA-756
>                 URL: https://issues.apache.org/jira/browse/TIKA-756
>             Project: Tika
>          Issue Type: New Feature
>          Components: cli, metadata
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>              Labels: metadata, xmp
>         Attachments: tika-xmp.patch, tika-xmp_styleAndHeader.patch
>
>
> It would be great if the Tika CLI could output metadata also in the XMP format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (TIKA-756) XMP output from Tika CLI

Posted by "Jörg Ehrlich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jörg Ehrlich updated TIKA-756:
------------------------------

    Attachment: tika-xmp_dependsOn_TIKA929changes.patch

uploading an update to the patch which depends on TIKA929 which fixes a test
                
> XMP output from Tika CLI
> ------------------------
>
>                 Key: TIKA-756
>                 URL: https://issues.apache.org/jira/browse/TIKA-756
>             Project: Tika
>          Issue Type: New Feature
>          Components: cli, metadata
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>              Labels: metadata, xmp
>         Attachments: tika-xmp.patch, tika-xmp_dependsOn_TIKA929changes.patch
>
>
> It would be great if the Tika CLI could output metadata also in the XMP format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (TIKA-756) XMP output from Tika CLI

Posted by "Jörg Ehrlich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TIKA-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406378#comment-13406378 ] 

Jörg Ehrlich commented on TIKA-756:
-----------------------------------

Thanks Jukka,

The IPTC header I copied accidentally and the Converter idea is good.
I will provide a patch with the style/header changes first and then work on another one for the Converter idea.
                
> XMP output from Tika CLI
> ------------------------
>
>                 Key: TIKA-756
>                 URL: https://issues.apache.org/jira/browse/TIKA-756
>             Project: Tika
>          Issue Type: New Feature
>          Components: cli, metadata
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>              Labels: metadata, xmp
>         Attachments: tika-xmp.patch
>
>
> It would be great if the Tika CLI could output metadata also in the XMP format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (TIKA-756) XMP output from Tika CLI

Posted by "Jörg Ehrlich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jörg Ehrlich updated TIKA-756:
------------------------------

    Attachment:     (was: tika-xmp_dependsOn_TIKA929changes.patch)
    
> XMP output from Tika CLI
> ------------------------
>
>                 Key: TIKA-756
>                 URL: https://issues.apache.org/jira/browse/TIKA-756
>             Project: Tika
>          Issue Type: New Feature
>          Components: cli, metadata
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>              Labels: metadata, xmp
>
> It would be great if the Tika CLI could output metadata also in the XMP format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (TIKA-756) XMP output from Tika CLI

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TIKA-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13405033#comment-13405033 ] 

Jukka Zitting commented on TIKA-756:
------------------------------------

Nice work! I committed the latest patch in revision 1356202.

There's a few minor issues like the use of tabs instead of spaces for indentation and required updates to our licensing details. I can take care of those in a minute.
                
> XMP output from Tika CLI
> ------------------------
>
>                 Key: TIKA-756
>                 URL: https://issues.apache.org/jira/browse/TIKA-756
>             Project: Tika
>          Issue Type: New Feature
>          Components: cli, metadata
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>              Labels: metadata, xmp
>         Attachments: tika-xmp.patch
>
>
> It would be great if the Tika CLI could output metadata also in the XMP format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (TIKA-756) XMP output from Tika CLI

Posted by "Jörg Ehrlich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jörg Ehrlich updated TIKA-756:
------------------------------

    Attachment: tika-xmp_dependsOn_TIKA929changes.patch

The tika-xmp_dependsOn_TIKA929changes.patch contains the same tika-xmp module as offered by the other patch, but depends on the patch from TIKA-929 being applied first.

The recommendation is to use this one instead of tika-xmp.patch
                
> XMP output from Tika CLI
> ------------------------
>
>                 Key: TIKA-756
>                 URL: https://issues.apache.org/jira/browse/TIKA-756
>             Project: Tika
>          Issue Type: New Feature
>          Components: cli, metadata
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>              Labels: metadata, xmp
>         Attachments: tika-xmp.patch, tika-xmp_dependsOn_TIKA929changes.patch
>
>
> It would be great if the Tika CLI could output metadata also in the XMP format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (TIKA-756) XMP output from Tika CLI

Posted by "Jörg Ehrlich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jörg Ehrlich updated TIKA-756:
------------------------------

    Attachment: tika-xmp_styleAndHeader.patch

Added patch with style changes and adjusted licence header. No functional changes.
                
> XMP output from Tika CLI
> ------------------------
>
>                 Key: TIKA-756
>                 URL: https://issues.apache.org/jira/browse/TIKA-756
>             Project: Tika
>          Issue Type: New Feature
>          Components: cli, metadata
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>              Labels: metadata, xmp
>         Attachments: tika-xmp.patch, tika-xmp_styleAndHeader.patch
>
>
> It would be great if the Tika CLI could output metadata also in the XMP format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (TIKA-756) XMP output from Tika CLI

Posted by "Jukka Zitting (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TIKA-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13129957#comment-13129957 ] 

Jukka Zitting commented on TIKA-756:
------------------------------------

Rough first version committed in revision 1185805.
                
> XMP output from Tika CLI
> ------------------------
>
>                 Key: TIKA-756
>                 URL: https://issues.apache.org/jira/browse/TIKA-756
>             Project: Tika
>          Issue Type: New Feature
>          Components: cli, metadata
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>              Labels: metadata, xmp
>
> It would be great if the Tika CLI could output metadata also in the XMP format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (TIKA-756) XMP output from Tika CLI

Posted by "Jörg Ehrlich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jörg Ehrlich updated TIKA-756:
------------------------------

    Attachment:     (was: tika-xmp.patch)
    
> XMP output from Tika CLI
> ------------------------
>
>                 Key: TIKA-756
>                 URL: https://issues.apache.org/jira/browse/TIKA-756
>             Project: Tika
>          Issue Type: New Feature
>          Components: cli, metadata
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>              Labels: metadata, xmp
>
> It would be great if the Tika CLI could output metadata also in the XMP format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (TIKA-756) XMP output from Tika CLI

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TIKA-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13405078#comment-13405078 ] 

Jukka Zitting commented on TIKA-756:
------------------------------------

OK, see my followup commits for the minor adjustments. There are a few things remaining:

* I didn't change the formatting of code inside tika-xmp/src to avoid breaking any pending changes you may have. It would be good however to unify the formatting of that code with the rest of Tika where we normally use four spaces (no tabs) for indentation and don't use a separate line for an opening brace.

* The copyright headers mention the IPTC Photo Metadata standard. I didn't notice specific IPTC metadata descriptions being included in the relevant source files, so can we drop that extra copyright notice?

* The tika-xmp component currently has a dependency on tika-parsers just to get the list of media types supported by relevant parser components. Could we rather make tika-parsers depend on tika-xmp and provide the {{Converter}} classes as parts of the relevant o.a.t.parser.* packages. The {{TikaToXMP}} class could access them using the same {{ServiceLoader}} mechanism as tika-core uses for {{Detector}} and {{Parser}} implementations?
                
> XMP output from Tika CLI
> ------------------------
>
>                 Key: TIKA-756
>                 URL: https://issues.apache.org/jira/browse/TIKA-756
>             Project: Tika
>          Issue Type: New Feature
>          Components: cli, metadata
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>              Labels: metadata, xmp
>         Attachments: tika-xmp.patch
>
>
> It would be great if the Tika CLI could output metadata also in the XMP format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (TIKA-756) XMP output from Tika CLI

Posted by "Jörg Ehrlich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jörg Ehrlich updated TIKA-756:
------------------------------

    Attachment: tika-xmp.patch

As TIKA-929 has already been resolved I have deleted the previous two patches and upload a new one now, which also contains adjustments to latests tika-cli changes.
                
> XMP output from Tika CLI
> ------------------------
>
>                 Key: TIKA-756
>                 URL: https://issues.apache.org/jira/browse/TIKA-756
>             Project: Tika
>          Issue Type: New Feature
>          Components: cli, metadata
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>              Labels: metadata, xmp
>         Attachments: tika-xmp.patch
>
>
> It would be great if the Tika CLI could output metadata also in the XMP format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (TIKA-756) XMP output from Tika CLI

Posted by "Jörg Ehrlich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jörg Ehrlich updated TIKA-756:
------------------------------

    Assignee: Jörg Ehrlich  (was: Jukka Zitting)
    
> XMP output from Tika CLI
> ------------------------
>
>                 Key: TIKA-756
>                 URL: https://issues.apache.org/jira/browse/TIKA-756
>             Project: Tika
>          Issue Type: New Feature
>          Components: cli, metadata
>            Reporter: Jukka Zitting
>            Assignee: Jörg Ehrlich
>              Labels: metadata, xmp
>         Attachments: tika-xmp.patch, tika-xmp_styleAndHeader.patch
>
>
> It would be great if the Tika CLI could output metadata also in the XMP format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (TIKA-756) XMP output from Tika CLI

Posted by "Jörg Ehrlich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/TIKA-756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jörg Ehrlich updated TIKA-756:
------------------------------

    Attachment: tika-xmp.patch

The tika-xmp.patch provides an extra Tika module which offers conversion of Tika Metadata to XMP data model. It also integrates it with the "-y" output option of Tika-app, and therefor providing XMP output for Tika CLI.

The API extends the tika-core Metadata class but also offers the possibility to directly work with the XMP data model.
The Metadata information from Tika can either be converted by mimetype-specific converters which convert everything for their respective file format or by a generic converter, which will only convert full qualified properties which use prefixes from registered namespaces.
                
> XMP output from Tika CLI
> ------------------------
>
>                 Key: TIKA-756
>                 URL: https://issues.apache.org/jira/browse/TIKA-756
>             Project: Tika
>          Issue Type: New Feature
>          Components: cli, metadata
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>              Labels: metadata, xmp
>         Attachments: tika-xmp.patch
>
>
> It would be great if the Tika CLI could output metadata also in the XMP format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (TIKA-756) XMP output from Tika CLI

Posted by "Jörg Ehrlich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/TIKA-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13403799#comment-13403799 ] 

Jörg Ehrlich commented on TIKA-756:
-----------------------------------

Version 5.1.1 of the XMPCore library which is compatible with JDK 1.5/1.6 is available on Maven Central now.
                
> XMP output from Tika CLI
> ------------------------
>
>                 Key: TIKA-756
>                 URL: https://issues.apache.org/jira/browse/TIKA-756
>             Project: Tika
>          Issue Type: New Feature
>          Components: cli, metadata
>            Reporter: Jukka Zitting
>            Assignee: Jukka Zitting
>              Labels: metadata, xmp
>         Attachments: tika-xmp.patch, tika-xmp_dependsOn_TIKA929changes.patch
>
>
> It would be great if the Tika CLI could output metadata also in the XMP format.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira