You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by "Rupert Westenthaler (JIRA)" <ji...@apache.org> on 2013/02/20 14:03:14 UTC

[jira] [Resolved] (STANBOL-947) Allow the TikaEngine to add unmapped properties to the Metadata of the processed ContentItem

     [ https://issues.apache.org/jira/browse/STANBOL-947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rupert Westenthaler resolved STANBOL-947.
-----------------------------------------

    Resolution: Fixed

implemented with http://svn.apache.org/r1448128

NOTE that by now only Tika properties following the {prefix}:{localname} syntax are considered. Others are for now ignored. If someone would like to also have those please feel free to open an new issue.
                
> Allow the TikaEngine to add unmapped properties to the Metadata of the processed ContentItem
> --------------------------------------------------------------------------------------------
>
>                 Key: STANBOL-947
>                 URL: https://issues.apache.org/jira/browse/STANBOL-947
>             Project: Stanbol
>          Issue Type: New Feature
>          Components: Engine - Tika
>            Reporter: Rupert Westenthaler
>            Assignee: Rupert Westenthaler
>
> Currently the Tika Engine can only add information provided by the Tika Metadata to the Enhancement Metadata for that an explicit Ontology mapping is available and activated. All other metadata are not accessible.
> This will add an additional Configuration to the TikaEngine that allows to add unmapped properties to the Enhancement graph.
> Properties will be written by using
> * the ContentItem URI as subject
> * urn:tika.apache.org:{property-name} as property
> * the value of the property as Object.
> That means that values of unmapped properties will be accessible by using
>     ContentItem ci; //the content item
>     String property; //the property
>     Iterator<Triple> it = ci.getMetadata().filter(
>         ci.getId, new UriRef("urn:tika.apache.org:"+property), null);
>     while(it.hasNext()){
>         Resource value = it.next().getObject();
>     }
> By default this feature will be deactivated. Users that want to have unmapped properties present need to set "stanbol.engine.tika.mapping.unmapped" to true.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira