You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by "Julien Massiera (Jira)" <ji...@apache.org> on 2022/03/15 16:35:00 UTC

[jira] [Resolved] (CONNECTORS-1700) TikaServiceRmeta: Add options to filter out metadata based on size

     [ https://issues.apache.org/jira/browse/CONNECTORS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Julien Massiera resolved CONNECTORS-1700.
-----------------------------------------
    Fix Version/s: ManifoldCF next
       Resolution: Fixed

r1898949

> TikaServiceRmeta: Add options to filter out metadata based on size
> ------------------------------------------------------------------
>
>                 Key: CONNECTORS-1700
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1700
>             Project: ManifoldCF
>          Issue Type: Improvement
>          Components: Tika service connector
>    Affects Versions: ManifoldCF 2.21
>            Reporter: Julien Massiera
>            Assignee: Julien Massiera
>            Priority: Major
>             Fix For: ManifoldCF next
>
>
> Some files may contain abnormally big metadata (several MB, be it for the metadata values, but also for the total amount of metadata) that can be problematic concerning the memory consumption of the connector. 
> To avoid this, we can provide job configuration options for the TikaServiceRmetaConnector to set limits on both metadata values and global amount of metadata, and exclude metadata that exceed the limits



--
This message was sent by Atlassian Jira
(v8.20.1#820001)