You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by "Markus Schuch (Jira)" <ji...@apache.org> on 2022/04/30 15:40:00 UTC

[jira] [Updated] (CONNECTORS-1700) TikaServiceRmeta: Add options to filter out metadata based on size

     [ https://issues.apache.org/jira/browse/CONNECTORS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Markus Schuch updated CONNECTORS-1700:
--------------------------------------
    Fix Version/s: ManifoldCF 2.22
                       (was: ManifoldCF next)

> TikaServiceRmeta: Add options to filter out metadata based on size
> ------------------------------------------------------------------
>
>                 Key: CONNECTORS-1700
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1700
>             Project: ManifoldCF
>          Issue Type: Improvement
>          Components: Tika service connector
>    Affects Versions: ManifoldCF 2.21
>            Reporter: Julien Massiera
>            Assignee: Julien Massiera
>            Priority: Major
>             Fix For: ManifoldCF 2.22
>
>
> Some files may contain abnormally big metadata (several MB, be it for the metadata values, but also for the total amount of metadata) that can be problematic concerning the memory consumption of the connector. 
> To avoid this, we can provide job configuration options for the TikaServiceRmetaConnector to set limits on both metadata values and global amount of metadata, and exclude metadata that exceed the limits



--
This message was sent by Atlassian Jira
(v8.20.7#820007)