You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by ju...@francelabs.com on 2021/03/16 09:41:34 UTC

content limiter header

Hi Tim,

 

I am using the “rmeta” endpoint of a Tika server to extract data from
documents, and I was wondering if there is a specific header I can add to my
requests in order to tell Tika to limit the number of bytes extracted for
the content ? I did not manage to find such header in the documentation. It
would be very useful, like the “maxEmbeddedResources” header allowing to
limit the depth of the recursive data extraction !

 

Regards,

 

Julien Massiera
Responsable produit

France Labs – Makers of  <https://www.datafari.com/en> Datafari Enteprise
Search
Vainqueur du trophée du Jury à
<https://www.ima-dt.org/ima/event/detail.html/idConf/938> IMAgineDAY 2021



 


Re: content limiter header

Posted by Tim Allison <ta...@apache.org>.
Hi Julien,
  That seems reasonable.  Please open an issue, and we should have time to
get that in by 1.26.

     Cheers,

              Tim

On Tue, Mar 16, 2021 at 5:41 AM <ju...@francelabs.com> wrote:

> Hi Tim,
>
>
>
> I am using the “rmeta” endpoint of a Tika server to extract data from
> documents, and I was wondering if there is a specific header I can add to
> my requests in order to tell Tika to limit the number of bytes extracted
> for the content ? I did not manage to find such header in the
> documentation. It would be very useful, like the “maxEmbeddedResources”
> header allowing to limit the depth of the recursive data extraction !
>
>
>
> Regards,
>
>
>
> Julien Massiera
> Responsable produit
>
> France Labs – Makers of Datafari Enteprise Search
> <https://www.datafari.com/en>
> Vainqueur du trophée du Jury à IMAgineDAY
> <https://www.ima-dt.org/ima/event/detail.html/idConf/938> 2021
>
> [image: Trophee_Jury_Datafari_IMAgineDay_202102_400x200]
>
>
>