You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by Archan Ranade <ar...@opshub.com> on 2021/09/23 04:14:17 UTC

Side effects of setting WriteOutContentHandler write limit as -1 are unknown

I have the following query. WriteOutContentHandler has a parameterized constructor which can be used to specify the writeLimit. The default seems to be 100,000 characters. Setting this to -1 signifies no writing limit. I want to understand the side effects of keeping it as -1, whether it can cause any memory or performance issues. Can this lead to any OutOfMemoryException errors or any other memory related issues? I also want to understand the reason why this write limit was introduced in the first place, any use-case or user scenario due to which this parameter was introduced.


Thanks,
Archan Ranade

[Logo]<https://www.opshub.com/>

Archan Ranade | Software Engineer

e: archan.ranade@opshub.com<ma...@opshub.com>  | w: https://www.opshub.com/
p:  +1.650.701.1800

[facebook icon]<https://www.facebook.com/OpsHub/> [twitter icon] <https://twitter.com/opshub/>  [youtube icon] <https://www.youtube.com/channel/UCuUvJx7xB1vQDRf6w8VwKIw?view_as=subscriber>  [linkedin icon] <https://www.linkedin.com/company/opshub>



Re: Side effects of setting WriteOutContentHandler write limit as -1 are unknown

Posted by Tim Allison <ta...@apache.org>.
Sorry, but It Depends(TM).  Yes, you can blow out your memory on extra
large files especially if you are running multithreaded and/or have not a
lot of memory available.

If you are processing lots and lots of untrusted content, you need to set
limits on just about everything.  This is one limit that makes sense to try
to avoid OOMs.

On Thu, Sep 23, 2021 at 12:14 AM Archan Ranade <ar...@opshub.com>
wrote:

> I have the following query. WriteOutContentHandler has a parameterized
> constructor which can be used to specify the writeLimit. The default seems
> to be 100,000 characters. Setting this to -1 signifies no writing limit. I
> want to understand the side effects of keeping it as -1, whether it can
> cause any memory or performance issues. Can this lead to any
> OutOfMemoryException errors or any other memory related issues? I also
> want to understand the reason why this write limit was introduced in the
> first place, any use-case or user scenario due to which this parameter was
> introduced.
>
> Thanks,
> Archan Ranade
>
> [image: Logo] <https://www.opshub.com/>
>
> *Archan Ranade* *| Software Engineer*
>
> *e: archan.ranade@opshub.com <ar...@opshub.com>*  |* w: *
> https://www.opshub.com/
> *p:*  +1.650.701.1800
>
> [image: facebook icon] <https://www.facebook.com/OpsHub/> [image: twitter
> icon] <https://twitter.com/opshub/> [image: youtube icon]
> <https://www.youtube.com/channel/UCuUvJx7xB1vQDRf6w8VwKIw?view_as=subscriber>
>  [image: linkedin icon] <https://www.linkedin.com/company/opshub>
>
>
>
>