You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2021/04/26 21:31:00 UTC

[jira] [Comment Edited] (TIKA-3372) Fix writelimit in recursiveparserhandler

    [ https://issues.apache.org/jira/browse/TIKA-3372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17332759#comment-17332759 ] 

Tim Allison edited comment on TIKA-3372 at 4/26/21, 9:30 PM:
-------------------------------------------------------------

While I'm fixing this, [~julienFL] and fellow devs...the fixed behavior is that the parse is stopped when the content length hits the limit.  This means that users will lose metadata about embedded objects.  If the parser hits the write limit on attachment 3, it will not process attachments 4-n at all.  Is this what we want or do we just want the parser to stop writing to the content but still gather the metadata (including embedded file type)?


was (Author: tallison@mitre.org):
While I'm fixing this, [~julienFL] and fellow devs.  The fixed behavior is that the parse is stopped when the content length hits the limit.  This means that users will lose metadata about embedded objects.  If the parser hits the write limit on attachment 3, it will not process attachments 4-n at all.  Is this what we want or do we just want the parser to stop writing to the content but still gather the metadata (including embedded file type)?

> Fix writelimit in recursiveparserhandler
> ----------------------------------------
>
>                 Key: TIKA-3372
>                 URL: https://issues.apache.org/jira/browse/TIKA-3372
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>
> On the dev list, [~julienFL] noted surprising behavior with the new write limit in the /rmeta handler.  I wasn't able to replicate it, but there is clearly a bug in how the write limiting is working. The upshot is that we're still effectively write limiting per object not for the full container doc and embedded objects.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)