You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@daffodil.apache.org by "Steve Lawrence (Jira)" <ji...@apache.org> on 2019/08/28 13:22:00 UTC

[jira] [Created] (DAFFODIL-2194) buffered data output stream has a chunk limit of 2GB

Steve Lawrence created DAFFODIL-2194:
----------------------------------------

             Summary: buffered data output stream has a chunk limit of 2GB
                 Key: DAFFODIL-2194
                 URL: https://issues.apache.org/jira/browse/DAFFODIL-2194
             Project: Daffodil
          Issue Type: Bug
          Components: Back End
            Reporter: Steve Lawrence
             Fix For: 2.5.0


A buffered data outupt stream is backed by a growable ByteArrayOutputStream, which can only grow to 2GB in size. So if we ever try to write more than 2GB to a buffered output stream during unparse (very possible with large blobs), we'll get an OutOfMemoryError.

One potential solution is to be aware of the size of a ByteArrayOutputStream when buffering output and automatically create a split when it gets to 2GB in sizes. This will still require a ton of memory since we're buffering these in memoary, but we'll at least be able to unparse more than 2GB of continuous data. 

Note that we should still be able to unparse more than 2GB of data total, as long as there so single buffer that's more than 2GB.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)