You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@daffodil.apache.org by GitBox <gi...@apache.org> on 2019/10/31 16:58:36 UTC

[GitHub] [incubator-daffodil] stevedlawrence commented on a change in pull request #281: Remove heap sized limit for parsing

stevedlawrence commented on a change in pull request #281: Remove heap sized limit for parsing
URL: https://github.com/apache/incubator-daffodil/pull/281#discussion_r341250601
 
 

 ##########
 File path: daffodil-io/src/main/scala/org/apache/daffodil/io/InputSource.scala
 ##########
 @@ -340,6 +352,9 @@ class BucketingInputSource(inputStream: java.io.InputStream, bucketSize: Int = 1
       while (bytesStillToGet > 0) {
         val bytesToGetFromCurrentBucket = Math.min(bucketSize - byteIndex, bytesStillToGet).toInt
 
+        if (buckets(bucketIndex.toInt) == null)
 
 Review comment:
   I'm wondering if it's ever possible for bucketIndex to be negative causing an out of bounds exception?
   
   The case I'm thinking:
   1. User creates an InputSourceDataInputStream and eventually backtracks beyond the cache. No error occurs yet, but curBytePostion is set to something before the oldest bucket.
   2. User eventually tries to read and we hit one of these new exceptions and throws it.
   3  We convert that exception to an SDE and user handles SDE appropriately
   4. User tries to start a new parse where the previous one ended using the same InputSourceDataInputStream.
   5. Starting the new parse will cause compact() to be called. I *think* this will ultimately cause bucketIndex to become negative since curBytePosition0b will be less than headBucketBytePosition.
   6. THat would result in an index out of bounds exception here.
   
   So the question, if that's correct, how to handle it? Do we throw the same exception if bucketIndex is also negative? Do we somehow mark the this input stream as invalid and if something tries a new parse we throw a different exception? Something else?
   
   Would be interesting to add a unit test for this case to see if this is right.
   
   I guess more generically, what should happen if there's an SDE and the user tries to use the same InputStream for another parse? In that case, where the parse ended is sort of meaningless and the user shouldn't be able to continue parsing?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services