You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2020/02/18 19:57:44 UTC

[GitHub] [accumulo] dan-blum opened a new issue #1520: UnsynchronizedBuffer max array size is too large

dan-blum opened a new issue #1520: UnsynchronizedBuffer max array size is too large
URL: https://github.com/apache/accumulo/issues/1520
 
 
   When the current array size is greater than 2^30 nextArraySize returns Integer.MAX_VALUE. While this is the theoretical maximum size for a Java array, a given JVM can have a smaller maximum size. For example, HotSpot (at least the build I tested) has a maximum size 2 bytes less than this, so when creating a large Mutation UnsynchronizedBuffer.reserve can throw "java.lang.OutOfMemoryError: Requested array size exceeds VM limit".

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [accumulo] ctubbsii commented on issue #1520: UnsynchronizedBuffer max array size is too large

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on issue #1520: UnsynchronizedBuffer max array size is too large
URL: https://github.com/apache/accumulo/issues/1520#issuecomment-587955713
 
 
   > The code for ArrayList assumes Integer.MAX_VALUE-8 is the maximum array size, which seems a reasonable number to use - I don't know how to get at it programmatically.
   
   I'd be fine with updating UnsynchronizedBuffer to use that instead, since it's hard-coded in ArrayList, and likely a stable enough number to use. Though... anybody creating mutations big enough to hit this will still be asking for trouble elsewhere. Although we don't impose a limit, I'd subjectively say that anything larger than low 10s of MBs for a mutation is probably too large.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [accumulo] mjwall commented on issue #1520: UnsynchronizedBuffer max array size is too large

Posted by GitBox <gi...@apache.org>.
mjwall commented on issue #1520: UnsynchronizedBuffer max array size is too large
URL: https://github.com/apache/accumulo/issues/1520#issuecomment-588274532
 
 
   > It's not handled in 2.0 - when the buffer first gets beyond 2^30 bytes it will throw an exception, long before the overall Mutation size reaches the limit.
   
   Hey @dan-blum, sorry I didn't mean to imply this issue is fixed in 2.0.  Only that the OOM appears to be handled *better* in 2.0 and link the existing tickets for that change.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [accumulo] mjwall commented on issue #1520: UnsynchronizedBuffer max array size is too large

Posted by GitBox <gi...@apache.org>.
mjwall commented on issue #1520: UnsynchronizedBuffer max array size is too large
URL: https://github.com/apache/accumulo/issues/1520#issuecomment-587957931
 
 
   This is handled in 2.0 I think see https://issues.apache.org/jira/browse/ACCUMULO-4709 and https://issues.apache.org/jira/browse/ACCUMULO-4708

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [accumulo] ctubbsii commented on issue #1520: UnsynchronizedBuffer max array size is too large

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on issue #1520: UnsynchronizedBuffer max array size is too large
URL: https://github.com/apache/accumulo/issues/1520#issuecomment-587945569
 
 
   @dan-blum Creating a mutation that large seems like a bad idea. The exception seems like a reasonable one to throw under circumstances involving such a huge mutation. At the very least, it seems better to let the JVM throw that exception than to try to accommodate specific JVM behaviors.
   
   Is there a way to determine programmatically the actual JVM limit? Or, do you have an alternate solution for limiting the mutation buffer size that doesn't involve hard-coding specific "known" JVM behaviors?
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [accumulo] asfgit closed issue #1520: UnsynchronizedBuffer max array size is too large

Posted by GitBox <gi...@apache.org>.
asfgit closed issue #1520: UnsynchronizedBuffer max array size is too large
URL: https://github.com/apache/accumulo/issues/1520
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [accumulo] ctubbsii commented on issue #1520: UnsynchronizedBuffer max array size is too large

Posted by GitBox <gi...@apache.org>.
ctubbsii commented on issue #1520: UnsynchronizedBuffer max array size is too large
URL: https://github.com/apache/accumulo/issues/1520#issuecomment-587953484
 
 
   > Of course if you end up putting all of the Mutation in a single byte array you will run into the same problem at that point 
   
   That would almost certainly happen later, when the mutation is put on the wire by Thrift.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [accumulo] dan-blum commented on issue #1520: UnsynchronizedBuffer max array size is too large

Posted by GitBox <gi...@apache.org>.
dan-blum commented on issue #1520: UnsynchronizedBuffer max array size is too large
URL: https://github.com/apache/accumulo/issues/1520#issuecomment-588262981
 
 
   I agree that creating very large Mutations is probably not a great idea
   
   > > The code for ArrayList assumes Integer.MAX_VALUE-8 is the maximum array size, which seems a reasonable number to use - I don't know how to get at it programmatically.
   > 
   > I'd be fine with updating UnsynchronizedBuffer to use that instead, since it's hard-coded in ArrayList, and likely a stable enough number to use. Though... anybody creating mutations big enough to hit this will still be asking for trouble elsewhere. Although we don't impose a limit, I'd subjectively say that anything larger than low 10s of MBs for a mutation is probably too large.
   
   I agree, but note that there's no way (at least as of 1.9.3, I didn't check 2.0) for the caller to see just how large the Mutation has gotten.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [accumulo] dan-blum commented on issue #1520: UnsynchronizedBuffer max array size is too large

Posted by GitBox <gi...@apache.org>.
dan-blum commented on issue #1520: UnsynchronizedBuffer max array size is too large
URL: https://github.com/apache/accumulo/issues/1520#issuecomment-588262260
 
 
   It's not handled in 2.0 - when the buffer first gets beyond 2^30 bytes it will throw an exception, long before the overall Mutation size reaches the limit.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [accumulo] dan-blum commented on issue #1520: UnsynchronizedBuffer max array size is too large

Posted by GitBox <gi...@apache.org>.
dan-blum commented on issue #1520: UnsynchronizedBuffer max array size is too large
URL: https://github.com/apache/accumulo/issues/1520#issuecomment-587952353
 
 
   I agree, but you'd have to convince the people actually using it.
   
   In 2.0.0 Mutation will in fact throw a good exception when you're about to put data that will make the overall Mutation size grow too large, which I think is fine.
   
   Note that this is the overall mutation size, not just this buffer. So in fact if this buffer ever got close to MAX_VALUE in size the Mutation would fail anyway, regardless of JVM. So the simplest solution would be to have Mutation tell UnsynchronizedBuffer the maximum size to use, which subtracts everything else from the Mutation maximum size.
   
   Of course if you end up putting all of the Mutation in a single byte array you will run into the same problem at that point (I didn't check this part of the code). In this case MAX_MUTATION_SIZE should be lower. The code for ArrayList assumes Integer.MAX_VALUE-8 is the maximum array size, which seems a reasonable number to use - I don't know how to get at it programmatically.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [accumulo] ctubbsii edited a comment on issue #1520: UnsynchronizedBuffer max array size is too large

Posted by GitBox <gi...@apache.org>.
ctubbsii edited a comment on issue #1520: UnsynchronizedBuffer max array size is too large
URL: https://github.com/apache/accumulo/issues/1520#issuecomment-587945569
 
 
   @dan-blum Creating a mutation large enough to trigger this seems like a bad idea. So, the exception seems like a reasonable one to throw under circumstances involving such a huge mutation. At the very least, it seems better to let the JVM throw that exception than to try to accommodate specific JVM behaviors.
   
   Is there a way to determine programmatically the actual JVM limit? Or, do you have an alternate solution for limiting the mutation buffer size that doesn't involve hard-coding specific "known" JVM behaviors?
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [accumulo] dan-blum commented on issue #1520: UnsynchronizedBuffer max array size is too large

Posted by GitBox <gi...@apache.org>.
dan-blum commented on issue #1520: UnsynchronizedBuffer max array size is too large
URL: https://github.com/apache/accumulo/issues/1520#issuecomment-587952695
 
 
   Note, in case it's not clear - the OutOfMemory exception gets thrown when the buffer exceeds half the maximum size. That is admittedly large but it's not pushing the limit.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services