You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Kannan Muthukkaruppan (JIRA)" <ji...@apache.org> on 2010/11/05 19:36:42 UTC

[jira] Created: (HBASE-3199) large response handling: some fixups and cleanups

large response handling: some fixups and cleanups
-------------------------------------------------

                 Key: HBASE-3199
                 URL: https://issues.apache.org/jira/browse/HBASE-3199
             Project: HBase
          Issue Type: Bug
            Reporter: Kannan Muthukkaruppan
            Assignee: Kannan Muthukkaruppan


This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.

(i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.

java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:

{code}
   public synchronized void write(byte b[], int off, int len) {
	if ((off < 0) || (off > b.length) || (len < 0) ||
            ((off + len) > b.length) || ((off + len) < 0)) {
	    throw new IndexOutOfBoundsException();
	} else if (len == 0) {
	    return;
	}
        int newcount = count + len;
        if (newcount > buf.length) {
            buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
        }
        System.arraycopy(b, off, buf, count, len);
        count = newcount;
    }
{code}

The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.

(ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".

{code}
    int bufLen = 0;
    for(Result result : results) {
      bufLen += Bytes.SIZEOF_INT;
      if(result == null || result.isEmpty()) {
        continue;
      }
      for(KeyValue key : result.raw()) {
        bufLen += key.getLength() + Bytes.SIZEOF_INT;
      }
    }
{code}

We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.

(iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.

(iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).





-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ryan rawson updated HBASE-3199:
-------------------------------

    Attachment: HBASE-3199-3.txt

> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Kannan Muthukkaruppan
>         Attachments: HBASE-3199-2.txt, HBASE-3199-3.txt, HBASE-3199.txt, HBASE-3199_prelim.txt
>
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HBASE-3199:
---------------------------------

    Attachment: trunkThrottleImage2.patch

After talking to Konstantin, we decided to create the throttler on the fly for each file transfer. This patch does this.

> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Kannan Muthukkaruppan
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ryan rawson updated HBASE-3199:
-------------------------------

    Attachment: HBASE-3199-2.txt

here is a new version which returns exceptions instead of weirdly failing for really large Result buffers.

For super completeness this patch also needs to have WritableWithSize applied to the multi result object as well.  

> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Kannan Muthukkaruppan
>         Attachments: HBASE-3199-2.txt, HBASE-3199.txt, HBASE-3199_prelim.txt
>
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ryan rawson updated HBASE-3199:
-------------------------------

    Attachment: HBASE-3199.txt

here is my base patch, i'll merge in the other in a moment here

> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Kannan Muthukkaruppan
>         Attachments: HBASE-3199.txt, HBASE-3199_prelim.txt
>
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "Kannan Muthukkaruppan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929451#action_12929451 ] 

Kannan Muthukkaruppan commented on HBASE-3199:
----------------------------------------------

#3 is mainly for debuggability. If there are some bad queries this'll help debug what the RPC and its params were. 

#2. In checkSizeAndGrow(): 

{code}
     ByteBuffer newBuf = ByteBuffer.allocate(buf.capacity() * 2);    <<<<<<<  int overflow possible when we reach 1G range.
{code}         

When buf.capacity() say reaches 1G + something, buf.capacity() * 2 will become negative. It is not clear what ByteBuffer.allocate() will do, but we don't want this case to cause an exception since 1G + something is still within an "int". At this point, we should resize to 2G (since we can't go past 2G).

So,how about something like:

{code}
 ByteBuffer newBuf = ByteBuffer.allocate((int)Math.min(  (long)buf.capacity() * 2,
                                                                                                     (long)(Integer.MAX_VALUE)));
{code}





> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Kannan Muthukkaruppan
>         Attachments: HBASE-3199-2.txt, HBASE-3199-3.txt, HBASE-3199.txt, HBASE-3199_prelim.txt
>
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "Kannan Muthukkaruppan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930159#action_12930159 ] 

Kannan Muthukkaruppan commented on HBASE-3199:
----------------------------------------------

Stack: Please note there were two more int overflow type fixes that Aravind & Hairong pointed out. Have communicated that to Ryan. He'll upload a revised patch.

> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: ryan rawson
>             Fix For: 0.90.0
>
>         Attachments: HBASE-3199-2.txt, HBASE-3199-3.txt, HBASE-3199-4.txt, HBASE-3199.txt, HBASE-3199_prelim.txt
>
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "Kannan Muthukkaruppan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928741#action_12928741 ] 

Kannan Muthukkaruppan commented on HBASE-3199:
----------------------------------------------

For (i), the thought was to subclass ByteArrayOutputStream, and override the write method to something like:

{code}
  public synchronized void write(byte b[], int off, int len) {
    if ((off < 0) || (off > b.length) || (len < 0) ||
        ((off + len) > b.length) || ((off + len) < 0)) {
      throw new IndexOutOfBoundsException();
    } else if (len == 0) {
      return;
    }

    int newcount = count + len;
    if (newcount > buf.length) {
      int newSize = (int)Math.min((((long)buf.length) << 1),                <<<<<<<< proposed change.
                                  (long)(Integer.MAX_VALUE));      
      buf = Arrays.copyOf(buf, Math.max(newSize, newcount));
    }
    System.arraycopy(b, off, buf, count, len);
    count = newcount;
  }
}
{code}

> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Kannan Muthukkaruppan
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "Kannan Muthukkaruppan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929723#action_12929723 ] 

Kannan Muthukkaruppan commented on HBASE-3199:
----------------------------------------------

Ryan: 

In my testing, we are still going in the resize (buffer doubling code) even for the Result "write". I think there is still some off by one error which is causing us to do one extra doubling/copy than necessary.

Looking....

> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: ryan rawson
>         Attachments: HBASE-3199-2.txt, HBASE-3199-3.txt, HBASE-3199.txt, HBASE-3199_prelim.txt
>
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-3199:
-------------------------

    Fix Version/s: 0.90.0
     Hadoop Flags: [Reviewed]
           Status: Patch Available  (was: Open)

Marking patch available and bringing into 0.90.0.  Important bug fix.

> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: ryan rawson
>             Fix For: 0.90.0
>
>         Attachments: HBASE-3199-2.txt, HBASE-3199-3.txt, HBASE-3199-4.txt, HBASE-3199.txt, HBASE-3199_prelim.txt
>
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hairong Kuang updated HBASE-3199:
---------------------------------

    Attachment:     (was: trunkThrottleImage2.patch)

> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Kannan Muthukkaruppan
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928775#action_12928775 ] 

ryan rawson commented on HBASE-3199:
------------------------------------

I have a fix for all that and more. Ill post it in 2 hours


> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Kannan Muthukkaruppan
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "Kannan Muthukkaruppan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929712#action_12929712 ] 

Kannan Muthukkaruppan commented on HBASE-3199:
----------------------------------------------

I think we need something like this at the beginning of HbaseObjectWritable.java:getWritableSize()

{code}
    if (instance == null) {
      return 0L; // no hint is the default
    }
{code}

> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: ryan rawson
>         Attachments: HBASE-3199-2.txt, HBASE-3199-3.txt, HBASE-3199.txt, HBASE-3199_prelim.txt
>
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "Kannan Muthukkaruppan (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kannan Muthukkaruppan reassigned HBASE-3199:
--------------------------------------------

    Assignee: ryan rawson  (was: Kannan Muthukkaruppan)

Reassigning to you since it is pretty much going to be your patch.

> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: ryan rawson
>         Attachments: HBASE-3199-2.txt, HBASE-3199-3.txt, HBASE-3199.txt, HBASE-3199_prelim.txt
>
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "Kannan Muthukkaruppan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929708#action_12929708 ] 

Kannan Muthukkaruppan commented on HBASE-3199:
----------------------------------------------

Seeing this in testing:
{code}
2010-11-08 11:42:10,380 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 15 on 62469 caught: java.lang.NullPointerException
        at org.apache.hadoop.hbase.io.HbaseObjectWritable.getWritableSize(HbaseObjectWritable.java:301)
        at org.apache.hadoop.hbase.io.HbaseObjectWritable.getWritableSize(HbaseObjectWritable.java:235)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:933)
{code}

Seems like "instance" inside HBaseObjectWritable can be null, and the code is currently not safely handling that case.


> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: ryan rawson
>         Attachments: HBASE-3199-2.txt, HBASE-3199-3.txt, HBASE-3199.txt, HBASE-3199_prelim.txt
>
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "Kannan Muthukkaruppan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929793#action_12929793 ] 

Kannan Muthukkaruppan commented on HBASE-3199:
----------------------------------------------

+1 on latest patch. The off by one errors seem gone. Am running the unit tests as well right now.

> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: ryan rawson
>         Attachments: HBASE-3199-2.txt, HBASE-3199-3.txt, HBASE-3199-4.txt, HBASE-3199.txt, HBASE-3199_prelim.txt
>
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928871#action_12928871 ] 

ryan rawson commented on HBASE-3199:
------------------------------------

looks like my patch does nearly everything yours does except handle extremely large size replies.  I'm going to go thru that and use longs and also add some exceptions so that those calls should fail with IOEs instead of choke the regionserver.

> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Kannan Muthukkaruppan
>         Attachments: HBASE-3199.txt, HBASE-3199_prelim.txt
>
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928776#action_12928776 ] 

Hairong Kuang commented on HBASE-3199:
--------------------------------------

newcount calculation should guard against overflow too.

> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Kannan Muthukkaruppan
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ryan rawson updated HBASE-3199:
-------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

resolved, thanks for the reviews and suggestions.  All feedback has been rolled into the committed patch (which was not published to jira).

> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: ryan rawson
>             Fix For: 0.90.0
>
>         Attachments: HBASE-3199-2.txt, HBASE-3199-3.txt, HBASE-3199-4.txt, HBASE-3199.txt, HBASE-3199_prelim.txt
>
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "Kannan Muthukkaruppan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929087#action_12929087 ] 

Kannan Muthukkaruppan commented on HBASE-3199:
----------------------------------------------

My first comment above didn't format correctly...

1) in the size computation for Result.write() case, the "int" per kv is accounted for in the loop already. So it seems like we are double counting for those.


> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Kannan Muthukkaruppan
>         Attachments: HBASE-3199-2.txt, HBASE-3199.txt, HBASE-3199_prelim.txt
>
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929446#action_12929446 ] 

ryan rawson commented on HBASE-3199:
------------------------------------

1. you are totally correct here!  I fixed that... I had also fixed it in a subsequent patch in my git.

2. We won't need explicit protection, we get it from ByteBuffer.allocate(). 

3. what is the motivation for this feature?  could add it, but im not seeing the use case personally. 

> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Kannan Muthukkaruppan
>         Attachments: HBASE-3199-2.txt, HBASE-3199.txt, HBASE-3199_prelim.txt
>
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "Kannan Muthukkaruppan (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929084#action_12929084 ] 

Kannan Muthukkaruppan commented on HBASE-3199:
----------------------------------------------

Ryan,

Looks awesome. Nice work eliminating a bunch of copies. Few minor comments:

1) In the size computation:
{code}
+    long size = kvs.length;                                           <<<< The "int" per key is accounted for in the loop below.
+    size *= Bytes.SIZEOF_INT; // 1 int for each kv.   <<< So this seems unnecessaru/
+
+    size += Bytes.SIZEOF_INT;
+
+    for (KeyValue kv : kvs) {
+      size += kv.getLength() + Bytes.SIZEOF_INT;
+    }
{code}

2) checkSizeAndGrow() could probably use some protection against Integer.MAX_VALUE overflow.  (Similar to what my patch does in the HBaseByteArrayOutputStream:write() methods).

3) Will you be adding the "log responses above a configurable threshold" changes from my patch?




> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Kannan Muthukkaruppan
>         Attachments: HBASE-3199-2.txt, HBASE-3199.txt, HBASE-3199_prelim.txt
>
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "Kannan Muthukkaruppan (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kannan Muthukkaruppan updated HBASE-3199:
-----------------------------------------

    Attachment: HBASE-3199_prelim.txt

Prelim patch for review/merge with Ryan's work.

Chatted with Ryan in IRC, and he has more in-depth fix to avoid the whole "double the buffer as you grow" approach and replace it with a "precompute size of buffer in one pass and then alloc what you need". So a good portion of my patch might be superceded by his. Still submitting my  patch so that Ryan  can do the needed merge/union of parts in this patch that are orthogonal to his changes.

> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Kannan Muthukkaruppan
>         Attachments: HBASE-3199_prelim.txt
>
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928774#action_12928774 ] 

Hairong Kuang commented on HBASE-3199:
--------------------------------------

Oops, commented on the wrong jira. I wish that I could delete the previous comment.

> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Kannan Muthukkaruppan
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-3199) large response handling: some fixups and cleanups

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ryan rawson updated HBASE-3199:
-------------------------------

    Attachment: HBASE-3199-4.txt

a patch with fixes and implements all of kannan's suggestions

> large response handling: some fixups and cleanups
> -------------------------------------------------
>
>                 Key: HBASE-3199
>                 URL: https://issues.apache.org/jira/browse/HBASE-3199
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Kannan Muthukkaruppan
>            Assignee: ryan rawson
>         Attachments: HBASE-3199-2.txt, HBASE-3199-3.txt, HBASE-3199-4.txt, HBASE-3199.txt, HBASE-3199_prelim.txt
>
>
> This may not be common for many use cases, but it might be good to put a couple of safety nets as well as logging to protect against large responses.
> (i) Aravind and I were trying to track down why JVM memory usage was oscillating so much when dealing with very large buffers rather than OOM'ing or hitting some Index out of bound type exception, and this is what we found.
> java.io.ByteArrayOutputStream graduates its internal buffers by doubling them. Also, it is supposed to be able to handle "int" sized buffers (2G). The code which handles "write" (in jdk 1.6) is along the lines of:
> {code}
>    public synchronized void write(byte b[], int off, int len) {
> 	if ((off < 0) || (off > b.length) || (len < 0) ||
>             ((off + len) > b.length) || ((off + len) < 0)) {
> 	    throw new IndexOutOfBoundsException();
> 	} else if (len == 0) {
> 	    return;
> 	}
>         int newcount = count + len;
>         if (newcount > buf.length) {
>             buf = Arrays.copyOf(buf, Math.max(buf.length << 1, newcount));
>         }
>         System.arraycopy(b, off, buf, count, len);
>         count = newcount;
>     }
> {code}
> The "buf.length << 1" will start producing -ve values when buf.length reaches 1G, and "newcount" will instead dictate the size of the buffer allocated. At this point, all attempts to write to the buffer will grow linearly, and the buffer will be resized by only the required amount on each write. Effectively, each write will allocate a new 1G buffer + reqd size buffer, copy the contents, and so on. This will put the process in heavy GC mode (with jvm heap oscillating by several GBs rapidly), and render it practically unusable.
> (ii) When serializing a Result, the writeArray method doesn't assert that the resultant size does not overflow an "int".
> {code}
>     int bufLen = 0;
>     for(Result result : results) {
>       bufLen += Bytes.SIZEOF_INT;
>       if(result == null || result.isEmpty()) {
>         continue;
>       }
>       for(KeyValue key : result.raw()) {
>         bufLen += key.getLength() + Bytes.SIZEOF_INT;
>       }
>     }
> {code}
> We should do the math in "long" and assert on bufLen values > Integer.MAX_VALUE.
> (iii) In HBaseServer.java on RPC responses, we could add some logging on responses above a certain thresholds.
> (iv) Increase buffer size threshold for buffers that are reused by RPC handlers. And make this configurable. Currently, any response buffer about 16k is not reused on next response. (HBaseServer.java).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.