You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "Bill Burcham (Jira)" <ji...@apache.org> on 2021/11/19 02:35:00 UTC

[jira] [Updated] (GEODE-9825) Disparate socket-buffer-size Results in "Unknown header byte" Exceptions and Hangs

     [ https://issues.apache.org/jira/browse/GEODE-9825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Burcham updated GEODE-9825:
--------------------------------
    Description: 
GEODE-9141 introduced a bug that causes hangs

In {{{}Connection.processInputBuffer(){}}}. When that method has read all the messages it can from the current input buffer, it then considers whether the buffer needs expansion. If it does then:
{code:java}
inputBuffer = inputSharing.expandReadBufferIfNeeded(allocSize); {code}
Is executed and the method returns. The caller then expects to be able to _write_ bytes into {{{}inputBuffer{}}}.

The problem, it seems, is that {{ByteBufferSharingInternalImpl.expandReadBufferIfNeeded()}} does not leave the the {{ByteBuffer}} in the proper state. It leaves the buffer ready to be _read_ not written.

The line of code referenced above used to be this method in {{Connection}} (which has since been removed):
{code:java}
private void compactOrResizeBuffer(int messageLength) {
  final int oldBufferSize = inputBuffer.capacity();
  int allocSize = messageLength + MSG_HEADER_BYTES;
  if (oldBufferSize < allocSize) {
    // need a bigger buffer
    logger.info("Allocating larger network read buffer, new size is {} old size was {}.",
        allocSize, oldBufferSize);
    ByteBuffer oldBuffer = inputBuffer;
    inputBuffer = getBufferPool().acquireDirectReceiveBuffer(allocSize);

    if (oldBuffer != null) {
      int oldByteCount = oldBuffer.remaining();
      inputBuffer.put(oldBuffer);
      inputBuffer.position(oldByteCount);
      getBufferPool().releaseReceiveBuffer(oldBuffer);
    }
  } else {
    if (inputBuffer.position() != 0) {
      inputBuffer.compact();
    } else {
      inputBuffer.position(inputBuffer.limit());
      inputBuffer.limit(inputBuffer.capacity());
    }
  }
} {code}
Notice how this method leaves {{inputBuffer}} ready to be _written_ to.

It's not sufficient to simply call {{flip()}} on the inputBuffer before returning it (I tried it and it didn't fix the bug). More work is needed.

To reproduce this bug, modify {{P2PMessagingConcurrencyDUnitTest}} so that sender and locator and receiver use different configuration parameters. Set {{socket-buffer-size}} to 212992 for the sender and 32 * 1024 for the receiver. Oh and just skip the call to {{{}securityProperties(){}}}—we want to induce the "Unknown header byte" exception—we don't want the TLS framework throwing exceptions.

When this ticket is complete {{P2PMessagingConcurrencyDUnitTest}} will be enhanced to test these combinations:

[security, sender/locator socket-buffer-size, receiver socket-buffer-size]

[TLS, (default), (default)]  this is what the test currently does
[no TLS, 212992, 32 * 1024] *new: this illustrates this bug*
[no TLS, (default), (default)] *new*

We might want to mix in conserve-sockets true/false in there too while we're at it (the test currently holds it at true).

  was:
GEODE-9141 introduced a bug in {{{}Connection.processInputBuffer(){}}}. When that method has read all the messages it can from the current input buffer, it then considers whether the buffer needs expansion. If it does then:

 
{code:java}
inputBuffer = inputSharing.expandReadBufferIfNeeded(allocSize); {code}
Is executed and the method returns. The caller then expects to be able to _write_ bytes into {{{}inputBuffer{}}}.

 

The problem, it seems, is that {{ByteBufferSharingInternalImpl.expandReadBufferIfNeeded()}} does not leave the the {{ByteBuffer}} in the proper state. It leaves the buffer ready to be _read_ not written.

The line of code referenced above used to be this method in {{Connection}} (which has since been removed):

 
{code:java}
private void compactOrResizeBuffer(int messageLength) {
  final int oldBufferSize = inputBuffer.capacity();
  int allocSize = messageLength + MSG_HEADER_BYTES;
  if (oldBufferSize < allocSize) {
    // need a bigger buffer
    logger.info("Allocating larger network read buffer, new size is {} old size was {}.",
        allocSize, oldBufferSize);
    ByteBuffer oldBuffer = inputBuffer;
    inputBuffer = getBufferPool().acquireDirectReceiveBuffer(allocSize);

    if (oldBuffer != null) {
      int oldByteCount = oldBuffer.remaining();
      inputBuffer.put(oldBuffer);
      inputBuffer.position(oldByteCount);
      getBufferPool().releaseReceiveBuffer(oldBuffer);
    }
  } else {
    if (inputBuffer.position() != 0) {
      inputBuffer.compact();
    } else {
      inputBuffer.position(inputBuffer.limit());
      inputBuffer.limit(inputBuffer.capacity());
    }
  }
} {code}
Notice how this method leaves {{inputBuffer}} ready to be _written_ to.

 

It's not sufficient to simply call {{flip()}} on the inputBuffer before returning it (I tried it and it didn't fix the bug). More work is needed.

To reproduce this bug, modify {{P2PMessagingConcurrencyDUnitTest}} so that sender and locator and receiver use different configuration parameters. Set {{socket-buffer-size}} to 212992 for the sender and 32 * 1024 for the receiver. Oh and just skip the call to {{{}securityProperties(){}}}—we want to induce the "Unknown header byte" exception—we don't want the TLS framework throwing exceptions.

When this ticket is complete {{P2PMessagingConcurrencyDUnitTest}} will be enhanced to test these combinations:

[security, sender/locator socket-buffer-size, receiver socket-buffer-size]

[TLS, (default), (default)]  this is what the test currently does
[no TLS, 212992, 32 * 1024] *new: this illustrates this bug*
[no TLS, (default), (default)] *new*

We might want to mix in conserve-sockets true/false in there too while we're at it (the test currently holds it at true).


> Disparate socket-buffer-size Results in "Unknown header byte" Exceptions and Hangs
> ----------------------------------------------------------------------------------
>
>                 Key: GEODE-9825
>                 URL: https://issues.apache.org/jira/browse/GEODE-9825
>             Project: Geode
>          Issue Type: Bug
>          Components: messaging
>    Affects Versions: 1.12.4, 1.15.0
>            Reporter: Bill Burcham
>            Priority: Major
>
> GEODE-9141 introduced a bug that causes hangs
> In {{{}Connection.processInputBuffer(){}}}. When that method has read all the messages it can from the current input buffer, it then considers whether the buffer needs expansion. If it does then:
> {code:java}
> inputBuffer = inputSharing.expandReadBufferIfNeeded(allocSize); {code}
> Is executed and the method returns. The caller then expects to be able to _write_ bytes into {{{}inputBuffer{}}}.
> The problem, it seems, is that {{ByteBufferSharingInternalImpl.expandReadBufferIfNeeded()}} does not leave the the {{ByteBuffer}} in the proper state. It leaves the buffer ready to be _read_ not written.
> The line of code referenced above used to be this method in {{Connection}} (which has since been removed):
> {code:java}
> private void compactOrResizeBuffer(int messageLength) {
>   final int oldBufferSize = inputBuffer.capacity();
>   int allocSize = messageLength + MSG_HEADER_BYTES;
>   if (oldBufferSize < allocSize) {
>     // need a bigger buffer
>     logger.info("Allocating larger network read buffer, new size is {} old size was {}.",
>         allocSize, oldBufferSize);
>     ByteBuffer oldBuffer = inputBuffer;
>     inputBuffer = getBufferPool().acquireDirectReceiveBuffer(allocSize);
>     if (oldBuffer != null) {
>       int oldByteCount = oldBuffer.remaining();
>       inputBuffer.put(oldBuffer);
>       inputBuffer.position(oldByteCount);
>       getBufferPool().releaseReceiveBuffer(oldBuffer);
>     }
>   } else {
>     if (inputBuffer.position() != 0) {
>       inputBuffer.compact();
>     } else {
>       inputBuffer.position(inputBuffer.limit());
>       inputBuffer.limit(inputBuffer.capacity());
>     }
>   }
> } {code}
> Notice how this method leaves {{inputBuffer}} ready to be _written_ to.
> It's not sufficient to simply call {{flip()}} on the inputBuffer before returning it (I tried it and it didn't fix the bug). More work is needed.
> To reproduce this bug, modify {{P2PMessagingConcurrencyDUnitTest}} so that sender and locator and receiver use different configuration parameters. Set {{socket-buffer-size}} to 212992 for the sender and 32 * 1024 for the receiver. Oh and just skip the call to {{{}securityProperties(){}}}—we want to induce the "Unknown header byte" exception—we don't want the TLS framework throwing exceptions.
> When this ticket is complete {{P2PMessagingConcurrencyDUnitTest}} will be enhanced to test these combinations:
> [security, sender/locator socket-buffer-size, receiver socket-buffer-size]
> [TLS, (default), (default)]  this is what the test currently does
> [no TLS, 212992, 32 * 1024] *new: this illustrates this bug*
> [no TLS, (default), (default)] *new*
> We might want to mix in conserve-sockets true/false in there too while we're at it (the test currently holds it at true).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)