You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@trafficserver.apache.org by "Wilson Ho (Created) (JIRA)" <ji...@apache.org> on 2011/12/13 00:59:31 UTC

[jira] [Created] (TS-1049) TS hangs (dead lock) on HTTPS POST requests

TS hangs (dead lock) on HTTPS POST requests
-------------------------------------------

Key: TS-1049
URL: https://issues.apache.org/jira/browse/TS-1049
Project: Traffic Server
Issue Type: Bug
Components: Core, HTTP, SSL
Affects Versions: 3.1.0, 3.1.1, 3.0.2
Environment: RedHat Enterprise Linux 6.0, Intel 32-bit
Reporter: Wilson Ho
Priority: Blocker

A very reproducible bug where the body of a HTTPS POST request is never forwarded to the origin server.

Client submits a HTTPS POST request to TS, which is supposed to forward to the backend/origin server via HTTP. TS process the HTTP headers and establishes connection to the origin server, but the body of the HTTPS POST is never read. This hangs until the client times out and shuts down the connection.

To reproduce:
1) Client connects to TS using HTTPS (works OK if it is just HTTP).
2) It must be a POST request.
3) TS must use at least 2 worker threads.
4) Easier to reproduce when the connections to the origin server is HTTP (not HTTPS).
5) POST body must be large enough so that the HTTP request headers and POST body do *NOT* fit within the same TCP packet. (2000 bytes is a good size)
6) I can consistently reproduce this problem using 2 separate clients each simultaneously submitting 2 requests back to back (i.e., 2 requests from each client, a total of 4 requests). This gives you a high probability that at least one of the requests would hang.

Observation:
1) Thread A accepted and processed the HTTP headers, and called "UnixNetProcessor::connect_re" to prepare a new connection to the origin server.
2) Thread A must not have read the body of the POST. Otherwise, it works fine.
3) Thread B was assigned the task to handle the origin server connection. If the same thread A was picked, then everything works fine.
4) Apparently, one of the first things that thread B does is to acquire the mutex for reading from the client. (Why does it do that??)
5) While thread B was holding the mutex, thread A proceeded in "SSLNetVConnection::net_read_io", tried and failed to acquire the mutex. Thread A typically re-tried calling "SSLNetVConnection::net_read_io" soon, but gave up after the second failure. But if thread B released the mutex soon enough, that thread A could proceed happily and everything works.
6) From this point, the body of the POST is never read from the client, and there is nothing to be proxy'd to the origin server, and both the consumer and producer tasks are never scheduled to run again -- or until the client times out. I tried setting the client-side time out to as long as 3-5 minutes and TS really does not recover by itself until the client closed the connection.

This is the first time I uses this bug system. Please let me know how I could produce the configuration files and trace logs, etc. Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (TS-1049) TS hangs (dead lock) on HTTPS POST requests

Posted by "Wilson Ho (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/TS-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168096#comment-13168096 ] 

Wilson Ho commented on TS-1049:
-------------------------------

Adding a call to "readReschdule()" seem to make the problem go away.  But I have no idea if this is the right thing to do at all, or if there is a better way.  Please advice!

In file SSLNetVConnection.cc:

void
SSLNetVConnection::net_read_io(NetHandler * nh, EThread * lthread)
{
  int ret;
  int64_t r = 0;
  int64_t bytes = 0;
  NetState *s = &this->read;
  MIOBufferAccessor & buf = s->vio.buffer;

  MUTEX_TRY_LOCK_FOR(lock, s->vio.mutex, lthread, s->vio._cont);
  if (!lock) {
      readReschedule(nh);              // <<<<-------------------- added
      return;
  }

                
> TS hangs (dead lock) on HTTPS POST requests
> -------------------------------------------
>
>                 Key: TS-1049
>                 URL: https://issues.apache.org/jira/browse/TS-1049
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Core, HTTP, SSL
>    Affects Versions: 3.1.1, 3.1.0, 3.0.2
>         Environment: RedHat Enterprise Linux 6.0, Intel 32-bit
>            Reporter: Wilson Ho
>            Priority: Blocker
>         Attachments: records.config
>
>
> A very reproducible bug where the body of a HTTPS POST request is never forwarded to the origin server.
> Client submits a HTTPS POST request to TS, which is supposed to forward to the backend/origin server via HTTP.  TS process the HTTP headers and establishes connection to the origin server, but the body of the HTTPS POST is never read.  This hangs until the client times out and shuts down the connection.
> To reproduce:
> 1) Client connects to TS using HTTPS (works OK if it is just HTTP).
> 2) It must be a POST request.
> 3) TS must use at least 2 worker threads.
> 4) Easier to reproduce when the connections to the origin server is HTTP (not HTTPS).
> 5) POST body must be large enough so that the HTTP request headers and POST body do *NOT* fit within the same TCP packet. (2000 bytes is a good size)
> 6) I can consistently reproduce this problem using 2 separate clients each simultaneously submitting 2 requests back to back (i.e., 2 requests from each client, a total of 4 requests).  This gives you a high probability that at least one of the requests would hang.
> Observation:
> 1) Thread A accepted and processed the HTTP headers, and called "UnixNetProcessor::connect_re" to prepare a new connection to the origin server.
> 2) Thread A must not have read the body of the POST.  Otherwise, it works fine.
> 3) Thread B was assigned the task to handle the origin server connection.  If the same thread A was picked, then everything works fine.
> 4) Apparently, one of the first things that thread B does is to acquire the mutex for reading from the client.  (Why does it do that??)
> 5) While thread B was holding the mutex, thread A proceeded in "SSLNetVConnection::net_read_io", tried and failed to acquire the mutex.  Thread A typically re-tried calling "SSLNetVConnection::net_read_io" soon, but gave up after the second failure. But if thread B released the mutex soon enough, that thread A could proceed happily and everything works.
> 6) From this point, the body of the POST is never read from the client, and there is nothing to be proxy'd to the origin server, and both the consumer and producer tasks are never scheduled to run again -- or until the client times out.  I tried setting the client-side time out to as long as 3-5 minutes and TS really does not recover by itself until the client closed the connection.
> This is the first time I uses this bug system.  Please let me know how I could produce the configuration files and trace logs, etc.  Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (TS-1049) TS hangs (dead lock) on HTTPS POST requests

Posted by "Igor Galić (Updated JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Igor Galić updated TS-1049:
---------------------------

    Description: 
A very reproducible bug where the body of a HTTPS POST request is never forwarded to the origin server.

Client submits a HTTPS POST request to TS, which is supposed to forward to the backend/origin server via HTTP.  TS process the HTTP headers and establishes connection to the origin server, but the body of the HTTPS POST is never read.  This hangs until the client times out and shuts down the connection.

To reproduce:
# Client connects to TS using HTTPS (works OK if it is just HTTP).
# It must be a POST request.
# TS must use at least 2 worker threads.
# Easier to reproduce when the connections to the origin server is HTTP (not HTTPS).
# POST body must be large enough so that the HTTP request headers and POST body do *NOT* fit within the same TCP packet. (2000 bytes is a good size)
# I can consistently reproduce this problem using 2 separate clients each simultaneously submitting 2 requests back to back (i.e., 2 requests from each client, a total of 4 requests).  This gives you a high probability that at least one of the requests would hang.

Observation:
# Thread A accepted and processed the HTTP headers, and called "UnixNetProcessor::connect_re" to prepare a new connection to the origin server.
# Thread A must not have read the body of the POST.  Otherwise, it works fine.
# Thread B was assigned the task to handle the origin server connection.  If the same thread A was picked, then everything works fine.
# Apparently, one of the first things that thread B does is to acquire the mutex for reading from the client.  (Why does it do that??)
# While thread B was holding the mutex, thread A proceeded in "SSLNetVConnection::net_read_io", tried and failed to acquire the mutex.  Thread A typically re-tried calling "SSLNetVConnection::net_read_io" soon, but gave up after the second failure. But if thread B released the mutex soon enough, that thread A could proceed happily and everything works.
# From this point, the body of the POST is never read from the client, and there is nothing to be proxy'd to the origin server, and both the consumer and producer tasks are never scheduled to run again -- or until the client times out.  I tried setting the client-side time out to as long as 3-5 minutes and TS really does not recover by itself until the client closed the connection.

This is the first time I uses this bug system.  Please let me know how I could produce the configuration files and trace logs, etc.  Thanks!


  was:
A very reproducible bug where the body of a HTTPS POST request is never forwarded to the origin server.

Client submits a HTTPS POST request to TS, which is supposed to forward to the backend/origin server via HTTP.  TS process the HTTP headers and establishes connection to the origin server, but the body of the HTTPS POST is never read.  This hangs until the client times out and shuts down the connection.

To reproduce:
1) Client connects to TS using HTTPS (works OK if it is just HTTP).
2) It must be a POST request.
3) TS must use at least 2 worker threads.
4) Easier to reproduce when the connections to the origin server is HTTP (not HTTPS).
5) POST body must be large enough so that the HTTP request headers and POST body do *NOT* fit within the same TCP packet. (2000 bytes is a good size)
6) I can consistently reproduce this problem using 2 separate clients each simultaneously submitting 2 requests back to back (i.e., 2 requests from each client, a total of 4 requests).  This gives you a high probability that at least one of the requests would hang.

Observation:
1) Thread A accepted and processed the HTTP headers, and called "UnixNetProcessor::connect_re" to prepare a new connection to the origin server.
2) Thread A must not have read the body of the POST.  Otherwise, it works fine.
3) Thread B was assigned the task to handle the origin server connection.  If the same thread A was picked, then everything works fine.
4) Apparently, one of the first things that thread B does is to acquire the mutex for reading from the client.  (Why does it do that??)
5) While thread B was holding the mutex, thread A proceeded in "SSLNetVConnection::net_read_io", tried and failed to acquire the mutex.  Thread A typically re-tried calling "SSLNetVConnection::net_read_io" soon, but gave up after the second failure. But if thread B released the mutex soon enough, that thread A could proceed happily and everything works.
6) From this point, the body of the POST is never read from the client, and there is nothing to be proxy'd to the origin server, and both the consumer and producer tasks are never scheduled to run again -- or until the client times out.  I tried setting the client-side time out to as long as 3-5 minutes and TS really does not recover by itself until the client closed the connection.

This is the first time I uses this bug system.  Please let me know how I could produce the configuration files and trace logs, etc.  Thanks!


    
> TS hangs (dead lock) on HTTPS POST requests
> -------------------------------------------
>
>                 Key: TS-1049
>                 URL: https://issues.apache.org/jira/browse/TS-1049
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Core, HTTP, SSL
>    Affects Versions: 3.1.1, 3.1.0, 3.0.2
>         Environment: RedHat Enterprise Linux 6.0, Intel 32-bit
>            Reporter: Wilson Ho
>            Assignee: Leif Hedstrom
>            Priority: Blocker
>             Fix For: 3.1.2
>
>         Attachments: records.config
>
>
> A very reproducible bug where the body of a HTTPS POST request is never forwarded to the origin server.
> Client submits a HTTPS POST request to TS, which is supposed to forward to the backend/origin server via HTTP.  TS process the HTTP headers and establishes connection to the origin server, but the body of the HTTPS POST is never read.  This hangs until the client times out and shuts down the connection.
> To reproduce:
> # Client connects to TS using HTTPS (works OK if it is just HTTP).
> # It must be a POST request.
> # TS must use at least 2 worker threads.
> # Easier to reproduce when the connections to the origin server is HTTP (not HTTPS).
> # POST body must be large enough so that the HTTP request headers and POST body do *NOT* fit within the same TCP packet. (2000 bytes is a good size)
> # I can consistently reproduce this problem using 2 separate clients each simultaneously submitting 2 requests back to back (i.e., 2 requests from each client, a total of 4 requests).  This gives you a high probability that at least one of the requests would hang.
> Observation:
> # Thread A accepted and processed the HTTP headers, and called "UnixNetProcessor::connect_re" to prepare a new connection to the origin server.
> # Thread A must not have read the body of the POST.  Otherwise, it works fine.
> # Thread B was assigned the task to handle the origin server connection.  If the same thread A was picked, then everything works fine.
> # Apparently, one of the first things that thread B does is to acquire the mutex for reading from the client.  (Why does it do that??)
> # While thread B was holding the mutex, thread A proceeded in "SSLNetVConnection::net_read_io", tried and failed to acquire the mutex.  Thread A typically re-tried calling "SSLNetVConnection::net_read_io" soon, but gave up after the second failure. But if thread B released the mutex soon enough, that thread A could proceed happily and everything works.
> # From this point, the body of the POST is never read from the client, and there is nothing to be proxy'd to the origin server, and both the consumer and producer tasks are never scheduled to run again -- or until the client times out.  I tried setting the client-side time out to as long as 3-5 minutes and TS really does not recover by itself until the client closed the connection.
> This is the first time I uses this bug system.  Please let me know how I could produce the configuration files and trace logs, etc.  Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (TS-1049) TS hangs (dead lock) on HTTPS POST requests

Posted by "Igor Galić (Reopened JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Igor Galić reopened TS-1049:
----------------------------


reopen for backport
                
> TS hangs (dead lock) on HTTPS POST requests
> -------------------------------------------
>
>                 Key: TS-1049
>                 URL: https://issues.apache.org/jira/browse/TS-1049
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Core, HTTP, SSL
>    Affects Versions: 3.1.1, 3.1.0, 3.0.2
>         Environment: RedHat Enterprise Linux 6.0, Intel 32-bit
>            Reporter: Wilson Ho
>            Assignee: Igor Galić
>            Priority: Blocker
>             Fix For: 3.1.2
>
>         Attachments: records.config
>
>
> A very reproducible bug where the body of a HTTPS POST request is never forwarded to the origin server.
> Client submits a HTTPS POST request to TS, which is supposed to forward to the backend/origin server via HTTP.  TS process the HTTP headers and establishes connection to the origin server, but the body of the HTTPS POST is never read.  This hangs until the client times out and shuts down the connection.
> To reproduce:
> # Client connects to TS using HTTPS (works OK if it is just HTTP).
> # It must be a POST request.
> # TS must use at least 2 worker threads.
> # Easier to reproduce when the connections to the origin server is HTTP (not HTTPS).
> # POST body must be large enough so that the HTTP request headers and POST body do *NOT* fit within the same TCP packet. (2000 bytes is a good size)
> # I can consistently reproduce this problem using 2 separate clients each simultaneously submitting 2 requests back to back (i.e., 2 requests from each client, a total of 4 requests).  This gives you a high probability that at least one of the requests would hang.
> Observation:
> # Thread A accepted and processed the HTTP headers, and called "UnixNetProcessor::connect_re" to prepare a new connection to the origin server.
> # Thread A must not have read the body of the POST.  Otherwise, it works fine.
> # Thread B was assigned the task to handle the origin server connection.  If the same thread A was picked, then everything works fine.
> # Apparently, one of the first things that thread B does is to acquire the mutex for reading from the client.  (Why does it do that??)
> # While thread B was holding the mutex, thread A proceeded in "SSLNetVConnection::net_read_io", tried and failed to acquire the mutex.  Thread A typically re-tried calling "SSLNetVConnection::net_read_io" soon, but gave up after the second failure. But if thread B released the mutex soon enough, that thread A could proceed happily and everything works.
> # From this point, the body of the POST is never read from the client, and there is nothing to be proxy'd to the origin server, and both the consumer and producer tasks are never scheduled to run again -- or until the client times out.  I tried setting the client-side time out to as long as 3-5 minutes and TS really does not recover by itself until the client closed the connection.
> This is the first time I uses this bug system.  Please let me know how I could produce the configuration files and trace logs, etc.  Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (TS-1049) TS hangs (dead lock) on HTTPS POST requests

Posted by "Wilson Ho (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wilson Ho updated TS-1049:
--------------------------

    Attachment: records.config
    
> TS hangs (dead lock) on HTTPS POST requests
> -------------------------------------------
>
>                 Key: TS-1049
>                 URL: https://issues.apache.org/jira/browse/TS-1049
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Core, HTTP, SSL
>    Affects Versions: 3.1.1, 3.1.0, 3.0.2
>         Environment: RedHat Enterprise Linux 6.0, Intel 32-bit
>            Reporter: Wilson Ho
>            Priority: Blocker
>         Attachments: records.config
>
>
> A very reproducible bug where the body of a HTTPS POST request is never forwarded to the origin server.
> Client submits a HTTPS POST request to TS, which is supposed to forward to the backend/origin server via HTTP.  TS process the HTTP headers and establishes connection to the origin server, but the body of the HTTPS POST is never read.  This hangs until the client times out and shuts down the connection.
> To reproduce:
> 1) Client connects to TS using HTTPS (works OK if it is just HTTP).
> 2) It must be a POST request.
> 3) TS must use at least 2 worker threads.
> 4) Easier to reproduce when the connections to the origin server is HTTP (not HTTPS).
> 5) POST body must be large enough so that the HTTP request headers and POST body do *NOT* fit within the same TCP packet. (2000 bytes is a good size)
> 6) I can consistently reproduce this problem using 2 separate clients each simultaneously submitting 2 requests back to back (i.e., 2 requests from each client, a total of 4 requests).  This gives you a high probability that at least one of the requests would hang.
> Observation:
> 1) Thread A accepted and processed the HTTP headers, and called "UnixNetProcessor::connect_re" to prepare a new connection to the origin server.
> 2) Thread A must not have read the body of the POST.  Otherwise, it works fine.
> 3) Thread B was assigned the task to handle the origin server connection.  If the same thread A was picked, then everything works fine.
> 4) Apparently, one of the first things that thread B does is to acquire the mutex for reading from the client.  (Why does it do that??)
> 5) While thread B was holding the mutex, thread A proceeded in "SSLNetVConnection::net_read_io", tried and failed to acquire the mutex.  Thread A typically re-tried calling "SSLNetVConnection::net_read_io" soon, but gave up after the second failure. But if thread B released the mutex soon enough, that thread A could proceed happily and everything works.
> 6) From this point, the body of the POST is never read from the client, and there is nothing to be proxy'd to the origin server, and both the consumer and producer tasks are never scheduled to run again -- or until the client times out.  I tried setting the client-side time out to as long as 3-5 minutes and TS really does not recover by itself until the client closed the connection.
> This is the first time I uses this bug system.  Please let me know how I could produce the configuration files and trace logs, etc.  Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (TS-1049) TS hangs (dead lock) on HTTPS POST requests

Posted by "Leif Hedstrom (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/TS-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13182309#comment-13182309 ] 

Leif Hedstrom commented on TS-1049:
-----------------------------------

Reading the code for the normal NetVC, and the SSL NetVC, I'm fairly certain your suggestion is correct.

Thanks!
                
> TS hangs (dead lock) on HTTPS POST requests
> -------------------------------------------
>
>                 Key: TS-1049
>                 URL: https://issues.apache.org/jira/browse/TS-1049
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Core, HTTP, SSL
>    Affects Versions: 3.1.1, 3.1.0, 3.0.2
>         Environment: RedHat Enterprise Linux 6.0, Intel 32-bit
>            Reporter: Wilson Ho
>            Assignee: Leif Hedstrom
>            Priority: Blocker
>             Fix For: 3.1.2
>
>         Attachments: records.config
>
>
> A very reproducible bug where the body of a HTTPS POST request is never forwarded to the origin server.
> Client submits a HTTPS POST request to TS, which is supposed to forward to the backend/origin server via HTTP.  TS process the HTTP headers and establishes connection to the origin server, but the body of the HTTPS POST is never read.  This hangs until the client times out and shuts down the connection.
> To reproduce:
> 1) Client connects to TS using HTTPS (works OK if it is just HTTP).
> 2) It must be a POST request.
> 3) TS must use at least 2 worker threads.
> 4) Easier to reproduce when the connections to the origin server is HTTP (not HTTPS).
> 5) POST body must be large enough so that the HTTP request headers and POST body do *NOT* fit within the same TCP packet. (2000 bytes is a good size)
> 6) I can consistently reproduce this problem using 2 separate clients each simultaneously submitting 2 requests back to back (i.e., 2 requests from each client, a total of 4 requests).  This gives you a high probability that at least one of the requests would hang.
> Observation:
> 1) Thread A accepted and processed the HTTP headers, and called "UnixNetProcessor::connect_re" to prepare a new connection to the origin server.
> 2) Thread A must not have read the body of the POST.  Otherwise, it works fine.
> 3) Thread B was assigned the task to handle the origin server connection.  If the same thread A was picked, then everything works fine.
> 4) Apparently, one of the first things that thread B does is to acquire the mutex for reading from the client.  (Why does it do that??)
> 5) While thread B was holding the mutex, thread A proceeded in "SSLNetVConnection::net_read_io", tried and failed to acquire the mutex.  Thread A typically re-tried calling "SSLNetVConnection::net_read_io" soon, but gave up after the second failure. But if thread B released the mutex soon enough, that thread A could proceed happily and everything works.
> 6) From this point, the body of the POST is never read from the client, and there is nothing to be proxy'd to the origin server, and both the consumer and producer tasks are never scheduled to run again -- or until the client times out.  I tried setting the client-side time out to as long as 3-5 minutes and TS really does not recover by itself until the client closed the connection.
> This is the first time I uses this bug system.  Please let me know how I could produce the configuration files and trace logs, etc.  Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (TS-1049) TS hangs (dead lock) on HTTPS POST requests

Posted by "Igor Galić (Updated JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Igor Galić updated TS-1049:
---------------------------

    Backport to Version: 3.0.3
    
> TS hangs (dead lock) on HTTPS POST requests
> -------------------------------------------
>
>                 Key: TS-1049
>                 URL: https://issues.apache.org/jira/browse/TS-1049
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Core, HTTP, SSL
>    Affects Versions: 3.1.1, 3.1.0, 3.0.2
>         Environment: RedHat Enterprise Linux 6.0, Intel 32-bit
>            Reporter: Wilson Ho
>            Assignee: Leif Hedstrom
>            Priority: Blocker
>             Fix For: 3.1.2
>
>         Attachments: records.config
>
>
> A very reproducible bug where the body of a HTTPS POST request is never forwarded to the origin server.
> Client submits a HTTPS POST request to TS, which is supposed to forward to the backend/origin server via HTTP.  TS process the HTTP headers and establishes connection to the origin server, but the body of the HTTPS POST is never read.  This hangs until the client times out and shuts down the connection.
> To reproduce:
> 1) Client connects to TS using HTTPS (works OK if it is just HTTP).
> 2) It must be a POST request.
> 3) TS must use at least 2 worker threads.
> 4) Easier to reproduce when the connections to the origin server is HTTP (not HTTPS).
> 5) POST body must be large enough so that the HTTP request headers and POST body do *NOT* fit within the same TCP packet. (2000 bytes is a good size)
> 6) I can consistently reproduce this problem using 2 separate clients each simultaneously submitting 2 requests back to back (i.e., 2 requests from each client, a total of 4 requests).  This gives you a high probability that at least one of the requests would hang.
> Observation:
> 1) Thread A accepted and processed the HTTP headers, and called "UnixNetProcessor::connect_re" to prepare a new connection to the origin server.
> 2) Thread A must not have read the body of the POST.  Otherwise, it works fine.
> 3) Thread B was assigned the task to handle the origin server connection.  If the same thread A was picked, then everything works fine.
> 4) Apparently, one of the first things that thread B does is to acquire the mutex for reading from the client.  (Why does it do that??)
> 5) While thread B was holding the mutex, thread A proceeded in "SSLNetVConnection::net_read_io", tried and failed to acquire the mutex.  Thread A typically re-tried calling "SSLNetVConnection::net_read_io" soon, but gave up after the second failure. But if thread B released the mutex soon enough, that thread A could proceed happily and everything works.
> 6) From this point, the body of the POST is never read from the client, and there is nothing to be proxy'd to the origin server, and both the consumer and producer tasks are never scheduled to run again -- or until the client times out.  I tried setting the client-side time out to as long as 3-5 minutes and TS really does not recover by itself until the client closed the connection.
> This is the first time I uses this bug system.  Please let me know how I could produce the configuration files and trace logs, etc.  Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (TS-1049) TS hangs (dead lock) on HTTPS POST requests

Posted by "Igor Galić (Updated JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Igor Galić updated TS-1049:
---------------------------

    Backport to Version: 3.0.5  (was: 3.0.3)
    
> TS hangs (dead lock) on HTTPS POST requests
> -------------------------------------------
>
>                 Key: TS-1049
>                 URL: https://issues.apache.org/jira/browse/TS-1049
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Core, HTTP, SSL
>    Affects Versions: 3.1.1, 3.1.0, 3.0.2
>         Environment: RedHat Enterprise Linux 6.0, Intel 32-bit
>            Reporter: Wilson Ho
>            Assignee: Igor Galić
>            Priority: Blocker
>             Fix For: 3.1.2
>
>         Attachments: records.config
>
>
> A very reproducible bug where the body of a HTTPS POST request is never forwarded to the origin server.
> Client submits a HTTPS POST request to TS, which is supposed to forward to the backend/origin server via HTTP.  TS process the HTTP headers and establishes connection to the origin server, but the body of the HTTPS POST is never read.  This hangs until the client times out and shuts down the connection.
> To reproduce:
> # Client connects to TS using HTTPS (works OK if it is just HTTP).
> # It must be a POST request.
> # TS must use at least 2 worker threads.
> # Easier to reproduce when the connections to the origin server is HTTP (not HTTPS).
> # POST body must be large enough so that the HTTP request headers and POST body do *NOT* fit within the same TCP packet. (2000 bytes is a good size)
> # I can consistently reproduce this problem using 2 separate clients each simultaneously submitting 2 requests back to back (i.e., 2 requests from each client, a total of 4 requests).  This gives you a high probability that at least one of the requests would hang.
> Observation:
> # Thread A accepted and processed the HTTP headers, and called "UnixNetProcessor::connect_re" to prepare a new connection to the origin server.
> # Thread A must not have read the body of the POST.  Otherwise, it works fine.
> # Thread B was assigned the task to handle the origin server connection.  If the same thread A was picked, then everything works fine.
> # Apparently, one of the first things that thread B does is to acquire the mutex for reading from the client.  (Why does it do that??)
> # While thread B was holding the mutex, thread A proceeded in "SSLNetVConnection::net_read_io", tried and failed to acquire the mutex.  Thread A typically re-tried calling "SSLNetVConnection::net_read_io" soon, but gave up after the second failure. But if thread B released the mutex soon enough, that thread A could proceed happily and everything works.
> # From this point, the body of the POST is never read from the client, and there is nothing to be proxy'd to the origin server, and both the consumer and producer tasks are never scheduled to run again -- or until the client times out.  I tried setting the client-side time out to as long as 3-5 minutes and TS really does not recover by itself until the client closed the connection.
> This is the first time I uses this bug system.  Please let me know how I could produce the configuration files and trace logs, etc.  Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (TS-1049) TS hangs (dead lock) on HTTPS POST requests

Posted by "Leif Hedstrom (Resolved) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Leif Hedstrom resolved TS-1049.
-------------------------------

    Resolution: Fixed

Great job Wilson!
                
> TS hangs (dead lock) on HTTPS POST requests
> -------------------------------------------
>
>                 Key: TS-1049
>                 URL: https://issues.apache.org/jira/browse/TS-1049
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Core, HTTP, SSL
>    Affects Versions: 3.1.1, 3.1.0, 3.0.2
>         Environment: RedHat Enterprise Linux 6.0, Intel 32-bit
>            Reporter: Wilson Ho
>            Assignee: Leif Hedstrom
>            Priority: Blocker
>             Fix For: 3.1.2
>
>         Attachments: records.config
>
>
> A very reproducible bug where the body of a HTTPS POST request is never forwarded to the origin server.
> Client submits a HTTPS POST request to TS, which is supposed to forward to the backend/origin server via HTTP.  TS process the HTTP headers and establishes connection to the origin server, but the body of the HTTPS POST is never read.  This hangs until the client times out and shuts down the connection.
> To reproduce:
> 1) Client connects to TS using HTTPS (works OK if it is just HTTP).
> 2) It must be a POST request.
> 3) TS must use at least 2 worker threads.
> 4) Easier to reproduce when the connections to the origin server is HTTP (not HTTPS).
> 5) POST body must be large enough so that the HTTP request headers and POST body do *NOT* fit within the same TCP packet. (2000 bytes is a good size)
> 6) I can consistently reproduce this problem using 2 separate clients each simultaneously submitting 2 requests back to back (i.e., 2 requests from each client, a total of 4 requests).  This gives you a high probability that at least one of the requests would hang.
> Observation:
> 1) Thread A accepted and processed the HTTP headers, and called "UnixNetProcessor::connect_re" to prepare a new connection to the origin server.
> 2) Thread A must not have read the body of the POST.  Otherwise, it works fine.
> 3) Thread B was assigned the task to handle the origin server connection.  If the same thread A was picked, then everything works fine.
> 4) Apparently, one of the first things that thread B does is to acquire the mutex for reading from the client.  (Why does it do that??)
> 5) While thread B was holding the mutex, thread A proceeded in "SSLNetVConnection::net_read_io", tried and failed to acquire the mutex.  Thread A typically re-tried calling "SSLNetVConnection::net_read_io" soon, but gave up after the second failure. But if thread B released the mutex soon enough, that thread A could proceed happily and everything works.
> 6) From this point, the body of the POST is never read from the client, and there is nothing to be proxy'd to the origin server, and both the consumer and producer tasks are never scheduled to run again -- or until the client times out.  I tried setting the client-side time out to as long as 3-5 minutes and TS really does not recover by itself until the client closed the connection.
> This is the first time I uses this bug system.  Please let me know how I could produce the configuration files and trace logs, etc.  Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (TS-1049) TS hangs (dead lock) on HTTPS POST requests

Posted by "Leif Hedstrom (Assigned) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Leif Hedstrom reassigned TS-1049:
---------------------------------

    Assignee: Leif Hedstrom
    
> TS hangs (dead lock) on HTTPS POST requests
> -------------------------------------------
>
>                 Key: TS-1049
>                 URL: https://issues.apache.org/jira/browse/TS-1049
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Core, HTTP, SSL
>    Affects Versions: 3.1.1, 3.1.0, 3.0.2
>         Environment: RedHat Enterprise Linux 6.0, Intel 32-bit
>            Reporter: Wilson Ho
>            Assignee: Leif Hedstrom
>            Priority: Blocker
>             Fix For: 3.1.2
>
>         Attachments: records.config
>
>
> A very reproducible bug where the body of a HTTPS POST request is never forwarded to the origin server.
> Client submits a HTTPS POST request to TS, which is supposed to forward to the backend/origin server via HTTP.  TS process the HTTP headers and establishes connection to the origin server, but the body of the HTTPS POST is never read.  This hangs until the client times out and shuts down the connection.
> To reproduce:
> 1) Client connects to TS using HTTPS (works OK if it is just HTTP).
> 2) It must be a POST request.
> 3) TS must use at least 2 worker threads.
> 4) Easier to reproduce when the connections to the origin server is HTTP (not HTTPS).
> 5) POST body must be large enough so that the HTTP request headers and POST body do *NOT* fit within the same TCP packet. (2000 bytes is a good size)
> 6) I can consistently reproduce this problem using 2 separate clients each simultaneously submitting 2 requests back to back (i.e., 2 requests from each client, a total of 4 requests).  This gives you a high probability that at least one of the requests would hang.
> Observation:
> 1) Thread A accepted and processed the HTTP headers, and called "UnixNetProcessor::connect_re" to prepare a new connection to the origin server.
> 2) Thread A must not have read the body of the POST.  Otherwise, it works fine.
> 3) Thread B was assigned the task to handle the origin server connection.  If the same thread A was picked, then everything works fine.
> 4) Apparently, one of the first things that thread B does is to acquire the mutex for reading from the client.  (Why does it do that??)
> 5) While thread B was holding the mutex, thread A proceeded in "SSLNetVConnection::net_read_io", tried and failed to acquire the mutex.  Thread A typically re-tried calling "SSLNetVConnection::net_read_io" soon, but gave up after the second failure. But if thread B released the mutex soon enough, that thread A could proceed happily and everything works.
> 6) From this point, the body of the POST is never read from the client, and there is nothing to be proxy'd to the origin server, and both the consumer and producer tasks are never scheduled to run again -- or until the client times out.  I tried setting the client-side time out to as long as 3-5 minutes and TS really does not recover by itself until the client closed the connection.
> This is the first time I uses this bug system.  Please let me know how I could produce the configuration files and trace logs, etc.  Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (TS-1049) TS hangs (dead lock) on HTTPS POST requests

Posted by "Leif Hedstrom (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/TS-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13181861#comment-13181861 ] 

Leif Hedstrom commented on TS-1049:
-----------------------------------

Interesting.Did you try turning off sharing origin connections (or setting it to "2", which creates a connection pool per thread)? Not saying that's a "fix", but curious to hear if it helps (and if it does, it's a viable work around).

I'll also look at your suggested patch, and see if it makes sense :)

Thanks!

-- leif

                
> TS hangs (dead lock) on HTTPS POST requests
> -------------------------------------------
>
>                 Key: TS-1049
>                 URL: https://issues.apache.org/jira/browse/TS-1049
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Core, HTTP, SSL
>    Affects Versions: 3.1.1, 3.1.0, 3.0.2
>         Environment: RedHat Enterprise Linux 6.0, Intel 32-bit
>            Reporter: Wilson Ho
>            Assignee: Leif Hedstrom
>            Priority: Blocker
>             Fix For: 3.1.2
>
>         Attachments: records.config
>
>
> A very reproducible bug where the body of a HTTPS POST request is never forwarded to the origin server.
> Client submits a HTTPS POST request to TS, which is supposed to forward to the backend/origin server via HTTP.  TS process the HTTP headers and establishes connection to the origin server, but the body of the HTTPS POST is never read.  This hangs until the client times out and shuts down the connection.
> To reproduce:
> 1) Client connects to TS using HTTPS (works OK if it is just HTTP).
> 2) It must be a POST request.
> 3) TS must use at least 2 worker threads.
> 4) Easier to reproduce when the connections to the origin server is HTTP (not HTTPS).
> 5) POST body must be large enough so that the HTTP request headers and POST body do *NOT* fit within the same TCP packet. (2000 bytes is a good size)
> 6) I can consistently reproduce this problem using 2 separate clients each simultaneously submitting 2 requests back to back (i.e., 2 requests from each client, a total of 4 requests).  This gives you a high probability that at least one of the requests would hang.
> Observation:
> 1) Thread A accepted and processed the HTTP headers, and called "UnixNetProcessor::connect_re" to prepare a new connection to the origin server.
> 2) Thread A must not have read the body of the POST.  Otherwise, it works fine.
> 3) Thread B was assigned the task to handle the origin server connection.  If the same thread A was picked, then everything works fine.
> 4) Apparently, one of the first things that thread B does is to acquire the mutex for reading from the client.  (Why does it do that??)
> 5) While thread B was holding the mutex, thread A proceeded in "SSLNetVConnection::net_read_io", tried and failed to acquire the mutex.  Thread A typically re-tried calling "SSLNetVConnection::net_read_io" soon, but gave up after the second failure. But if thread B released the mutex soon enough, that thread A could proceed happily and everything works.
> 6) From this point, the body of the POST is never read from the client, and there is nothing to be proxy'd to the origin server, and both the consumer and producer tasks are never scheduled to run again -- or until the client times out.  I tried setting the client-side time out to as long as 3-5 minutes and TS really does not recover by itself until the client closed the connection.
> This is the first time I uses this bug system.  Please let me know how I could produce the configuration files and trace logs, etc.  Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (TS-1049) TS hangs (dead lock) on HTTPS POST requests

Posted by "Brian Geffon (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brian Geffon reassigned TS-1049:
--------------------------------

    Assignee: Brian Geffon  (was: Igor Galić)
    
> TS hangs (dead lock) on HTTPS POST requests
> -------------------------------------------
>
>                 Key: TS-1049
>                 URL: https://issues.apache.org/jira/browse/TS-1049
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Core, HTTP, SSL
>    Affects Versions: 3.1.1, 3.1.0, 3.0.2
>         Environment: RedHat Enterprise Linux 6.0, Intel 32-bit
>            Reporter: Wilson Ho
>            Assignee: Brian Geffon
>            Priority: Blocker
>             Fix For: 3.1.2
>
>         Attachments: records.config
>
>
> A very reproducible bug where the body of a HTTPS POST request is never forwarded to the origin server.
> Client submits a HTTPS POST request to TS, which is supposed to forward to the backend/origin server via HTTP.  TS process the HTTP headers and establishes connection to the origin server, but the body of the HTTPS POST is never read.  This hangs until the client times out and shuts down the connection.
> To reproduce:
> # Client connects to TS using HTTPS (works OK if it is just HTTP).
> # It must be a POST request.
> # TS must use at least 2 worker threads.
> # Easier to reproduce when the connections to the origin server is HTTP (not HTTPS).
> # POST body must be large enough so that the HTTP request headers and POST body do *NOT* fit within the same TCP packet. (2000 bytes is a good size)
> # I can consistently reproduce this problem using 2 separate clients each simultaneously submitting 2 requests back to back (i.e., 2 requests from each client, a total of 4 requests).  This gives you a high probability that at least one of the requests would hang.
> Observation:
> # Thread A accepted and processed the HTTP headers, and called "UnixNetProcessor::connect_re" to prepare a new connection to the origin server.
> # Thread A must not have read the body of the POST.  Otherwise, it works fine.
> # Thread B was assigned the task to handle the origin server connection.  If the same thread A was picked, then everything works fine.
> # Apparently, one of the first things that thread B does is to acquire the mutex for reading from the client.  (Why does it do that??)
> # While thread B was holding the mutex, thread A proceeded in "SSLNetVConnection::net_read_io", tried and failed to acquire the mutex.  Thread A typically re-tried calling "SSLNetVConnection::net_read_io" soon, but gave up after the second failure. But if thread B released the mutex soon enough, that thread A could proceed happily and everything works.
> # From this point, the body of the POST is never read from the client, and there is nothing to be proxy'd to the origin server, and both the consumer and producer tasks are never scheduled to run again -- or until the client times out.  I tried setting the client-side time out to as long as 3-5 minutes and TS really does not recover by itself until the client closed the connection.
> This is the first time I uses this bug system.  Please let me know how I could produce the configuration files and trace logs, etc.  Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (TS-1049) TS hangs (dead lock) on HTTPS POST requests

Posted by "Brian Geffon (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brian Geffon resolved TS-1049.
------------------------------

       Resolution: Fixed
    Fix Version/s: 3.0.5

Backport to 3.0.x in cb40d2c285082955f3aeab60186a0dfaf085b7c9
                
> TS hangs (dead lock) on HTTPS POST requests
> -------------------------------------------
>
>                 Key: TS-1049
>                 URL: https://issues.apache.org/jira/browse/TS-1049
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Core, HTTP, SSL
>    Affects Versions: 3.1.1, 3.1.0, 3.0.2
>         Environment: RedHat Enterprise Linux 6.0, Intel 32-bit
>            Reporter: Wilson Ho
>            Assignee: Brian Geffon
>            Priority: Blocker
>             Fix For: 3.0.5, 3.1.2
>
>         Attachments: records.config
>
>
> A very reproducible bug where the body of a HTTPS POST request is never forwarded to the origin server.
> Client submits a HTTPS POST request to TS, which is supposed to forward to the backend/origin server via HTTP.  TS process the HTTP headers and establishes connection to the origin server, but the body of the HTTPS POST is never read.  This hangs until the client times out and shuts down the connection.
> To reproduce:
> # Client connects to TS using HTTPS (works OK if it is just HTTP).
> # It must be a POST request.
> # TS must use at least 2 worker threads.
> # Easier to reproduce when the connections to the origin server is HTTP (not HTTPS).
> # POST body must be large enough so that the HTTP request headers and POST body do *NOT* fit within the same TCP packet. (2000 bytes is a good size)
> # I can consistently reproduce this problem using 2 separate clients each simultaneously submitting 2 requests back to back (i.e., 2 requests from each client, a total of 4 requests).  This gives you a high probability that at least one of the requests would hang.
> Observation:
> # Thread A accepted and processed the HTTP headers, and called "UnixNetProcessor::connect_re" to prepare a new connection to the origin server.
> # Thread A must not have read the body of the POST.  Otherwise, it works fine.
> # Thread B was assigned the task to handle the origin server connection.  If the same thread A was picked, then everything works fine.
> # Apparently, one of the first things that thread B does is to acquire the mutex for reading from the client.  (Why does it do that??)
> # While thread B was holding the mutex, thread A proceeded in "SSLNetVConnection::net_read_io", tried and failed to acquire the mutex.  Thread A typically re-tried calling "SSLNetVConnection::net_read_io" soon, but gave up after the second failure. But if thread B released the mutex soon enough, that thread A could proceed happily and everything works.
> # From this point, the body of the POST is never read from the client, and there is nothing to be proxy'd to the origin server, and both the consumer and producer tasks are never scheduled to run again -- or until the client times out.  I tried setting the client-side time out to as long as 3-5 minutes and TS really does not recover by itself until the client closed the connection.
> This is the first time I uses this bug system.  Please let me know how I could produce the configuration files and trace logs, etc.  Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (TS-1049) TS hangs (dead lock) on HTTPS POST requests

Posted by "Leif Hedstrom (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Leif Hedstrom updated TS-1049:
------------------------------

    Fix Version/s: 3.1.2
    
> TS hangs (dead lock) on HTTPS POST requests
> -------------------------------------------
>
>                 Key: TS-1049
>                 URL: https://issues.apache.org/jira/browse/TS-1049
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Core, HTTP, SSL
>    Affects Versions: 3.1.1, 3.1.0, 3.0.2
>         Environment: RedHat Enterprise Linux 6.0, Intel 32-bit
>            Reporter: Wilson Ho
>            Priority: Blocker
>             Fix For: 3.1.2
>
>         Attachments: records.config
>
>
> A very reproducible bug where the body of a HTTPS POST request is never forwarded to the origin server.
> Client submits a HTTPS POST request to TS, which is supposed to forward to the backend/origin server via HTTP.  TS process the HTTP headers and establishes connection to the origin server, but the body of the HTTPS POST is never read.  This hangs until the client times out and shuts down the connection.
> To reproduce:
> 1) Client connects to TS using HTTPS (works OK if it is just HTTP).
> 2) It must be a POST request.
> 3) TS must use at least 2 worker threads.
> 4) Easier to reproduce when the connections to the origin server is HTTP (not HTTPS).
> 5) POST body must be large enough so that the HTTP request headers and POST body do *NOT* fit within the same TCP packet. (2000 bytes is a good size)
> 6) I can consistently reproduce this problem using 2 separate clients each simultaneously submitting 2 requests back to back (i.e., 2 requests from each client, a total of 4 requests).  This gives you a high probability that at least one of the requests would hang.
> Observation:
> 1) Thread A accepted and processed the HTTP headers, and called "UnixNetProcessor::connect_re" to prepare a new connection to the origin server.
> 2) Thread A must not have read the body of the POST.  Otherwise, it works fine.
> 3) Thread B was assigned the task to handle the origin server connection.  If the same thread A was picked, then everything works fine.
> 4) Apparently, one of the first things that thread B does is to acquire the mutex for reading from the client.  (Why does it do that??)
> 5) While thread B was holding the mutex, thread A proceeded in "SSLNetVConnection::net_read_io", tried and failed to acquire the mutex.  Thread A typically re-tried calling "SSLNetVConnection::net_read_io" soon, but gave up after the second failure. But if thread B released the mutex soon enough, that thread A could proceed happily and everything works.
> 6) From this point, the body of the POST is never read from the client, and there is nothing to be proxy'd to the origin server, and both the consumer and producer tasks are never scheduled to run again -- or until the client times out.  I tried setting the client-side time out to as long as 3-5 minutes and TS really does not recover by itself until the client closed the connection.
> This is the first time I uses this bug system.  Please let me know how I could produce the configuration files and trace logs, etc.  Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (TS-1049) TS hangs (dead lock) on HTTPS POST requests

Posted by "Igor Galić (Assigned JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/TS-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Igor Galić reassigned TS-1049:
------------------------------

    Assignee: Igor Galić  (was: Leif Hedstrom)
    
> TS hangs (dead lock) on HTTPS POST requests
> -------------------------------------------
>
>                 Key: TS-1049
>                 URL: https://issues.apache.org/jira/browse/TS-1049
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Core, HTTP, SSL
>    Affects Versions: 3.1.1, 3.1.0, 3.0.2
>         Environment: RedHat Enterprise Linux 6.0, Intel 32-bit
>            Reporter: Wilson Ho
>            Assignee: Igor Galić
>            Priority: Blocker
>             Fix For: 3.1.2
>
>         Attachments: records.config
>
>
> A very reproducible bug where the body of a HTTPS POST request is never forwarded to the origin server.
> Client submits a HTTPS POST request to TS, which is supposed to forward to the backend/origin server via HTTP.  TS process the HTTP headers and establishes connection to the origin server, but the body of the HTTPS POST is never read.  This hangs until the client times out and shuts down the connection.
> To reproduce:
> # Client connects to TS using HTTPS (works OK if it is just HTTP).
> # It must be a POST request.
> # TS must use at least 2 worker threads.
> # Easier to reproduce when the connections to the origin server is HTTP (not HTTPS).
> # POST body must be large enough so that the HTTP request headers and POST body do *NOT* fit within the same TCP packet. (2000 bytes is a good size)
> # I can consistently reproduce this problem using 2 separate clients each simultaneously submitting 2 requests back to back (i.e., 2 requests from each client, a total of 4 requests).  This gives you a high probability that at least one of the requests would hang.
> Observation:
> # Thread A accepted and processed the HTTP headers, and called "UnixNetProcessor::connect_re" to prepare a new connection to the origin server.
> # Thread A must not have read the body of the POST.  Otherwise, it works fine.
> # Thread B was assigned the task to handle the origin server connection.  If the same thread A was picked, then everything works fine.
> # Apparently, one of the first things that thread B does is to acquire the mutex for reading from the client.  (Why does it do that??)
> # While thread B was holding the mutex, thread A proceeded in "SSLNetVConnection::net_read_io", tried and failed to acquire the mutex.  Thread A typically re-tried calling "SSLNetVConnection::net_read_io" soon, but gave up after the second failure. But if thread B released the mutex soon enough, that thread A could proceed happily and everything works.
> # From this point, the body of the POST is never read from the client, and there is nothing to be proxy'd to the origin server, and both the consumer and producer tasks are never scheduled to run again -- or until the client times out.  I tried setting the client-side time out to as long as 3-5 minutes and TS really does not recover by itself until the client closed the connection.
> This is the first time I uses this bug system.  Please let me know how I could produce the configuration files and trace logs, etc.  Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira