You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafficserver.apache.org by "Craig Schomburg (craigs)" <cr...@cisco.com> on 2016/06/03 19:43:02 UTC

Question on recommended ATS release and update on SSLNetVConnection IOBufferBlock management issue

Leif and Susan,



Based on your knowledge of the releases 6.0.0 and beyond, if we were looking at the best blessed release so far would you recommend looking at 6.1.1 (last posted release on the web site) or is there a 6.2 that is blessed as stable and deployed that would be a better option?



Also we are just about to FCS a new internal product release with 6.0.0.  Are there any other known deployments of 6.0.0 that have indicated they are stable?



We are still digging into the SSLNetVConnection IOBufferBlock management issue mentioned on our previous threads but are considering looking at a later release in parallel.



As an update for Susan and Alan, on our SSLNetVConnection IOBufferBlock issue we are progressing further but still digging in on the root cause (slow digging).  Turns out the previous SSL_ERROR_ZERO_RETURN data point we had mentioned was on a different session so was unrelated to this issue.



What we have determined thus far is

·         we have a Client that is generating ~3 SSL sessions in parallel (same Src IP, Different Src-Port).

·         On the session in error, it issues 3 GET requests serially (each is ~1.6K), each packet associated with the GET is in 2 fragments.

·         As each GET HTTP/1.1. is received we work our way through the first IOBufferBlock and on the 3rd we have insufficient buffer space for the complete packet (buffer_write_avail = 0, b->next = NULL).

·         The first part of the 3rd GET is sent off to HTTP for processing and we get a PARSE_CONT/ SSL_ERROR_NONE.  Problem we see is that net_read_io() is never re-invoked to resume the remainder of the read.



In our live network environment we can recreate this at will.  One unique item in the live client sequence is just before we get the 3rd GET on the initial SSL session, one of the other three SSL sessions on this client gets terminated.  This was the source of the SSL_ERROR_ZERO_RETURN message we reported on the earlier thread.  So we see the SSL_ERROR_ZERO_RETURN on the unrelated session followed by the incomplete GET processing on the other session on the same client.



If we use our test harness and do the same, less the extra session termination (Start SSL session, send 3 GETs that cause multiple read passes on the 3rd GET) we never see the failure.



Just curious if any of this sounds like any other issues folks my be familiar with in ATS 6.0.0?

Craig S.



On 5/27/16, 11:19 AM, "Susan Hinrichs" <sh...@network-geographics.com> wrote:



>There is a PR that uses the buffer interface instead of the block

>interface which results in simpler code.  We are running this code

>internally in Yahoo.  It fixed a performance problem introduced by a fix

>not yet landed in open source.  Since the current code works, I haven't

>pushed this PR.  But if debugging anything in this area, I'd suggest

>first moving to the buffer interface.

>

>https://github.com/apache/trafficserver/pull/629/files

>

>Alan just rediscovered how additional blocks are added in the existing

>code. I'll let him respond with details on that.

>

>Also, I'd suggest moving up to 6.1 or 6.2.x rather than 6.0.0.  I don't

>think many folks have deployed 6.0.  We are starting to deploy 6.2 and

>we tested a bit with 6.1 (and others have deployed 6.1).

>

>

>On 5/27/2016 9:46 AM, Craig Schomburg (craigs) wrote:

>> Hey folks,

>>

>> We are encountering a SSLVNetConnection IOBufferBlock buffer management

>> issue in ATS 6.0.0 that we did not see in the earlier ATS 4.0.1 release

>> Which we were using.

>>

>> What we see in ssl_read_from_net() is when we get multiple GET requests on

>> a single SSL session, as each GET is processed and ACK/NACK’ed that the

>> buffer is not reset and the space released for reuse.  As a result, the

>> available write_avail() space in the session IOBufferBlock buffer is

>> reduced with each subsequent packet until we have insufficient space to

>> buffer the packet.

>>

>> Also appears that ATS is set up to support a chain of 2 IOBufferBlock

>> but since only 1 is allocated we bail out of the read loop in

>> ssl_read_from_net() with a incomplete packet and then drop it.

>>

>> Request                       Response          Txn-ID  VC

>> ----------------------------  ----------------  ------  --------------

>> GET /call/187972?debug=1      200 OK             4      0x560bb93e6420

>>    b->write_avail()=4096, nread=0

>>    b->write_avail()=4096, nread=1900 (2196 left in buffer)

>>    nread=0                PARSE DONE

>>

>> GET /call/widget.jsp...       200 OK             5      0x560bb93e6420

>>    b->write_avail()=2196, nread=0

>>    b->write_avail()=2196, nread=2120 (  76 left in buffer)

>>    nread=0                PARSE DONE

>>

>> GET /call/js/libs/require.js  304 Not Modified   6      0x560bb93e6420

>>    b->write_avail()=76,   nread=0

>>    b->write_avail()=76,   nread=76   (   0 left)

>>    b->next is NULL so ssl_read_from_net() bails on read loop and remainder

>>      of packet is not read

>>

>> We hacked the ssl_read_from_net() code in the SSL_ERROR_NONE case to

>> add_block() if b->next == NULL and block_write_avail == 0 and that

>> “appeared” to get us working again but I am not convinced that was the

>> correct solution.  Concerned because it appears that other areas of the

>> code assume there will never be more than 2 buffers in the list and we

>> did not put a limit on the list length.

>>

>> So my question is when should the IOBufferBlock _end and _start have

>> been reset() to free the buffer space?  I assumed that since we were seeing

>> a serial Packet(GET), ACK, Packet(GET), ACK, Packet(GET), ACK, that the

>> Buffer space could/should have been reset after each ACK?

>>

>> Also curious if this is a known issue with ATS 6.0.0 that has been

>> addressed or is known/unaddressed?

>>

>> Continuing to dig through the code in the mean time.  Any feedback, insight,

>> etc. would be appreciated…

>>

>> Thanks,

>>

>> Craig S.

>>

>