You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafficserver.apache.org by zwoop <gi...@git.apache.org> on 2017/03/03 16:57:43 UTC

[GitHub] trafficserver issue #1531: Assertion in state_read_server_response_header (v...

GitHub user zwoop opened an issue:

    https://github.com/apache/trafficserver/issues/1531

    Assertion in state_read_server_response_header (v7.1.0)

    Seeing this on docs (infrequently):
    
    ```
    (gdb) bt full
    #0  0x00007ffff57b21d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
            resultvar = 0
            pid = 5856
            selftid = 5862
    #1  0x00007ffff57b38c8 in __GI_abort () at abort.c:90
            save_stage = 2
            act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {0 <repeats 16 times>}}, sa_flags = 0, sa_restorer = 0x300a00440}
            sigs = {__val = {32, 0 <repeats 15 times>}}
    #2  0x00007ffff7bb28fe in ink_abort (message_format=0x7ffff7bc9160 "%s:%d: failed assertion `%s`") at ink_error.cc:99
            ap = {{gp_offset = 32, fp_offset = 48, overflow_arg_area = 0x2aaaaaffe960, reg_save_area = 0x2aaaaaffe8a0}}
    #3  0x00007ffff7baffbf in _ink_assert (expression=0x902628 "server_entry->read_vio == (VIO *)data", file=0x9018ec "HttpSM.cc", line=1811) at ink_assert.cc:37
    No locals.
    #4  0x000000000064dbac in HttpSM::state_read_server_response_header (this=0x2aaab2ecc300, event=3, data=0x2aaadfe10fe0) at HttpSM.cc:1811
            __FUNCTION__ = "state_read_server_response_header"
            bytes_used = -1426065664
            vio = 0x0
            state = 10922
    #5  0x0000000000650eca in HttpSM::main_handler (this=0x2aaab2ecc300, event=3, data=0x2aaadfe10fe0) at HttpSM.cc:2663
            jump_point = (int (HttpSM::*)(HttpSM * const, int, void *)) 0x64da58 <HttpSM::state_read_server_response_header(int, void*)>
            __FUNCTION__ = "main_handler"
            vc_entry = 0x2aaab2ecdb08
    #6  0x000000000057472a in Continuation::handleEvent (this=0x2aaab2ecc300, event=3, data=0x2aaadfe10fe0) at /usr/local/src/trafficserver/iocore/eventsystem/I_Continuation.h:153
    No locals.
    #7  0x0000000000806eed in write_signal_and_update (event=3, vc=0x2aaadfe10e40) at UnixNetVConnection.cc:176
            __FUNCTION__ = "write_signal_and_update"
    #8  0x0000000000807124 in write_signal_done (event=3, nh=0x7ffff3009e50, vc=0x2aaadfe10e40) at UnixNetVConnection.cc:218
    No locals.
    #9  0x00000000008071b4 in write_signal_error (nh=0x7ffff3009e50, vc=0x2aaadfe10e40, lerrno=32) at UnixNetVConnection.cc:237
    No locals.
    #10 0x00000000008080d0 in write_to_net_io (nh=0x7ffff3009e50, vc=0x2aaadfe10e40, thread=0x7ffff3006000) at UnixNetVConnection.cc:463
            err = 32
            errlen = 4
            s = 0x2aaadfe10fd8
            mutex = 0x7ffff3512db0
            lock = {m = {m_ptr = 0x2aaab0402ca0}, lock_acquired = true}
            __FUNCTION__ = "write_to_net_io"
            ntodo = 910
            buf = <error reading variable>
            towrite = 46913388875328
            signalled = 0
            needs = 32
            total_written = 4
            r = 140737294027776
    #11 0x0000000000807ead in write_to_net (nh=0x7ffff3009e50, vc=0x2aaadfe10e40, thread=0x7ffff3006000) at UnixNetVConnection.cc:435
            mutex = 0x7ffff3512db0
    #12 0x00000000007ff2d8 in NetHandler::mainNetEvent (this=0x7ffff3009e50, event=5, e=0x2aaaabc05ca0) at UnixNet.cc:526
            epd = 0x2aaadfe10da8
            poll_timeout = 0
            pd = 0x7ffff2f1d000
            vc = 0x2aaadfe10e40
            __FUNCTION__ = "mainNetEvent"
    #13 0x000000000057472a in Continuation::handleEvent (this=0x7ffff3009e50, event=5, data=0x2aaaabc05ca0) at /usr/local/src/trafficserver/iocore/eventsystem/I_Continuation.h:153
    No locals.
    #14 0x000000000082b522 in EThread::process_event (this=0x7ffff3006000, e=0x2aaaabc05ca0, calling_code=5) at UnixEThread.cc:143
            c_temp = 0x7ffff3009e50
            lock = {m = {m_ptr = 0x7ffff3512480}, lock_acquired = true}
            __FUNCTION__ = "process_event"
    #15 0x000000000082bb01 in EThread::execute (this=0x7ffff3006000) at UnixEThread.cc:270
            done_one = false
            e = 0x2aaaabc05ca0
            NegativeQueue = {<DLL<Event, Event::Link_link>> = {head = 0x2aaaabc04f80}, tail = 0x2aaaabc041a0}
            next_time = 1488388269045403248
            __FUNCTION__ = "execute"
    #16 0x000000000082ab3c in spawn_thread_internal (a=0x7ffff4539fa0) at Thread.cc:84
            p = 0x7ffff4539fa0
    #17 0x00007ffff604bdc5 in start_thread (arg=0x2aaaaafff700) at pthread_create.c:308
            __res = <optimized out>
            pd = 0x2aaaaafff700
            now = <optimized out>
            unwind_buf = {cancel_jmp_buf = {{jmp_buf = {46912501708544, 8358616470746199031, 0, 140737488348672, 46912501708544, 0, 2786198315644908535, -8358635357929704457}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
            not_first_call = <optimized out>
            pagesize_m1 = <optimized out>
            sp = <optimized out>
            freesize = <optimized out>
    #18 0x00007ffff587473d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
    No locals.
    ```

----

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver issue #1531: Assertion in state_read_server_response_header (v...

Posted by shinrich <gi...@git.apache.org>.
Github user shinrich commented on the issue:

    https://github.com/apache/trafficserver/issues/1531
  
    The server_entry->read_vio is NULL.  From what I can tell this value only gets set due to do_io_read() calls in attach_server_session() and setup_server_read_response_header().  Presumably one of these do_io_read operations failed. 
    
    Looking at the stack, we had a failure which is getting trigger via the write path (errno 32 broken pipe).
    
    I'm guessing that we should have started handling the failure where the do_io_read failed presumably in setup_server_read_response_header().


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver issue #1531: Assertion in state_read_server_response_header (v...

Posted by scw00 <gi...@git.apache.org>.
Github user scw00 commented on the issue:

    https://github.com/apache/trafficserver/issues/1531
  
    Based on #1522 we need to ensure that all the handler should handle read error event and write error event at same time.
    
    I'll track it !


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver issue #1531: Assertion in state_read_server_response_header (v...

Posted by oknet <gi...@git.apache.org>.
Github user oknet commented on the issue:

    https://github.com/apache/trafficserver/issues/1531
  
    before #947, the EPOLLERR event only set write.triggered to 1, iocore only callback SM if write.enabled==1 and write.triggered==1.
    after #947 , the EPOLLERR set write.error to 1, iocore could callback SM if write.error==1 and write.triggered==1 event write.enable==0. (the assert you met in this isssue).
    
    @zwoop , please try to fix this issue by merge #1522 .


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver issue #1531: Assertion in state_read_server_response_header (v...

Posted by zwoop <gi...@git.apache.org>.
Github user zwoop commented on the issue:

    https://github.com/apache/trafficserver/issues/1531
  
    Hmmm, I see
    
    ```
    (gdb) frame 4
    #4  0x000000000064dbac in HttpSM::state_read_server_response_header (this=0x2aaab2ecc300, event=3, data=0x2aaadfe10fe0) at HttpSM.cc:1811
    1811	  ink_assert(server_entry->read_vio == (VIO *)data);
    (gdb) p *server_entry
    $5 = {vc = 0x2aaae0604c60, read_buffer = 0x0, write_buffer = 0x0, read_vio = 0x2aaadfe10f70, write_vio = 0x2aaadfe10fe0, vc_handler = (int (HttpSM::*)(HttpSM * const, int, void *)) 0x64da58 <HttpSM::state_read_server_response_header(int, void*)>, vc_type = HTTP_SERVER_VC, eos = true, in_tunnel = false}
    (gdb) p (VIO*)data
    $6 = (VIO *) 0x2aaadfe10fe0
    ```
    
    So, as far as I can tell, "data" is set to the write_vio, but the assert checks against the read_vio.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver issue #1531: Assertion in state_read_server_response_header (v...

Posted by oknet <gi...@git.apache.org>.
Github user oknet commented on the issue:

    https://github.com/apache/trafficserver/issues/1531
  
    I created a PR #1559 to resolve this issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver issue #1531: Assertion in state_read_server_response_header (v...

Posted by zwoop <gi...@git.apache.org>.
GitHub user zwoop reopened an issue:

    https://github.com/apache/trafficserver/issues/1531

    Assertion in state_read_server_response_header (v7.1.0)

    Seeing this on docs (infrequently):
    
    ```
    (gdb) bt full
    #0  0x00007ffff57b21d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
            resultvar = 0
            pid = 5856
            selftid = 5862
    #1  0x00007ffff57b38c8 in __GI_abort () at abort.c:90
            save_stage = 2
            act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {0 <repeats 16 times>}}, sa_flags = 0, sa_restorer = 0x300a00440}
            sigs = {__val = {32, 0 <repeats 15 times>}}
    #2  0x00007ffff7bb28fe in ink_abort (message_format=0x7ffff7bc9160 "%s:%d: failed assertion `%s`") at ink_error.cc:99
            ap = {{gp_offset = 32, fp_offset = 48, overflow_arg_area = 0x2aaaaaffe960, reg_save_area = 0x2aaaaaffe8a0}}
    #3  0x00007ffff7baffbf in _ink_assert (expression=0x902628 "server_entry->read_vio == (VIO *)data", file=0x9018ec "HttpSM.cc", line=1811) at ink_assert.cc:37
    No locals.
    #4  0x000000000064dbac in HttpSM::state_read_server_response_header (this=0x2aaab2ecc300, event=3, data=0x2aaadfe10fe0) at HttpSM.cc:1811
            __FUNCTION__ = "state_read_server_response_header"
            bytes_used = -1426065664
            vio = 0x0
            state = 10922
    #5  0x0000000000650eca in HttpSM::main_handler (this=0x2aaab2ecc300, event=3, data=0x2aaadfe10fe0) at HttpSM.cc:2663
            jump_point = (int (HttpSM::*)(HttpSM * const, int, void *)) 0x64da58 <HttpSM::state_read_server_response_header(int, void*)>
            __FUNCTION__ = "main_handler"
            vc_entry = 0x2aaab2ecdb08
    #6  0x000000000057472a in Continuation::handleEvent (this=0x2aaab2ecc300, event=3, data=0x2aaadfe10fe0) at /usr/local/src/trafficserver/iocore/eventsystem/I_Continuation.h:153
    No locals.
    #7  0x0000000000806eed in write_signal_and_update (event=3, vc=0x2aaadfe10e40) at UnixNetVConnection.cc:176
            __FUNCTION__ = "write_signal_and_update"
    #8  0x0000000000807124 in write_signal_done (event=3, nh=0x7ffff3009e50, vc=0x2aaadfe10e40) at UnixNetVConnection.cc:218
    No locals.
    #9  0x00000000008071b4 in write_signal_error (nh=0x7ffff3009e50, vc=0x2aaadfe10e40, lerrno=32) at UnixNetVConnection.cc:237
    No locals.
    #10 0x00000000008080d0 in write_to_net_io (nh=0x7ffff3009e50, vc=0x2aaadfe10e40, thread=0x7ffff3006000) at UnixNetVConnection.cc:463
            err = 32
            errlen = 4
            s = 0x2aaadfe10fd8
            mutex = 0x7ffff3512db0
            lock = {m = {m_ptr = 0x2aaab0402ca0}, lock_acquired = true}
            __FUNCTION__ = "write_to_net_io"
            ntodo = 910
            buf = <error reading variable>
            towrite = 46913388875328
            signalled = 0
            needs = 32
            total_written = 4
            r = 140737294027776
    #11 0x0000000000807ead in write_to_net (nh=0x7ffff3009e50, vc=0x2aaadfe10e40, thread=0x7ffff3006000) at UnixNetVConnection.cc:435
            mutex = 0x7ffff3512db0
    #12 0x00000000007ff2d8 in NetHandler::mainNetEvent (this=0x7ffff3009e50, event=5, e=0x2aaaabc05ca0) at UnixNet.cc:526
            epd = 0x2aaadfe10da8
            poll_timeout = 0
            pd = 0x7ffff2f1d000
            vc = 0x2aaadfe10e40
            __FUNCTION__ = "mainNetEvent"
    #13 0x000000000057472a in Continuation::handleEvent (this=0x7ffff3009e50, event=5, data=0x2aaaabc05ca0) at /usr/local/src/trafficserver/iocore/eventsystem/I_Continuation.h:153
    No locals.
    #14 0x000000000082b522 in EThread::process_event (this=0x7ffff3006000, e=0x2aaaabc05ca0, calling_code=5) at UnixEThread.cc:143
            c_temp = 0x7ffff3009e50
            lock = {m = {m_ptr = 0x7ffff3512480}, lock_acquired = true}
            __FUNCTION__ = "process_event"
    #15 0x000000000082bb01 in EThread::execute (this=0x7ffff3006000) at UnixEThread.cc:270
            done_one = false
            e = 0x2aaaabc05ca0
            NegativeQueue = {<DLL<Event, Event::Link_link>> = {head = 0x2aaaabc04f80}, tail = 0x2aaaabc041a0}
            next_time = 1488388269045403248
            __FUNCTION__ = "execute"
    #16 0x000000000082ab3c in spawn_thread_internal (a=0x7ffff4539fa0) at Thread.cc:84
            p = 0x7ffff4539fa0
    #17 0x00007ffff604bdc5 in start_thread (arg=0x2aaaaafff700) at pthread_create.c:308
            __res = <optimized out>
            pd = 0x2aaaaafff700
            now = <optimized out>
            unwind_buf = {cancel_jmp_buf = {{jmp_buf = {46912501708544, 8358616470746199031, 0, 140737488348672, 46912501708544, 0, 2786198315644908535, -8358635357929704457}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
            not_first_call = <optimized out>
            pagesize_m1 = <optimized out>
            sp = <optimized out>
            freesize = <optimized out>
    #18 0x00007ffff587473d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
    No locals.
    ```

----

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver issue #1531: Assertion in state_read_server_response_header (v...

Posted by shinrich <gi...@git.apache.org>.
Github user shinrich commented on the issue:

    https://github.com/apache/trafficserver/issues/1531
  
    Oh, nevermind.  I saw vio = NULL in the stack, but that is the local variable.
    
    That makes sense that the event would be passing up the write via.  It is a write event we are processing.  But it looks like the handler is only expecting a read event.   I guess a broken pipe shows up as a write event.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver issue #1531: Assertion in state_read_server_response_header (v...

Posted by zwoop <gi...@git.apache.org>.
Github user zwoop commented on the issue:

    https://github.com/apache/trafficserver/issues/1531
  
    I got this again on Docs.
    
    ```
    (gdb) bt full
    #0  0x00002b5b8bd881d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
            resultvar = 0
            pid = 24448
            selftid = 24454
    #1  0x00002b5b8bd89a08 in __GI_abort () at abort.c:119
            act = {__sigaction_handler = {sa_handler = 0x0, sa_sigaction = 0x0}, sa_mask = {__val = {18446744073709551615 <repeats 16 times>}}, sa_flags = 0, sa_restorer = 0x0}
            sigs = {__val = {32, 0 <repeats 15 times>}}
    #2  0x00002b5b89af48fe in ink_abort (message_format=0x2b5b89b0b160 "%s:%d: failed assertion `%s`") at ink_error.cc:99
            ap = {{gp_offset = 32, fp_offset = 48, overflow_arg_area = 0x2aaaab5fe960, reg_save_area = 0x2aaaab5fe8a0}}
    #3  0x00002b5b89af1fbf in _ink_assert (expression=0x902748 "server_entry->read_vio == (VIO *)data", file=0x901a0c "HttpSM.cc", line=1811) at ink_assert.cc:37
    No locals.
    #4  0x000000000064dcc8 in HttpSM::state_read_server_response_header (this=0x2aaab14ee380, event=3, data=0x2aaaeea0a910) at HttpSM.cc:1811
            __FUNCTION__ = "state_read_server_response_header"
            bytes_used = -1419774208
            vio = 0x0
            state = 10922
    #5  0x0000000000650fe6 in HttpSM::main_handler (this=0x2aaab14ee380, event=3, data=0x2aaaeea0a910) at HttpSM.cc:2663
            jump_point = (int (HttpSM::*)(HttpSM * const, int, void *)) 0x64db74 <HttpSM::state_read_server_response_header(int, void*)>
            __FUNCTION__ = "main_handler"
            vc_entry = 0x2aaab14efb88
    #6  0x0000000000574752 in Continuation::handleEvent (this=0x2aaab14ee380, event=3, data=0x2aaaeea0a910) at /usr/local/src/trafficserver/iocore/eventsystem/I_Continuation.h:153
    No locals.
    #7  0x0000000000807027 in write_signal_and_update (event=3, vc=0x2aaaeea0a770) at UnixNetVConnection.cc:176
            __FUNCTION__ = "write_signal_and_update"
    #8  0x000000000080725e in write_signal_done (event=3, nh=0x2b5b8e409e50, vc=0x2aaaeea0a770) at UnixNetVConnection.cc:218
    No locals.
    #9  0x00000000008072ee in write_signal_error (nh=0x2b5b8e409e50, vc=0x2aaaeea0a770, lerrno=32) at UnixNetVConnection.cc:237
    No locals.
    #10 0x00000000008081de in write_to_net_io (nh=0x2b5b8e409e50, vc=0x2aaaeea0a770, thread=0x2b5b8e406000) at UnixNetVConnection.cc:468
            err = 32
            errlen = 4
            s = 0x2aaaeea0a908
            mutex = 0x2b5b8d914c60
            lock = {m = {m_ptr = 0x2aaac4e02ed0}, lock_acquired = true}
            __FUNCTION__ = "write_to_net_io"
            ntodo = 959
            buf = <error reading variable>
            towrite = 46913636312944
            signalled = 0
            needs = 32
            total_written = 4
            r = 47672214753344
    #11 0x0000000000808023 in write_to_net (nh=0x2b5b8e409e50, vc=0x2aaaeea0a770, thread=0x2b5b8e406000) at UnixNetVConnection.cc:443
            mutex = 0x2b5b8d914c60
    #12 0x00000000007ff412 in NetHandler::mainNetEvent (this=0x2b5b8e409e50, event=5, e=0x2aaaabc05a60) at UnixNet.cc:526
            epd = 0x2aaaeea0ca18
            poll_timeout = 0
            pd = 0x2b5b8ea99000
            vc = 0x2aaaeea0a770
            __FUNCTION__ = "mainNetEvent"
    #13 0x0000000000574752 in Continuation::handleEvent (this=0x2b5b8e409e50, event=5, data=0x2aaaabc05a60) at /usr/local/src/trafficserver/iocore/eventsystem/I_Continuation.h:153
    No locals.
    #14 0x000000000082b640 in EThread::process_event (this=0x2b5b8e406000, e=0x2aaaabc05a60, calling_code=5) at UnixEThread.cc:143
            c_temp = 0x2b5b8e409e50
            lock = {m = {m_ptr = 0x2b5b8d9141e0}, lock_acquired = true}
            __FUNCTION__ = "process_event"
    #15 0x000000000082bc1f in EThread::execute (this=0x2b5b8e406000) at UnixEThread.cc:270
            done_one = false
            e = 0x2aaaabc05a60
            NegativeQueue = {<DLL<Event, Event::Link_link>> = {head = 0x2aaaabc04c80}, tail = 0x2aaaabc04080}
            next_time = 1490586447220437269
            __FUNCTION__ = "execute"
    #16 0x000000000082ac5a in spawn_thread_internal (a=0x2b5b8d6aea90) at Thread.cc:84
            p = 0x2b5b8d6aea90
    #17 0x00002b5b8b638dc5 in start_thread (arg=0x2aaaab5ff700) at pthread_create.c:308
            __res = <optimized out>
            pd = 0x2aaaab5ff700
            now = <optimized out>
            unwind_buf = {cancel_jmp_buf = {{jmp_buf = {46912508000000, -4231554947831992312, 0, 140736391318576, 46912508000000, 0, -8065056817632415736, -7786325772008460280}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
            not_first_call = <optimized out>
            pagesize_m1 = <optimized out>
            sp = <optimized out>
            freesize = <optimized out>
    #18 0x00002b5b8be4a73d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
    No locals.
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver issue #1531: Assertion in state_read_server_response_header (v...

Posted by oknet <gi...@git.apache.org>.
Github user oknet commented on the issue:

    https://github.com/apache/trafficserver/issues/1531
  
    @zwoop @shinrich It could run into state_read_server_response_header twice if the data is point to write.vio and the handler function doest not do_io_close on vc while the 1st run into it.
    
    after #947 , iocore set read.error and write.error to 1 if EPOLLERR that means for that VC iocore will callback read._cont with EVENT_ERROR first and then callback write._cont with EVENT_ERROR if the VC is not disabled or closed in the first callback.
    
    I think the SM can not receive EVENT_ERROR twice just like it can not receive EVENT_TIMEOUT twice (UnixNetVConnection::mainEvent will ignore the write VIO callback if its _cont is the same as read VIO).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver issue #1531: Assertion in state_read_server_response_header (v...

Posted by scw00 <gi...@git.apache.org>.
Github user scw00 commented on the issue:

    https://github.com/apache/trafficserver/issues/1531
  
    There are reasons for this creash:
    1\u3001ATS try to connect the origin server, but receive RST package. So ATS try again until retry_times == 0. It will clean the server_entry (mark the VC as closed, remove this VC from write_ready_list and read_ready_list, then close it) in every times except the last one(It is easy to re-product), so it will call both read sm and write sm in the last times, and the write sm causes crash.
    2\u3001Epoll trigger EPOLLERR without EVENTIO_READ(hard to re-product).
    
    In my opinion, the EPOLLERR always with read.enabled or write.enabled(otherwise, there is no sm could handle this event, we just ignore). So when EPOLLERR happens, we could push the vc into read_ready_list if the read.enabled == 1, otherwise push it into write_ready_list.
    
    By the way, there is my program to re-product this crash:
    ```
    package main
    
    import (
    	"net"
    	"os"
    	"fmt"
    )
    
    func main()  {
    
    	ls, err := net.Listen("tcp", os.Args[1])
    	if err != nil {
    		fmt.Println(err)
    		os.Exit(1)
    	}
    
    	for {
    		conn, err := ls.Accept()
    		if err != nil {
    			fmt.Println(err)
    			os.Exit(1)
    		}
    		go handleConnection(conn)
    	}
    }
    
    func handleConnection(con net.Conn)  {
    	con.Close()
    }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver issue #1531: Assertion in state_read_server_response_header (v...

Posted by zwoop <gi...@git.apache.org>.
Github user zwoop closed the issue at:

    https://github.com/apache/trafficserver/issues/1531


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver issue #1531: Assertion in state_read_server_response_header (v...

Posted by scw00 <gi...@git.apache.org>.
Github user scw00 commented on the issue:

    https://github.com/apache/trafficserver/issues/1531
  
    I think, it never call handler with read.enabled = 0(or write.enabled). It means  the handler don not want this type of event .It may never handle this event and just assert!
    [https://github.com/apache/trafficserver/pull/1522](url)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---