You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Michael Lee (Updated) (JIRA)" <ji...@apache.org> on 2011/11/24 08:32:40 UTC

[jira] [Updated] (ZOOKEEPER-1306) hang in zookeeper_close()

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Lee updated ZOOKEEPER-1306:
-----------------------------------

    Attachment: close_hang.patch
    
> hang in zookeeper_close()
> -------------------------
>
>                 Key: ZOOKEEPER-1306
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1306
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.3.3
>            Reporter: helei
>         Attachments: close_hang.patch
>
>
> With patch ZOOKEEPER-981,  I saw another problem. Hang in zookeeper_close() again. here is the stack:
> (gdb) bt
> #0 0x000000302b80adfb in __lll_mutex_lock_wait () from /lib64/tls/libpthread.so.0
> #1 0x000000302b1307a8 in main_arena () from /lib64/tls/libc.so.6
> #2 0x000000302b910230 in stack_used () from /lib64/tls/libpthread.so.0
> #3 0x000000302b808dde in pthread_cond_broadcast@@GLIBC_2.3.2 () from /lib64/tls/libpthread.so.0
> #4 0x00000000006b4ce8 in adaptor_finish (zh=0x6902060) at src/mt_adaptor.c:217
> #5 0x00000000006b0fd0 in zookeeper_close (zh=0x6902060) at src/zookeeper.c:2297
> (gdb) p zh->ref_counter
> $5 = 1
> (gdb) p zh->close_requested
> $6 = 1
> (gdb) p *zh
> $7 = {fd = 110112576, hostname = 0x6903620 "", addrs = 0x0, addrs_count = 1,
> watcher = 0x62e5dc <doris::meta_register_mgr_t::register_mgr_watcher(_zhandle*, int, int, char const*, void*)>, last_recv = {tv_sec = 1321510694, tv_usec = 552835}, last_send = {tv_sec = 1321510694, tv_usec = 552886}, last_ping = {tv_sec = 1321510685, tv_usec = 774869}, next_deadline = { tv_sec = 1321510704, tv_usec = 547831}, recv_timeout = 30000, input_buffer = 0x0, to_process = {head = 0x0, last = 0x0, lock = {__m_reserved = 0,
> __m_count = 0, __m_owner = 0x0, __m_kind = 0, __m_lock = {__status = 0, __spinlock = 0}}}, to_send = {head = 0x0, last = 0x0, lock = {
> __m_reserved = 0, __m_count = 0, __m_owner = 0x0, __m_kind = 1, __m_lock = {__status = 0, __spinlock = 0}}}, sent_requests = {head = 0x0, last = 0x0,
> cond = {__c_lock = {_status = 1, __spinlock = -1}, __c_waiting = 0x0, __padding = '\0' <repeats 15 times>, __align = 0}, lock = {_m_reserved = 0,
> __m_count = 0, __m_owner = 0x0, __m_kind = 0, __m_lock = {__status = 0, __spinlock = 0}}}, completions_to_process = {head = 0x2aefbff800,
> last = 0x2af0e05f40, cond = {__c_lock = {__status = 592705486850, __spinlock = -1}, __c_waiting = 0x45,
> _padding = "E\000\000\000\000\000\000\000\220\006\000\000\000", __align = 296352743424}, lock = {_m_reserved = 1, __m_count = 0,
> __m_owner = 0x1000026ca, __m_kind = 0, __m_lock = {__status = 0, __spinlock = 0}}}, connect_index = 0, client_id = {client_id = 86551148676999146, passwd = "G懵擀\233\213\f闬202筴\002錪\034"}, last_zxid = 82057372, outstanding_sync = 0, primer_buffer = {buffer = 0x6902290 "", len = 40, curr_offset = 44, next = 0x0}, primer_storage = {len = 36, protocolVersion = 0, timeOut = 30000, sessionId = 86551148676999146, passwd_len = 16, passwd = "G懵擀\233\213\f闬202筴\002錪\034"},
> primer_storage_buffer = "\000\000\000$\000\000\000\000\000\000u0\0013}惜薵闬000\000\000\020G懵擀\233\213\f闬202筴\002錪\034", state = 0, context = 0x0,
> auth_h = {auth = 0x0, lock = {__m_reserved = 0, __m_count = 0, __m_owner = 0x0, __m_kind = 0, __m_lock = {__status = 0, __spinlock = 0}}},
> ref_counter = 1, close_requested = 1, adaptor_priv = 0x0, socket_readable = {tv_sec = 0, tv_usec = 0}, active_node_watchers = 0x6901520,
> active_exist_watchers = 0x69015d0, active_child_watchers = 0x6902ef0, chroot = 0x0}
> I think the ref_counter is suposed to be 2 or 3 or 4 here. it seems not correct. I think maybe we should increase the ref_counter before we set zh->close_request=1, otherwise the do_io thread and do_completion thread may release the handler just after we set zh->close_request and before we increase zh->ref_counter. Thanks again

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira