You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Chris Darroch (JIRA)" <ji...@apache.org> on 2009/02/17 23:48:59 UTC

[jira] Created: (ZOOKEEPER-320) call auth completion in free_completions()

call auth completion in free_completions()
------------------------------------------

                 Key: ZOOKEEPER-320
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-320
             Project: Zookeeper
          Issue Type: Bug
          Components: c client
    Affects Versions: 3.1.0, 3.0.1, 3.0.0
            Reporter: Chris Darroch
             Fix For: 3.1.1, 3.2.0


If a client calls zoo_add_auth() with an invalid scheme (e.g., "foo") the ZooKeeper server will mark their session expired and close the connection.  However, the C client has returned immediately after queuing the new auth data to be sent with a ZOK return code.

If the client then waits for their auth completion function to be called, they can wait forever, as no session event is ever delivered to that completion function.  All other completion functions are notified of session events by free_completions(), which is called by cleanup_bufs() in handle_error() in handle_socket_error_msg().

In actual fact, what can happen (about 50% of the time, for me) is that the next call by the IO thread to flush_send_queue() calls send() from within send_buffer(), and receives a SIGPIPE signal during this send() call.  Because the ZooKeeper C API is a library, it properly does not catch that signal.  If the user's code is not catching that signal either, they experience an abort caused by an untrapped signal.  If they are ignoring the signal -- which is common in context I'm working in, the Apache httpd server -- then flush_send_queue()'s error return code is EPIPE, which is logged by handle_socket_error_msg(), and all non-auth completion functions are notified of a session event.  However, if the caller is waiting for their auth completion function, they wait forever while the IO thread tries repeatedly to reconnect and is rejected by the server as having an expired session.

So, first of all, it would be useful to document in the C API portion of the programmer's guide that trapping or ignoring SIGPIPE is important, as this signal may be generated by the C API.

Next, the two attached patches call the auth completion function, if any, in free_completions(), which fixes this problem for me.  The second attached patch includes auth lock/unlock function, as per ZOOKEEPER-319.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (ZOOKEEPER-320) call auth completion in free_completions()

Posted by "Chris Darroch (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677143#action_12677143 ] 

cdarroch edited comment on ZOOKEEPER-320 at 2/26/09 2:41 PM:
------------------------------------------------------------------

Updated with a NULL initialization as per the comment on ZOOKEEPER-319#action_12676824

Out of interest --- what compiler gives these errors?  My gcc 4.1.2 with -Wall doesn't report any troubles.

      was (Author: cdarroch):
    Updated with a NULL initialization as per the comment on ZOOKEEPER-319:
 https://issues.apache.org/jira/browse/ZOOKEEPER-319?focusedCommentId=12676824#action_12676824

Out of interest -- what compiler gives these errors?  My gcc 4.1.2 with -Wall doesn't report any troubles.
  
> call auth completion in free_completions()
> ------------------------------------------
>
>                 Key: ZOOKEEPER-320
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-320
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.0.0, 3.0.1, 3.1.0
>            Reporter: Chris Darroch
>            Assignee: Chris Darroch
>             Fix For: 3.1.1, 3.2.0
>
>         Attachments: ZOOKEEPER-320-319.patch, ZOOKEEPER-320-319.patch, ZOOKEEPER-320.patch
>
>
> If a client calls zoo_add_auth() with an invalid scheme (e.g., "foo") the ZooKeeper server will mark their session expired and close the connection.  However, the C client has returned immediately after queuing the new auth data to be sent with a ZOK return code.
> If the client then waits for their auth completion function to be called, they can wait forever, as no session event is ever delivered to that completion function.  All other completion functions are notified of session events by free_completions(), which is called by cleanup_bufs() in handle_error() in handle_socket_error_msg().
> In actual fact, what can happen (about 50% of the time, for me) is that the next call by the IO thread to flush_send_queue() calls send() from within send_buffer(), and receives a SIGPIPE signal during this send() call.  Because the ZooKeeper C API is a library, it properly does not catch that signal.  If the user's code is not catching that signal either, they experience an abort caused by an untrapped signal.  If they are ignoring the signal -- which is common in context I'm working in, the Apache httpd server -- then flush_send_queue()'s error return code is EPIPE, which is logged by handle_socket_error_msg(), and all non-auth completion functions are notified of a session event.  However, if the caller is waiting for their auth completion function, they wait forever while the IO thread tries repeatedly to reconnect and is rejected by the server as having an expired session.
> So, first of all, it would be useful to document in the C API portion of the programmer's guide that trapping or ignoring SIGPIPE is important, as this signal may be generated by the C API.
> Next, the two attached patches call the auth completion function, if any, in free_completions(), which fixes this problem for me.  The second attached patch includes auth lock/unlock function, as per ZOOKEEPER-319.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-320) call auth completion in free_completions()

Posted by "Chris Darroch (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Darroch updated ZOOKEEPER-320:
------------------------------------

    Attachment: ZOOKEEPER-320.patch

This patch does not include locking for the auth data, as per ZOOKEEPER-319.

> call auth completion in free_completions()
> ------------------------------------------
>
>                 Key: ZOOKEEPER-320
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-320
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.0.0, 3.0.1, 3.1.0
>            Reporter: Chris Darroch
>             Fix For: 3.1.1, 3.2.0
>
>         Attachments: ZOOKEEPER-320.patch
>
>
> If a client calls zoo_add_auth() with an invalid scheme (e.g., "foo") the ZooKeeper server will mark their session expired and close the connection.  However, the C client has returned immediately after queuing the new auth data to be sent with a ZOK return code.
> If the client then waits for their auth completion function to be called, they can wait forever, as no session event is ever delivered to that completion function.  All other completion functions are notified of session events by free_completions(), which is called by cleanup_bufs() in handle_error() in handle_socket_error_msg().
> In actual fact, what can happen (about 50% of the time, for me) is that the next call by the IO thread to flush_send_queue() calls send() from within send_buffer(), and receives a SIGPIPE signal during this send() call.  Because the ZooKeeper C API is a library, it properly does not catch that signal.  If the user's code is not catching that signal either, they experience an abort caused by an untrapped signal.  If they are ignoring the signal -- which is common in context I'm working in, the Apache httpd server -- then flush_send_queue()'s error return code is EPIPE, which is logged by handle_socket_error_msg(), and all non-auth completion functions are notified of a session event.  However, if the caller is waiting for their auth completion function, they wait forever while the IO thread tries repeatedly to reconnect and is rejected by the server as having an expired session.
> So, first of all, it would be useful to document in the C API portion of the programmer's guide that trapping or ignoring SIGPIPE is important, as this signal may be generated by the C API.
> Next, the two attached patches call the auth completion function, if any, in free_completions(), which fixes this problem for me.  The second attached patch includes auth lock/unlock function, as per ZOOKEEPER-319.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (ZOOKEEPER-320) call auth completion in free_completions()

Posted by "Chris Darroch (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677143#action_12677143 ] 

cdarroch edited comment on ZOOKEEPER-320 at 2/26/09 2:45 PM:
------------------------------------------------------------------

Updated with a NULL initialization as per the comment on [ZOOKEEPER-319#action_12676824].

Out of interest --- what compiler gives these errors?  My gcc 4.1.2 with -Wall doesn't report any troubles.

      was (Author: cdarroch):
    Updated with a NULL initialization as per the comment on ZOOKEEPER-319#action_12676824

Out of interest --- what compiler gives these errors?  My gcc 4.1.2 with -Wall doesn't report any troubles.
  
> call auth completion in free_completions()
> ------------------------------------------
>
>                 Key: ZOOKEEPER-320
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-320
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.0.0, 3.0.1, 3.1.0
>            Reporter: Chris Darroch
>            Assignee: Chris Darroch
>             Fix For: 3.1.1, 3.2.0
>
>         Attachments: ZOOKEEPER-320-319.patch, ZOOKEEPER-320-319.patch, ZOOKEEPER-320.patch
>
>
> If a client calls zoo_add_auth() with an invalid scheme (e.g., "foo") the ZooKeeper server will mark their session expired and close the connection.  However, the C client has returned immediately after queuing the new auth data to be sent with a ZOK return code.
> If the client then waits for their auth completion function to be called, they can wait forever, as no session event is ever delivered to that completion function.  All other completion functions are notified of session events by free_completions(), which is called by cleanup_bufs() in handle_error() in handle_socket_error_msg().
> In actual fact, what can happen (about 50% of the time, for me) is that the next call by the IO thread to flush_send_queue() calls send() from within send_buffer(), and receives a SIGPIPE signal during this send() call.  Because the ZooKeeper C API is a library, it properly does not catch that signal.  If the user's code is not catching that signal either, they experience an abort caused by an untrapped signal.  If they are ignoring the signal -- which is common in context I'm working in, the Apache httpd server -- then flush_send_queue()'s error return code is EPIPE, which is logged by handle_socket_error_msg(), and all non-auth completion functions are notified of a session event.  However, if the caller is waiting for their auth completion function, they wait forever while the IO thread tries repeatedly to reconnect and is rejected by the server as having an expired session.
> So, first of all, it would be useful to document in the C API portion of the programmer's guide that trapping or ignoring SIGPIPE is important, as this signal may be generated by the C API.
> Next, the two attached patches call the auth completion function, if any, in free_completions(), which fixes this problem for me.  The second attached patch includes auth lock/unlock function, as per ZOOKEEPER-319.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-320) call auth completion in free_completions()

Posted by "Chris Darroch (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Darroch updated ZOOKEEPER-320:
------------------------------------

    Attachment: ZOOKEEPER-320-319.patch

This version avoids holding the auth lock while calling the user's auth completion function (which may run for a long time; we don't know).

> call auth completion in free_completions()
> ------------------------------------------
>
>                 Key: ZOOKEEPER-320
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-320
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.0.0, 3.0.1, 3.1.0
>            Reporter: Chris Darroch
>             Fix For: 3.1.1, 3.2.0
>
>         Attachments: ZOOKEEPER-320-319.patch, ZOOKEEPER-320.patch
>
>
> If a client calls zoo_add_auth() with an invalid scheme (e.g., "foo") the ZooKeeper server will mark their session expired and close the connection.  However, the C client has returned immediately after queuing the new auth data to be sent with a ZOK return code.
> If the client then waits for their auth completion function to be called, they can wait forever, as no session event is ever delivered to that completion function.  All other completion functions are notified of session events by free_completions(), which is called by cleanup_bufs() in handle_error() in handle_socket_error_msg().
> In actual fact, what can happen (about 50% of the time, for me) is that the next call by the IO thread to flush_send_queue() calls send() from within send_buffer(), and receives a SIGPIPE signal during this send() call.  Because the ZooKeeper C API is a library, it properly does not catch that signal.  If the user's code is not catching that signal either, they experience an abort caused by an untrapped signal.  If they are ignoring the signal -- which is common in context I'm working in, the Apache httpd server -- then flush_send_queue()'s error return code is EPIPE, which is logged by handle_socket_error_msg(), and all non-auth completion functions are notified of a session event.  However, if the caller is waiting for their auth completion function, they wait forever while the IO thread tries repeatedly to reconnect and is rejected by the server as having an expired session.
> So, first of all, it would be useful to document in the C API portion of the programmer's guide that trapping or ignoring SIGPIPE is important, as this signal may be generated by the C API.
> Next, the two attached patches call the auth completion function, if any, in free_completions(), which fixes this problem for me.  The second attached patch includes auth lock/unlock function, as per ZOOKEEPER-319.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-320) call auth completion in free_completions()

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677686#action_12677686 ] 

Hudson commented on ZOOKEEPER-320:
----------------------------------

Integrated in ZooKeeper-trunk #243 (See [http://hudson.zones.apache.org/hudson/job/ZooKeeper-trunk/243/])
    . call auth completion in free_completions(). (chris darroch via mahadev)


> call auth completion in free_completions()
> ------------------------------------------
>
>                 Key: ZOOKEEPER-320
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-320
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.0.0, 3.0.1, 3.1.0
>            Reporter: Chris Darroch
>            Assignee: Chris Darroch
>             Fix For: 3.1.1, 3.2.0
>
>         Attachments: ZOOKEEPER-320-319.patch, ZOOKEEPER-320-319.patch, ZOOKEEPER-320.patch
>
>
> If a client calls zoo_add_auth() with an invalid scheme (e.g., "foo") the ZooKeeper server will mark their session expired and close the connection.  However, the C client has returned immediately after queuing the new auth data to be sent with a ZOK return code.
> If the client then waits for their auth completion function to be called, they can wait forever, as no session event is ever delivered to that completion function.  All other completion functions are notified of session events by free_completions(), which is called by cleanup_bufs() in handle_error() in handle_socket_error_msg().
> In actual fact, what can happen (about 50% of the time, for me) is that the next call by the IO thread to flush_send_queue() calls send() from within send_buffer(), and receives a SIGPIPE signal during this send() call.  Because the ZooKeeper C API is a library, it properly does not catch that signal.  If the user's code is not catching that signal either, they experience an abort caused by an untrapped signal.  If they are ignoring the signal -- which is common in context I'm working in, the Apache httpd server -- then flush_send_queue()'s error return code is EPIPE, which is logged by handle_socket_error_msg(), and all non-auth completion functions are notified of a session event.  However, if the caller is waiting for their auth completion function, they wait forever while the IO thread tries repeatedly to reconnect and is rejected by the server as having an expired session.
> So, first of all, it would be useful to document in the C API portion of the programmer's guide that trapping or ignoring SIGPIPE is important, as this signal may be generated by the C API.
> Next, the two attached patches call the auth completion function, if any, in free_completions(), which fixes this problem for me.  The second attached patch includes auth lock/unlock function, as per ZOOKEEPER-319.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-320) call auth completion in free_completions()

Posted by "Chris Darroch (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Darroch updated ZOOKEEPER-320:
------------------------------------

    Attachment: ZOOKEEPER-320-319.patch

This patch includes auth data locking as per ZOOKEEPER-319.

> call auth completion in free_completions()
> ------------------------------------------
>
>                 Key: ZOOKEEPER-320
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-320
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.0.0, 3.0.1, 3.1.0
>            Reporter: Chris Darroch
>             Fix For: 3.1.1, 3.2.0
>
>         Attachments: ZOOKEEPER-320-319.patch, ZOOKEEPER-320.patch
>
>
> If a client calls zoo_add_auth() with an invalid scheme (e.g., "foo") the ZooKeeper server will mark their session expired and close the connection.  However, the C client has returned immediately after queuing the new auth data to be sent with a ZOK return code.
> If the client then waits for their auth completion function to be called, they can wait forever, as no session event is ever delivered to that completion function.  All other completion functions are notified of session events by free_completions(), which is called by cleanup_bufs() in handle_error() in handle_socket_error_msg().
> In actual fact, what can happen (about 50% of the time, for me) is that the next call by the IO thread to flush_send_queue() calls send() from within send_buffer(), and receives a SIGPIPE signal during this send() call.  Because the ZooKeeper C API is a library, it properly does not catch that signal.  If the user's code is not catching that signal either, they experience an abort caused by an untrapped signal.  If they are ignoring the signal -- which is common in context I'm working in, the Apache httpd server -- then flush_send_queue()'s error return code is EPIPE, which is logged by handle_socket_error_msg(), and all non-auth completion functions are notified of a session event.  However, if the caller is waiting for their auth completion function, they wait forever while the IO thread tries repeatedly to reconnect and is rejected by the server as having an expired session.
> So, first of all, it would be useful to document in the C API portion of the programmer's guide that trapping or ignoring SIGPIPE is important, as this signal may be generated by the C API.
> Next, the two attached patches call the auth completion function, if any, in free_completions(), which fixes this problem for me.  The second attached patch includes auth lock/unlock function, as per ZOOKEEPER-319.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-320) call auth completion in free_completions()

Posted by "Chris Darroch (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Darroch updated ZOOKEEPER-320:
------------------------------------

    Attachment: ZOOKEEPER-320-319.patch

Updated with a NULL initialization as per the comment on ZOOKEEPER-319:
 https://issues.apache.org/jira/browse/ZOOKEEPER-319?focusedCommentId=12676824#action_12676824

Out of interest -- what compiler gives these errors?  My gcc 4.1.2 with -Wall doesn't report any troubles.

> call auth completion in free_completions()
> ------------------------------------------
>
>                 Key: ZOOKEEPER-320
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-320
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.0.0, 3.0.1, 3.1.0
>            Reporter: Chris Darroch
>            Assignee: Chris Darroch
>             Fix For: 3.1.1, 3.2.0
>
>         Attachments: ZOOKEEPER-320-319.patch, ZOOKEEPER-320-319.patch, ZOOKEEPER-320.patch
>
>
> If a client calls zoo_add_auth() with an invalid scheme (e.g., "foo") the ZooKeeper server will mark their session expired and close the connection.  However, the C client has returned immediately after queuing the new auth data to be sent with a ZOK return code.
> If the client then waits for their auth completion function to be called, they can wait forever, as no session event is ever delivered to that completion function.  All other completion functions are notified of session events by free_completions(), which is called by cleanup_bufs() in handle_error() in handle_socket_error_msg().
> In actual fact, what can happen (about 50% of the time, for me) is that the next call by the IO thread to flush_send_queue() calls send() from within send_buffer(), and receives a SIGPIPE signal during this send() call.  Because the ZooKeeper C API is a library, it properly does not catch that signal.  If the user's code is not catching that signal either, they experience an abort caused by an untrapped signal.  If they are ignoring the signal -- which is common in context I'm working in, the Apache httpd server -- then flush_send_queue()'s error return code is EPIPE, which is logged by handle_socket_error_msg(), and all non-auth completion functions are notified of a session event.  However, if the caller is waiting for their auth completion function, they wait forever while the IO thread tries repeatedly to reconnect and is rejected by the server as having an expired session.
> So, first of all, it would be useful to document in the C API portion of the programmer's guide that trapping or ignoring SIGPIPE is important, as this signal may be generated by the C API.
> Next, the two attached patches call the auth completion function, if any, in free_completions(), which fixes this problem for me.  The second attached patch includes auth lock/unlock function, as per ZOOKEEPER-319.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Re: [jira] Updated: (ZOOKEEPER-320) call auth completion in free_completions()

Posted by Mahadev Konar <ma...@yahoo-inc.com>.
Hi Chris,
  Just to mention that in case you want your patch reviewed please make it
patch available.

Here is a link to the process we follow.

http://wiki.apache.org/hadoop/ZooKeeper/HowToContribute

mahadev


On 2/18/09 2:12 PM, "Chris Darroch (JIRA)" <ji...@apache.org> wrote:

> 
>      [ 
> https://issues.apache.org/jira/browse/ZOOKEEPER-320?page=com.atlassian.jira.pl
> ugin.system.issuetabpanels:all-tabpanel ]
> 
> Chris Darroch updated ZOOKEEPER-320:
> ------------------------------------
> 
>     Attachment:     (was: ZOOKEEPER-320-319.patch)
> 
>> call auth completion in free_completions()
>> ------------------------------------------
>> 
>>                 Key: ZOOKEEPER-320
>>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-320
>>             Project: Zookeeper
>>          Issue Type: Bug
>>          Components: c client
>>    Affects Versions: 3.0.0, 3.0.1, 3.1.0
>>            Reporter: Chris Darroch
>>             Fix For: 3.1.1, 3.2.0
>> 
>>         Attachments: ZOOKEEPER-320-319.patch, ZOOKEEPER-320.patch
>> 
>> 
>> If a client calls zoo_add_auth() with an invalid scheme (e.g., "foo") the
>> ZooKeeper server will mark their session expired and close the connection.
>> However, the C client has returned immediately after queuing the new auth
>> data to be sent with a ZOK return code.
>> If the client then waits for their auth completion function to be called,
>> they can wait forever, as no session event is ever delivered to that
>> completion function.  All other completion functions are notified of session
>> events by free_completions(), which is called by cleanup_bufs() in
>> handle_error() in handle_socket_error_msg().
>> In actual fact, what can happen (about 50% of the time, for me) is that the
>> next call by the IO thread to flush_send_queue() calls send() from within
>> send_buffer(), and receives a SIGPIPE signal during this send() call.
>> Because the ZooKeeper C API is a library, it properly does not catch that
>> signal.  If the user's code is not catching that signal either, they
>> experience an abort caused by an untrapped signal.  If they are ignoring the
>> signal -- which is common in context I'm working in, the Apache httpd server
>> -- then flush_send_queue()'s error return code is EPIPE, which is logged by
>> handle_socket_error_msg(), and all non-auth completion functions are notified
>> of a session event.  However, if the caller is waiting for their auth
>> completion function, they wait forever while the IO thread tries repeatedly
>> to reconnect and is rejected by the server as having an expired session.
>> So, first of all, it would be useful to document in the C API portion of the
>> programmer's guide that trapping or ignoring SIGPIPE is important, as this
>> signal may be generated by the C API.
>> Next, the two attached patches call the auth completion function, if any, in
>> free_completions(), which fixes this problem for me.  The second attached
>> patch includes auth lock/unlock function, as per ZOOKEEPER-319.


[jira] Updated: (ZOOKEEPER-320) call auth completion in free_completions()

Posted by "Chris Darroch (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Darroch updated ZOOKEEPER-320:
------------------------------------

    Attachment:     (was: ZOOKEEPER-320-319.patch)

> call auth completion in free_completions()
> ------------------------------------------
>
>                 Key: ZOOKEEPER-320
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-320
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.0.0, 3.0.1, 3.1.0
>            Reporter: Chris Darroch
>             Fix For: 3.1.1, 3.2.0
>
>         Attachments: ZOOKEEPER-320-319.patch, ZOOKEEPER-320.patch
>
>
> If a client calls zoo_add_auth() with an invalid scheme (e.g., "foo") the ZooKeeper server will mark their session expired and close the connection.  However, the C client has returned immediately after queuing the new auth data to be sent with a ZOK return code.
> If the client then waits for their auth completion function to be called, they can wait forever, as no session event is ever delivered to that completion function.  All other completion functions are notified of session events by free_completions(), which is called by cleanup_bufs() in handle_error() in handle_socket_error_msg().
> In actual fact, what can happen (about 50% of the time, for me) is that the next call by the IO thread to flush_send_queue() calls send() from within send_buffer(), and receives a SIGPIPE signal during this send() call.  Because the ZooKeeper C API is a library, it properly does not catch that signal.  If the user's code is not catching that signal either, they experience an abort caused by an untrapped signal.  If they are ignoring the signal -- which is common in context I'm working in, the Apache httpd server -- then flush_send_queue()'s error return code is EPIPE, which is logged by handle_socket_error_msg(), and all non-auth completion functions are notified of a session event.  However, if the caller is waiting for their auth completion function, they wait forever while the IO thread tries repeatedly to reconnect and is rejected by the server as having an expired session.
> So, first of all, it would be useful to document in the C API portion of the programmer's guide that trapping or ignoring SIGPIPE is important, as this signal may be generated by the C API.
> Next, the two attached patches call the auth completion function, if any, in free_completions(), which fixes this problem for me.  The second attached patch includes auth lock/unlock function, as per ZOOKEEPER-319.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-320) call auth completion in free_completions()

Posted by "Chris Darroch (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Darroch updated ZOOKEEPER-320:
------------------------------------

    Status: Patch Available  (was: Open)

> call auth completion in free_completions()
> ------------------------------------------
>
>                 Key: ZOOKEEPER-320
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-320
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.1.0, 3.0.1, 3.0.0
>            Reporter: Chris Darroch
>             Fix For: 3.1.1, 3.2.0
>
>         Attachments: ZOOKEEPER-320-319.patch, ZOOKEEPER-320.patch
>
>
> If a client calls zoo_add_auth() with an invalid scheme (e.g., "foo") the ZooKeeper server will mark their session expired and close the connection.  However, the C client has returned immediately after queuing the new auth data to be sent with a ZOK return code.
> If the client then waits for their auth completion function to be called, they can wait forever, as no session event is ever delivered to that completion function.  All other completion functions are notified of session events by free_completions(), which is called by cleanup_bufs() in handle_error() in handle_socket_error_msg().
> In actual fact, what can happen (about 50% of the time, for me) is that the next call by the IO thread to flush_send_queue() calls send() from within send_buffer(), and receives a SIGPIPE signal during this send() call.  Because the ZooKeeper C API is a library, it properly does not catch that signal.  If the user's code is not catching that signal either, they experience an abort caused by an untrapped signal.  If they are ignoring the signal -- which is common in context I'm working in, the Apache httpd server -- then flush_send_queue()'s error return code is EPIPE, which is logged by handle_socket_error_msg(), and all non-auth completion functions are notified of a session event.  However, if the caller is waiting for their auth completion function, they wait forever while the IO thread tries repeatedly to reconnect and is rejected by the server as having an expired session.
> So, first of all, it would be useful to document in the C API portion of the programmer's guide that trapping or ignoring SIGPIPE is important, as this signal may be generated by the C API.
> Next, the two attached patches call the auth completion function, if any, in free_completions(), which fixes this problem for me.  The second attached patch includes auth lock/unlock function, as per ZOOKEEPER-319.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-320) call auth completion in free_completions()

Posted by "Chris Darroch (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12674452#action_12674452 ] 

Chris Darroch commented on ZOOKEEPER-320:
-----------------------------------------

I should perhaps clarify that the reason one might want to unconditionally wait on the auth completion after calling zoo_add_auth() is so as to provide one's own sync version of that function.  The various sync functions such as zoo_wget(), zoo_set2(), etc. are just wrappers of the async versions followed by an unconditional wait on the relevant completion.  Similarly, in a context which must be exclusively sync-only, zoo_add_auth() needs to be followed by a wait on its completion.

> call auth completion in free_completions()
> ------------------------------------------
>
>                 Key: ZOOKEEPER-320
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-320
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.0.0, 3.0.1, 3.1.0
>            Reporter: Chris Darroch
>             Fix For: 3.1.1, 3.2.0
>
>         Attachments: ZOOKEEPER-320-319.patch, ZOOKEEPER-320.patch
>
>
> If a client calls zoo_add_auth() with an invalid scheme (e.g., "foo") the ZooKeeper server will mark their session expired and close the connection.  However, the C client has returned immediately after queuing the new auth data to be sent with a ZOK return code.
> If the client then waits for their auth completion function to be called, they can wait forever, as no session event is ever delivered to that completion function.  All other completion functions are notified of session events by free_completions(), which is called by cleanup_bufs() in handle_error() in handle_socket_error_msg().
> In actual fact, what can happen (about 50% of the time, for me) is that the next call by the IO thread to flush_send_queue() calls send() from within send_buffer(), and receives a SIGPIPE signal during this send() call.  Because the ZooKeeper C API is a library, it properly does not catch that signal.  If the user's code is not catching that signal either, they experience an abort caused by an untrapped signal.  If they are ignoring the signal -- which is common in context I'm working in, the Apache httpd server -- then flush_send_queue()'s error return code is EPIPE, which is logged by handle_socket_error_msg(), and all non-auth completion functions are notified of a session event.  However, if the caller is waiting for their auth completion function, they wait forever while the IO thread tries repeatedly to reconnect and is rejected by the server as having an expired session.
> So, first of all, it would be useful to document in the C API portion of the programmer's guide that trapping or ignoring SIGPIPE is important, as this signal may be generated by the C API.
> Next, the two attached patches call the auth completion function, if any, in free_completions(), which fixes this problem for me.  The second attached patch includes auth lock/unlock function, as per ZOOKEEPER-319.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (ZOOKEEPER-320) call auth completion in free_completions()

Posted by "Chris Darroch (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677143#action_12677143 ] 

cdarroch edited comment on ZOOKEEPER-320 at 2/26/09 2:47 PM:
------------------------------------------------------------------

Updated with a NULL initialization as per the comment on ZOOKEEPER-319:

https://issues.apache.org/jira/browse/ZOOKEEPER-319#action_12676824

Out of interest --- what compiler gives these errors?  My gcc 4.1.2 with -Wall doesn't report any troubles.

      was (Author: cdarroch):
    Updated with a NULL initialization as per the comment on [ZOOKEEPER-319#action_12676824].

Out of interest --- what compiler gives these errors?  My gcc 4.1.2 with -Wall doesn't report any troubles.
  
> call auth completion in free_completions()
> ------------------------------------------
>
>                 Key: ZOOKEEPER-320
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-320
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.0.0, 3.0.1, 3.1.0
>            Reporter: Chris Darroch
>            Assignee: Chris Darroch
>             Fix For: 3.1.1, 3.2.0
>
>         Attachments: ZOOKEEPER-320-319.patch, ZOOKEEPER-320-319.patch, ZOOKEEPER-320.patch
>
>
> If a client calls zoo_add_auth() with an invalid scheme (e.g., "foo") the ZooKeeper server will mark their session expired and close the connection.  However, the C client has returned immediately after queuing the new auth data to be sent with a ZOK return code.
> If the client then waits for their auth completion function to be called, they can wait forever, as no session event is ever delivered to that completion function.  All other completion functions are notified of session events by free_completions(), which is called by cleanup_bufs() in handle_error() in handle_socket_error_msg().
> In actual fact, what can happen (about 50% of the time, for me) is that the next call by the IO thread to flush_send_queue() calls send() from within send_buffer(), and receives a SIGPIPE signal during this send() call.  Because the ZooKeeper C API is a library, it properly does not catch that signal.  If the user's code is not catching that signal either, they experience an abort caused by an untrapped signal.  If they are ignoring the signal -- which is common in context I'm working in, the Apache httpd server -- then flush_send_queue()'s error return code is EPIPE, which is logged by handle_socket_error_msg(), and all non-auth completion functions are notified of a session event.  However, if the caller is waiting for their auth completion function, they wait forever while the IO thread tries repeatedly to reconnect and is rejected by the server as having an expired session.
> So, first of all, it would be useful to document in the C API portion of the programmer's guide that trapping or ignoring SIGPIPE is important, as this signal may be generated by the C API.
> Next, the two attached patches call the auth completion function, if any, in free_completions(), which fixes this problem for me.  The second attached patch includes auth lock/unlock function, as per ZOOKEEPER-319.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-320) call auth completion in free_completions()

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677169#action_12677169 ] 

Mahadev konar commented on ZOOKEEPER-320:
-----------------------------------------

my compiler is gcc 3.4.4


> call auth completion in free_completions()
> ------------------------------------------
>
>                 Key: ZOOKEEPER-320
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-320
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.0.0, 3.0.1, 3.1.0
>            Reporter: Chris Darroch
>            Assignee: Chris Darroch
>             Fix For: 3.1.1, 3.2.0
>
>         Attachments: ZOOKEEPER-320-319.patch, ZOOKEEPER-320-319.patch, ZOOKEEPER-320.patch
>
>
> If a client calls zoo_add_auth() with an invalid scheme (e.g., "foo") the ZooKeeper server will mark their session expired and close the connection.  However, the C client has returned immediately after queuing the new auth data to be sent with a ZOK return code.
> If the client then waits for their auth completion function to be called, they can wait forever, as no session event is ever delivered to that completion function.  All other completion functions are notified of session events by free_completions(), which is called by cleanup_bufs() in handle_error() in handle_socket_error_msg().
> In actual fact, what can happen (about 50% of the time, for me) is that the next call by the IO thread to flush_send_queue() calls send() from within send_buffer(), and receives a SIGPIPE signal during this send() call.  Because the ZooKeeper C API is a library, it properly does not catch that signal.  If the user's code is not catching that signal either, they experience an abort caused by an untrapped signal.  If they are ignoring the signal -- which is common in context I'm working in, the Apache httpd server -- then flush_send_queue()'s error return code is EPIPE, which is logged by handle_socket_error_msg(), and all non-auth completion functions are notified of a session event.  However, if the caller is waiting for their auth completion function, they wait forever while the IO thread tries repeatedly to reconnect and is rejected by the server as having an expired session.
> So, first of all, it would be useful to document in the C API portion of the programmer's guide that trapping or ignoring SIGPIPE is important, as this signal may be generated by the C API.
> Next, the two attached patches call the auth completion function, if any, in free_completions(), which fixes this problem for me.  The second attached patch includes auth lock/unlock function, as per ZOOKEEPER-319.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (ZOOKEEPER-320) call auth completion in free_completions()

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar reassigned ZOOKEEPER-320:
---------------------------------------

    Assignee: Chris Darroch

> call auth completion in free_completions()
> ------------------------------------------
>
>                 Key: ZOOKEEPER-320
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-320
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.0.0, 3.0.1, 3.1.0
>            Reporter: Chris Darroch
>            Assignee: Chris Darroch
>             Fix For: 3.1.1, 3.2.0
>
>         Attachments: ZOOKEEPER-320-319.patch, ZOOKEEPER-320.patch
>
>
> If a client calls zoo_add_auth() with an invalid scheme (e.g., "foo") the ZooKeeper server will mark their session expired and close the connection.  However, the C client has returned immediately after queuing the new auth data to be sent with a ZOK return code.
> If the client then waits for their auth completion function to be called, they can wait forever, as no session event is ever delivered to that completion function.  All other completion functions are notified of session events by free_completions(), which is called by cleanup_bufs() in handle_error() in handle_socket_error_msg().
> In actual fact, what can happen (about 50% of the time, for me) is that the next call by the IO thread to flush_send_queue() calls send() from within send_buffer(), and receives a SIGPIPE signal during this send() call.  Because the ZooKeeper C API is a library, it properly does not catch that signal.  If the user's code is not catching that signal either, they experience an abort caused by an untrapped signal.  If they are ignoring the signal -- which is common in context I'm working in, the Apache httpd server -- then flush_send_queue()'s error return code is EPIPE, which is logged by handle_socket_error_msg(), and all non-auth completion functions are notified of a session event.  However, if the caller is waiting for their auth completion function, they wait forever while the IO thread tries repeatedly to reconnect and is rejected by the server as having an expired session.
> So, first of all, it would be useful to document in the C API portion of the programmer's guide that trapping or ignoring SIGPIPE is important, as this signal may be generated by the C API.
> Next, the two attached patches call the auth completion function, if any, in free_completions(), which fixes this problem for me.  The second attached patch includes auth lock/unlock function, as per ZOOKEEPER-319.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (ZOOKEEPER-320) call auth completion in free_completions()

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar updated ZOOKEEPER-320:
------------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

I just committed this. Thanks chris.

> call auth completion in free_completions()
> ------------------------------------------
>
>                 Key: ZOOKEEPER-320
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-320
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.0.0, 3.0.1, 3.1.0
>            Reporter: Chris Darroch
>            Assignee: Chris Darroch
>             Fix For: 3.1.1, 3.2.0
>
>         Attachments: ZOOKEEPER-320-319.patch, ZOOKEEPER-320-319.patch, ZOOKEEPER-320.patch
>
>
> If a client calls zoo_add_auth() with an invalid scheme (e.g., "foo") the ZooKeeper server will mark their session expired and close the connection.  However, the C client has returned immediately after queuing the new auth data to be sent with a ZOK return code.
> If the client then waits for their auth completion function to be called, they can wait forever, as no session event is ever delivered to that completion function.  All other completion functions are notified of session events by free_completions(), which is called by cleanup_bufs() in handle_error() in handle_socket_error_msg().
> In actual fact, what can happen (about 50% of the time, for me) is that the next call by the IO thread to flush_send_queue() calls send() from within send_buffer(), and receives a SIGPIPE signal during this send() call.  Because the ZooKeeper C API is a library, it properly does not catch that signal.  If the user's code is not catching that signal either, they experience an abort caused by an untrapped signal.  If they are ignoring the signal -- which is common in context I'm working in, the Apache httpd server -- then flush_send_queue()'s error return code is EPIPE, which is logged by handle_socket_error_msg(), and all non-auth completion functions are notified of a session event.  However, if the caller is waiting for their auth completion function, they wait forever while the IO thread tries repeatedly to reconnect and is rejected by the server as having an expired session.
> So, first of all, it would be useful to document in the C API portion of the programmer's guide that trapping or ignoring SIGPIPE is important, as this signal may be generated by the C API.
> Next, the two attached patches call the auth completion function, if any, in free_completions(), which fixes this problem for me.  The second attached patch includes auth lock/unlock function, as per ZOOKEEPER-319.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-320) call auth completion in free_completions()

Posted by "Chris Darroch (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677144#action_12677144 ] 

Chris Darroch commented on ZOOKEEPER-320:
-----------------------------------------

Also, please note my suggestion that the docs mention the need to catch, ignore, or otherwise be aware of SIGPIPE signals.

> call auth completion in free_completions()
> ------------------------------------------
>
>                 Key: ZOOKEEPER-320
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-320
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.0.0, 3.0.1, 3.1.0
>            Reporter: Chris Darroch
>            Assignee: Chris Darroch
>             Fix For: 3.1.1, 3.2.0
>
>         Attachments: ZOOKEEPER-320-319.patch, ZOOKEEPER-320-319.patch, ZOOKEEPER-320.patch
>
>
> If a client calls zoo_add_auth() with an invalid scheme (e.g., "foo") the ZooKeeper server will mark their session expired and close the connection.  However, the C client has returned immediately after queuing the new auth data to be sent with a ZOK return code.
> If the client then waits for their auth completion function to be called, they can wait forever, as no session event is ever delivered to that completion function.  All other completion functions are notified of session events by free_completions(), which is called by cleanup_bufs() in handle_error() in handle_socket_error_msg().
> In actual fact, what can happen (about 50% of the time, for me) is that the next call by the IO thread to flush_send_queue() calls send() from within send_buffer(), and receives a SIGPIPE signal during this send() call.  Because the ZooKeeper C API is a library, it properly does not catch that signal.  If the user's code is not catching that signal either, they experience an abort caused by an untrapped signal.  If they are ignoring the signal -- which is common in context I'm working in, the Apache httpd server -- then flush_send_queue()'s error return code is EPIPE, which is logged by handle_socket_error_msg(), and all non-auth completion functions are notified of a session event.  However, if the caller is waiting for their auth completion function, they wait forever while the IO thread tries repeatedly to reconnect and is rejected by the server as having an expired session.
> So, first of all, it would be useful to document in the C API portion of the programmer's guide that trapping or ignoring SIGPIPE is important, as this signal may be generated by the C API.
> Next, the two attached patches call the auth completion function, if any, in free_completions(), which fixes this problem for me.  The second attached patch includes auth lock/unlock function, as per ZOOKEEPER-319.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (ZOOKEEPER-320) call auth completion in free_completions()

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676825#action_12676825 ] 

Mahadev konar commented on ZOOKEEPER-320:
-----------------------------------------

+1 for the second patch.

> call auth completion in free_completions()
> ------------------------------------------
>
>                 Key: ZOOKEEPER-320
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-320
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: c client
>    Affects Versions: 3.0.0, 3.0.1, 3.1.0
>            Reporter: Chris Darroch
>             Fix For: 3.1.1, 3.2.0
>
>         Attachments: ZOOKEEPER-320-319.patch, ZOOKEEPER-320.patch
>
>
> If a client calls zoo_add_auth() with an invalid scheme (e.g., "foo") the ZooKeeper server will mark their session expired and close the connection.  However, the C client has returned immediately after queuing the new auth data to be sent with a ZOK return code.
> If the client then waits for their auth completion function to be called, they can wait forever, as no session event is ever delivered to that completion function.  All other completion functions are notified of session events by free_completions(), which is called by cleanup_bufs() in handle_error() in handle_socket_error_msg().
> In actual fact, what can happen (about 50% of the time, for me) is that the next call by the IO thread to flush_send_queue() calls send() from within send_buffer(), and receives a SIGPIPE signal during this send() call.  Because the ZooKeeper C API is a library, it properly does not catch that signal.  If the user's code is not catching that signal either, they experience an abort caused by an untrapped signal.  If they are ignoring the signal -- which is common in context I'm working in, the Apache httpd server -- then flush_send_queue()'s error return code is EPIPE, which is logged by handle_socket_error_msg(), and all non-auth completion functions are notified of a session event.  However, if the caller is waiting for their auth completion function, they wait forever while the IO thread tries repeatedly to reconnect and is rejected by the server as having an expired session.
> So, first of all, it would be useful to document in the C API portion of the programmer's guide that trapping or ignoring SIGPIPE is important, as this signal may be generated by the C API.
> Next, the two attached patches call the auth completion function, if any, in free_completions(), which fixes this problem for me.  The second attached patch includes auth lock/unlock function, as per ZOOKEEPER-319.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.