You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bookkeeper.apache.org by "Sijie Guo (JIRA)" <ji...@apache.org> on 2012/07/23 13:05:34 UTC

[jira] [Created] (BOOKKEEPER-350) Revisit consume interface in Hedwig Client

Sijie Guo created BOOKKEEPER-350:
------------------------------------

             Summary: Revisit consume interface in Hedwig Client
                 Key: BOOKKEEPER-350
                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-350
             Project: Bookkeeper
          Issue Type: Sub-task
          Components: hedwig-client
    Affects Versions: 4.1.0, 4.0.0
            Reporter: Sijie Guo
             Fix For: 4.2.0


the jira is used to revisit consume interface in hedwig client and to improve it to meet JMS provider's requirements.

move comments from BOOKKEEPER-311 to here, which make discussion more clearer. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-350) Revisit consume interface in Hedwig Client

Posted by "Sijie Guo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420565#comment-13420565 ] 

Sijie Guo commented on BOOKKEEPER-350:
--------------------------------------

moved all related discussion on consume api from BOOKKEEPER-311 to BOOKKEEPER-350 to make it clearer. 
                
> Revisit consume interface in Hedwig Client
> ------------------------------------------
>
>                 Key: BOOKKEEPER-350
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-350
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: hedwig-client
>    Affects Versions: 4.0.0, 4.1.0
>            Reporter: Sijie Guo
>             Fix For: 4.2.0
>
>
> the jira is used to revisit consume interface in hedwig client and to improve it to meet JMS provider's requirements.
> move comments from BOOKKEEPER-311 to here, which make discussion more clearer. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-350) Revisit consume interface in Hedwig Client

Posted by "Sijie Guo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420560#comment-13420560 ] 

Sijie Guo commented on BOOKKEEPER-350:
--------------------------------------

@Mridul:



bq. 1) the changes break backward compatibility. it changed the consume semantic. if existed applications used consume as async callback, it broken existed applications' assumption. one way I think we might not change consume's semantic, adding another call which returns only when consume request is written to the channel. this new call name might be 'syncConsume' or other better name.


When discussing with Ivan, it looked like the existing consume() method was not consistent with rest of the api exposed from the interface all async methods were prefixed with 'async' and the sync ones were not.
We assumed the implementation of consume was an oversight (particularly since it did not have a async variant) - hence the addition of an additional explicit asyncConsume.

Having said that, you are right, it is an intentional interface behavioral change introduced.
Note that there are two aspects to the overall change :

Basically, what happens when user does 
* consume() and then close() (no-auto ack mode) vs 
* close() (ack mode - controlled via auto_send_consume_message_enabled=true (default) and consumed_messages_buffer_size (default 5) ). 

This in context of the fact that there is option of automatic consume acknowledgement, and buffering of these consume acknowledgements implemented in java api. (Actually, you can mix ack-mode with explicit consume() too, and vice versa btw - the latter resulting in some interesting issues).

The changes to client code (on consume related changes) are to handle issues with these two cases.


In the first case :

If consume remains async, there are two possibilities here :

1) consume() results in request being sent to server, and so server does not redeliver the message again.
2) consume() request was in flight (within netty), while close() is executed - resulting in socket close before request makes it out.

As you mentioned, we can ofcourse keep consume() as async (but with no way to track progress via a future) and introduce a syncConsume() - functionally, other than inability to track future, this will be functionally equivalent way to resolve the bug !

Note that, as you mentioned in point 2 above, other than netty 'telling us' about delivery of request to server - there is nothing much else we can do. Practically, this is good enough.
It is a best case effort, and does not give transactional gaurantee's.


For the second case (close() with auto-ack) :

The changes to *ResponseHandler via addition of handleChannelClosedExplicitly (rename ?) is to handle this case.
Ensure that buffered state (in this case, consume'd seq-id which is not yet ack'ed to server) is written before closing socket.
Even this is, ofcourse, best case effort : note that this is used for sending buffered seq-id but could be used for others too (now or in future).







(1) is the desired behavior, and (2) is actually fairly common - please note that this observed and then fixed, not other way around :-) I was really looking to no change hedwig in any way possible.


a) The sync aspect comes in ONLY when the invocation of the method results in a request being sent to the server.
b) If consume() does not result in request to server (due to buffering of consume requests), then changes to close() handle the delivery of buffered seq'id.






To recap:

The issue we are trying to resolve is, if message is consume'd by user, within reasonable gaurantee's, it must not be sent back to him.
Problem (2) above meant that upto last 4 messages might always be sent back to client.
Problem (1) meant that consume + close might send last message (or last N if batch consume by user !) back to him.
                
> Revisit consume interface in Hedwig Client
> ------------------------------------------
>
>                 Key: BOOKKEEPER-350
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-350
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: hedwig-client
>    Affects Versions: 4.0.0, 4.1.0
>            Reporter: Sijie Guo
>             Fix For: 4.2.0
>
>
> the jira is used to revisit consume interface in hedwig client and to improve it to meet JMS provider's requirements.
> move comments from BOOKKEEPER-311 to here, which make discussion more clearer. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-350) Revisit consume interface in Hedwig Client

Posted by "Sijie Guo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420561#comment-13420561 ] 

Sijie Guo commented on BOOKKEEPER-350:
--------------------------------------

sijie:

thanks for Mridul for explaining details.

from your explanation, actually what you need is different ack guarantees. so we could categorized the semantics of consume into several groups:

1) just guarantee sending consume request to local netty layer : if channel is closed before consume request written to server by netty, the consume request is lost. we would get duplicated messages.

2) just guarantee writing consume request to channel : if hub server is down before processing the consume request. we still get duplicated messages.

3) guarantee consume request is processed by hub server : we need response for consume request. in reality, we might not need such guarantee. I wrote it here, is just for completeness. 

so I think the better idea is to extend 'consume' api to support different-level consume guarantee, instead of introducing 'asyncConsume' api. I wrote an initial idea as below:

{code}
enum ConsumeMode {
    // provide different guarantee levels as described above.
    // I had no good names for it.
}

public void consume(topic, subscriberId, messageSeqId);

public void consume(topic, subscriberId, messageSeqId, ConsumeMode);
{code}

the original consume api still keeps semantic just sending consume requests to netty layer. we don't need to break the backward compatibility. and it makes semantic more clearly and extensible. also it could resolve the issue you mentioned.

@Mridul @Ivan, how are your opinions? 
                
> Revisit consume interface in Hedwig Client
> ------------------------------------------
>
>                 Key: BOOKKEEPER-350
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-350
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: hedwig-client
>    Affects Versions: 4.0.0, 4.1.0
>            Reporter: Sijie Guo
>             Fix For: 4.2.0
>
>
> the jira is used to revisit consume interface in hedwig client and to improve it to meet JMS provider's requirements.
> move comments from BOOKKEEPER-311 to here, which make discussion more clearer. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (BOOKKEEPER-350) Revisit consume interface in Hedwig Client

Posted by "Sijie Guo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/BOOKKEEPER-350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sijie Guo updated BOOKKEEPER-350:
---------------------------------

    Fix Version/s:     (was: 4.2.0)
                   4.3.0
    
> Revisit consume interface in Hedwig Client
> ------------------------------------------
>
>                 Key: BOOKKEEPER-350
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-350
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: hedwig-client
>    Affects Versions: 4.0.0, 4.1.0
>            Reporter: Sijie Guo
>             Fix For: 4.3.0
>
>
> the jira is used to revisit consume interface in hedwig client and to improve it to meet JMS provider's requirements.
> move comments from BOOKKEEPER-311 to here, which make discussion more clearer. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (BOOKKEEPER-350) Revisit consume interface in Hedwig Client

Posted by "Sijie Guo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420564#comment-13420564 ] 

Sijie Guo commented on BOOKKEEPER-350:
--------------------------------------

sijie:

I don't think it's inconsistent between apis. the usage of consume is to ack hub server when it received and processed a message. if you treated message delivery as a kind of request, consume could be treated as its response, which is a kind of server->client operation. from this side, it is different from pub/sub/unsub requests, which are client->server operations. also for pub/sub/unsub, we had to receive the response even using asynchronized api, which is the semantic require for these actions.  

{quote}
Having said that, as long as there is reasonable assurance that best case effort was made to send consume request to server, any additional guarantees would be better 
{quote}

I agreed. but for some proxy-style use cases, there are several running servers using hedwig client interacting with hub servers. so the connection would not be broken until it is down, so netty could handle sending consume request for it without blocking its other logic. why not leverage it?

yes. for auto-consume clients, it is OK to provide better guarantee as it could. but for those clients who consume themselves, they could choose how to ack hub server as original api provided.

 
                
> Revisit consume interface in Hedwig Client
> ------------------------------------------
>
>                 Key: BOOKKEEPER-350
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-350
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: hedwig-client
>    Affects Versions: 4.0.0, 4.1.0
>            Reporter: Sijie Guo
>             Fix For: 4.2.0
>
>
> the jira is used to revisit consume interface in hedwig client and to improve it to meet JMS provider's requirements.
> move comments from BOOKKEEPER-311 to here, which make discussion more clearer. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-350) Revisit consume interface in Hedwig Client

Posted by "Sijie Guo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420562#comment-13420562 ] 

Sijie Guo commented on BOOKKEEPER-350:
--------------------------------------

Mridul:

Please note that the guarantees provided must be consistent across api methods exposed - for example, similar reasoning applies to publish.
(1) is not supported in publish iirc.
Currently, only (2) is.
(3) can be inferred if the seq-id is returned : but there is no requirement that not receiving it meant the message was not published (socket lost post delivery or server death, etc) - as in, no transactional guarantees.

Which was the reason (and not to mention minimize hedwig changes :) ), we restricted to (2) - make it inline with rest of api, while providing reasonable assurance of delivery.



Having said that, as long as there is reasonable assurance that best case effort was made to send consume request to server, any additional guarantees would be better (but would have a higher cost, which needs to be factored in - ack from server for example) !
                
> Revisit consume interface in Hedwig Client
> ------------------------------------------
>
>                 Key: BOOKKEEPER-350
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-350
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: hedwig-client
>    Affects Versions: 4.0.0, 4.1.0
>            Reporter: Sijie Guo
>             Fix For: 4.2.0
>
>
> the jira is used to revisit consume interface in hedwig client and to improve it to meet JMS provider's requirements.
> move comments from BOOKKEEPER-311 to here, which make discussion more clearer. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-350) Revisit consume interface in Hedwig Client

Posted by "Sijie Guo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420609#comment-13420609 ] 

Sijie Guo commented on BOOKKEEPER-350:
--------------------------------------

thanks for reminder. remove the link. 
                
> Revisit consume interface in Hedwig Client
> ------------------------------------------
>
>                 Key: BOOKKEEPER-350
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-350
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: hedwig-client
>    Affects Versions: 4.0.0, 4.1.0
>            Reporter: Sijie Guo
>             Fix For: 4.2.0
>
>
> the jira is used to revisit consume interface in hedwig client and to improve it to meet JMS provider's requirements.
> move comments from BOOKKEEPER-311 to here, which make discussion more clearer. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-350) Revisit consume interface in Hedwig Client

Posted by "Sijie Guo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420559#comment-13420559 ] 

Sijie Guo commented on BOOKKEEPER-350:
--------------------------------------

@sijie:

for consume request:

1) the changes break backward compatibility. it changed the consume semantic. if existed applications used consume as async callback, it broken existed applications' assumption. one way I think we might not change consume's semantic, adding another call which returns only when consume request is written to the channel. this new call name might be 'syncConsume' or other better name.

2) even we have 'syncConsume' like method to ensure netty wrote the consume request to server. we still can't guarantee the consume request is received and processed by hub server. so adding such 'syncConsume' could not provide any semantic guarantee to the consume behavior.

what kind of semantic for consume request is needed by JMS provider? the request is wrote to the channel? the request is processed by the hub sever?
                
> Revisit consume interface in Hedwig Client
> ------------------------------------------
>
>                 Key: BOOKKEEPER-350
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-350
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: hedwig-client
>    Affects Versions: 4.0.0, 4.1.0
>            Reporter: Sijie Guo
>             Fix For: 4.2.0
>
>
> the jira is used to revisit consume interface in hedwig client and to improve it to meet JMS provider's requirements.
> move comments from BOOKKEEPER-311 to here, which make discussion more clearer. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (BOOKKEEPER-350) Revisit consume interface in Hedwig Client

Posted by "Mridul Muralidharan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/BOOKKEEPER-350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420582#comment-13420582 ] 

Mridul Muralidharan commented on BOOKKEEPER-350:
------------------------------------------------

Please note that this JIRA does not block the JMS provider specifically - it blocks all use of consume api.
The JMS provider can be committed without having to resolve this.
                
> Revisit consume interface in Hedwig Client
> ------------------------------------------
>
>                 Key: BOOKKEEPER-350
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-350
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: hedwig-client
>    Affects Versions: 4.0.0, 4.1.0
>            Reporter: Sijie Guo
>             Fix For: 4.2.0
>
>
> the jira is used to revisit consume interface in hedwig client and to improve it to meet JMS provider's requirements.
> move comments from BOOKKEEPER-311 to here, which make discussion more clearer. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira