You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by "Jimmy Xiang (JIRA)" <ji...@apache.org> on 2012/10/23 19:45:11 UTC

[jira] [Created] (ZOOKEEPER-1568) multi should have a non-transaction version

Jimmy Xiang created ZOOKEEPER-1568:
--------------------------------------

             Summary: multi should have a non-transaction version
                 Key: ZOOKEEPER-1568
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
             Project: ZooKeeper
          Issue Type: Improvement
            Reporter: Jimmy Xiang


Currently multi is transactional, i.e. all or none.  However, sometimes, we don't want that.  We want all operations to be executed.  Even some operation(s) fails, it is ok. We just need to know the result of each operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: [jira] [Commented] (ZOOKEEPER-1568) multi should have a non-transaction version

Posted by Marshall McMullen <ma...@gmail.com>.
I actually think there is a valid use case for this. Mostly for performance
reasons. Because a multi is one transaction, it causes less permuation on
the distributed and replicated state of zookeeper than multiple individual
operations not in a multi.

With a Multi:
- You only pay the cost of the RPC overhead once rather than on each
individual operation
- You get one flush of the leader channel rather than multiple ones for
each write operation
- A multi will case one new snapshot/log to be generated rather than
multiple ones for each operation

There are other reasons that make this a good reason too that are not
performance based. e.g., if it makes the programmer's job easier to use a
multi with these semantics, then that's a win.

In other distributed databases I've worked on, we used different
terminology to disinguish between a multi op that all succeed/fail vs one
that does not. We used the term "Batch" to imply we were batching up
operations but there was no guarantee they'd all succeed/fail.

On Wed, Oct 24, 2012 at 10:44 AM, Jimmy Xiang (JIRA) <ji...@apache.org>wrote:

>
>     [
> https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483372#comment-13483372]
>
> Jimmy Xiang commented on ZOOKEEPER-1568:
> ----------------------------------------
>
> Yes, that's what we are using now. It is working fine. I was thinking if
> there is still room for improvement.
>
> > multi should have a non-transaction version
> > -------------------------------------------
> >
> >                 Key: ZOOKEEPER-1568
> >                 URL:
> https://issues.apache.org/jira/browse/ZOOKEEPER-1568
> >             Project: ZooKeeper
> >          Issue Type: Improvement
> >            Reporter: Jimmy Xiang
> >
> > Currently multi is transactional, i.e. all or none.  However, sometimes,
> we don't want that.  We want all operations to be executed.  Even some
> operation(s) fails, it is ok. We just need to know the result of each
> operation.
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA
> administrators
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>

[jira] [Commented] (ZOOKEEPER-1568) multi should have a non-transaction version

Posted by "Marshall McMullen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483626#comment-13483626 ] 

Marshall McMullen commented on ZOOKEEPER-1568:
----------------------------------------------

Yes, I meant 'cause' :).

The existing multi code fills in a list of results for each op. Right now, it aborts on the first op that fails and rolls back the data tree to what it was before it started. And it explicitly marks all ops after that in the results list with a runtime exception. So the mechanism is already there to communicate the errors back to the client.

So I suppose the Multi code would need to take a bool to indicate if it was all or nothing or not.
                
> multi should have a non-transaction version
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1568
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
>             Project: ZooKeeper
>          Issue Type: Improvement
>            Reporter: Jimmy Xiang
>
> Currently multi is transactional, i.e. all or none.  However, sometimes, we don't want that.  We want all operations to be executed.  Even some operation(s) fails, it is ok. We just need to know the result of each operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1568) multi should have a non-transaction version

Posted by "Jimmy Xiang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484256#comment-13484256 ] 

Jimmy Xiang commented on ZOOKEEPER-1568:
----------------------------------------

For our use case, there is no dependency issue.  "Batch" is what we want.  Another benefit is that we will get less ZK events.  This may not be obvious.  But we will do get less ZK events.  For example, for a bunch of create operations, if we do it one by one, once we get a nodeChildrenChanged event, we will watch it again, so we will get another one for the next create operation.  If they are batched, after we get the first nodeChildrenChanged event, when we watch it again, most likely, other nodes are already created, so we will get less events, which is good.
                
> multi should have a non-transaction version
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1568
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
>             Project: ZooKeeper
>          Issue Type: Improvement
>            Reporter: Jimmy Xiang
>
> Currently multi is transactional, i.e. all or none.  However, sometimes, we don't want that.  We want all operations to be executed.  Even some operation(s) fails, it is ok. We just need to know the result of each operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1568) multi should have a non-transaction version

Posted by "Flavio Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483136#comment-13483136 ] 

Flavio Junqueira commented on ZOOKEEPER-1568:
---------------------------------------------

Hi Jimmy, I'm trying to understand why submitting operations asynchronously is not sufficient for your case. Why do you need to use multi in this case?
                
> multi should have a non-transaction version
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1568
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
>             Project: ZooKeeper
>          Issue Type: Improvement
>            Reporter: Jimmy Xiang
>
> Currently multi is transactional, i.e. all or none.  However, sometimes, we don't want that.  We want all operations to be executed.  Even some operation(s) fails, it is ok. We just need to know the result of each operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1568) multi should have a non-transaction version

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483635#comment-13483635 ] 

Ted Yu commented on ZOOKEEPER-1568:
-----------------------------------

bq. it aborts on the first op that fails and rolls back
Should we allow operations after the failed operation to continue ?
The rationale is that the operations in the batch may not have dependencies among them.
                
> multi should have a non-transaction version
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1568
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
>             Project: ZooKeeper
>          Issue Type: Improvement
>            Reporter: Jimmy Xiang
>
> Currently multi is transactional, i.e. all or none.  However, sometimes, we don't want that.  We want all operations to be executed.  Even some operation(s) fails, it is ok. We just need to know the result of each operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1568) multi should have a non-transaction version

Posted by "Jimmy Xiang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483372#comment-13483372 ] 

Jimmy Xiang commented on ZOOKEEPER-1568:
----------------------------------------

Yes, that's what we are using now. It is working fine. I was thinking if there is still room for improvement.
                
> multi should have a non-transaction version
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1568
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
>             Project: ZooKeeper
>          Issue Type: Improvement
>            Reporter: Jimmy Xiang
>
> Currently multi is transactional, i.e. all or none.  However, sometimes, we don't want that.  We want all operations to be executed.  Even some operation(s) fails, it is ok. We just need to know the result of each operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1568) multi should have a non-transaction version

Posted by "Flavio Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13505326#comment-13505326 ] 

Flavio Junqueira commented on ZOOKEEPER-1568:
---------------------------------------------

I had a look at the patch, and I'm not sure how you take care of the scenario you mention above of receiving fewer zk events. With this patch, wouldn't you still get as many notifications as watches you had set for the znodes you have manipulated in your batch?

I'm still interested in understanding if the performance difference you claim matters or of it can be fixed some other way. This feature is mainly for performance and convenience, yes? 
                
> multi should have a non-transaction version
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1568
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
>             Project: ZooKeeper
>          Issue Type: Improvement
>            Reporter: Jimmy Xiang
>         Attachments: zk-1568_v1.patch
>
>
> Currently multi is transactional, i.e. all or none.  However, sometimes, we don't want that.  We want all operations to be executed.  Even some operation(s) fails, it is ok. We just need to know the result of each operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1568) multi should have a non-transaction version

Posted by "Jimmy Xiang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13505588#comment-13505588 ] 

Jimmy Xiang commented on ZOOKEEPER-1568:
----------------------------------------

I am hacking around this week. I will find out how much performance I can gain this way, if any. Yes, it is mainly for performance and convenience.
As to fewer zk events, it may be just for our use case, assuming our node children changed handler doesn't have a chance to reset the watch soon enough. So if we create lots of children for one parent, we may get less node children changed events, theoretically.
                
> multi should have a non-transaction version
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1568
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
>             Project: ZooKeeper
>          Issue Type: Improvement
>            Reporter: Jimmy Xiang
>         Attachments: zk-1568_v1.patch
>
>
> Currently multi is transactional, i.e. all or none.  However, sometimes, we don't want that.  We want all operations to be executed.  Even some operation(s) fails, it is ok. We just need to know the result of each operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1568) multi should have a non-transaction version

Posted by "Jimmy Xiang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483311#comment-13483311 ] 

Jimmy Xiang commented on ZOOKEEPER-1568:
----------------------------------------

Hi Flavio, for our use case, we need to create/setData hundreds/thousands of znodes.  By submitting operations asynchronously, we need to do it one by one. If we can do it in batches, we can save lots of network trips.
                
> multi should have a non-transaction version
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1568
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
>             Project: ZooKeeper
>          Issue Type: Improvement
>            Reporter: Jimmy Xiang
>
> Currently multi is transactional, i.e. all or none.  However, sometimes, we don't want that.  We want all operations to be executed.  Even some operation(s) fails, it is ok. We just need to know the result of each operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1568) multi should have a non-transaction version

Posted by "Jimmy Xiang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jimmy Xiang updated ZOOKEEPER-1568:
-----------------------------------

    Attachment: zk-1568_v2.patch

Patch version 2 which supports async batch.
                
> multi should have a non-transaction version
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1568
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
>             Project: ZooKeeper
>          Issue Type: Improvement
>            Reporter: Jimmy Xiang
>         Attachments: zk-1568_v1.patch, zk-1568_v2.patch
>
>
> Currently multi is transactional, i.e. all or none.  However, sometimes, we don't want that.  We want all operations to be executed.  Even some operation(s) fails, it is ok. We just need to know the result of each operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1568) multi should have a non-transaction version

Posted by "Marshall McMullen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483650#comment-13483650 ] 

Marshall McMullen commented on ZOOKEEPER-1568:
----------------------------------------------

Well obviously the way it is currently implemented we do not proceed past the first failure. But if we wanted to support a "batch" request wherein they are not all or nothing, then yes, I think we'd proceed past the first failure. If there are dependencies on earlier ops, then obviously those will fail. 
                
> multi should have a non-transaction version
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1568
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
>             Project: ZooKeeper
>          Issue Type: Improvement
>            Reporter: Jimmy Xiang
>
> Currently multi is transactional, i.e. all or none.  However, sometimes, we don't want that.  We want all operations to be executed.  Even some operation(s) fails, it is ok. We just need to know the result of each operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1568) multi should have a non-transaction version

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483618#comment-13483618 ] 

Ted Yu commented on ZOOKEEPER-1568:
-----------------------------------

bq. A multi will case one new snapshot/log to be generated
I guess you meant 'cause' above.
bq. but there was no guarantee they'd all succeed/fail.
I think we need to formalize how success / failure status for individual operations in this new multi API should be delivered back to client.
                
> multi should have a non-transaction version
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1568
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
>             Project: ZooKeeper
>          Issue Type: Improvement
>            Reporter: Jimmy Xiang
>
> Currently multi is transactional, i.e. all or none.  However, sometimes, we don't want that.  We want all operations to be executed.  Even some operation(s) fails, it is ok. We just need to know the result of each operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1568) multi should have a non-transaction version

Posted by "Marshall McMullen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483590#comment-13483590 ] 

Marshall McMullen commented on ZOOKEEPER-1568:
----------------------------------------------

I actually think there is a valid use case for this. Mostly for performance reasons. Because a multi is one transaction, it causes less permuation on the distributed and replicated state of zookeeper than multiple individual operations not in a multi.

With a Multi:
- You only pay the cost of the RPC overhead once rather than on each individual operation
- You get one flush of the leader channel rather than multiple ones for each write operation
- A multi will case one new snapshot/log to be generated rather than multiple ones for each operation

There are other reasons that make this a good reason too that are not performance based. e.g., if it makes the programmer's job easier to use a multi with these semantics, then that's a win.

In other distributed databases I've worked on, we used different terminology to disinguish between a multi op that all succeed/fail vs one that does not. We used the term "Batch" to imply we were batching up operations but there was no guarantee they'd all succeed/fail.
                
> multi should have a non-transaction version
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1568
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
>             Project: ZooKeeper
>          Issue Type: Improvement
>            Reporter: Jimmy Xiang
>
> Currently multi is transactional, i.e. all or none.  However, sometimes, we don't want that.  We want all operations to be executed.  Even some operation(s) fails, it is ok. We just need to know the result of each operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1568) multi should have a non-transaction version

Posted by "Flavio Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483347#comment-13483347 ] 

Flavio Junqueira commented on ZOOKEEPER-1568:
---------------------------------------------

In my view, the asynchronous API has been designed to address exactly use cases like yours. I don't think you should be suffering any severe penalty by using the asynchronous API. Have you actually tried it and had any issue with it?
                
> multi should have a non-transaction version
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1568
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
>             Project: ZooKeeper
>          Issue Type: Improvement
>            Reporter: Jimmy Xiang
>
> Currently multi is transactional, i.e. all or none.  However, sometimes, we don't want that.  We want all operations to be executed.  Even some operation(s) fails, it is ok. We just need to know the result of each operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1568) multi should have a non-transaction version

Posted by "Flavio Junqueira (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486471#comment-13486471 ] 

Flavio Junqueira commented on ZOOKEEPER-1568:
---------------------------------------------

Hi guys, Here are two concerns I have:

# With this proposal we are now mixing performance and correctness (atomicity) in the multi abstraction. At this point, I'd rather stick only to the correctness aspect.
# The architecture of zookeeper is essentially an execution pipeline, which has been optimized to provide both low latency and high throughput. This proposal goes in the opposite the direction and tries to promote the execution of large batches instead of individual operations at least for some use cases.

In general, if it there is an opportunity to improve the performance of the system, then we should pursue it, but at this point it is not even clear how much difference it would actually make if any. Can we actually make sure that such an app-level batching makes a significant difference compared to trunk with respect to performance? And if it does, what exactly is the culprit? Can we fix it without introducing a new API feature?

The point about getChildren capturing fewer events sounds like a "good to have" but not really "must have", but please correct me if I'm wrong. 

  

     
                
> multi should have a non-transaction version
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1568
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
>             Project: ZooKeeper
>          Issue Type: Improvement
>            Reporter: Jimmy Xiang
>
> Currently multi is transactional, i.e. all or none.  However, sometimes, we don't want that.  We want all operations to be executed.  Even some operation(s) fails, it is ok. We just need to know the result of each operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1568) multi should have a non-transaction version

Posted by "Jimmy Xiang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ZOOKEEPER-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jimmy Xiang updated ZOOKEEPER-1568:
-----------------------------------

    Attachment: zk-1568_v1.patch

Attached is the first version patch which is on top of the patch for ZK-1569 and ZK-1592.

This patch supports sync-batch only.  The async version will be in the next patch.
                
> multi should have a non-transaction version
> -------------------------------------------
>
>                 Key: ZOOKEEPER-1568
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1568
>             Project: ZooKeeper
>          Issue Type: Improvement
>            Reporter: Jimmy Xiang
>         Attachments: zk-1568_v1.patch
>
>
> Currently multi is transactional, i.e. all or none.  However, sometimes, we don't want that.  We want all operations to be executed.  Even some operation(s) fails, it is ok. We just need to know the result of each operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira