You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Jan Algermissen <ja...@nordsc.com> on 2014/10/06 01:03:52 UTC

Exploring Simply Queueing

Hi,

I have put together some thoughts on realizing simple queues with Cassandra.

https://github.com/algermissen/cassandra-ruby-queue

The design is inspired by (the much more sophisticated) Netfilx approach[1] but very reduced.

Given that I am still a C* newbie, I’d be very glad to hear some thoughts on the design path I took.

Jan

[1] https://github.com/Netflix/astyanax/wiki/Message-Queue

Re: Exploring Queueing

Posted by DuyHai Doan <do...@gmail.com>.

Excellent initiative. I'll have a look tomorrow and give some feedbacks

On Sun, Oct 12, 2014 at 10:25 PM, Jan Algermissen <
jan.algermissen@nordsc.com> wrote:

> Hi all,
>
> thanks again for the comments.
>
> I have created an (improved?) design, this time using dedicated consumers
> per shard and time-based row expire, hence without immediate deletes.
>
> https://github.com/algermissen/cassandra-ruby-sharded-workers
>
> As before, comments are welcome.
>
> Jan
>
> On 06 Oct 2014, at 22:50, Robert Coli <rc...@eventbrite.com> wrote:
>
> > On Mon, Oct 6, 2014 at 1:40 PM, Jan Algermissen <
> jan.algermissen@nordsc.com> wrote:
> > Hmm, I was under the impression that issues with old queue state
> disappear after gc_grace_seconds and that the goal primarily is to keep the
> rows ‘short’ enough to achieve a tombstones read performance impact that
> one can live with in a given use case.
> >
> > The design I pasted does a link to does not include specifics regarding
> pruning old history. Yes, you can just delete it, if your system design
> doesn't require replay from the start.
> >
> > =Rob
> >
>
>

Re: Exploring Queueing

Posted by Jan Algermissen <ja...@nordsc.com>.

Hi all,

thanks again for the comments.

I have created an (improved?) design, this time using dedicated consumers per shard and time-based row expire, hence without immediate deletes.

https://github.com/algermissen/cassandra-ruby-sharded-workers

As before, comments are welcome.

Jan

On 06 Oct 2014, at 22:50, Robert Coli <rc...@eventbrite.com> wrote:

> On Mon, Oct 6, 2014 at 1:40 PM, Jan Algermissen <ja...@nordsc.com> wrote:
> Hmm, I was under the impression that issues with old queue state disappear after gc_grace_seconds and that the goal primarily is to keep the rows ‘short’ enough to achieve a tombstones read performance impact that one can live with in a given use case.
> 
> The design I pasted does a link to does not include specifics regarding pruning old history. Yes, you can just delete it, if your system design doesn't require replay from the start.
> 
> =Rob
>

Re: Exploring Simply Queueing

Posted by Robert Coli <rc...@eventbrite.com>.

On Mon, Oct 6, 2014 at 1:40 PM, Jan Algermissen <ja...@nordsc.com>
wrote:

> Hmm, I was under the impression that issues with old queue state disappear
> after gc_grace_seconds and that the goal primarily is to keep the rows
> ‘short’ enough to achieve a tombstones read performance impact that one can
> live with in a given use case.
>

The design I pasted does a link to does not include specifics regarding
pruning old history. Yes, you can just delete it, if your system design
doesn't require replay from the start.

=Rob

Re: Exploring Simply Queueing

Posted by Jan Algermissen <ja...@nordsc.com>.

Robert,

On 06 Oct 2014, at 17:50, Robert Coli <rc...@eventbrite.com> wrote:

> In theory they can also be designed such that history is not infinite, which mitigates the buildup of old queue state.
> 

Hmm, I was under the impression that issues with old queue state disappear after gc_grace_seconds and that the goal primarily is to keep the rows ‘short’ enough to achieve a tombstones read performance impact that one can live with in a given use case.

Is that understanding wrong?

Jan

Re: Exploring Simply Queueing

Posted by Robert Coli <rc...@eventbrite.com>.

On Mon, Oct 6, 2014 at 8:30 AM, Minh Do <md...@netflix.com> wrote:

> Just let you know if you base your implementation on Netflix's queue
> recipe, there are many issues with it.
>
> In general, we don't advise people to use that recipe so I suggest you to
> save your time by not going that same route again.
>

I +1 people who are saying that this is not a strong case for Cassandra. I
also agree that if you want to do this, you should consider other
approaches.

However, depending on the nature of the queue (low amount of total volume,
etc.) things like this can work just fine in practice :

https://engineering.eventbrite.com/replayable-pubsub-queues-with-cassandra-and-zookeeper/

In theory they can also be designed such that history is not infinite,
which mitigates the buildup of old queue state.

=Rob
http://twitter.com/rcolidba

Re: Exploring Simply Queueing

Posted by Minh Do <md...@netflix.com>.

Hi Jan,

Both Chris and Shane say what I believe the correct thinking.

Just let you know if you base your implementation on Netflix's queue
recipe, there are many issues with it.

In general, we don't advise people to use that recipe so I suggest you to
save your time by not going that same route again.


Minh


On Mon, Oct 6, 2014 at 7:34 AM, Shane Hansen <sh...@gmail.com> wrote:

> Sorry if I'm hijacking the conversation, but why in the world would you
> want
> to implement a queue on top of Cassandra? It seems like using a proper
> queuing service
> would make your life a lot easier.
>
> That being said, there might be a better way to play to the strengths of
> C*. Ideally everything you do
> is append only with few deletes or updates. So an interesting way to
> implement a queue might be
> to do one insert to put the job in the queue and another insert to mark
> the job as done or in process
> or whatever. This would also give you the benefit of being able to replay
> the state of the queue.
>
>
> On Mon, Oct 6, 2014 at 12:57 AM, Jan Algermissen <
> jan.algermissen@nordsc.com> wrote:
>
>> Chris,
>>
>> thanks for taking a look.
>>
>> On 06 Oct 2014, at 04:44, Chris Lohfink <cl...@blackbirdit.com> wrote:
>>
>> > It appears you are aware of the tombstones affect that leads people to
>> label this an anti-pattern.  Without "due" or any time based value being
>> part of the partition key means you will still get a lot of buildup.  You
>> only have 1 partition per shard which just linearly decreases the
>> tombstones.  That isn't likely to be enough to really help in a situation
>> of high queue throughput, especially with the default of 4 shards.
>>
>> Yes, dealing with the tombstones effect is the whole point. The work
>> loads I have to deal with are not really high throughput, it is unlikely
>> we’ll ever reach multiple messages per second.The emphasis is also more on
>> coordinating producer and consumer than on high volume capacity problems.
>>
>> Your comment seems to suggest to include larger time frames (e.g. the
>> due-hour) in the partition keys and use the current time to select the
>> active partitions (e.g. the shards of the hour). Once an hour has passed,
>> the corresponding shards will never be touched again.
>>
>> Am I understanding this correctly?
>>
>> >
>> > You may want to consider switching to LCS from the default STCS since
>> re-writing to same partitions a lot. It will still use STCS in L0 so in
>> high write/delete scenarios, with low enough gc_grace, when it never gets
>> higher then L1 it will be sameish write throughput. In scenarios where you
>> get more LCS will shine I suspect by reducing number of obsolete
>> tombstones.  Would be hard to identify difference in small tests I think.
>>
>> Thanks, I’ll try to explore the various effects
>>
>> >
>> > Whats the plan to prevent two consumers from reading same message off
>> of a queue?  You mention in docs you will address it at a later point in
>> time but its kinda a biggy.  Big lock & batch reads like astyanax recipe?
>>
>> I have included a static column per shard to act as a lock (the ’lock’
>> column in the examples) in combination with conditional updates.
>>
>> I must admit, I have not quite understood what Netfix is doing in terms
>> of coordination - but since performance isn’t our concern, CAS should do
>> fine, I guess(?)
>>
>> Thanks again,
>>
>> Jan
>>
>>
>> >
>> > ---
>> > Chris Lohfink
>> >
>> >
>> > On Oct 5, 2014, at 6:03 PM, Jan Algermissen <ja...@nordsc.com>
>> wrote:
>> >
>> >> Hi,
>> >>
>> >> I have put together some thoughts on realizing simple queues with
>> Cassandra.
>> >>
>> >> https://github.com/algermissen/cassandra-ruby-queue
>> >>
>> >> The design is inspired by (the much more sophisticated) Netfilx
>> approach[1] but very reduced.
>> >>
>> >> Given that I am still a C* newbie, I’d be very glad to hear some
>> thoughts on the design path I took.
>> >>
>> >> Jan
>> >>
>> >> [1] https://github.com/Netflix/astyanax/wiki/Message-Queue
>> >
>>
>>
>

Re: Exploring Simply Queueing

Posted by Ranjib Dey <de...@gmail.com>.

i want answer the first question why one might use cassandra as a queuing
solution:
 - its the only opensource distributed persistence layer (i.e. no SPOF),
that you can run over WAN and provide lan/wan specific quorum controls
i know its sub optimal, as the deletion imposes additional
compaction/repair penalties, but there no other solution i am awaee of.
Most AMQP solutions are broker based and clustering is pain, while things
like riak only supports wan based cluster in their commercial solution. I
would love to know about other alternatives,

And thaks for sharing the ruby based priority queue prototype, it helps
people like me (sys ad :-) ) exploring these concepts betrter,

cheers
ranjib

On Mon, Oct 6, 2014 at 1:35 PM, Jan Algermissen <ja...@nordsc.com>
wrote:

> Shane,
>
> On 06 Oct 2014, at 16:34, Shane Hansen <sh...@gmail.com> wrote:
>
> > Sorry if I'm hijacking the conversation, but why in the world would you
> want
> > to implement a queue on top of Cassandra? It seems like using a proper
> queuing service
> > would make your life a lot easier.
>
> Agreed - however, the use case simply does not justify the additional
> operations.
>
> >
> > That being said, there might be a better way to play to the strengths of
> C*. Ideally everything you do
> > is append only with few deletes or updates. So an interesting way to
> implement a queue might be
> > to do one insert to put the job in the queue and another insert to mark
> the job as done or in process
> > or whatever. This would also give you the benefit of being able to
> replay the state of the queue.
>
> Thanks, I’ll try that, too.
>
> Jan
>
>
> >
> >
> > On Mon, Oct 6, 2014 at 12:57 AM, Jan Algermissen <
> jan.algermissen@nordsc.com> wrote:
> > Chris,
> >
> > thanks for taking a look.
> >
> > On 06 Oct 2014, at 04:44, Chris Lohfink <cl...@blackbirdit.com>
> wrote:
> >
> > > It appears you are aware of the tombstones affect that leads people to
> label this an anti-pattern.  Without "due" or any time based value being
> part of the partition key means you will still get a lot of buildup.  You
> only have 1 partition per shard which just linearly decreases the
> tombstones.  That isn't likely to be enough to really help in a situation
> of high queue throughput, especially with the default of 4 shards.
> >
> > Yes, dealing with the tombstones effect is the whole point. The work
> loads I have to deal with are not really high throughput, it is unlikely
> we’ll ever reach multiple messages per second.The emphasis is also more on
> coordinating producer and consumer than on high volume capacity problems.
> >
> > Your comment seems to suggest to include larger time frames (e.g. the
> due-hour) in the partition keys and use the current time to select the
> active partitions (e.g. the shards of the hour). Once an hour has passed,
> the corresponding shards will never be touched again.
> >
> > Am I understanding this correctly?
> >
> > >
> > > You may want to consider switching to LCS from the default STCS since
> re-writing to same partitions a lot. It will still use STCS in L0 so in
> high write/delete scenarios, with low enough gc_grace, when it never gets
> higher then L1 it will be sameish write throughput. In scenarios where you
> get more LCS will shine I suspect by reducing number of obsolete
> tombstones.  Would be hard to identify difference in small tests I think.
> >
> > Thanks, I’ll try to explore the various effects
> >
> > >
> > > Whats the plan to prevent two consumers from reading same message off
> of a queue?  You mention in docs you will address it at a later point in
> time but its kinda a biggy.  Big lock & batch reads like astyanax recipe?
> >
> > I have included a static column per shard to act as a lock (the ’lock’
> column in the examples) in combination with conditional updates.
> >
> > I must admit, I have not quite understood what Netfix is doing in terms
> of coordination - but since performance isn’t our concern, CAS should do
> fine, I guess(?)
> >
> > Thanks again,
> >
> > Jan
> >
> >
> > >
> > > ---
> > > Chris Lohfink
> > >
> > >
> > > On Oct 5, 2014, at 6:03 PM, Jan Algermissen <
> jan.algermissen@nordsc.com> wrote:
> > >
> > >> Hi,
> > >>
> > >> I have put together some thoughts on realizing simple queues with
> Cassandra.
> > >>
> > >> https://github.com/algermissen/cassandra-ruby-queue
> > >>
> > >> The design is inspired by (the much more sophisticated) Netfilx
> approach[1] but very reduced.
> > >>
> > >> Given that I am still a C* newbie, I’d be very glad to hear some
> thoughts on the design path I took.
> > >>
> > >> Jan
> > >>
> > >> [1] https://github.com/Netflix/astyanax/wiki/Message-Queue
> > >
> >
> >
>
>

Re: Exploring Simply Queueing

Posted by Jan Algermissen <ja...@nordsc.com>.

Shane,

On 06 Oct 2014, at 16:34, Shane Hansen <sh...@gmail.com> wrote:

> Sorry if I'm hijacking the conversation, but why in the world would you want
> to implement a queue on top of Cassandra? It seems like using a proper queuing service
> would make your life a lot easier.

Agreed - however, the use case simply does not justify the additional operations.

> 
> That being said, there might be a better way to play to the strengths of C*. Ideally everything you do
> is append only with few deletes or updates. So an interesting way to implement a queue might be
> to do one insert to put the job in the queue and another insert to mark the job as done or in process
> or whatever. This would also give you the benefit of being able to replay the state of the queue.

Thanks, I’ll try that, too.

Jan


> 
> 
> On Mon, Oct 6, 2014 at 12:57 AM, Jan Algermissen <ja...@nordsc.com> wrote:
> Chris,
> 
> thanks for taking a look.
> 
> On 06 Oct 2014, at 04:44, Chris Lohfink <cl...@blackbirdit.com> wrote:
> 
> > It appears you are aware of the tombstones affect that leads people to label this an anti-pattern.  Without "due" or any time based value being part of the partition key means you will still get a lot of buildup.  You only have 1 partition per shard which just linearly decreases the tombstones.  That isn't likely to be enough to really help in a situation of high queue throughput, especially with the default of 4 shards.
> 
> Yes, dealing with the tombstones effect is the whole point. The work loads I have to deal with are not really high throughput, it is unlikely we’ll ever reach multiple messages per second.The emphasis is also more on coordinating producer and consumer than on high volume capacity problems.
> 
> Your comment seems to suggest to include larger time frames (e.g. the due-hour) in the partition keys and use the current time to select the active partitions (e.g. the shards of the hour). Once an hour has passed, the corresponding shards will never be touched again.
> 
> Am I understanding this correctly?
> 
> >
> > You may want to consider switching to LCS from the default STCS since re-writing to same partitions a lot. It will still use STCS in L0 so in high write/delete scenarios, with low enough gc_grace, when it never gets higher then L1 it will be sameish write throughput. In scenarios where you get more LCS will shine I suspect by reducing number of obsolete tombstones.  Would be hard to identify difference in small tests I think.
> 
> Thanks, I’ll try to explore the various effects
> 
> >
> > Whats the plan to prevent two consumers from reading same message off of a queue?  You mention in docs you will address it at a later point in time but its kinda a biggy.  Big lock & batch reads like astyanax recipe?
> 
> I have included a static column per shard to act as a lock (the ’lock’ column in the examples) in combination with conditional updates.
> 
> I must admit, I have not quite understood what Netfix is doing in terms of coordination - but since performance isn’t our concern, CAS should do fine, I guess(?)
> 
> Thanks again,
> 
> Jan
> 
> 
> >
> > ---
> > Chris Lohfink
> >
> >
> > On Oct 5, 2014, at 6:03 PM, Jan Algermissen <ja...@nordsc.com> wrote:
> >
> >> Hi,
> >>
> >> I have put together some thoughts on realizing simple queues with Cassandra.
> >>
> >> https://github.com/algermissen/cassandra-ruby-queue
> >>
> >> The design is inspired by (the much more sophisticated) Netfilx approach[1] but very reduced.
> >>
> >> Given that I am still a C* newbie, I’d be very glad to hear some thoughts on the design path I took.
> >>
> >> Jan
> >>
> >> [1] https://github.com/Netflix/astyanax/wiki/Message-Queue
> >
> 
>

Re: Exploring Simply Queueing

Posted by Shane Hansen <sh...@gmail.com>.

Sorry if I'm hijacking the conversation, but why in the world would you want
to implement a queue on top of Cassandra? It seems like using a proper
queuing service
would make your life a lot easier.

That being said, there might be a better way to play to the strengths of
C*. Ideally everything you do
is append only with few deletes or updates. So an interesting way to
implement a queue might be
to do one insert to put the job in the queue and another insert to mark the
job as done or in process
or whatever. This would also give you the benefit of being able to replay
the state of the queue.


On Mon, Oct 6, 2014 at 12:57 AM, Jan Algermissen <jan.algermissen@nordsc.com
> wrote:

> Chris,
>
> thanks for taking a look.
>
> On 06 Oct 2014, at 04:44, Chris Lohfink <cl...@blackbirdit.com> wrote:
>
> > It appears you are aware of the tombstones affect that leads people to
> label this an anti-pattern.  Without "due" or any time based value being
> part of the partition key means you will still get a lot of buildup.  You
> only have 1 partition per shard which just linearly decreases the
> tombstones.  That isn't likely to be enough to really help in a situation
> of high queue throughput, especially with the default of 4 shards.
>
> Yes, dealing with the tombstones effect is the whole point. The work loads
> I have to deal with are not really high throughput, it is unlikely we’ll
> ever reach multiple messages per second.The emphasis is also more on
> coordinating producer and consumer than on high volume capacity problems.
>
> Your comment seems to suggest to include larger time frames (e.g. the
> due-hour) in the partition keys and use the current time to select the
> active partitions (e.g. the shards of the hour). Once an hour has passed,
> the corresponding shards will never be touched again.
>
> Am I understanding this correctly?
>
> >
> > You may want to consider switching to LCS from the default STCS since
> re-writing to same partitions a lot. It will still use STCS in L0 so in
> high write/delete scenarios, with low enough gc_grace, when it never gets
> higher then L1 it will be sameish write throughput. In scenarios where you
> get more LCS will shine I suspect by reducing number of obsolete
> tombstones.  Would be hard to identify difference in small tests I think.
>
> Thanks, I’ll try to explore the various effects
>
> >
> > Whats the plan to prevent two consumers from reading same message off of
> a queue?  You mention in docs you will address it at a later point in time
> but its kinda a biggy.  Big lock & batch reads like astyanax recipe?
>
> I have included a static column per shard to act as a lock (the ’lock’
> column in the examples) in combination with conditional updates.
>
> I must admit, I have not quite understood what Netfix is doing in terms of
> coordination - but since performance isn’t our concern, CAS should do fine,
> I guess(?)
>
> Thanks again,
>
> Jan
>
>
> >
> > ---
> > Chris Lohfink
> >
> >
> > On Oct 5, 2014, at 6:03 PM, Jan Algermissen <ja...@nordsc.com>
> wrote:
> >
> >> Hi,
> >>
> >> I have put together some thoughts on realizing simple queues with
> Cassandra.
> >>
> >> https://github.com/algermissen/cassandra-ruby-queue
> >>
> >> The design is inspired by (the much more sophisticated) Netfilx
> approach[1] but very reduced.
> >>
> >> Given that I am still a C* newbie, I’d be very glad to hear some
> thoughts on the design path I took.
> >>
> >> Jan
> >>
> >> [1] https://github.com/Netflix/astyanax/wiki/Message-Queue
> >
>
>

Re: Exploring Simply Queueing

Posted by Jan Algermissen <ja...@nordsc.com>.

Chris,

thanks for taking a look.

On 06 Oct 2014, at 04:44, Chris Lohfink <cl...@blackbirdit.com> wrote:

> It appears you are aware of the tombstones affect that leads people to label this an anti-pattern.  Without "due" or any time based value being part of the partition key means you will still get a lot of buildup.  You only have 1 partition per shard which just linearly decreases the tombstones.  That isn't likely to be enough to really help in a situation of high queue throughput, especially with the default of 4 shards. 

Yes, dealing with the tombstones effect is the whole point. The work loads I have to deal with are not really high throughput, it is unlikely we’ll ever reach multiple messages per second.The emphasis is also more on coordinating producer and consumer than on high volume capacity problems.

Your comment seems to suggest to include larger time frames (e.g. the due-hour) in the partition keys and use the current time to select the active partitions (e.g. the shards of the hour). Once an hour has passed, the corresponding shards will never be touched again.

Am I understanding this correctly?

> 
> You may want to consider switching to LCS from the default STCS since re-writing to same partitions a lot. It will still use STCS in L0 so in high write/delete scenarios, with low enough gc_grace, when it never gets higher then L1 it will be sameish write throughput. In scenarios where you get more LCS will shine I suspect by reducing number of obsolete tombstones.  Would be hard to identify difference in small tests I think.

Thanks, I’ll try to explore the various effects

> 
> Whats the plan to prevent two consumers from reading same message off of a queue?  You mention in docs you will address it at a later point in time but its kinda a biggy.  Big lock & batch reads like astyanax recipe?

I have included a static column per shard to act as a lock (the ’lock’ column in the examples) in combination with conditional updates.

I must admit, I have not quite understood what Netfix is doing in terms of coordination - but since performance isn’t our concern, CAS should do fine, I guess(?)

Thanks again,

Jan

> 
> ---
> Chris Lohfink
> 
> 
> On Oct 5, 2014, at 6:03 PM, Jan Algermissen <ja...@nordsc.com> wrote:
> 
>> Hi,
>> 
>> I have put together some thoughts on realizing simple queues with Cassandra.
>> 
>> https://github.com/algermissen/cassandra-ruby-queue
>> 
>> The design is inspired by (the much more sophisticated) Netfilx approach[1] but very reduced.
>> 
>> Given that I am still a C* newbie, I’d be very glad to hear some thoughts on the design path I took.
>> 
>> Jan
>> 
>> [1] https://github.com/Netflix/astyanax/wiki/Message-Queue
>

Re: Exploring Simply Queueing

Posted by Chris Lohfink <cl...@blackbirdit.com>.

It appears you are aware of the tombstones affect that leads people to label this an anti-pattern.  Without "due" or any time based value being part of the partition key means you will still get a lot of buildup.  You only have 1 partition per shard which just linearly decreases the tombstones.  That isn't likely to be enough to really help in a situation of high queue throughput, especially with the default of 4 shards. 

You may want to consider switching to LCS from the default STCS since re-writing to same partitions a lot. It will still use STCS in L0 so in high write/delete scenarios, with low enough gc_grace, when it never gets higher then L1 it will be sameish write throughput. In scenarios where you get more LCS will shine I suspect by reducing number of obsolete tombstones.  Would be hard to identify difference in small tests I think.

Whats the plan to prevent two consumers from reading same message off of a queue?  You mention in docs you will address it at a later point in time but its kinda a biggy.  Big lock & batch reads like astyanax recipe?

---
Chris Lohfink

On Oct 5, 2014, at 6:03 PM, Jan Algermissen <ja...@nordsc.com> wrote:

> Hi,
> 
> I have put together some thoughts on realizing simple queues with Cassandra.
> 
> https://github.com/algermissen/cassandra-ruby-queue
> 
> The design is inspired by (the much more sophisticated) Netfilx approach[1] but very reduced.
> 
> Given that I am still a C* newbie, I’d be very glad to hear some thoughts on the design path I took.
> 
> Jan
> 
> [1] https://github.com/Netflix/astyanax/wiki/Message-Queue