You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Francisco Reyes <li...@natserv.net> on 2016/01/28 18:06:09 UTC
Are aggregate functions done in parallel?
Does Cassandra paralelizes aggregate functions?
Have a new project with potentially 200 to 300 million rows per month
that I need to do aggregates on. Wondering if Cassandra would be a good
match.
Re: Are aggregate functions done in parallel?
Posted by DuyHai Doan <do...@gmail.com>.
You can read this: http://www.doanduyhai.com/blog/?p=1876 and this:
http://www.doanduyhai.com/blog/?p=2015
Long story short, UDF and UDA computation is Cassandra is not distributed.
All the values are retrieved first on the coordinator node (to apply the
last write win reconciliation logic) before applying any UDF/UDA
The sweet spot for Cassandra UDA is single partition operations. If you
need to aggregate on multiple partitions, consider using Apache Spark
On Thu, Jan 28, 2016 at 6:06 PM, Francisco Reyes <li...@natserv.net> wrote:
> Does Cassandra paralelizes aggregate functions?
>
> Have a new project with potentially 200 to 300 million rows per month that
> I need to do aggregates on. Wondering if Cassandra would be a good match.
>