You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Dinesh Shanbhag <di...@isanasystems.com> on 2015/12/18 13:57:12 UTC

Cassandra 3.1 - Aggregation query failure


I am trying out Aggregations in a single node Cassandra 3.1 
installation.  The node has 4GB RAM.  The table being aggregated on 
contains ~450000 rows.  It contains information on US domestic flights 
for a single month (from 
http://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time).

CREATE AGGREGATE flightdata.late_flights(text, decimal)
     SFUNC state_late_flights
     STYPE map<text, frozen<tuple<int, int>>>
     INITCOND {};

The late_flights aggregation function uses a state_late_flights() 
User-Defined Function that maintains a map of uniquecarrier to 
tuple<int,int>.  The first int in the tuple represents delayed flights 
of the corresponding uniquecarrier.  The second int represents total 
flights of the uniquecarrier.

This aggregation query on a subset of the days of the month works:

    cqlsh:flightdata> select late_flights(uniquecarrier, depdel15) from
    flightsbydate *where flightdate in ('2015-09-15', '2015-09-16',
    '2015-09-17', '2015-09-18', '2015-09-19', '2015-09-20', '2015-09-21')*;

      flightdata.late_flights(uniquecarrier, depdel15)
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
      {'AA': (2395, 17138), 'AS': (234, 3308), 'B6': (703, 4832), 'DL':
    (1452, 17311), 'EV': (1028, 10502), 'F9': (221, 1837), 'HA': (79,
    1414), 'MQ': (892, 4926), 'NK': (535, 2300), 'OO': (1539, 11299),
    'UA': (1422, 9792), 'VX': (181, 1209), 'WN': (3446, 23659)}

    (1 rows)

    Warnings :
    Aggregation query used on multiple partition keys (IN restriction)


However, the aggregation on all ~450000 rows always fails, sometimes 
immediately, sometimes after 30-60 seconds:

    cqlsh:flightdata> select late_flights(uniquecarrier, depdel15) from
    flightsbydate;

    Traceback (most recent call last):
       File "CassandraInstall-3.1/bin/cqlsh.py", line 1258, in
    perform_simple_statement
         result = future.result()
       File
    "/home/wpl/CassandraInstall-3.1/bin/../lib/cassandra-driver-internal-only-3.0.0-6af642d.zip/cassandra-driver-3.0.0-6af642d/cassandra/cluster.py",
    line 3122, in result
         raise self._final_exception
    FunctionFailure: code=1400 [User Defined Function failure]
    message="execution of 'flightdata.state_late_flights[map<text,
    frozen<tuple<int, int>>>, text, decimal]' failed:
    java.security.AccessControlException: access denied
    ("java.io.FilePermission"
    "/home/wpl/CassandraInstall-3.1/conf/logback.xml" "read")"


While this query runs, CPU utilization is 100% - 120%, Peak RAM used is 
less than 3.5GB.

Just in case it is useful, the state_late_flights User-Defined function:

    cqlsh:flightdata> describe function state_late_flights;

    CREATE FUNCTION flightdata.state_late_flights(state map<text,
    frozen<tuple<int, int>>>, flid text, fldelay decimal)
         CALLED ON NULL INPUT
         RETURNS map<text, frozen<tuple<int, int>>>
         LANGUAGE java
         AS $$com.datastax.driver.core.TupleType tt =
    com.datastax.driver.core.TupleType.of(com.datastax.driver.core.ProtocolVersion.NEWEST_SUPPORTED,
    com.datastax.driver.core.CodecRegistry.DEFAULT_INSTANCE,
    com.datastax.driver.core.DataType.cint(),
    com.datastax.driver.core.DataType.cint());
    com.datastax.driver.core.TupleValue tv = tt.newValue(); tv.setInt(0,
    0); tv.setInt(1, 1); if (flid == null) { state.put("EMPTY", tv);
    return state; } if (state.get(flid) != null) {  tv =
    (com.datastax.driver.core.TupleValue) state.get(flid);  tv.setInt(1,
    tv.getInt(1) + 1); if
    (fldelay.compareTo(java.math.BigDecimal.valueOf(0)) == 1) {
    tv.setInt(0, tv.getInt(0) + 1); } } state.put(flid, tv); return
    state;$$;


What should be checked on to investigate this further?
Thanks,
Dinesh.

RE: Cassandra 3.1 - Aggregation query failure

Posted by SE...@homedepot.com.

It shouldn’t be called an aggregate. That is more like a user defined function. If you are correct, the term “aggregate” will lead people to do “bad things” – just like secondary indexes. I think the dev team needs a naming expert.

Sean Durity – Lead Cassandra Admin

From: Robert Stupp [mailto:snazy@snazy.de]
Sent: Wednesday, December 23, 2015 12:15 PM
To: user@cassandra.apache.org
Cc: dinesh.shanbhag@isanasystems.com
Subject: Re: Cassandra 3.1 - Aggregation query failure

Well, the usual access goal for queries in C* is “one partition per query” - maybe a handful partitions in some cases.
That does not differ for aggregates since the read path is still the same.

Aggregates in C* are meant to move some computation (for example on the data in a time-frame materialized in a partition) to the coordinator and reduce the amount of data pumped through the wire.

For queries that span huge datasets, Spark is the easiest way to go.

On 23 Dec 2015, at 18:02, <SE...@homedepot.com>> <SE...@homedepot.com>> wrote:

An aggregate only within a partition? That is rather useless and shouldn’t be called an aggregate.

I am hoping the functionality can be used to support at least “normal” types of aggregates like count, sum, avg, etc.

Sean Durity – Lead Cassandra Admin

From: Jonathan Haddad [mailto:jon@jonhaddad.com]
Sent: Monday, December 21, 2015 2:50 PM
To: user@cassandra.apache.org<ma...@cassandra.apache.org>; dinesh.shanbhag@isanasystems.com<ma...@isanasystems.com>
Subject: Re: Cassandra 3.1 - Aggregation query failure

Even if you get this to work for now, I really recommend using a different tool, like Spark.  Personally I wouldn't use UDAs outside of a single partition.

On Mon, Dec 21, 2015 at 1:50 AM Dinesh Shanbhag <di...@isanasystems.com>> wrote:

Thanks for the pointers!  I edited jvm.options in
$CASSANDRA_HOME/conf/jvm.options to increase -Xms and -Xmx to 1536M.
The result is the same.

And in $CASSANDRA_HOME/logs/system.log, grep GC system.log produces this
(when jvm.options had not been changed):

INFO  [Service Thread] 2015-12-18 15:26:31,668 GCInspector.java:284 -
ConcurrentMarkSweep GC in 296ms.  CMS Old Gen: 18133664 -> 15589256;
Code Cache: 5650880 -> 8122304; Compressed Class Space: 2530064 ->
3345624; Metaspace: 21314000 -> 28040984; Par Eden Space: 7019256 ->
164070848;
INFO  [Service Thread] 2015-12-18 15:48:39,736 GCInspector.java:284 -
ConcurrentMarkSweep GC in 379ms.  CMS Old Gen: 649257416 -> 84190176;
Code Cache: 20772224 -> 20726848; Par Eden Space: 2191408 -> 52356736;
Par Survivor Space: 2378448 -> 2346840
INFO  [Service Thread] 2015-12-18 15:58:35,118 GCInspector.java:284 -
ConcurrentMarkSweep GC in 406ms.  CMS Old Gen: 648847808 -> 86954856;
Code Cache: 21182080 -> 21188032; Par Eden Space: 1815696 -> 71525744;
Par Survivor Space: 2388648 -> 2364696
INFO  [Service Thread] 2015-12-18 16:13:45,821 GCInspector.java:284 -
ConcurrentMarkSweep GC in 211ms.  CMS Old Gen: 648343768 -> 73135720;
Par Eden Space: 3224880 -> 7957464; Par Survivor Space: 2379912 -> 2414520
INFO  [Service Thread] 2015-12-18 16:32:46,419 GCInspector.java:284 -
ConcurrentMarkSweep GC in 387ms.  CMS Old Gen: 648476072 -> 68888832;
Par Eden Space: 2006624 -> 64263360; Par Survivor Space: 2403792 -> 2387664
INFO  [Service Thread] 2015-12-18 16:42:38,648 GCInspector.java:284 -
ConcurrentMarkSweep GC in 365ms.  CMS Old Gen: 649126336 -> 137359384;
Code Cache: 22972224 -> 22979840; Metaspace: 41374464 -> 41375104; Par
Eden Space: 4286080 -> 154449480; Par Survivor Space: 1575440 -> 2310768
INFO  [Service Thread] 2015-12-18 16:51:57,538 GCInspector.java:284 -
ConcurrentMarkSweep GC in 322ms.  CMS Old Gen: 648338928 -> 79783856;
Par Eden Space: 2058968 -> 56931312; Par Survivor Space: 2342760 -> 2400336
INFO  [Service Thread] 2015-12-18 17:02:49,543 GCInspector.java:284 -
ConcurrentMarkSweep GC in 212ms.  CMS Old Gen: 648702008 -> 122954344;
Par Eden Space: 3269032 -> 61433328; Par Survivor Space: 2395824 -> 3448760
INFO  [Service Thread] 2015-12-18 17:11:54,090 GCInspector.java:284 -
ConcurrentMarkSweep GC in 306ms.  CMS Old Gen: 648748576 -> 70965096;
Par Eden Space: 2174840 -> 27074432; Par Survivor Space: 2365992 -> 2373984
INFO  [Service Thread] 2015-12-18 17:22:28,949 GCInspector.java:284 -
ConcurrentMarkSweep GC in 350ms.  CMS Old Gen: 648243024 -> 90897272;
Par Eden Space: 2150168 -> 43487192; Par Survivor Space: 2401872 -> 2410728

After modifying jvm.options to increase -Xms & -Xmx (to 1536M):

INFO  [Service Thread] 2015-12-21 11:39:24,918 GCInspector.java:284 -
ConcurrentMarkSweep GC in 342ms.  CMS Old Gen: 18579136 -> 16305144;
Code Cache: 8600128 -> 10898752; Compressed Class Space: 3431288 ->
3761496; Metaspace: 29551832 -> 33307352; Par Eden Space: 4822000 ->
94853272;
INFO  [Service Thread] 2015-12-21 11:39:30,710 GCInspector.java:284 -
ParNew GC in 206ms.  CMS Old Gen: 22932208 -> 41454520; Par Eden Space:
167772160 -> 0; Par Survivor Space: 13144872 -> 20971520
INFO  [Service Thread] 2015-12-21 13:08:14,922 GCInspector.java:284 -
ConcurrentMarkSweep GC in 468ms.  CMS Old Gen: 21418016 -> 16146528;
Code Cache: 11693888 -> 11744704; Compressed Class Space: 4331224 ->
4344192; Metaspace: 37191144 -> 37249960; Par Eden Space: 146089224 ->
148476848;
INFO  [Service Thread] 2015-12-21 13:08:53,068 GCInspector.java:284 -
ParNew GC in 216ms.  CMS Old Gen: 16146528 -> 26858568; Par Eden Space:
167772160 -> 0;

Earlier the node had OpenJDK 8.  For today's tests I installed and used
Oracle Java 8.

Do the above messages provide any clue? Or any debug logging I can
enable to progress further?
Thanks,
Dinesh.

On 12/18/2015 9:56 PM, Tyler Hobbs wrote:
>
> On Fri, Dec 18, 2015 at 9:17 AM, DuyHai Doan <do...@gmail.com>
> <ma...@gmail.com>>> wrote:
>
>     Cassandra will perform a full table scan and fetch all the data in
>     memory to apply the aggregate function.
>
>
> Just to clarify for others on the list: when executing aggregation
> functions, Cassandra /will/ use paging internally, so at most one page
> worth of data will be held in memory at a time.  However, if your
> aggregation function retains a large amount of data, this may
> contribute to heap pressure.
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>

________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

—
Robert Stupp
@snazy

________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

Re: Cassandra 3.1 - Aggregation query failure

Posted by Robert Stupp <sn...@snazy.de>.

Well, the usual access goal for queries in C* is “one partition per query” - maybe a handful partitions in some cases.
That does not differ for aggregates since the read path is still the same.

Aggregates in C* are meant to move some computation (for example on the data in a time-frame materialized in a partition) to the coordinator and reduce the amount of data pumped through the wire.

For queries that span huge datasets, Spark is the easiest way to go.


> On 23 Dec 2015, at 18:02, <SE...@homedepot.com> <SE...@homedepot.com> wrote:
> 
> An aggregate only within a partition? That is rather useless and shouldn’t be called an aggregate.
> 
> I am hoping the functionality can be used to support at least “normal” types of aggregates like count, sum, avg, etc.
> 
> 
> Sean Durity – Lead Cassandra Admin
> 
> From: Jonathan Haddad [mailto:jon@jonhaddad.com <ma...@jonhaddad.com>]
> Sent: Monday, December 21, 2015 2:50 PM
> To: user@cassandra.apache.org <ma...@cassandra.apache.org>; dinesh.shanbhag@isanasystems.com <ma...@isanasystems.com>
> Subject: Re: Cassandra 3.1 - Aggregation query failure
> 
> Even if you get this to work for now, I really recommend using a different tool, like Spark.  Personally I wouldn't use UDAs outside of a single partition.
> 
> On Mon, Dec 21, 2015 at 1:50 AM Dinesh Shanbhag <dinesh.shanbhag@isanasystems.com <ma...@isanasystems.com>> wrote:
> 
> Thanks for the pointers!  I edited jvm.options in
> $CASSANDRA_HOME/conf/jvm.options to increase -Xms and -Xmx to 1536M.
> The result is the same.
> 
> And in $CASSANDRA_HOME/logs/system.log, grep GC system.log produces this
> (when jvm.options had not been changed):
> 
> INFO  [Service Thread] 2015-12-18 15:26:31,668 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 296ms.  CMS Old Gen: 18133664 -> 15589256;
> Code Cache: 5650880 -> 8122304; Compressed Class Space: 2530064 ->
> 3345624; Metaspace: 21314000 -> 28040984; Par Eden Space: 7019256 ->
> 164070848;
> INFO  [Service Thread] 2015-12-18 15:48:39,736 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 379ms.  CMS Old Gen: 649257416 -> 84190176;
> Code Cache: 20772224 -> 20726848; Par Eden Space: 2191408 -> 52356736;
> Par Survivor Space: 2378448 -> 2346840
> INFO  [Service Thread] 2015-12-18 15:58:35,118 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 406ms.  CMS Old Gen: 648847808 -> 86954856;
> Code Cache: 21182080 -> 21188032; Par Eden Space: 1815696 -> 71525744;
> Par Survivor Space: 2388648 -> 2364696
> INFO  [Service Thread] 2015-12-18 16:13:45,821 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 211ms.  CMS Old Gen: 648343768 -> 73135720;
> Par Eden Space: 3224880 -> 7957464; Par Survivor Space: 2379912 -> 2414520
> INFO  [Service Thread] 2015-12-18 16:32:46,419 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 387ms.  CMS Old Gen: 648476072 -> 68888832;
> Par Eden Space: 2006624 -> 64263360; Par Survivor Space: 2403792 -> 2387664
> INFO  [Service Thread] 2015-12-18 16:42:38,648 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 365ms.  CMS Old Gen: 649126336 -> 137359384;
> Code Cache: 22972224 -> 22979840; Metaspace: 41374464 -> 41375104; Par
> Eden Space: 4286080 -> 154449480; Par Survivor Space: 1575440 -> 2310768
> INFO  [Service Thread] 2015-12-18 16:51:57,538 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 322ms.  CMS Old Gen: 648338928 -> 79783856;
> Par Eden Space: 2058968 -> 56931312; Par Survivor Space: 2342760 -> 2400336
> INFO  [Service Thread] 2015-12-18 17:02:49,543 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 212ms.  CMS Old Gen: 648702008 -> 122954344;
> Par Eden Space: 3269032 -> 61433328; Par Survivor Space: 2395824 -> 3448760
> INFO  [Service Thread] 2015-12-18 17:11:54,090 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 306ms.  CMS Old Gen: 648748576 -> 70965096;
> Par Eden Space: 2174840 -> 27074432; Par Survivor Space: 2365992 -> 2373984
> INFO  [Service Thread] 2015-12-18 17:22:28,949 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 350ms.  CMS Old Gen: 648243024 -> 90897272;
> Par Eden Space: 2150168 -> 43487192; Par Survivor Space: 2401872 -> 2410728
> 
> 
> After modifying jvm.options to increase -Xms & -Xmx (to 1536M):
> 
> INFO  [Service Thread] 2015-12-21 11:39:24,918 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 342ms.  CMS Old Gen: 18579136 -> 16305144;
> Code Cache: 8600128 -> 10898752; Compressed Class Space: 3431288 ->
> 3761496; Metaspace: 29551832 -> 33307352; Par Eden Space: 4822000 ->
> 94853272;
> INFO  [Service Thread] 2015-12-21 11:39:30,710 GCInspector.java:284 -
> ParNew GC in 206ms.  CMS Old Gen: 22932208 -> 41454520; Par Eden Space:
> 167772160 -> 0; Par Survivor Space: 13144872 -> 20971520
> INFO  [Service Thread] 2015-12-21 13:08:14,922 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 468ms.  CMS Old Gen: 21418016 -> 16146528;
> Code Cache: 11693888 -> 11744704; Compressed Class Space: 4331224 ->
> 4344192; Metaspace: 37191144 -> 37249960; Par Eden Space: 146089224 ->
> 148476848;
> INFO  [Service Thread] 2015-12-21 13:08:53,068 GCInspector.java:284 -
> ParNew GC in 216ms.  CMS Old Gen: 16146528 -> 26858568; Par Eden Space:
> 167772160 -> 0;
> 
> 
> Earlier the node had OpenJDK 8.  For today's tests I installed and used
> Oracle Java 8.
> 
> Do the above messages provide any clue? Or any debug logging I can
> enable to progress further?
> Thanks,
> Dinesh.
> 
> On 12/18/2015 9:56 PM, Tyler Hobbs wrote:
> >
> > On Fri, Dec 18, 2015 at 9:17 AM, DuyHai Doan <doanduyhai@gmail.com <ma...@gmail.com>
> > <mailto:doanduyhai@gmail.com <ma...@gmail.com>>> wrote:
> >
> >     Cassandra will perform a full table scan and fetch all the data in
> >     memory to apply the aggregate function.
> >
> >
> > Just to clarify for others on the list: when executing aggregation
> > functions, Cassandra /will/ use paging internally, so at most one page
> > worth of data will be held in memory at a time.  However, if your
> > aggregation function retains a large amount of data, this may
> > contribute to heap pressure.
> >
> >
> > --
> > Tyler Hobbs
> > DataStax <http://datastax.com/ <http://datastax.com/>>
> 
> 
> 
> The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

—
Robert Stupp
@snazy

RE: Cassandra 3.1 - Aggregation query failure

Posted by SE...@homedepot.com.

An aggregate only within a partition? That is rather useless and shouldn’t be called an aggregate.

I am hoping the functionality can be used to support at least “normal” types of aggregates like count, sum, avg, etc.

Sean Durity – Lead Cassandra Admin

From: Jonathan Haddad [mailto:jon@jonhaddad.com]
Sent: Monday, December 21, 2015 2:50 PM
To: user@cassandra.apache.org; dinesh.shanbhag@isanasystems.com
Subject: Re: Cassandra 3.1 - Aggregation query failure

Even if you get this to work for now, I really recommend using a different tool, like Spark.  Personally I wouldn't use UDAs outside of a single partition.

On Mon, Dec 21, 2015 at 1:50 AM Dinesh Shanbhag <di...@isanasystems.com>> wrote:

Thanks for the pointers!  I edited jvm.options in
$CASSANDRA_HOME/conf/jvm.options to increase -Xms and -Xmx to 1536M.
The result is the same.

And in $CASSANDRA_HOME/logs/system.log, grep GC system.log produces this
(when jvm.options had not been changed):

INFO  [Service Thread] 2015-12-18 15:26:31,668 GCInspector.java:284 -
ConcurrentMarkSweep GC in 296ms.  CMS Old Gen: 18133664 -> 15589256;
Code Cache: 5650880 -> 8122304; Compressed Class Space: 2530064 ->
3345624; Metaspace: 21314000 -> 28040984; Par Eden Space: 7019256 ->
164070848;
INFO  [Service Thread] 2015-12-18 15:48:39,736 GCInspector.java:284 -
ConcurrentMarkSweep GC in 379ms.  CMS Old Gen: 649257416 -> 84190176;
Code Cache: 20772224 -> 20726848; Par Eden Space: 2191408 -> 52356736;
Par Survivor Space: 2378448 -> 2346840
INFO  [Service Thread] 2015-12-18 15:58:35,118 GCInspector.java:284 -
ConcurrentMarkSweep GC in 406ms.  CMS Old Gen: 648847808 -> 86954856;
Code Cache: 21182080 -> 21188032; Par Eden Space: 1815696 -> 71525744;
Par Survivor Space: 2388648 -> 2364696
INFO  [Service Thread] 2015-12-18 16:13:45,821 GCInspector.java:284 -
ConcurrentMarkSweep GC in 211ms.  CMS Old Gen: 648343768 -> 73135720;
Par Eden Space: 3224880 -> 7957464; Par Survivor Space: 2379912 -> 2414520
INFO  [Service Thread] 2015-12-18 16:32:46,419 GCInspector.java:284 -
ConcurrentMarkSweep GC in 387ms.  CMS Old Gen: 648476072 -> 68888832;
Par Eden Space: 2006624 -> 64263360; Par Survivor Space: 2403792 -> 2387664
INFO  [Service Thread] 2015-12-18 16:42:38,648 GCInspector.java:284 -
ConcurrentMarkSweep GC in 365ms.  CMS Old Gen: 649126336 -> 137359384;
Code Cache: 22972224 -> 22979840; Metaspace: 41374464 -> 41375104; Par
Eden Space: 4286080 -> 154449480; Par Survivor Space: 1575440 -> 2310768
INFO  [Service Thread] 2015-12-18 16:51:57,538 GCInspector.java:284 -
ConcurrentMarkSweep GC in 322ms.  CMS Old Gen: 648338928 -> 79783856;
Par Eden Space: 2058968 -> 56931312; Par Survivor Space: 2342760 -> 2400336
INFO  [Service Thread] 2015-12-18 17:02:49,543 GCInspector.java:284 -
ConcurrentMarkSweep GC in 212ms.  CMS Old Gen: 648702008 -> 122954344;
Par Eden Space: 3269032 -> 61433328; Par Survivor Space: 2395824 -> 3448760
INFO  [Service Thread] 2015-12-18 17:11:54,090 GCInspector.java:284 -
ConcurrentMarkSweep GC in 306ms.  CMS Old Gen: 648748576 -> 70965096;
Par Eden Space: 2174840 -> 27074432; Par Survivor Space: 2365992 -> 2373984
INFO  [Service Thread] 2015-12-18 17:22:28,949 GCInspector.java:284 -
ConcurrentMarkSweep GC in 350ms.  CMS Old Gen: 648243024 -> 90897272;
Par Eden Space: 2150168 -> 43487192; Par Survivor Space: 2401872 -> 2410728

After modifying jvm.options to increase -Xms & -Xmx (to 1536M):

INFO  [Service Thread] 2015-12-21 11:39:24,918 GCInspector.java:284 -
ConcurrentMarkSweep GC in 342ms.  CMS Old Gen: 18579136 -> 16305144;
Code Cache: 8600128 -> 10898752; Compressed Class Space: 3431288 ->
3761496; Metaspace: 29551832 -> 33307352; Par Eden Space: 4822000 ->
94853272;
INFO  [Service Thread] 2015-12-21 11:39:30,710 GCInspector.java:284 -
ParNew GC in 206ms.  CMS Old Gen: 22932208 -> 41454520; Par Eden Space:
167772160 -> 0; Par Survivor Space: 13144872 -> 20971520
INFO  [Service Thread] 2015-12-21 13:08:14,922 GCInspector.java:284 -
ConcurrentMarkSweep GC in 468ms.  CMS Old Gen: 21418016 -> 16146528;
Code Cache: 11693888 -> 11744704; Compressed Class Space: 4331224 ->
4344192; Metaspace: 37191144 -> 37249960; Par Eden Space: 146089224 ->
148476848;
INFO  [Service Thread] 2015-12-21 13:08:53,068 GCInspector.java:284 -
ParNew GC in 216ms.  CMS Old Gen: 16146528 -> 26858568; Par Eden Space:
167772160 -> 0;

Earlier the node had OpenJDK 8.  For today's tests I installed and used
Oracle Java 8.

Do the above messages provide any clue? Or any debug logging I can
enable to progress further?
Thanks,
Dinesh.

On 12/18/2015 9:56 PM, Tyler Hobbs wrote:
>
> On Fri, Dec 18, 2015 at 9:17 AM, DuyHai Doan <do...@gmail.com>
> <ma...@gmail.com>>> wrote:
>
>     Cassandra will perform a full table scan and fetch all the data in
>     memory to apply the aggregate function.
>
>
> Just to clarify for others on the list: when executing aggregation
> functions, Cassandra /will/ use paging internally, so at most one page
> worth of data will be held in memory at a time.  However, if your
> aggregation function retains a large amount of data, this may
> contribute to heap pressure.
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>

________________________________

The information in this Internet Email is confidential and may be legally privileged. It is intended solely for the addressee. Access to this Email by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. When addressed to our clients any opinions or advice contained in this Email are subject to the terms and conditions expressed in any applicable governing The Home Depot terms of business or client engagement letter. The Home Depot disclaims all responsibility and liability for the accuracy and content of this attachment and for any damages or losses arising from any inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other items of a destructive nature, which may be contained in this attachment and shall not be liable for direct, indirect, consequential or special damages in connection with this e-mail message or its attachment.

Re: Cassandra 3.1 - Aggregation query failure

Posted by Jonathan Haddad <jo...@jonhaddad.com>.

Even if you get this to work for now, I really recommend using a different
tool, like Spark.  Personally I wouldn't use UDAs outside of a single
partition.

On Mon, Dec 21, 2015 at 1:50 AM Dinesh Shanbhag <
dinesh.shanbhag@isanasystems.com> wrote:

>
> Thanks for the pointers!  I edited jvm.options in
> $CASSANDRA_HOME/conf/jvm.options to increase -Xms and -Xmx to 1536M.
> The result is the same.
>
> And in $CASSANDRA_HOME/logs/system.log, grep GC system.log produces this
> (when jvm.options had not been changed):
>
> INFO  [Service Thread] 2015-12-18 15:26:31,668 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 296ms.  CMS Old Gen: 18133664 -> 15589256;
> Code Cache: 5650880 -> 8122304; Compressed Class Space: 2530064 ->
> 3345624; Metaspace: 21314000 -> 28040984; Par Eden Space: 7019256 ->
> 164070848;
> INFO  [Service Thread] 2015-12-18 15:48:39,736 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 379ms.  CMS Old Gen: 649257416 -> 84190176;
> Code Cache: 20772224 -> 20726848; Par Eden Space: 2191408 -> 52356736;
> Par Survivor Space: 2378448 -> 2346840
> INFO  [Service Thread] 2015-12-18 15:58:35,118 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 406ms.  CMS Old Gen: 648847808 -> 86954856;
> Code Cache: 21182080 -> 21188032; Par Eden Space: 1815696 -> 71525744;
> Par Survivor Space: 2388648 -> 2364696
> INFO  [Service Thread] 2015-12-18 16:13:45,821 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 211ms.  CMS Old Gen: 648343768 -> 73135720;
> Par Eden Space: 3224880 -> 7957464; Par Survivor Space: 2379912 -> 2414520
> INFO  [Service Thread] 2015-12-18 16:32:46,419 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 387ms.  CMS Old Gen: 648476072 -> 68888832;
> Par Eden Space: 2006624 -> 64263360; Par Survivor Space: 2403792 -> 2387664
> INFO  [Service Thread] 2015-12-18 16:42:38,648 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 365ms.  CMS Old Gen: 649126336 -> 137359384;
> Code Cache: 22972224 -> 22979840; Metaspace: 41374464 -> 41375104; Par
> Eden Space: 4286080 -> 154449480; Par Survivor Space: 1575440 -> 2310768
> INFO  [Service Thread] 2015-12-18 16:51:57,538 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 322ms.  CMS Old Gen: 648338928 -> 79783856;
> Par Eden Space: 2058968 -> 56931312; Par Survivor Space: 2342760 -> 2400336
> INFO  [Service Thread] 2015-12-18 17:02:49,543 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 212ms.  CMS Old Gen: 648702008 -> 122954344;
> Par Eden Space: 3269032 -> 61433328; Par Survivor Space: 2395824 -> 3448760
> INFO  [Service Thread] 2015-12-18 17:11:54,090 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 306ms.  CMS Old Gen: 648748576 -> 70965096;
> Par Eden Space: 2174840 -> 27074432; Par Survivor Space: 2365992 -> 2373984
> INFO  [Service Thread] 2015-12-18 17:22:28,949 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 350ms.  CMS Old Gen: 648243024 -> 90897272;
> Par Eden Space: 2150168 -> 43487192; Par Survivor Space: 2401872 -> 2410728
>
>
> After modifying jvm.options to increase -Xms & -Xmx (to 1536M):
>
> INFO  [Service Thread] 2015-12-21 11:39:24,918 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 342ms.  CMS Old Gen: 18579136 -> 16305144;
> Code Cache: 8600128 -> 10898752; Compressed Class Space: 3431288 ->
> 3761496; Metaspace: 29551832 -> 33307352; Par Eden Space: 4822000 ->
> 94853272;
> INFO  [Service Thread] 2015-12-21 11:39:30,710 GCInspector.java:284 -
> ParNew GC in 206ms.  CMS Old Gen: 22932208 -> 41454520; Par Eden Space:
> 167772160 -> 0; Par Survivor Space: 13144872 -> 20971520
> INFO  [Service Thread] 2015-12-21 13:08:14,922 GCInspector.java:284 -
> ConcurrentMarkSweep GC in 468ms.  CMS Old Gen: 21418016 -> 16146528;
> Code Cache: 11693888 -> 11744704; Compressed Class Space: 4331224 ->
> 4344192; Metaspace: 37191144 -> 37249960; Par Eden Space: 146089224 ->
> 148476848;
> INFO  [Service Thread] 2015-12-21 13:08:53,068 GCInspector.java:284 -
> ParNew GC in 216ms.  CMS Old Gen: 16146528 -> 26858568; Par Eden Space:
> 167772160 -> 0;
>
>
> Earlier the node had OpenJDK 8.  For today's tests I installed and used
> Oracle Java 8.
>
> Do the above messages provide any clue? Or any debug logging I can
> enable to progress further?
> Thanks,
> Dinesh.
>
> On 12/18/2015 9:56 PM, Tyler Hobbs wrote:
> >
> > On Fri, Dec 18, 2015 at 9:17 AM, DuyHai Doan <doanduyhai@gmail.com
> > <ma...@gmail.com>> wrote:
> >
> >     Cassandra will perform a full table scan and fetch all the data in
> >     memory to apply the aggregate function.
> >
> >
> > Just to clarify for others on the list: when executing aggregation
> > functions, Cassandra /will/ use paging internally, so at most one page
> > worth of data will be held in memory at a time.  However, if your
> > aggregation function retains a large amount of data, this may
> > contribute to heap pressure.
> >
> >
> > --
> > Tyler Hobbs
> > DataStax <http://datastax.com/>
>
>

Re: Cassandra 3.1 - Aggregation query failure

Posted by Dinesh Shanbhag <di...@isanasystems.com>.

Thanks for the pointers!  I edited jvm.options in 
$CASSANDRA_HOME/conf/jvm.options to increase -Xms and -Xmx to 1536M.  
The result is the same.

And in $CASSANDRA_HOME/logs/system.log, grep GC system.log produces this 
(when jvm.options had not been changed):

INFO  [Service Thread] 2015-12-18 15:26:31,668 GCInspector.java:284 - 
ConcurrentMarkSweep GC in 296ms.  CMS Old Gen: 18133664 -> 15589256; 
Code Cache: 5650880 -> 8122304; Compressed Class Space: 2530064 -> 
3345624; Metaspace: 21314000 -> 28040984; Par Eden Space: 7019256 -> 
164070848;
INFO  [Service Thread] 2015-12-18 15:48:39,736 GCInspector.java:284 - 
ConcurrentMarkSweep GC in 379ms.  CMS Old Gen: 649257416 -> 84190176; 
Code Cache: 20772224 -> 20726848; Par Eden Space: 2191408 -> 52356736; 
Par Survivor Space: 2378448 -> 2346840
INFO  [Service Thread] 2015-12-18 15:58:35,118 GCInspector.java:284 - 
ConcurrentMarkSweep GC in 406ms.  CMS Old Gen: 648847808 -> 86954856; 
Code Cache: 21182080 -> 21188032; Par Eden Space: 1815696 -> 71525744; 
Par Survivor Space: 2388648 -> 2364696
INFO  [Service Thread] 2015-12-18 16:13:45,821 GCInspector.java:284 - 
ConcurrentMarkSweep GC in 211ms.  CMS Old Gen: 648343768 -> 73135720; 
Par Eden Space: 3224880 -> 7957464; Par Survivor Space: 2379912 -> 2414520
INFO  [Service Thread] 2015-12-18 16:32:46,419 GCInspector.java:284 - 
ConcurrentMarkSweep GC in 387ms.  CMS Old Gen: 648476072 -> 68888832; 
Par Eden Space: 2006624 -> 64263360; Par Survivor Space: 2403792 -> 2387664
INFO  [Service Thread] 2015-12-18 16:42:38,648 GCInspector.java:284 - 
ConcurrentMarkSweep GC in 365ms.  CMS Old Gen: 649126336 -> 137359384; 
Code Cache: 22972224 -> 22979840; Metaspace: 41374464 -> 41375104; Par 
Eden Space: 4286080 -> 154449480; Par Survivor Space: 1575440 -> 2310768
INFO  [Service Thread] 2015-12-18 16:51:57,538 GCInspector.java:284 - 
ConcurrentMarkSweep GC in 322ms.  CMS Old Gen: 648338928 -> 79783856; 
Par Eden Space: 2058968 -> 56931312; Par Survivor Space: 2342760 -> 2400336
INFO  [Service Thread] 2015-12-18 17:02:49,543 GCInspector.java:284 - 
ConcurrentMarkSweep GC in 212ms.  CMS Old Gen: 648702008 -> 122954344; 
Par Eden Space: 3269032 -> 61433328; Par Survivor Space: 2395824 -> 3448760
INFO  [Service Thread] 2015-12-18 17:11:54,090 GCInspector.java:284 - 
ConcurrentMarkSweep GC in 306ms.  CMS Old Gen: 648748576 -> 70965096; 
Par Eden Space: 2174840 -> 27074432; Par Survivor Space: 2365992 -> 2373984
INFO  [Service Thread] 2015-12-18 17:22:28,949 GCInspector.java:284 - 
ConcurrentMarkSweep GC in 350ms.  CMS Old Gen: 648243024 -> 90897272; 
Par Eden Space: 2150168 -> 43487192; Par Survivor Space: 2401872 -> 2410728


After modifying jvm.options to increase -Xms & -Xmx (to 1536M):

INFO  [Service Thread] 2015-12-21 11:39:24,918 GCInspector.java:284 - 
ConcurrentMarkSweep GC in 342ms.  CMS Old Gen: 18579136 -> 16305144; 
Code Cache: 8600128 -> 10898752; Compressed Class Space: 3431288 -> 
3761496; Metaspace: 29551832 -> 33307352; Par Eden Space: 4822000 -> 
94853272;
INFO  [Service Thread] 2015-12-21 11:39:30,710 GCInspector.java:284 - 
ParNew GC in 206ms.  CMS Old Gen: 22932208 -> 41454520; Par Eden Space: 
167772160 -> 0; Par Survivor Space: 13144872 -> 20971520
INFO  [Service Thread] 2015-12-21 13:08:14,922 GCInspector.java:284 - 
ConcurrentMarkSweep GC in 468ms.  CMS Old Gen: 21418016 -> 16146528; 
Code Cache: 11693888 -> 11744704; Compressed Class Space: 4331224 -> 
4344192; Metaspace: 37191144 -> 37249960; Par Eden Space: 146089224 -> 
148476848;
INFO  [Service Thread] 2015-12-21 13:08:53,068 GCInspector.java:284 - 
ParNew GC in 216ms.  CMS Old Gen: 16146528 -> 26858568; Par Eden Space: 
167772160 -> 0;


Earlier the node had OpenJDK 8.  For today's tests I installed and used 
Oracle Java 8.

Do the above messages provide any clue? Or any debug logging I can 
enable to progress further?
Thanks,
Dinesh.

On 12/18/2015 9:56 PM, Tyler Hobbs wrote:
>
> On Fri, Dec 18, 2015 at 9:17 AM, DuyHai Doan <doanduyhai@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     Cassandra will perform a full table scan and fetch all the data in
>     memory to apply the aggregate function.
>
>
> Just to clarify for others on the list: when executing aggregation 
> functions, Cassandra /will/ use paging internally, so at most one page 
> worth of data will be held in memory at a time.  However, if your 
> aggregation function retains a large amount of data, this may 
> contribute to heap pressure.
>
>
> -- 
> Tyler Hobbs
> DataStax <http://datastax.com/>

Re: Cassandra 3.1 - Aggregation query failure

Posted by DuyHai Doan <do...@gmail.com>.

A quick update on this issue.

Today, when playing with UDA, I had also the exception:

java.security.AccessControlException: access denied
       ("java.io.FilePermission" "/xxxxx/logback.xml" "read")"

What is definitely strange is that by re-executing again the query, same
query, it works. I couldn't re-produce anymore the issue.

I'm wondering if it only occurs when the UDA execution exceeds the warning
timeout and is re-scheduled...

On Tue, Dec 29, 2015 at 9:52 PM, Tyler Hobbs <ty...@datastax.com> wrote:

>
>> 1. Is it possible to "tune" the page size or is it hard-coded internally ?
>>
>
> If a page size is set for the request at the driver level, that page size
> will be used internally.  Otherwise, it defaults to something reasonable
> (probably ~5k rows).
>
>
>> 2. Is read-repair performed on EACH page or is it done on the whole
>> requested rows once they are fetched ?
>>
>
> It's performed on each page as it's read.  Do note that read repair
> doesn't happen for multi-partition range reads, regardless of paging or
> aggregation.
>
>
>>
>> Question 2. is relevant in some particular scenarios when the user is
>> using CL QUORUM (or more) and some replicas are out-of-sync. Even in the
>> case of aggregation over a single partition, if this partition is wide and
>> spans many fetch pages, the time the coordinator performs all the
>> read-repair and reconcile over QUORUM replicas, the query may timeout very
>> quickly.
>>
>
> Yes, that's possible.  Timeouts for these queries should be adjusted
> accordingly.  It's worth noting that the read_request_timeout_in_ms setting
> applies per-page, so coordinator-level timeouts shouldn't be severely
> affected by this.
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>

Re: Cassandra 3.1 - Aggregation query failure

Posted by Tyler Hobbs <ty...@datastax.com>.

>
>
> 1. Is it possible to "tune" the page size or is it hard-coded internally ?
>

If a page size is set for the request at the driver level, that page size
will be used internally.  Otherwise, it defaults to something reasonable
(probably ~5k rows).


> 2. Is read-repair performed on EACH page or is it done on the whole
> requested rows once they are fetched ?
>

It's performed on each page as it's read.  Do note that read repair doesn't
happen for multi-partition range reads, regardless of paging or aggregation.


>
> Question 2. is relevant in some particular scenarios when the user is
> using CL QUORUM (or more) and some replicas are out-of-sync. Even in the
> case of aggregation over a single partition, if this partition is wide and
> spans many fetch pages, the time the coordinator performs all the
> read-repair and reconcile over QUORUM replicas, the query may timeout very
> quickly.
>

Yes, that's possible.  Timeouts for these queries should be adjusted
accordingly.  It's worth noting that the read_request_timeout_in_ms setting
applies per-page, so coordinator-level timeouts shouldn't be severely
affected by this.

-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Re: Cassandra 3.1 - Aggregation query failure

Posted by Dinesh Shanbhag <di...@isanasystems.com>.

There is nothing in the system.log when the aggregation query fails.

Thanks for the Datastax clarification.

Thanks,
Dinesh.

On 12/24/2015 2:46 PM, DuyHai Doan wrote:
> The exception stack trace at client side shows some issue with File 
> Permission. Try to look for the same error message in system.log to 
> chase down the root issue.
>
> "Would trying the Datastax distribution offer any better chances?" --> 
> No, DSC is just a packaging of C* OSS
>
> On Thu, Dec 24, 2015 at 7:07 AM, Dinesh Shanbhag 
> <dinesh.shanbhag@isanasystems.com 
> <ma...@isanasystems.com>> wrote:
>
>
>     Even if aggregation that forces a full table scan across
>     partitions is not recommended, the message/exception does seems
>     unrelated to partitioning:
>
>        cqlsh:flightdata> select late_flights(uniquecarrier, depdel15) from
>        flightsbydate in ('2015-09-15', '2015-09-16',
>        '2015-09-17', '2015-09-18', '2015-09-19', '2015-09-20',
>     '2015-09-21');
>
>        Traceback (most recent call last):
>           File "CassandraInstall-3.1/bin/cqlsh.py", line 1258, in
>        perform_simple_statement
>             result = future.result()
>           File
>      "/home/wpl/CassandraInstall-3.1/bin/../lib/cassandra-driver-internal-only-3.0.0-6af642d.zip/cassandra-driver-3.0.0-6af642d/cassandra/cluster.py",
>
>        line 3122, in result
>             raise self._final_exception
>        FunctionFailure: code=1400 [User Defined Function failure]
>        message="execution of 'flightdata.state_late_flights[map<text,
>        frozen<tuple<int, int>>>, text, decimal]' failed:
>        java.security.AccessControlException: access denied
>        ("java.io.FilePermission"
>        "/home/wpl/CassandraInstall-3.1/conf/logback.xml" "read")"
>
>     Is that right?
>
>     And note that this same aggregation query (on a subset of the
>     month's days) does complete successfully sometimes.
>
>     The behavior is similar with Cassandra 3.0 as well: on the same
>     set of days, the query sometimes succeeds, fails most times. 
>     Would trying the Datastax distribution offer any better chances?
>
>     Thanks,
>     Dinesh.
>
>
>     On 12/24/2015 2:59 AM, DuyHai Doan wrote:
>
>         Thanks for the pointer on internal paging Tyler, I missed this
>         one. But then it raises some questions:
>
>         1. Is it possible to "tune" the page size or is it hard-coded
>         internally ?
>         2. Is read-repair performed on EACH page or is it done on the
>         whole requested rows once they are fetched ?
>
>         Question 2. is relevant in some particular scenarios when the
>         user is using CL QUORUM (or more) and some replicas are
>         out-of-sync. Even in the case of aggregation over a single
>         partition, if this partition is wide and spans many fetch
>         pages, the time the coordinator performs all the read-repair
>         and reconcile over QUORUM replicas, the query may timeout very
>         quickly.
>
>
>         On Fri, Dec 18, 2015 at 5:26 PM, Tyler Hobbs
>         <tyler@datastax.com <ma...@datastax.com>
>         <mailto:tyler@datastax.com <ma...@datastax.com>>> wrote:
>
>
>             On Fri, Dec 18, 2015 at 9:17 AM, DuyHai Doan
>         <doanduyhai@gmail.com <ma...@gmail.com>
>             <mailto:doanduyhai@gmail.com
>         <ma...@gmail.com>>> wrote:
>
>                 Cassandra will perform a full table scan and fetch all the
>                 data in memory to apply the aggregate function.
>
>
>             Just to clarify for others on the list: when executing
>         aggregation
>             functions, Cassandra /will/ use paging internally, so at
>         most one
>             page worth of data will be held in memory at a time. 
>         However, if
>             your aggregation function retains a large amount of data,
>         this may
>             contribute to heap pressure.
>
>
>             --     Tyler Hobbs
>             DataStax <http://datastax.com/>
>
>
>
>

Re: Cassandra 3.1 - Aggregation query failure

Posted by DuyHai Doan <do...@gmail.com>.

The exception stack trace at client side shows some issue with File
Permission. Try to look for the same error message in system.log to chase
down the root issue.

"Would trying the Datastax distribution offer any better chances?" --> No,
DSC is just a packaging of C* OSS

On Thu, Dec 24, 2015 at 7:07 AM, Dinesh Shanbhag <
dinesh.shanbhag@isanasystems.com> wrote:

>
> Even if aggregation that forces a full table scan across partitions is not
> recommended, the message/exception does seems unrelated to partitioning:
>
>    cqlsh:flightdata> select late_flights(uniquecarrier, depdel15) from
>    flightsbydate in ('2015-09-15', '2015-09-16',
>    '2015-09-17', '2015-09-18', '2015-09-19', '2015-09-20', '2015-09-21');
>
>    Traceback (most recent call last):
>       File "CassandraInstall-3.1/bin/cqlsh.py", line 1258, in
>    perform_simple_statement
>         result = future.result()
>       File
>
>  "/home/wpl/CassandraInstall-3.1/bin/../lib/cassandra-driver-internal-only-3.0.0-6af642d.zip/cassandra-driver-3.0.0-6af642d/cassandra/cluster.py",
>
>    line 3122, in result
>         raise self._final_exception
>    FunctionFailure: code=1400 [User Defined Function failure]
>    message="execution of 'flightdata.state_late_flights[map<text,
>    frozen<tuple<int, int>>>, text, decimal]' failed:
>    java.security.AccessControlException: access denied
>    ("java.io.FilePermission"
>    "/home/wpl/CassandraInstall-3.1/conf/logback.xml" "read")"
>
> Is that right?
>
> And note that this same aggregation query (on a subset of the month's
> days) does complete successfully sometimes.
>
> The behavior is similar with Cassandra 3.0 as well: on the same set of
> days, the query sometimes succeeds, fails most times.  Would trying the
> Datastax distribution offer any better chances?
>
> Thanks,
> Dinesh.
>
>
> On 12/24/2015 2:59 AM, DuyHai Doan wrote:
>
>> Thanks for the pointer on internal paging Tyler, I missed this one. But
>> then it raises some questions:
>>
>> 1. Is it possible to "tune" the page size or is it hard-coded internally ?
>> 2. Is read-repair performed on EACH page or is it done on the whole
>> requested rows once they are fetched ?
>>
>> Question 2. is relevant in some particular scenarios when the user is
>> using CL QUORUM (or more) and some replicas are out-of-sync. Even in the
>> case of aggregation over a single partition, if this partition is wide and
>> spans many fetch pages, the time the coordinator performs all the
>> read-repair and reconcile over QUORUM replicas, the query may timeout very
>> quickly.
>>
>>
>> On Fri, Dec 18, 2015 at 5:26 PM, Tyler Hobbs <tyler@datastax.com <mailto:
>> tyler@datastax.com>> wrote:
>>
>>
>>     On Fri, Dec 18, 2015 at 9:17 AM, DuyHai Doan <doanduyhai@gmail.com
>>     <ma...@gmail.com>> wrote:
>>
>>         Cassandra will perform a full table scan and fetch all the
>>         data in memory to apply the aggregate function.
>>
>>
>>     Just to clarify for others on the list: when executing aggregation
>>     functions, Cassandra /will/ use paging internally, so at most one
>>     page worth of data will be held in memory at a time.  However, if
>>     your aggregation function retains a large amount of data, this may
>>     contribute to heap pressure.
>>
>>
>>     --     Tyler Hobbs
>>     DataStax <http://datastax.com/>
>>
>>
>>
>

Re: Cassandra 3.1 - Aggregation query failure

Posted by Dinesh Shanbhag <di...@isanasystems.com>.

Even if aggregation that forces a full table scan across partitions is 
not recommended, the message/exception does seems unrelated to partitioning:

    cqlsh:flightdata> select late_flights(uniquecarrier, depdel15) from
    flightsbydate in ('2015-09-15', '2015-09-16',
    '2015-09-17', '2015-09-18', '2015-09-19', '2015-09-20', '2015-09-21');

    Traceback (most recent call last):
       File "CassandraInstall-3.1/bin/cqlsh.py", line 1258, in
    perform_simple_statement
         result = future.result()
       File
    
"/home/wpl/CassandraInstall-3.1/bin/../lib/cassandra-driver-internal-only-3.0.0-6af642d.zip/cassandra-driver-3.0.0-6af642d/cassandra/cluster.py", 

    line 3122, in result
         raise self._final_exception
    FunctionFailure: code=1400 [User Defined Function failure]
    message="execution of 'flightdata.state_late_flights[map<text,
    frozen<tuple<int, int>>>, text, decimal]' failed:
    java.security.AccessControlException: access denied
    ("java.io.FilePermission"
    "/home/wpl/CassandraInstall-3.1/conf/logback.xml" "read")"

Is that right?

And note that this same aggregation query (on a subset of the month's 
days) does complete successfully sometimes.

The behavior is similar with Cassandra 3.0 as well: on the same set of 
days, the query sometimes succeeds, fails most times.  Would trying the 
Datastax distribution offer any better chances?

Thanks,
Dinesh.


On 12/24/2015 2:59 AM, DuyHai Doan wrote:
> Thanks for the pointer on internal paging Tyler, I missed this one. 
> But then it raises some questions:
>
> 1. Is it possible to "tune" the page size or is it hard-coded internally ?
> 2. Is read-repair performed on EACH page or is it done on the whole 
> requested rows once they are fetched ?
>
> Question 2. is relevant in some particular scenarios when the user is 
> using CL QUORUM (or more) and some replicas are out-of-sync. Even in 
> the case of aggregation over a single partition, if this partition is 
> wide and spans many fetch pages, the time the coordinator performs all 
> the read-repair and reconcile over QUORUM replicas, the query may 
> timeout very quickly.
>
>
> On Fri, Dec 18, 2015 at 5:26 PM, Tyler Hobbs <tyler@datastax.com 
> <ma...@datastax.com>> wrote:
>
>
>     On Fri, Dec 18, 2015 at 9:17 AM, DuyHai Doan <doanduyhai@gmail.com
>     <ma...@gmail.com>> wrote:
>
>         Cassandra will perform a full table scan and fetch all the
>         data in memory to apply the aggregate function.
>
>
>     Just to clarify for others on the list: when executing aggregation
>     functions, Cassandra /will/ use paging internally, so at most one
>     page worth of data will be held in memory at a time.  However, if
>     your aggregation function retains a large amount of data, this may
>     contribute to heap pressure.
>
>
>     -- 
>     Tyler Hobbs
>     DataStax <http://datastax.com/>
>
>

Re: Cassandra 3.1 - Aggregation query failure

Posted by DuyHai Doan <do...@gmail.com>.

Thanks for the pointer on internal paging Tyler, I missed this one. But
then it raises some questions:

1. Is it possible to "tune" the page size or is it hard-coded internally ?
2. Is read-repair performed on EACH page or is it done on the whole
requested rows once they are fetched ?

Question 2. is relevant in some particular scenarios when the user is using
CL QUORUM (or more) and some replicas are out-of-sync. Even in the case of
aggregation over a single partition, if this partition is wide and spans
many fetch pages, the time the coordinator performs all the read-repair and
reconcile over QUORUM replicas, the query may timeout very quickly.

On Fri, Dec 18, 2015 at 5:26 PM, Tyler Hobbs <ty...@datastax.com> wrote:

>
> On Fri, Dec 18, 2015 at 9:17 AM, DuyHai Doan <do...@gmail.com> wrote:
>
>> Cassandra will perform a full table scan and fetch all the data in memory
>> to apply the aggregate function.
>
>
> Just to clarify for others on the list: when executing aggregation
> functions, Cassandra *will* use paging internally, so at most one page
> worth of data will be held in memory at a time.  However, if your
> aggregation function retains a large amount of data, this may contribute to
> heap pressure.
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>

Re: Cassandra 3.1 - Aggregation query failure

Posted by Tyler Hobbs <ty...@datastax.com>.

On Fri, Dec 18, 2015 at 9:17 AM, DuyHai Doan <do...@gmail.com> wrote:

> Cassandra will perform a full table scan and fetch all the data in memory
> to apply the aggregate function.

Just to clarify for others on the list: when executing aggregation
functions, Cassandra *will* use paging internally, so at most one page
worth of data will be held in memory at a time.  However, if your
aggregation function retains a large amount of data, this may contribute to
heap pressure.

-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Re: Cassandra 3.1 - Aggregation query failure

Posted by DuyHai Doan <do...@gmail.com>.

Hello

 There are 2 details that are important here:

1. The node has only 4Gb of RAM
2. However, the aggregation on all ~450000 rows always fails, sometimes
immediately, sometimes after 30-60 seconds:

The consequence of point 1 is that the JVM Heap size is small: 1Gb

The formulae to compute max heap size is given in
$CASSANDRA_HOME/conf/cassandra-env.sh

maxHeapSize = max(min(1/2 ram, 1024MB), min(1/4 ram, 8GB))
                       = max(min(2Gb, 1024Mb), min(1Gb, 8Gb))
                       = max(1024Mb, 1Gb) = 1Gb

The consequence of point 2 is that since you're not restricting the
partition key in the query ("select late_flights(uniquecarrier, depdel15)
from  flightsbydate;") Cassandra will perform a full table scan and fetch
all the data in memory to apply the aggregate function.

With a small Java HEAP size, there is a possibility that Cassandra runs out
of memory of suffers from long old GC cycles.

Check the /var/log/cassandra/system.log file and look for lines with "GC",
it will tell you if you're running in long GC issue.



On Fri, Dec 18, 2015 at 1:57 PM, Dinesh Shanbhag <
dinesh.shanbhag@isanasystems.com> wrote:

>
>
> I am trying out Aggregations in a single node Cassandra 3.1 installation.
> The node has 4GB RAM.  The table being aggregated on contains ~450000
> rows.  It contains information on US domestic flights for a single month
> (from
> http://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=236&DB_Short_Name=On-Time
> ).
>
> CREATE AGGREGATE flightdata.late_flights(text, decimal)
>     SFUNC state_late_flights
>     STYPE map<text, frozen<tuple<int, int>>>
>     INITCOND {};
>
> The late_flights aggregation function uses a state_late_flights()
> User-Defined Function that maintains a map of uniquecarrier to
> tuple<int,int>.  The first int in the tuple represents delayed flights of
> the corresponding uniquecarrier.  The second int represents total flights
> of the uniquecarrier.
>
> This aggregation query on a subset of the days of the month works:
>
>    cqlsh:flightdata> select late_flights(uniquecarrier, depdel15) from
>    flightsbydate *where flightdate in ('2015-09-15', '2015-09-16',
>    '2015-09-17', '2015-09-18', '2015-09-19', '2015-09-20', '2015-09-21')*;
>
>      flightdata.late_flights(uniquecarrier, depdel15)
>
>  -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>      {'AA': (2395, 17138), 'AS': (234, 3308), 'B6': (703, 4832), 'DL':
>    (1452, 17311), 'EV': (1028, 10502), 'F9': (221, 1837), 'HA': (79,
>    1414), 'MQ': (892, 4926), 'NK': (535, 2300), 'OO': (1539, 11299),
>    'UA': (1422, 9792), 'VX': (181, 1209), 'WN': (3446, 23659)}
>
>    (1 rows)
>
>    Warnings :
>    Aggregation query used on multiple partition keys (IN restriction)
>
>
> However, the aggregation on all ~450000 rows always fails, sometimes
> immediately, sometimes after 30-60 seconds:
>
>    cqlsh:flightdata> select late_flights(uniquecarrier, depdel15) from
>    flightsbydate;
>
>    Traceback (most recent call last):
>       File "CassandraInstall-3.1/bin/cqlsh.py", line 1258, in
>    perform_simple_statement
>         result = future.result()
>       File
>
>  "/home/wpl/CassandraInstall-3.1/bin/../lib/cassandra-driver-internal-only-3.0.0-6af642d.zip/cassandra-driver-3.0.0-6af642d/cassandra/cluster.py",
>    line 3122, in result
>         raise self._final_exception
>    FunctionFailure: code=1400 [User Defined Function failure]
>    message="execution of 'flightdata.state_late_flights[map<text,
>    frozen<tuple<int, int>>>, text, decimal]' failed:
>    java.security.AccessControlException: access denied
>    ("java.io.FilePermission"
>    "/home/wpl/CassandraInstall-3.1/conf/logback.xml" "read")"
>
>
> While this query runs, CPU utilization is 100% - 120%, Peak RAM used is
> less than 3.5GB.
>
> Just in case it is useful, the state_late_flights User-Defined function:
>
>    cqlsh:flightdata> describe function state_late_flights;
>
>    CREATE FUNCTION flightdata.state_late_flights(state map<text,
>    frozen<tuple<int, int>>>, flid text, fldelay decimal)
>         CALLED ON NULL INPUT
>         RETURNS map<text, frozen<tuple<int, int>>>
>         LANGUAGE java
>         AS $$com.datastax.driver.core.TupleType tt =
>
>  com.datastax.driver.core.TupleType.of(com.datastax.driver.core.ProtocolVersion.NEWEST_SUPPORTED,
>    com.datastax.driver.core.CodecRegistry.DEFAULT_INSTANCE,
>    com.datastax.driver.core.DataType.cint(),
>    com.datastax.driver.core.DataType.cint());
>    com.datastax.driver.core.TupleValue tv = tt.newValue(); tv.setInt(0,
>    0); tv.setInt(1, 1); if (flid == null) { state.put("EMPTY", tv);
>    return state; } if (state.get(flid) != null) {  tv =
>    (com.datastax.driver.core.TupleValue) state.get(flid);  tv.setInt(1,
>    tv.getInt(1) + 1); if
>    (fldelay.compareTo(java.math.BigDecimal.valueOf(0)) == 1) {
>    tv.setInt(0, tv.getInt(0) + 1); } } state.put(flid, tv); return
>    state;$$;
>
>
> What should be checked on to investigate this further?
> Thanks,
> Dinesh.
>