You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Subodh Nijsure <su...@gmail.com> on 2014/08/17 12:45:24 UTC

Strange select result when using date grater than query

Hello,

I am fairly new to cassandra so this might be naieve question:

I have table that currently has following entries:

SELECT asset_id,event_time,sensor_type, temperature,humidity from
temp_humidity_data where asset_id='2';

 asset_id | event_time               | sensor_type | temperature | humidity
----------+--------------------------+-------------+-------------+----------
        2 | 2014-08-17 03:33:16-0700 |           1 |      74.768 |   65.768
        2 | 2014-08-17 03:33:17-0700 |           1 |      67.228 |   91.228
        2 | 2014-08-17 03:33:19-0700 |           1 |       61.97 |    73.97

Now if I execute a query :

SELECT asset_id,event_time,sensor_type, temperature,humidity from
temp_humidity_data where asset_id='2' and event_time > '2014-08-17
03:33:20'  ALLOW FILTERING;

it gives me back same results (!), I expected it to give me 0 results.

 asset_id | event_time               | sensor_type | temperature | humidity
----------+--------------------------+-------------+-------------+----------
        2 | 2014-08-17 03:33:16-0700 |           1 |      74.768 |   65.768
        2 | 2014-08-17 03:33:17-0700 |           1 |      67.228 |   91.228
        2 | 2014-08-17 03:33:19-0700 |           1 |       61.97 |    73.97

am I doing something wrong?

Note I have created table   with following options.

            CREATE TABLE temp_humidity_data (
                asset_id text,
                event_time timestamp,
                sensor_serial_number text,
                sensor_type int,
                temperature float,
                humidity float,
                polling_freq int,
                PRIMARY KEY(asset_id ,event_time)
            ) WITH CLUSTERING ORDER BY (event_time ASC)
            AND caching = '{"keys":"ALL", "rows_per_partition":"ALL"}'

I have also created following indexes:

CREATE INDEX event_time_index ON temp_humidity_data (event_time);

Also of note is, since actual installation I will be running against
large time series data I have configured  'row_cache_size_in_mb: 20'

I am running cqlsh 5.0.1 , and cassandra version 2.1.0-rc3

Would appreciate any suggestion on why the date grater-than query is
returning all the results?

-Subodh

Re: Strange select result when using date grater than query

Posted by Subodh Nijsure <su...@gmail.com>.
I am running csql on same machine as my cassandra server.

I am observing really strange behavior if I do this query all  3 rows show up.

 SELECT asset_id,event_time,sensor_type, temperature,humidity from
temp_humidity_data   ALLOW FILTERING;

 asset_id | event_time               | sensor_type | temperature | humidity
----------+--------------------------+-------------+-------------+----------
        2 | 2014-08-17 13:36:00-0700 |           1 |      75.147 |   91.147
        2 | 2014-08-17 13:36:02-0700 |           1 |      66.308 |   72.308
        2 | 2014-08-17 13:36:49-0700 |           1 |      65.397 |   77.397

Now if I do following only one record shows up correctly.

SELECT asset_id,event_time,sensor_type, temperature,humidity from
temp_humidity_data where event_time > '2014-08-17 13:36:02-0700'
ALLOW FILTERING;

 asset_id | event_time               | sensor_type | temperature | humidity
----------+--------------------------+-------------+-------------+----------
        2 | 2014-08-17 13:36:49-0700 |           1 |      65.397 |   77.397


However if I add additional column asset_id='2' to where condition all
three rows show up? So this is not timezone issue, but may be I don't
understand how csql does AND in its queries?

SELECT asset_id,event_time,sensor_type, temperature,humidity from
temp_humidity_data where event_time > '2014-08-17 13:36:02-0700' AND
asset_id='2';

 asset_id | event_time               | sensor_type | temperature | humidity
----------+--------------------------+-------------+-------------+----------
        2 | 2014-08-17 13:36:00-0700 |           1 |      75.147 |   91.147
        2 | 2014-08-17 13:36:02-0700 |           1 |      66.308 |   72.308
        2 | 2014-08-17 13:36:49-0700 |           1 |      65.397 |   77.397

help!

-Subodh

On Sun, Aug 17, 2014 at 8:09 AM, Jack Krupansky <ja...@basetechnology.com> wrote:
> I should have asked where your coordinator node is located. Check its time
> zone, relative to GMT.
>
> cqlsh is simply formatting the time stamp for your local display. That is
> separate from the actual query execution on the server coordinator node.
> cqlsh is merely a "client", not the "server". And separate from the actual
> data, which is stored in GMT.
>
>
> -- Jack Krupansky
>
> -----Original Message----- From: Subodh Nijsure
> Sent: Sunday, August 17, 2014 10:04 AM
> To: user@cassandra.apache.org
> Subject: Re: Strange select result when using date grater than query
>
>
> I am in PST ( Oakland ).
>
> I am storing the timestamp in UTC in my insert code, and I see that
> cqlsh converts the timestamp to local timezone? i.e.  if I set TZ=EST
> cqlsh shows me time stamps in EST like this for the same data set.
>
> SELECT asset_id,event_time,sensor_type, temperature,humidity from
> temp_humidity_data where asset_id='2';
> asset_id | event_time               | sensor_type | temperature | humidity
> ----------+--------------------------+-------------+-------------+----------
>        2 | 2014-08-17 05:33:16-0500 |           1 |      74.768 |   65.768
>        2 | 2014-08-17 05:33:17-0500 |           1 |      67.228 |   91.228
>        2 | 2014-08-17 05:33:19-0500 |           1 |       61.97 |    73.97
>
>
> So for query i though I should be giving time strings in local timezone too,
> no?
>
> -Subodh
>
> On Sun, Aug 17, 2014 at 5:17 AM, Jack Krupansky <ja...@basetechnology.com>
> wrote:
>>
>> Are you more than 7 time zones behind GMT? If so, that would make 03:33
>> your
>> query less than 03:33-0700  Your query is using the default time zone,
>> which
>> will be the time zone configured for the coordinator node executing the
>> query.
>>
>> IOW, where are you?
>>
>> -- Jack Krupansky
>>
>> -----Original Message----- From: Subodh Nijsure
>> Sent: Sunday, August 17, 2014 6:45 AM
>> To: user@cassandra.apache.org
>> Subject: Strange select result when using date grater than query
>>
>>
>> Hello,
>>
>> I am fairly new to cassandra so this might be naieve question:
>>
>> I have table that currently has following entries:
>>
>> SELECT asset_id,event_time,sensor_type, temperature,humidity from
>> temp_humidity_data where asset_id='2';
>>
>> asset_id | event_time               | sensor_type | temperature | humidity
>>
>> ----------+--------------------------+-------------+-------------+----------
>>        2 | 2014-08-17 03:33:16-0700 |           1 |      74.768 |   65.768
>>        2 | 2014-08-17 03:33:17-0700 |           1 |      67.228 |   91.228
>>        2 | 2014-08-17 03:33:19-0700 |           1 |       61.97 |    73.97
>>
>> Now if I execute a query :
>>
>> SELECT asset_id,event_time,sensor_type, temperature,humidity from
>> temp_humidity_data where asset_id='2' and event_time > '2014-08-17
>> 03:33:20'  ALLOW FILTERING;
>>
>> it gives me back same results (!), I expected it to give me 0 results.
>>
>> asset_id | event_time               | sensor_type | temperature | humidity
>>
>> ----------+--------------------------+-------------+-------------+----------
>>        2 | 2014-08-17 03:33:16-0700 |           1 |      74.768 |   65.768
>>        2 | 2014-08-17 03:33:17-0700 |           1 |      67.228 |   91.228
>>        2 | 2014-08-17 03:33:19-0700 |           1 |       61.97 |    73.97
>>
>> am I doing something wrong?
>>
>> Note I have created table   with following options.
>>
>>            CREATE TABLE temp_humidity_data (
>>                asset_id text,
>>                event_time timestamp,
>>                sensor_serial_number text,
>>                sensor_type int,
>>                temperature float,
>>                humidity float,
>>                polling_freq int,
>>                PRIMARY KEY(asset_id ,event_time)
>>            ) WITH CLUSTERING ORDER BY (event_time ASC)
>>            AND caching = '{"keys":"ALL", "rows_per_partition":"ALL"}'
>>
>> I have also created following indexes:
>>
>> CREATE INDEX event_time_index ON temp_humidity_data (event_time);
>>
>> Also of note is, since actual installation I will be running against
>> large time series data I have configured  'row_cache_size_in_mb: 20'
>>
>> I am running cqlsh 5.0.1 , and cassandra version 2.1.0-rc3
>>
>> Would appreciate any suggestion on why the date grater-than query is
>> returning all the results?
>>
>> -Subodh
>
>

Re: Strange select result when using date grater than query

Posted by Jack Krupansky <ja...@basetechnology.com>.
I should have asked where your coordinator node is located. Check its time 
zone, relative to GMT.

cqlsh is simply formatting the time stamp for your local display. That is 
separate from the actual query execution on the server coordinator node. 
cqlsh is merely a "client", not the "server". And separate from the actual 
data, which is stored in GMT.

-- Jack Krupansky

-----Original Message----- 
From: Subodh Nijsure
Sent: Sunday, August 17, 2014 10:04 AM
To: user@cassandra.apache.org
Subject: Re: Strange select result when using date grater than query

I am in PST ( Oakland ).

I am storing the timestamp in UTC in my insert code, and I see that
cqlsh converts the timestamp to local timezone? i.e.  if I set TZ=EST
cqlsh shows me time stamps in EST like this for the same data set.

SELECT asset_id,event_time,sensor_type, temperature,humidity from
temp_humidity_data where asset_id='2';
asset_id | event_time               | sensor_type | temperature | humidity
----------+--------------------------+-------------+-------------+----------
        2 | 2014-08-17 05:33:16-0500 |           1 |      74.768 |   65.768
        2 | 2014-08-17 05:33:17-0500 |           1 |      67.228 |   91.228
        2 | 2014-08-17 05:33:19-0500 |           1 |       61.97 |    73.97


So for query i though I should be giving time strings in local timezone too, 
no?

-Subodh

On Sun, Aug 17, 2014 at 5:17 AM, Jack Krupansky <ja...@basetechnology.com> 
wrote:
> Are you more than 7 time zones behind GMT? If so, that would make 03:33 
> your
> query less than 03:33-0700  Your query is using the default time zone, 
> which
> will be the time zone configured for the coordinator node executing the
> query.
>
> IOW, where are you?
>
> -- Jack Krupansky
>
> -----Original Message----- From: Subodh Nijsure
> Sent: Sunday, August 17, 2014 6:45 AM
> To: user@cassandra.apache.org
> Subject: Strange select result when using date grater than query
>
>
> Hello,
>
> I am fairly new to cassandra so this might be naieve question:
>
> I have table that currently has following entries:
>
> SELECT asset_id,event_time,sensor_type, temperature,humidity from
> temp_humidity_data where asset_id='2';
>
> asset_id | event_time               | sensor_type | temperature | humidity
> ----------+--------------------------+-------------+-------------+----------
>        2 | 2014-08-17 03:33:16-0700 |           1 |      74.768 |   65.768
>        2 | 2014-08-17 03:33:17-0700 |           1 |      67.228 |   91.228
>        2 | 2014-08-17 03:33:19-0700 |           1 |       61.97 |    73.97
>
> Now if I execute a query :
>
> SELECT asset_id,event_time,sensor_type, temperature,humidity from
> temp_humidity_data where asset_id='2' and event_time > '2014-08-17
> 03:33:20'  ALLOW FILTERING;
>
> it gives me back same results (!), I expected it to give me 0 results.
>
> asset_id | event_time               | sensor_type | temperature | humidity
> ----------+--------------------------+-------------+-------------+----------
>        2 | 2014-08-17 03:33:16-0700 |           1 |      74.768 |   65.768
>        2 | 2014-08-17 03:33:17-0700 |           1 |      67.228 |   91.228
>        2 | 2014-08-17 03:33:19-0700 |           1 |       61.97 |    73.97
>
> am I doing something wrong?
>
> Note I have created table   with following options.
>
>            CREATE TABLE temp_humidity_data (
>                asset_id text,
>                event_time timestamp,
>                sensor_serial_number text,
>                sensor_type int,
>                temperature float,
>                humidity float,
>                polling_freq int,
>                PRIMARY KEY(asset_id ,event_time)
>            ) WITH CLUSTERING ORDER BY (event_time ASC)
>            AND caching = '{"keys":"ALL", "rows_per_partition":"ALL"}'
>
> I have also created following indexes:
>
> CREATE INDEX event_time_index ON temp_humidity_data (event_time);
>
> Also of note is, since actual installation I will be running against
> large time series data I have configured  'row_cache_size_in_mb: 20'
>
> I am running cqlsh 5.0.1 , and cassandra version 2.1.0-rc3
>
> Would appreciate any suggestion on why the date grater-than query is
> returning all the results?
>
> -Subodh 


Re: Strange select result when using date grater than query

Posted by Subodh Nijsure <su...@gmail.com>.
I am in PST ( Oakland ).

I am storing the timestamp in UTC in my insert code, and I see that
cqlsh converts the timestamp to local timezone? i.e.  if I set TZ=EST
cqlsh shows me time stamps in EST like this for the same data set.

SELECT asset_id,event_time,sensor_type, temperature,humidity from
temp_humidity_data where asset_id='2';
 asset_id | event_time               | sensor_type | temperature | humidity
----------+--------------------------+-------------+-------------+----------
        2 | 2014-08-17 05:33:16-0500 |           1 |      74.768 |   65.768
        2 | 2014-08-17 05:33:17-0500 |           1 |      67.228 |   91.228
        2 | 2014-08-17 05:33:19-0500 |           1 |       61.97 |    73.97


So for query i though I should be giving time strings in local timezone too, no?

-Subodh

On Sun, Aug 17, 2014 at 5:17 AM, Jack Krupansky <ja...@basetechnology.com> wrote:
> Are you more than 7 time zones behind GMT? If so, that would make 03:33 your
> query less than 03:33-0700  Your query is using the default time zone, which
> will be the time zone configured for the coordinator node executing the
> query.
>
> IOW, where are you?
>
> -- Jack Krupansky
>
> -----Original Message----- From: Subodh Nijsure
> Sent: Sunday, August 17, 2014 6:45 AM
> To: user@cassandra.apache.org
> Subject: Strange select result when using date grater than query
>
>
> Hello,
>
> I am fairly new to cassandra so this might be naieve question:
>
> I have table that currently has following entries:
>
> SELECT asset_id,event_time,sensor_type, temperature,humidity from
> temp_humidity_data where asset_id='2';
>
> asset_id | event_time               | sensor_type | temperature | humidity
> ----------+--------------------------+-------------+-------------+----------
>        2 | 2014-08-17 03:33:16-0700 |           1 |      74.768 |   65.768
>        2 | 2014-08-17 03:33:17-0700 |           1 |      67.228 |   91.228
>        2 | 2014-08-17 03:33:19-0700 |           1 |       61.97 |    73.97
>
> Now if I execute a query :
>
> SELECT asset_id,event_time,sensor_type, temperature,humidity from
> temp_humidity_data where asset_id='2' and event_time > '2014-08-17
> 03:33:20'  ALLOW FILTERING;
>
> it gives me back same results (!), I expected it to give me 0 results.
>
> asset_id | event_time               | sensor_type | temperature | humidity
> ----------+--------------------------+-------------+-------------+----------
>        2 | 2014-08-17 03:33:16-0700 |           1 |      74.768 |   65.768
>        2 | 2014-08-17 03:33:17-0700 |           1 |      67.228 |   91.228
>        2 | 2014-08-17 03:33:19-0700 |           1 |       61.97 |    73.97
>
> am I doing something wrong?
>
> Note I have created table   with following options.
>
>            CREATE TABLE temp_humidity_data (
>                asset_id text,
>                event_time timestamp,
>                sensor_serial_number text,
>                sensor_type int,
>                temperature float,
>                humidity float,
>                polling_freq int,
>                PRIMARY KEY(asset_id ,event_time)
>            ) WITH CLUSTERING ORDER BY (event_time ASC)
>            AND caching = '{"keys":"ALL", "rows_per_partition":"ALL"}'
>
> I have also created following indexes:
>
> CREATE INDEX event_time_index ON temp_humidity_data (event_time);
>
> Also of note is, since actual installation I will be running against
> large time series data I have configured  'row_cache_size_in_mb: 20'
>
> I am running cqlsh 5.0.1 , and cassandra version 2.1.0-rc3
>
> Would appreciate any suggestion on why the date grater-than query is
> returning all the results?
>
> -Subodh

Re: Strange select result when using date grater than query

Posted by Jack Krupansky <ja...@basetechnology.com>.
Are you more than 7 time zones behind GMT? If so, that would make 03:33 your 
query less than 03:33-0700  Your query is using the default time zone, which 
will be the time zone configured for the coordinator node executing the 
query.

IOW, where are you?

-- Jack Krupansky

-----Original Message----- 
From: Subodh Nijsure
Sent: Sunday, August 17, 2014 6:45 AM
To: user@cassandra.apache.org
Subject: Strange select result when using date grater than query

Hello,

I am fairly new to cassandra so this might be naieve question:

I have table that currently has following entries:

SELECT asset_id,event_time,sensor_type, temperature,humidity from
temp_humidity_data where asset_id='2';

asset_id | event_time               | sensor_type | temperature | humidity
----------+--------------------------+-------------+-------------+----------
        2 | 2014-08-17 03:33:16-0700 |           1 |      74.768 |   65.768
        2 | 2014-08-17 03:33:17-0700 |           1 |      67.228 |   91.228
        2 | 2014-08-17 03:33:19-0700 |           1 |       61.97 |    73.97

Now if I execute a query :

SELECT asset_id,event_time,sensor_type, temperature,humidity from
temp_humidity_data where asset_id='2' and event_time > '2014-08-17
03:33:20'  ALLOW FILTERING;

it gives me back same results (!), I expected it to give me 0 results.

asset_id | event_time               | sensor_type | temperature | humidity
----------+--------------------------+-------------+-------------+----------
        2 | 2014-08-17 03:33:16-0700 |           1 |      74.768 |   65.768
        2 | 2014-08-17 03:33:17-0700 |           1 |      67.228 |   91.228
        2 | 2014-08-17 03:33:19-0700 |           1 |       61.97 |    73.97

am I doing something wrong?

Note I have created table   with following options.

            CREATE TABLE temp_humidity_data (
                asset_id text,
                event_time timestamp,
                sensor_serial_number text,
                sensor_type int,
                temperature float,
                humidity float,
                polling_freq int,
                PRIMARY KEY(asset_id ,event_time)
            ) WITH CLUSTERING ORDER BY (event_time ASC)
            AND caching = '{"keys":"ALL", "rows_per_partition":"ALL"}'

I have also created following indexes:

CREATE INDEX event_time_index ON temp_humidity_data (event_time);

Also of note is, since actual installation I will be running against
large time series data I have configured  'row_cache_size_in_mb: 20'

I am running cqlsh 5.0.1 , and cassandra version 2.1.0-rc3

Would appreciate any suggestion on why the date grater-than query is
returning all the results?

-Subodh