You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by Binu Mathew <bm...@groupon.com> on 2016/01/29 02:57:56 UTC

to_date not working as expected

Phoenix version 4.4.0

Issues with Phoenix when used with HBase 0.96.0.2.0

2 Issues:

*ISSUE:* to_date Function is not converting string data types in valid date
formats to a DATE data type when used in the WHERE clause for date
comparison.

Below is a query I ran against a Phoenix view in which I use the ‘to_date’
function to convert 2 VARCHAR columns to date.
1. column ‘created_at_ts’ stored as VARCHAR in format such as 2009-05-05
15:40:10.000
2. column ‘created_at_date’ stored as VARCHAR in format such as 2009-05-05

Observe that the ‘to_date’ function coverts the 2 VARCHAR columns to dates:


select to_date("created_at_ts"), to_date("created_at_date") from
"gp_subscriptions" limit 5;

+-------------------------------------------------+---------------------------------------------------+
| TO_DATE(subscriber."created_at_ts", null, null) |
TO_DATE(subscriber."created_at_date", null, null) |
+-------------------------------------------------+---------------------------------------------------+
| 2009-05-05 15:40:10.000                         | 2009-05-05
00:00:00.000                           |
| 2012-11-22 07:37:34.000                         | 2012-11-22
00:00:00.000                           |
| 2010-07-24 14:12:33.000                         | 2010-07-24
00:00:00.000                           |
| 2012-11-22 07:38:04.000                         | 2012-11-22
00:00:00.000                           |
| 2012-11-22 07:38:10.000                         | 2012-11-22
00:00:00.000                           |
+-------------------------------------------------+---------------------------------------------------+


Here is another query in which I’m using the ‘to_date’ function on string
literals in the WHERE clause for date comparison .

Observer that the ‘to_date’ function coverts the string literals to dates
and the the date comparison correctly evaluates:

select '1' from "gp_subscriptions" where to_date('2009-05-05 15:40:10.000')
= to_date('2009-05-05 15:40:10.000') limit 2;
2 rows selected (0.035 seconds)


Now when I try the date comparison using the columns from my view, it fails:

select '1' from "gp_subscriptions" where to_date("created_at_ts") =
to_date('2009-05-05 15:40:10.000') limit 2;

Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
BooleanExpressionFilter failed during reading: Could not initialize class
org.apache.phoenix.util.DateUtil$ISODateFormatParser
Caused by: java.lang.NoClassDefFoundError: Could not initialize class
org.apache.phoenix.util.DateUtil$ISODateFormatParser

Also fails with same error when I try: select '1' from "gp_subscriptions"
where to_date("created_at_ts") = to_date('2009-05-05') limit 2;

Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
BooleanExpressionFilter failed during reading: Could not initialize class
org.apache.phoenix.util.DateUtil$ISODateFormatParser
Caused by: java.lang.NoClassDefFoundError: Could not initialize class
org.apache.phoenix.util.DateUtil$ISODateFormatParser


*ISSUE: *Date comparisons on string literals are not evaluating correctly
such that dates in the future get interpreted as being less than dates in
the past.

Test case 1:
2009-05-05 15:40:10.000 is greater than (in the future) 2005-05-05
15:40:10.000

The following query should return 2 rows, however, it does not return any
rows:

select '1' from "gp_subscriptions" where to_date('2009-05-05 15:40:10.000')
> to_date('2005-05-05 15:40:10.000') limit 2;
No rows selected (0.024 seconds)


The following query should return no rows, however, it returns 2 rows:

select '1' from "gp_subscriptions" where to_date('2009-05-05 15:40:10.000')
< to_date('2005-05-05 15:40:10.000') limit 2;
2 rows selected (0.033 seconds)

Test case 2:
2009-05-05 is greater than (in the future) than 1970-05-05

The following query should return 2 rows, however, it does not return any
rows:

select '1' from "gp_subscriptions" where to_date('2009-05-05') >
to_date('1970-05-05') limit 2;
No rows selected (0.024 seconds)


The following query should return no rows, however, it returns 2 rows:

select '1' from "gp_subscriptions" where to_date('2009-05-05') <
to_date('1970-05-05') limit 2;
2 rows selected (0.033 seconds)


-- 
*Binu Mathew*
Data Engineering

3101 Park Blvd., Palo Alto, CA 94306
Mobile: 630.267.5938
Groupon
<http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>

Re: to_date not working as expected

Posted by Binu Mathew <bm...@groupon.com>.
If and when a patch becomes available, can you please update this thread.

This is a blocker:
1. We have existing HBase tables that users do not necessarily want to port
over to Phoenix tables
2. Building Phoenix views to the existing HBase tables is a better solution
3. The inability to cast the date data types in the WHERE clause and
compare against another date prevents users from performance range scans.

Unfortunately, we are unable to upgrade our Hortonworks Hadoop
distribution, HDP 2.3, to use Phoenix 4.6.

HDP 2.3 came default with Phoenix 4.4 and we can't seem to upgrade a single
package, 4.4, to 4.6 on the HDP 2.3 distribution.

Can you provide us with a patch to resolve this issue?

Thanks,

On Sat, Jan 30, 2016 at 11:45 AM, Thomas D'Silva <td...@salesforce.com>
wrote:

> Binu,
>
> I am able to repro the issue by manually running the test from the patch
> from https://issues.apache.org/jira/browse/PHOENIX-1769 .
> I will investigate further.
>
> Thanks,
> Thomas
>
>
> On Fri, Jan 29, 2016 at 4:26 PM, Binu Mathew <bm...@groupon.com> wrote:
>
>> That doesn't seem to work.
>>
>> Phoenix is not recognizing that created_at in the WERE clause is a
>> derived column from to_date("created_at_date").
>>
>> Some relational databases support this type of functionality.
>>
>> On Fri, Jan 29, 2016 at 4:16 PM, Alok Singh <al...@cloudability.com>
>> wrote:
>>
>>> Does this work:
>>>
>>> select 1, to_date("created_at_date") as created_at from
>>> "gp_subscriptions" where created_at > to_date('2010-10-10') limit 3;
>>>
>>> Alok
>>>
>>> Alok
>>>
>>> alok@cloudability.com
>>>
>>> On Fri, Jan 29, 2016 at 3:54 PM, Binu Mathew <bm...@groupon.com>
>>> wrote:
>>>
>>>> Thank you for the reply.
>>>>
>>>> I mistakenly wrote that we are using Phoenix with HBase .96. This was a
>>>> typo. We are using HBase .98 with Phoenix 4.4
>>>>
>>>> I tried the UNSIGNED types and still encountering the same issue.
>>>>
>>>> My field has the following data:
>>>>
>>>> select "created_at_date" from "gp_subscriptions" limit 3;
>>>>
>>>> +------------------------------------------+
>>>> |             created_at_date              |
>>>> +------------------------------------------+
>>>> | 2012-11-22                               |
>>>> | 2012-11-22                               |
>>>> | 2012-11-26                               |
>>>> +------------------------------------------+
>>>>
>>>> to_date function works when I convert the VARCHAR field to a DATE:
>>>>
>>>> select to_date("created_at_date") from "gp_subscriptions" limit 3;
>>>>
>>>> +---------------------------------------------------+
>>>> | TO_DATE(subscriber."created_at_date", null, null) |
>>>> +---------------------------------------------------+
>>>> | 2012-11-22 00:00:00.000                           |
>>>> | 2012-11-22 00:00:00.000                           |
>>>> | 2012-11-26 00:00:00.000                           |
>>>> +---------------------------------------------------+
>>>>
>>>>
>>>> However, I can't use the to_date function in the WHERE clause:
>>>>
>>>> select 1 from "gp_subscriptions" where to_date("created_at_date") >
>>>> to_date('2010-10-10') limit 3;
>>>>
>>>> java.lang.RuntimeException:
>>>> org.apache.phoenix.exception.PhoenixIOException:
>>>> org.apache.phoenix.exception.PhoenixIOException:
>>>> org.apache.hadoop.hbase.DoNotRetryIOException:
>>>> java.lang.reflect.InvocationTargetException
>>>>
>>>> Another issue is how Phoenix evaluates dates. In the queries below, I'm
>>>> evaluating string literals:
>>>>
>>>> 2009-05-05 is greater than (in the future) than 1970-05-05
>>>>
>>>> The following query should return 2 rows, however, it does not return
>>>> any rows:
>>>>
>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') >
>>>> to_date('1970-05-05') limit 2;
>>>> No rows selected (0.024 seconds)
>>>>
>>>>
>>>> The following query should return no rows, however, it returns 2 rows:
>>>>
>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') <
>>>> to_date('1970-05-05') limit 2;
>>>> 2 rows selected (0.033 seconds)
>>>>
>>>> Thanks,
>>>>
>>>> On Thu, Jan 28, 2016 at 8:07 PM, James Taylor <ja...@apache.org>
>>>> wrote:
>>>>
>>>>> Hi Binu,
>>>>> Phoenix has never supported HBase 0.96, so I'm not sure where you got
>>>>> the release from.
>>>>>
>>>>> I recommend upgrading to a later, supported version of HBase and a
>>>>> later version of Phoenix. Give the 4.7.0 RC a try.
>>>>>
>>>>> One other tip in particular for views you create over existing HBase
>>>>> tables. Use the UNSIGNED types documented here[1] as these use the same
>>>>> serialization as the Bytes methods provided by HBase. If you tell Phoenix
>>>>> the wrong type, it won't know so would produce erroneous data and queries.
>>>>>
>>>>> Thanks,
>>>>> James
>>>>>
>>>>> [1] https://phoenix.apache.org/language/datatypes.html
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__phoenix.apache.org_language_datatypes.html&d=CwMFaQ&c=LNdz7nrxyGFUIUTz2qIULQ&r=kx0aYbS3d_bROblEBwnOWEEpQQo7ummi3XKbSfjcDxI&m=DYIhYC8MzfMvMlb_vB97jlBOyX5-ROGZnBR8uHjL9AM&s=qZf4wLz966HePHU8YevWHvcdJWrlFV84gXZLYftu1II&e=>
>>>>>
>>>>> On Thu, Jan 28, 2016 at 5:57 PM, Binu Mathew <bm...@groupon.com>
>>>>> wrote:
>>>>>
>>>>>> Phoenix version 4.4.0
>>>>>>
>>>>>> Issues with Phoenix when used with HBase 0.96.0.2.0
>>>>>>
>>>>>> 2 Issues:
>>>>>>
>>>>>> *ISSUE:* to_date Function is not converting string data types in
>>>>>> valid date formats to a DATE data type when used in the WHERE clause for
>>>>>> date comparison.
>>>>>>
>>>>>> Below is a query I ran against a Phoenix view in which I use the
>>>>>> ‘to_date’ function to convert 2 VARCHAR columns to date.
>>>>>> 1. column ‘created_at_ts’ stored as VARCHAR in format such as
>>>>>> 2009-05-05 15:40:10.000
>>>>>> 2. column ‘created_at_date’ stored as VARCHAR in format such as
>>>>>> 2009-05-05
>>>>>>
>>>>>> Observe that the ‘to_date’ function coverts the 2 VARCHAR columns to
>>>>>> dates:
>>>>>>
>>>>>>
>>>>>> select to_date("created_at_ts"), to_date("created_at_date") from
>>>>>> "gp_subscriptions" limit 5;
>>>>>>
>>>>>>
>>>>>> +-------------------------------------------------+---------------------------------------------------+
>>>>>> | TO_DATE(subscriber."created_at_ts", null, null) |
>>>>>> TO_DATE(subscriber."created_at_date", null, null) |
>>>>>>
>>>>>> +-------------------------------------------------+---------------------------------------------------+
>>>>>> | 2009-05-05 15:40:10.000                         | 2009-05-05
>>>>>> 00:00:00.000                           |
>>>>>> | 2012-11-22 07:37:34.000                         | 2012-11-22
>>>>>> 00:00:00.000                           |
>>>>>> | 2010-07-24 14:12:33.000                         | 2010-07-24
>>>>>> 00:00:00.000                           |
>>>>>> | 2012-11-22 07:38:04.000                         | 2012-11-22
>>>>>> 00:00:00.000                           |
>>>>>> | 2012-11-22 07:38:10.000                         | 2012-11-22
>>>>>> 00:00:00.000                           |
>>>>>>
>>>>>> +-------------------------------------------------+---------------------------------------------------+
>>>>>>
>>>>>>
>>>>>> Here is another query in which I’m using the ‘to_date’ function on
>>>>>> string literals in the WHERE clause for date comparison .
>>>>>>
>>>>>> Observer that the ‘to_date’ function coverts the string literals to
>>>>>> dates and the the date comparison correctly evaluates:
>>>>>>
>>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>>>>> 15:40:10.000') = to_date('2009-05-05 15:40:10.000') limit 2;
>>>>>> 2 rows selected (0.035 seconds)
>>>>>>
>>>>>>
>>>>>> Now when I try the date comparison using the columns from my view, it
>>>>>> fails:
>>>>>>
>>>>>> select '1' from "gp_subscriptions" where to_date("created_at_ts") =
>>>>>> to_date('2009-05-05 15:40:10.000') limit 2;
>>>>>>
>>>>>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>>>>>> BooleanExpressionFilter failed during reading: Could not initialize class
>>>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>>> Caused by: java.lang.NoClassDefFoundError: Could not initialize class
>>>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>>>
>>>>>> Also fails with same error when I try: select '1' from
>>>>>> "gp_subscriptions" where to_date("created_at_ts") = to_date('2009-05-05')
>>>>>> limit 2;
>>>>>>
>>>>>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>>>>>> BooleanExpressionFilter failed during reading: Could not initialize class
>>>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>>> Caused by: java.lang.NoClassDefFoundError: Could not initialize class
>>>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>>>
>>>>>>
>>>>>> *ISSUE: *Date comparisons on string literals are not evaluating
>>>>>> correctly such that dates in the future get interpreted as being less than
>>>>>> dates in the past.
>>>>>>
>>>>>> Test case 1:
>>>>>> 2009-05-05 15:40:10.000 is greater than (in the future) 2005-05-05
>>>>>> 15:40:10.000
>>>>>>
>>>>>> The following query should return 2 rows, however, it does not return
>>>>>> any rows:
>>>>>>
>>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>>>>> 15:40:10.000') > to_date('2005-05-05 15:40:10.000') limit 2;
>>>>>> No rows selected (0.024 seconds)
>>>>>>
>>>>>>
>>>>>> The following query should return no rows, however, it returns 2 rows:
>>>>>>
>>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>>>>> 15:40:10.000') < to_date('2005-05-05 15:40:10.000') limit 2;
>>>>>> 2 rows selected (0.033 seconds)
>>>>>>
>>>>>> Test case 2:
>>>>>> 2009-05-05 is greater than (in the future) than 1970-05-05
>>>>>>
>>>>>> The following query should return 2 rows, however, it does not return
>>>>>> any rows:
>>>>>>
>>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') >
>>>>>> to_date('1970-05-05') limit 2;
>>>>>> No rows selected (0.024 seconds)
>>>>>>
>>>>>>
>>>>>> The following query should return no rows, however, it returns 2 rows:
>>>>>>
>>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') <
>>>>>> to_date('1970-05-05') limit 2;
>>>>>> 2 rows selected (0.033 seconds)
>>>>>>
>>>>>>
>>>>>> --
>>>>>> *Binu Mathew*
>>>>>> Data Engineering
>>>>>>
>>>>>> 3101 Park Blvd., Palo Alto, CA 94306
>>>>>> Mobile: 630.267.5938
>>>>>> Groupon
>>>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.google.com_url-3Fq-3Dhttp-253A-252F-252Fwww.groupon.com-252F-26sa-3DD-26sntz-3D1-26usg-3DAFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg&d=CwMFaQ&c=LNdz7nrxyGFUIUTz2qIULQ&r=kx0aYbS3d_bROblEBwnOWEEpQQo7ummi3XKbSfjcDxI&m=DYIhYC8MzfMvMlb_vB97jlBOyX5-ROGZnBR8uHjL9AM&s=ZLtx84pLqRnFvwXQsJT-IoadaGYt0KObWPttSMqW-xY&e=>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> *Binu Mathew*
>>>> Data Engineering
>>>>
>>>> 3101 Park Blvd., Palo Alto, CA 94306
>>>> Mobile: 630.267.5938
>>>> Groupon
>>>> <http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> *Binu Mathew*
>> Data Engineering
>>
>> 3101 Park Blvd., Palo Alto, CA 94306
>> Mobile: 630.267.5938
>> Groupon
>> <http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>
>>
>>
>>
>>
>


-- 
*Binu Mathew*
Data Engineering

3101 Park Blvd., Palo Alto, CA 94306
Mobile: 630.267.5938
Groupon
<http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>

Re: to_date not working as expected

Posted by Binu Mathew <bm...@groupon.com>.
The patch resolved our issue.

We just updated the jar file on all nodes in the cluster, and our test
cases are now passing.

This issue can be closed.

Thank you.

On Tue, Feb 2, 2016 at 3:45 PM, Thomas D'Silva <td...@salesforce.com>
wrote:

> I have a patch out for PHOENIX-1769
> <https://issues.apache.org/jira/browse/PHOENIX-1769> . The issue is that
> the jruby-complete-1.6.8.jar that is present in then hbase lib directory
> contains jodatime 1.6 classes which get picked up instead of the 2.7
> classes that are included in the phoenix server jar. You could try removing
> this jar from the lib directory (from all the servers on your cluster), I
> think its only required for the hbase shell.
>
> On Tue, Feb 2, 2016 at 11:41 AM, Binu Mathew <bm...@groupon.com> wrote:
>
>> Hi Thomas,
>>
>> Any update on this?
>>
>> If and when a patch becomes available, can you please update this thread.
>>
>> This is a blocker:
>> 1. We have existing HBase tables that users do not necessarily want to
>> port over to Phoenix tables
>> 2. Building Phoenix views to the existing HBase tables is a better
>> solution
>> 3. The inability to cast the date data types in the WHERE clause and
>> compare against another date prevents users from performance range scans.
>>
>> Unfortunately, we are unable to upgrade our Hortonworks Hadoop
>> distribution, HDP 2.3, to use Phoenix 4.6.
>>
>> HDP 2.3 came default with Phoenix 4.4 and we can't seem to upgrade a
>> single package, 4.4, to 4.6 on the HDP 2.3 distribution.
>>
>> Can you provide us with a patch to resolve this issue?
>>
>> Thanks,
>>
>> On Sat, Jan 30, 2016 at 11:45 AM, Thomas D'Silva <td...@salesforce.com>
>> wrote:
>>
>>> Binu,
>>>
>>> I am able to repro the issue by manually running the test from the patch
>>> from https://issues.apache.org/jira/browse/PHOENIX-1769 .
>>> I will investigate further.
>>>
>>> Thanks,
>>> Thomas
>>>
>>>
>>> On Fri, Jan 29, 2016 at 4:26 PM, Binu Mathew <bm...@groupon.com>
>>> wrote:
>>>
>>>> That doesn't seem to work.
>>>>
>>>> Phoenix is not recognizing that created_at in the WERE clause is a
>>>> derived column from to_date("created_at_date").
>>>>
>>>> Some relational databases support this type of functionality.
>>>>
>>>> On Fri, Jan 29, 2016 at 4:16 PM, Alok Singh <al...@cloudability.com>
>>>> wrote:
>>>>
>>>>> Does this work:
>>>>>
>>>>> select 1, to_date("created_at_date") as created_at from
>>>>> "gp_subscriptions" where created_at > to_date('2010-10-10') limit 3;
>>>>>
>>>>> Alok
>>>>>
>>>>> Alok
>>>>>
>>>>> alok@cloudability.com
>>>>>
>>>>> On Fri, Jan 29, 2016 at 3:54 PM, Binu Mathew <bm...@groupon.com>
>>>>> wrote:
>>>>>
>>>>>> Thank you for the reply.
>>>>>>
>>>>>> I mistakenly wrote that we are using Phoenix with HBase .96. This was
>>>>>> a typo. We are using HBase .98 with Phoenix 4.4
>>>>>>
>>>>>> I tried the UNSIGNED types and still encountering the same issue.
>>>>>>
>>>>>> My field has the following data:
>>>>>>
>>>>>> select "created_at_date" from "gp_subscriptions" limit 3;
>>>>>>
>>>>>> +------------------------------------------+
>>>>>> |             created_at_date              |
>>>>>> +------------------------------------------+
>>>>>> | 2012-11-22                               |
>>>>>> | 2012-11-22                               |
>>>>>> | 2012-11-26                               |
>>>>>> +------------------------------------------+
>>>>>>
>>>>>> to_date function works when I convert the VARCHAR field to a DATE:
>>>>>>
>>>>>> select to_date("created_at_date") from "gp_subscriptions" limit 3;
>>>>>>
>>>>>> +---------------------------------------------------+
>>>>>> | TO_DATE(subscriber."created_at_date", null, null) |
>>>>>> +---------------------------------------------------+
>>>>>> | 2012-11-22 00:00:00.000                           |
>>>>>> | 2012-11-22 00:00:00.000                           |
>>>>>> | 2012-11-26 00:00:00.000                           |
>>>>>> +---------------------------------------------------+
>>>>>>
>>>>>>
>>>>>> However, I can't use the to_date function in the WHERE clause:
>>>>>>
>>>>>> select 1 from "gp_subscriptions" where to_date("created_at_date") >
>>>>>> to_date('2010-10-10') limit 3;
>>>>>>
>>>>>> java.lang.RuntimeException:
>>>>>> org.apache.phoenix.exception.PhoenixIOException:
>>>>>> org.apache.phoenix.exception.PhoenixIOException:
>>>>>> org.apache.hadoop.hbase.DoNotRetryIOException:
>>>>>> java.lang.reflect.InvocationTargetException
>>>>>>
>>>>>> Another issue is how Phoenix evaluates dates. In the queries below,
>>>>>> I'm evaluating string literals:
>>>>>>
>>>>>> 2009-05-05 is greater than (in the future) than 1970-05-05
>>>>>>
>>>>>> The following query should return 2 rows, however, it does not return
>>>>>> any rows:
>>>>>>
>>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') >
>>>>>> to_date('1970-05-05') limit 2;
>>>>>> No rows selected (0.024 seconds)
>>>>>>
>>>>>>
>>>>>> The following query should return no rows, however, it returns 2 rows:
>>>>>>
>>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') <
>>>>>> to_date('1970-05-05') limit 2;
>>>>>> 2 rows selected (0.033 seconds)
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> On Thu, Jan 28, 2016 at 8:07 PM, James Taylor <jamestaylor@apache.org
>>>>>> > wrote:
>>>>>>
>>>>>>> Hi Binu,
>>>>>>> Phoenix has never supported HBase 0.96, so I'm not sure where you
>>>>>>> got the release from.
>>>>>>>
>>>>>>> I recommend upgrading to a later, supported version of HBase and a
>>>>>>> later version of Phoenix. Give the 4.7.0 RC a try.
>>>>>>>
>>>>>>> One other tip in particular for views you create over existing HBase
>>>>>>> tables. Use the UNSIGNED types documented here[1] as these use the same
>>>>>>> serialization as the Bytes methods provided by HBase. If you tell Phoenix
>>>>>>> the wrong type, it won't know so would produce erroneous data and queries.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> James
>>>>>>>
>>>>>>> [1] https://phoenix.apache.org/language/datatypes.html
>>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__phoenix.apache.org_language_datatypes.html&d=CwMFaQ&c=LNdz7nrxyGFUIUTz2qIULQ&r=kx0aYbS3d_bROblEBwnOWEEpQQo7ummi3XKbSfjcDxI&m=DYIhYC8MzfMvMlb_vB97jlBOyX5-ROGZnBR8uHjL9AM&s=qZf4wLz966HePHU8YevWHvcdJWrlFV84gXZLYftu1II&e=>
>>>>>>>
>>>>>>> On Thu, Jan 28, 2016 at 5:57 PM, Binu Mathew <bm...@groupon.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Phoenix version 4.4.0
>>>>>>>>
>>>>>>>> Issues with Phoenix when used with HBase 0.96.0.2.0
>>>>>>>>
>>>>>>>> 2 Issues:
>>>>>>>>
>>>>>>>> *ISSUE:* to_date Function is not converting string data types in
>>>>>>>> valid date formats to a DATE data type when used in the WHERE clause for
>>>>>>>> date comparison.
>>>>>>>>
>>>>>>>> Below is a query I ran against a Phoenix view in which I use the
>>>>>>>> ‘to_date’ function to convert 2 VARCHAR columns to date.
>>>>>>>> 1. column ‘created_at_ts’ stored as VARCHAR in format such as
>>>>>>>> 2009-05-05 15:40:10.000
>>>>>>>> 2. column ‘created_at_date’ stored as VARCHAR in format such as
>>>>>>>> 2009-05-05
>>>>>>>>
>>>>>>>> Observe that the ‘to_date’ function coverts the 2 VARCHAR columns
>>>>>>>> to dates:
>>>>>>>>
>>>>>>>>
>>>>>>>> select to_date("created_at_ts"), to_date("created_at_date") from
>>>>>>>> "gp_subscriptions" limit 5;
>>>>>>>>
>>>>>>>>
>>>>>>>> +-------------------------------------------------+---------------------------------------------------+
>>>>>>>> | TO_DATE(subscriber."created_at_ts", null, null) |
>>>>>>>> TO_DATE(subscriber."created_at_date", null, null) |
>>>>>>>>
>>>>>>>> +-------------------------------------------------+---------------------------------------------------+
>>>>>>>> | 2009-05-05 15:40:10.000                         | 2009-05-05
>>>>>>>> 00:00:00.000                           |
>>>>>>>> | 2012-11-22 07:37:34.000                         | 2012-11-22
>>>>>>>> 00:00:00.000                           |
>>>>>>>> | 2010-07-24 14:12:33.000                         | 2010-07-24
>>>>>>>> 00:00:00.000                           |
>>>>>>>> | 2012-11-22 07:38:04.000                         | 2012-11-22
>>>>>>>> 00:00:00.000                           |
>>>>>>>> | 2012-11-22 07:38:10.000                         | 2012-11-22
>>>>>>>> 00:00:00.000                           |
>>>>>>>>
>>>>>>>> +-------------------------------------------------+---------------------------------------------------+
>>>>>>>>
>>>>>>>>
>>>>>>>> Here is another query in which I’m using the ‘to_date’ function on
>>>>>>>> string literals in the WHERE clause for date comparison .
>>>>>>>>
>>>>>>>> Observer that the ‘to_date’ function coverts the string literals to
>>>>>>>> dates and the the date comparison correctly evaluates:
>>>>>>>>
>>>>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>>>>>>> 15:40:10.000') = to_date('2009-05-05 15:40:10.000') limit 2;
>>>>>>>> 2 rows selected (0.035 seconds)
>>>>>>>>
>>>>>>>>
>>>>>>>> Now when I try the date comparison using the columns from my view,
>>>>>>>> it fails:
>>>>>>>>
>>>>>>>> select '1' from "gp_subscriptions" where to_date("created_at_ts") =
>>>>>>>> to_date('2009-05-05 15:40:10.000') limit 2;
>>>>>>>>
>>>>>>>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>>>>>>>> BooleanExpressionFilter failed during reading: Could not initialize class
>>>>>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>>>>> Caused by: java.lang.NoClassDefFoundError: Could not initialize
>>>>>>>> class org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>>>>>
>>>>>>>> Also fails with same error when I try: select '1' from
>>>>>>>> "gp_subscriptions" where to_date("created_at_ts") = to_date('2009-05-05')
>>>>>>>> limit 2;
>>>>>>>>
>>>>>>>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>>>>>>>> BooleanExpressionFilter failed during reading: Could not initialize class
>>>>>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>>>>> Caused by: java.lang.NoClassDefFoundError: Could not initialize
>>>>>>>> class org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>>>>>
>>>>>>>>
>>>>>>>> *ISSUE: *Date comparisons on string literals are not evaluating
>>>>>>>> correctly such that dates in the future get interpreted as being less than
>>>>>>>> dates in the past.
>>>>>>>>
>>>>>>>> Test case 1:
>>>>>>>> 2009-05-05 15:40:10.000 is greater than (in the future) 2005-05-05
>>>>>>>> 15:40:10.000
>>>>>>>>
>>>>>>>> The following query should return 2 rows, however, it does not
>>>>>>>> return any rows:
>>>>>>>>
>>>>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>>>>>>> 15:40:10.000') > to_date('2005-05-05 15:40:10.000') limit 2;
>>>>>>>> No rows selected (0.024 seconds)
>>>>>>>>
>>>>>>>>
>>>>>>>> The following query should return no rows, however, it returns 2
>>>>>>>> rows:
>>>>>>>>
>>>>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>>>>>>> 15:40:10.000') < to_date('2005-05-05 15:40:10.000') limit 2;
>>>>>>>> 2 rows selected (0.033 seconds)
>>>>>>>>
>>>>>>>> Test case 2:
>>>>>>>> 2009-05-05 is greater than (in the future) than 1970-05-05
>>>>>>>>
>>>>>>>> The following query should return 2 rows, however, it does not
>>>>>>>> return any rows:
>>>>>>>>
>>>>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') >
>>>>>>>> to_date('1970-05-05') limit 2;
>>>>>>>> No rows selected (0.024 seconds)
>>>>>>>>
>>>>>>>>
>>>>>>>> The following query should return no rows, however, it returns 2
>>>>>>>> rows:
>>>>>>>>
>>>>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') <
>>>>>>>> to_date('1970-05-05') limit 2;
>>>>>>>> 2 rows selected (0.033 seconds)
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> *Binu Mathew*
>>>>>>>> Data Engineering
>>>>>>>>
>>>>>>>> 3101 Park Blvd., Palo Alto, CA 94306
>>>>>>>> Mobile: 630.267.5938
>>>>>>>> Groupon
>>>>>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.google.com_url-3Fq-3Dhttp-253A-252F-252Fwww.groupon.com-252F-26sa-3DD-26sntz-3D1-26usg-3DAFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg&d=CwMFaQ&c=LNdz7nrxyGFUIUTz2qIULQ&r=kx0aYbS3d_bROblEBwnOWEEpQQo7ummi3XKbSfjcDxI&m=DYIhYC8MzfMvMlb_vB97jlBOyX5-ROGZnBR8uHjL9AM&s=ZLtx84pLqRnFvwXQsJT-IoadaGYt0KObWPttSMqW-xY&e=>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> *Binu Mathew*
>>>>>> Data Engineering
>>>>>>
>>>>>> 3101 Park Blvd., Palo Alto, CA 94306
>>>>>> Mobile: 630.267.5938
>>>>>> Groupon
>>>>>> <http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> *Binu Mathew*
>>>> Data Engineering
>>>>
>>>> 3101 Park Blvd., Palo Alto, CA 94306
>>>> Mobile: 630.267.5938
>>>> Groupon
>>>> <http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> *Binu Mathew*
>> Data Engineering
>>
>> 3101 Park Blvd., Palo Alto, CA 94306
>> Mobile: 630.267.5938
>> Groupon
>> <http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>
>>
>>
>>
>>
>


-- 
*Binu Mathew*
Data Engineering

3101 Park Blvd., Palo Alto, CA 94306
Mobile: 630.267.5938
Groupon
<http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>

Re: to_date not working as expected

Posted by Thomas D'Silva <td...@salesforce.com>.
I have a patch out for PHOENIX-1769
<https://issues.apache.org/jira/browse/PHOENIX-1769> . The issue is that
the jruby-complete-1.6.8.jar that is present in then hbase lib directory
contains jodatime 1.6 classes which get picked up instead of the 2.7
classes that are included in the phoenix server jar. You could try removing
this jar from the lib directory (from all the servers on your cluster), I
think its only required for the hbase shell.

On Tue, Feb 2, 2016 at 11:41 AM, Binu Mathew <bm...@groupon.com> wrote:

> Hi Thomas,
>
> Any update on this?
>
> If and when a patch becomes available, can you please update this thread.
>
> This is a blocker:
> 1. We have existing HBase tables that users do not necessarily want to
> port over to Phoenix tables
> 2. Building Phoenix views to the existing HBase tables is a better
> solution
> 3. The inability to cast the date data types in the WHERE clause and
> compare against another date prevents users from performance range scans.
>
> Unfortunately, we are unable to upgrade our Hortonworks Hadoop
> distribution, HDP 2.3, to use Phoenix 4.6.
>
> HDP 2.3 came default with Phoenix 4.4 and we can't seem to upgrade a
> single package, 4.4, to 4.6 on the HDP 2.3 distribution.
>
> Can you provide us with a patch to resolve this issue?
>
> Thanks,
>
> On Sat, Jan 30, 2016 at 11:45 AM, Thomas D'Silva <td...@salesforce.com>
> wrote:
>
>> Binu,
>>
>> I am able to repro the issue by manually running the test from the patch
>> from https://issues.apache.org/jira/browse/PHOENIX-1769 .
>> I will investigate further.
>>
>> Thanks,
>> Thomas
>>
>>
>> On Fri, Jan 29, 2016 at 4:26 PM, Binu Mathew <bm...@groupon.com> wrote:
>>
>>> That doesn't seem to work.
>>>
>>> Phoenix is not recognizing that created_at in the WERE clause is a
>>> derived column from to_date("created_at_date").
>>>
>>> Some relational databases support this type of functionality.
>>>
>>> On Fri, Jan 29, 2016 at 4:16 PM, Alok Singh <al...@cloudability.com>
>>> wrote:
>>>
>>>> Does this work:
>>>>
>>>> select 1, to_date("created_at_date") as created_at from
>>>> "gp_subscriptions" where created_at > to_date('2010-10-10') limit 3;
>>>>
>>>> Alok
>>>>
>>>> Alok
>>>>
>>>> alok@cloudability.com
>>>>
>>>> On Fri, Jan 29, 2016 at 3:54 PM, Binu Mathew <bm...@groupon.com>
>>>> wrote:
>>>>
>>>>> Thank you for the reply.
>>>>>
>>>>> I mistakenly wrote that we are using Phoenix with HBase .96. This was
>>>>> a typo. We are using HBase .98 with Phoenix 4.4
>>>>>
>>>>> I tried the UNSIGNED types and still encountering the same issue.
>>>>>
>>>>> My field has the following data:
>>>>>
>>>>> select "created_at_date" from "gp_subscriptions" limit 3;
>>>>>
>>>>> +------------------------------------------+
>>>>> |             created_at_date              |
>>>>> +------------------------------------------+
>>>>> | 2012-11-22                               |
>>>>> | 2012-11-22                               |
>>>>> | 2012-11-26                               |
>>>>> +------------------------------------------+
>>>>>
>>>>> to_date function works when I convert the VARCHAR field to a DATE:
>>>>>
>>>>> select to_date("created_at_date") from "gp_subscriptions" limit 3;
>>>>>
>>>>> +---------------------------------------------------+
>>>>> | TO_DATE(subscriber."created_at_date", null, null) |
>>>>> +---------------------------------------------------+
>>>>> | 2012-11-22 00:00:00.000                           |
>>>>> | 2012-11-22 00:00:00.000                           |
>>>>> | 2012-11-26 00:00:00.000                           |
>>>>> +---------------------------------------------------+
>>>>>
>>>>>
>>>>> However, I can't use the to_date function in the WHERE clause:
>>>>>
>>>>> select 1 from "gp_subscriptions" where to_date("created_at_date") >
>>>>> to_date('2010-10-10') limit 3;
>>>>>
>>>>> java.lang.RuntimeException:
>>>>> org.apache.phoenix.exception.PhoenixIOException:
>>>>> org.apache.phoenix.exception.PhoenixIOException:
>>>>> org.apache.hadoop.hbase.DoNotRetryIOException:
>>>>> java.lang.reflect.InvocationTargetException
>>>>>
>>>>> Another issue is how Phoenix evaluates dates. In the queries below,
>>>>> I'm evaluating string literals:
>>>>>
>>>>> 2009-05-05 is greater than (in the future) than 1970-05-05
>>>>>
>>>>> The following query should return 2 rows, however, it does not return
>>>>> any rows:
>>>>>
>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') >
>>>>> to_date('1970-05-05') limit 2;
>>>>> No rows selected (0.024 seconds)
>>>>>
>>>>>
>>>>> The following query should return no rows, however, it returns 2 rows:
>>>>>
>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') <
>>>>> to_date('1970-05-05') limit 2;
>>>>> 2 rows selected (0.033 seconds)
>>>>>
>>>>> Thanks,
>>>>>
>>>>> On Thu, Jan 28, 2016 at 8:07 PM, James Taylor <ja...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> Hi Binu,
>>>>>> Phoenix has never supported HBase 0.96, so I'm not sure where you got
>>>>>> the release from.
>>>>>>
>>>>>> I recommend upgrading to a later, supported version of HBase and a
>>>>>> later version of Phoenix. Give the 4.7.0 RC a try.
>>>>>>
>>>>>> One other tip in particular for views you create over existing HBase
>>>>>> tables. Use the UNSIGNED types documented here[1] as these use the same
>>>>>> serialization as the Bytes methods provided by HBase. If you tell Phoenix
>>>>>> the wrong type, it won't know so would produce erroneous data and queries.
>>>>>>
>>>>>> Thanks,
>>>>>> James
>>>>>>
>>>>>> [1] https://phoenix.apache.org/language/datatypes.html
>>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__phoenix.apache.org_language_datatypes.html&d=CwMFaQ&c=LNdz7nrxyGFUIUTz2qIULQ&r=kx0aYbS3d_bROblEBwnOWEEpQQo7ummi3XKbSfjcDxI&m=DYIhYC8MzfMvMlb_vB97jlBOyX5-ROGZnBR8uHjL9AM&s=qZf4wLz966HePHU8YevWHvcdJWrlFV84gXZLYftu1II&e=>
>>>>>>
>>>>>> On Thu, Jan 28, 2016 at 5:57 PM, Binu Mathew <bm...@groupon.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Phoenix version 4.4.0
>>>>>>>
>>>>>>> Issues with Phoenix when used with HBase 0.96.0.2.0
>>>>>>>
>>>>>>> 2 Issues:
>>>>>>>
>>>>>>> *ISSUE:* to_date Function is not converting string data types in
>>>>>>> valid date formats to a DATE data type when used in the WHERE clause for
>>>>>>> date comparison.
>>>>>>>
>>>>>>> Below is a query I ran against a Phoenix view in which I use the
>>>>>>> ‘to_date’ function to convert 2 VARCHAR columns to date.
>>>>>>> 1. column ‘created_at_ts’ stored as VARCHAR in format such as
>>>>>>> 2009-05-05 15:40:10.000
>>>>>>> 2. column ‘created_at_date’ stored as VARCHAR in format such as
>>>>>>> 2009-05-05
>>>>>>>
>>>>>>> Observe that the ‘to_date’ function coverts the 2 VARCHAR columns to
>>>>>>> dates:
>>>>>>>
>>>>>>>
>>>>>>> select to_date("created_at_ts"), to_date("created_at_date") from
>>>>>>> "gp_subscriptions" limit 5;
>>>>>>>
>>>>>>>
>>>>>>> +-------------------------------------------------+---------------------------------------------------+
>>>>>>> | TO_DATE(subscriber."created_at_ts", null, null) |
>>>>>>> TO_DATE(subscriber."created_at_date", null, null) |
>>>>>>>
>>>>>>> +-------------------------------------------------+---------------------------------------------------+
>>>>>>> | 2009-05-05 15:40:10.000                         | 2009-05-05
>>>>>>> 00:00:00.000                           |
>>>>>>> | 2012-11-22 07:37:34.000                         | 2012-11-22
>>>>>>> 00:00:00.000                           |
>>>>>>> | 2010-07-24 14:12:33.000                         | 2010-07-24
>>>>>>> 00:00:00.000                           |
>>>>>>> | 2012-11-22 07:38:04.000                         | 2012-11-22
>>>>>>> 00:00:00.000                           |
>>>>>>> | 2012-11-22 07:38:10.000                         | 2012-11-22
>>>>>>> 00:00:00.000                           |
>>>>>>>
>>>>>>> +-------------------------------------------------+---------------------------------------------------+
>>>>>>>
>>>>>>>
>>>>>>> Here is another query in which I’m using the ‘to_date’ function on
>>>>>>> string literals in the WHERE clause for date comparison .
>>>>>>>
>>>>>>> Observer that the ‘to_date’ function coverts the string literals to
>>>>>>> dates and the the date comparison correctly evaluates:
>>>>>>>
>>>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>>>>>> 15:40:10.000') = to_date('2009-05-05 15:40:10.000') limit 2;
>>>>>>> 2 rows selected (0.035 seconds)
>>>>>>>
>>>>>>>
>>>>>>> Now when I try the date comparison using the columns from my view,
>>>>>>> it fails:
>>>>>>>
>>>>>>> select '1' from "gp_subscriptions" where to_date("created_at_ts") =
>>>>>>> to_date('2009-05-05 15:40:10.000') limit 2;
>>>>>>>
>>>>>>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>>>>>>> BooleanExpressionFilter failed during reading: Could not initialize class
>>>>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>>>> Caused by: java.lang.NoClassDefFoundError: Could not initialize
>>>>>>> class org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>>>>
>>>>>>> Also fails with same error when I try: select '1' from
>>>>>>> "gp_subscriptions" where to_date("created_at_ts") = to_date('2009-05-05')
>>>>>>> limit 2;
>>>>>>>
>>>>>>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>>>>>>> BooleanExpressionFilter failed during reading: Could not initialize class
>>>>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>>>> Caused by: java.lang.NoClassDefFoundError: Could not initialize
>>>>>>> class org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>>>>
>>>>>>>
>>>>>>> *ISSUE: *Date comparisons on string literals are not evaluating
>>>>>>> correctly such that dates in the future get interpreted as being less than
>>>>>>> dates in the past.
>>>>>>>
>>>>>>> Test case 1:
>>>>>>> 2009-05-05 15:40:10.000 is greater than (in the future) 2005-05-05
>>>>>>> 15:40:10.000
>>>>>>>
>>>>>>> The following query should return 2 rows, however, it does not
>>>>>>> return any rows:
>>>>>>>
>>>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>>>>>> 15:40:10.000') > to_date('2005-05-05 15:40:10.000') limit 2;
>>>>>>> No rows selected (0.024 seconds)
>>>>>>>
>>>>>>>
>>>>>>> The following query should return no rows, however, it returns 2
>>>>>>> rows:
>>>>>>>
>>>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>>>>>> 15:40:10.000') < to_date('2005-05-05 15:40:10.000') limit 2;
>>>>>>> 2 rows selected (0.033 seconds)
>>>>>>>
>>>>>>> Test case 2:
>>>>>>> 2009-05-05 is greater than (in the future) than 1970-05-05
>>>>>>>
>>>>>>> The following query should return 2 rows, however, it does not
>>>>>>> return any rows:
>>>>>>>
>>>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') >
>>>>>>> to_date('1970-05-05') limit 2;
>>>>>>> No rows selected (0.024 seconds)
>>>>>>>
>>>>>>>
>>>>>>> The following query should return no rows, however, it returns 2
>>>>>>> rows:
>>>>>>>
>>>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') <
>>>>>>> to_date('1970-05-05') limit 2;
>>>>>>> 2 rows selected (0.033 seconds)
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> *Binu Mathew*
>>>>>>> Data Engineering
>>>>>>>
>>>>>>> 3101 Park Blvd., Palo Alto, CA 94306
>>>>>>> Mobile: 630.267.5938
>>>>>>> Groupon
>>>>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.google.com_url-3Fq-3Dhttp-253A-252F-252Fwww.groupon.com-252F-26sa-3DD-26sntz-3D1-26usg-3DAFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg&d=CwMFaQ&c=LNdz7nrxyGFUIUTz2qIULQ&r=kx0aYbS3d_bROblEBwnOWEEpQQo7ummi3XKbSfjcDxI&m=DYIhYC8MzfMvMlb_vB97jlBOyX5-ROGZnBR8uHjL9AM&s=ZLtx84pLqRnFvwXQsJT-IoadaGYt0KObWPttSMqW-xY&e=>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> *Binu Mathew*
>>>>> Data Engineering
>>>>>
>>>>> 3101 Park Blvd., Palo Alto, CA 94306
>>>>> Mobile: 630.267.5938
>>>>> Groupon
>>>>> <http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> *Binu Mathew*
>>> Data Engineering
>>>
>>> 3101 Park Blvd., Palo Alto, CA 94306
>>> Mobile: 630.267.5938
>>> Groupon
>>> <http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>
>>>
>>>
>>>
>>>
>>
>
>
> --
> *Binu Mathew*
> Data Engineering
>
> 3101 Park Blvd., Palo Alto, CA 94306
> Mobile: 630.267.5938
> Groupon
> <http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>
>
>
>
>

Re: to_date not working as expected

Posted by Binu Mathew <bm...@groupon.com>.
Hi Thomas,

Any update on this?

If and when a patch becomes available, can you please update this thread.

This is a blocker:
1. We have existing HBase tables that users do not necessarily want to port
over to Phoenix tables
2. Building Phoenix views to the existing HBase tables is a better solution
3. The inability to cast the date data types in the WHERE clause and
compare against another date prevents users from performance range scans.

Unfortunately, we are unable to upgrade our Hortonworks Hadoop
distribution, HDP 2.3, to use Phoenix 4.6.

HDP 2.3 came default with Phoenix 4.4 and we can't seem to upgrade a single
package, 4.4, to 4.6 on the HDP 2.3 distribution.

Can you provide us with a patch to resolve this issue?

Thanks,

On Sat, Jan 30, 2016 at 11:45 AM, Thomas D'Silva <td...@salesforce.com>
wrote:

> Binu,
>
> I am able to repro the issue by manually running the test from the patch
> from https://issues.apache.org/jira/browse/PHOENIX-1769 .
> I will investigate further.
>
> Thanks,
> Thomas
>
>
> On Fri, Jan 29, 2016 at 4:26 PM, Binu Mathew <bm...@groupon.com> wrote:
>
>> That doesn't seem to work.
>>
>> Phoenix is not recognizing that created_at in the WERE clause is a
>> derived column from to_date("created_at_date").
>>
>> Some relational databases support this type of functionality.
>>
>> On Fri, Jan 29, 2016 at 4:16 PM, Alok Singh <al...@cloudability.com>
>> wrote:
>>
>>> Does this work:
>>>
>>> select 1, to_date("created_at_date") as created_at from
>>> "gp_subscriptions" where created_at > to_date('2010-10-10') limit 3;
>>>
>>> Alok
>>>
>>> Alok
>>>
>>> alok@cloudability.com
>>>
>>> On Fri, Jan 29, 2016 at 3:54 PM, Binu Mathew <bm...@groupon.com>
>>> wrote:
>>>
>>>> Thank you for the reply.
>>>>
>>>> I mistakenly wrote that we are using Phoenix with HBase .96. This was a
>>>> typo. We are using HBase .98 with Phoenix 4.4
>>>>
>>>> I tried the UNSIGNED types and still encountering the same issue.
>>>>
>>>> My field has the following data:
>>>>
>>>> select "created_at_date" from "gp_subscriptions" limit 3;
>>>>
>>>> +------------------------------------------+
>>>> |             created_at_date              |
>>>> +------------------------------------------+
>>>> | 2012-11-22                               |
>>>> | 2012-11-22                               |
>>>> | 2012-11-26                               |
>>>> +------------------------------------------+
>>>>
>>>> to_date function works when I convert the VARCHAR field to a DATE:
>>>>
>>>> select to_date("created_at_date") from "gp_subscriptions" limit 3;
>>>>
>>>> +---------------------------------------------------+
>>>> | TO_DATE(subscriber."created_at_date", null, null) |
>>>> +---------------------------------------------------+
>>>> | 2012-11-22 00:00:00.000                           |
>>>> | 2012-11-22 00:00:00.000                           |
>>>> | 2012-11-26 00:00:00.000                           |
>>>> +---------------------------------------------------+
>>>>
>>>>
>>>> However, I can't use the to_date function in the WHERE clause:
>>>>
>>>> select 1 from "gp_subscriptions" where to_date("created_at_date") >
>>>> to_date('2010-10-10') limit 3;
>>>>
>>>> java.lang.RuntimeException:
>>>> org.apache.phoenix.exception.PhoenixIOException:
>>>> org.apache.phoenix.exception.PhoenixIOException:
>>>> org.apache.hadoop.hbase.DoNotRetryIOException:
>>>> java.lang.reflect.InvocationTargetException
>>>>
>>>> Another issue is how Phoenix evaluates dates. In the queries below, I'm
>>>> evaluating string literals:
>>>>
>>>> 2009-05-05 is greater than (in the future) than 1970-05-05
>>>>
>>>> The following query should return 2 rows, however, it does not return
>>>> any rows:
>>>>
>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') >
>>>> to_date('1970-05-05') limit 2;
>>>> No rows selected (0.024 seconds)
>>>>
>>>>
>>>> The following query should return no rows, however, it returns 2 rows:
>>>>
>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') <
>>>> to_date('1970-05-05') limit 2;
>>>> 2 rows selected (0.033 seconds)
>>>>
>>>> Thanks,
>>>>
>>>> On Thu, Jan 28, 2016 at 8:07 PM, James Taylor <ja...@apache.org>
>>>> wrote:
>>>>
>>>>> Hi Binu,
>>>>> Phoenix has never supported HBase 0.96, so I'm not sure where you got
>>>>> the release from.
>>>>>
>>>>> I recommend upgrading to a later, supported version of HBase and a
>>>>> later version of Phoenix. Give the 4.7.0 RC a try.
>>>>>
>>>>> One other tip in particular for views you create over existing HBase
>>>>> tables. Use the UNSIGNED types documented here[1] as these use the same
>>>>> serialization as the Bytes methods provided by HBase. If you tell Phoenix
>>>>> the wrong type, it won't know so would produce erroneous data and queries.
>>>>>
>>>>> Thanks,
>>>>> James
>>>>>
>>>>> [1] https://phoenix.apache.org/language/datatypes.html
>>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__phoenix.apache.org_language_datatypes.html&d=CwMFaQ&c=LNdz7nrxyGFUIUTz2qIULQ&r=kx0aYbS3d_bROblEBwnOWEEpQQo7ummi3XKbSfjcDxI&m=DYIhYC8MzfMvMlb_vB97jlBOyX5-ROGZnBR8uHjL9AM&s=qZf4wLz966HePHU8YevWHvcdJWrlFV84gXZLYftu1II&e=>
>>>>>
>>>>> On Thu, Jan 28, 2016 at 5:57 PM, Binu Mathew <bm...@groupon.com>
>>>>> wrote:
>>>>>
>>>>>> Phoenix version 4.4.0
>>>>>>
>>>>>> Issues with Phoenix when used with HBase 0.96.0.2.0
>>>>>>
>>>>>> 2 Issues:
>>>>>>
>>>>>> *ISSUE:* to_date Function is not converting string data types in
>>>>>> valid date formats to a DATE data type when used in the WHERE clause for
>>>>>> date comparison.
>>>>>>
>>>>>> Below is a query I ran against a Phoenix view in which I use the
>>>>>> ‘to_date’ function to convert 2 VARCHAR columns to date.
>>>>>> 1. column ‘created_at_ts’ stored as VARCHAR in format such as
>>>>>> 2009-05-05 15:40:10.000
>>>>>> 2. column ‘created_at_date’ stored as VARCHAR in format such as
>>>>>> 2009-05-05
>>>>>>
>>>>>> Observe that the ‘to_date’ function coverts the 2 VARCHAR columns to
>>>>>> dates:
>>>>>>
>>>>>>
>>>>>> select to_date("created_at_ts"), to_date("created_at_date") from
>>>>>> "gp_subscriptions" limit 5;
>>>>>>
>>>>>>
>>>>>> +-------------------------------------------------+---------------------------------------------------+
>>>>>> | TO_DATE(subscriber."created_at_ts", null, null) |
>>>>>> TO_DATE(subscriber."created_at_date", null, null) |
>>>>>>
>>>>>> +-------------------------------------------------+---------------------------------------------------+
>>>>>> | 2009-05-05 15:40:10.000                         | 2009-05-05
>>>>>> 00:00:00.000                           |
>>>>>> | 2012-11-22 07:37:34.000                         | 2012-11-22
>>>>>> 00:00:00.000                           |
>>>>>> | 2010-07-24 14:12:33.000                         | 2010-07-24
>>>>>> 00:00:00.000                           |
>>>>>> | 2012-11-22 07:38:04.000                         | 2012-11-22
>>>>>> 00:00:00.000                           |
>>>>>> | 2012-11-22 07:38:10.000                         | 2012-11-22
>>>>>> 00:00:00.000                           |
>>>>>>
>>>>>> +-------------------------------------------------+---------------------------------------------------+
>>>>>>
>>>>>>
>>>>>> Here is another query in which I’m using the ‘to_date’ function on
>>>>>> string literals in the WHERE clause for date comparison .
>>>>>>
>>>>>> Observer that the ‘to_date’ function coverts the string literals to
>>>>>> dates and the the date comparison correctly evaluates:
>>>>>>
>>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>>>>> 15:40:10.000') = to_date('2009-05-05 15:40:10.000') limit 2;
>>>>>> 2 rows selected (0.035 seconds)
>>>>>>
>>>>>>
>>>>>> Now when I try the date comparison using the columns from my view, it
>>>>>> fails:
>>>>>>
>>>>>> select '1' from "gp_subscriptions" where to_date("created_at_ts") =
>>>>>> to_date('2009-05-05 15:40:10.000') limit 2;
>>>>>>
>>>>>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>>>>>> BooleanExpressionFilter failed during reading: Could not initialize class
>>>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>>> Caused by: java.lang.NoClassDefFoundError: Could not initialize class
>>>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>>>
>>>>>> Also fails with same error when I try: select '1' from
>>>>>> "gp_subscriptions" where to_date("created_at_ts") = to_date('2009-05-05')
>>>>>> limit 2;
>>>>>>
>>>>>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>>>>>> BooleanExpressionFilter failed during reading: Could not initialize class
>>>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>>> Caused by: java.lang.NoClassDefFoundError: Could not initialize class
>>>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>>>
>>>>>>
>>>>>> *ISSUE: *Date comparisons on string literals are not evaluating
>>>>>> correctly such that dates in the future get interpreted as being less than
>>>>>> dates in the past.
>>>>>>
>>>>>> Test case 1:
>>>>>> 2009-05-05 15:40:10.000 is greater than (in the future) 2005-05-05
>>>>>> 15:40:10.000
>>>>>>
>>>>>> The following query should return 2 rows, however, it does not return
>>>>>> any rows:
>>>>>>
>>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>>>>> 15:40:10.000') > to_date('2005-05-05 15:40:10.000') limit 2;
>>>>>> No rows selected (0.024 seconds)
>>>>>>
>>>>>>
>>>>>> The following query should return no rows, however, it returns 2 rows:
>>>>>>
>>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>>>>> 15:40:10.000') < to_date('2005-05-05 15:40:10.000') limit 2;
>>>>>> 2 rows selected (0.033 seconds)
>>>>>>
>>>>>> Test case 2:
>>>>>> 2009-05-05 is greater than (in the future) than 1970-05-05
>>>>>>
>>>>>> The following query should return 2 rows, however, it does not return
>>>>>> any rows:
>>>>>>
>>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') >
>>>>>> to_date('1970-05-05') limit 2;
>>>>>> No rows selected (0.024 seconds)
>>>>>>
>>>>>>
>>>>>> The following query should return no rows, however, it returns 2 rows:
>>>>>>
>>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') <
>>>>>> to_date('1970-05-05') limit 2;
>>>>>> 2 rows selected (0.033 seconds)
>>>>>>
>>>>>>
>>>>>> --
>>>>>> *Binu Mathew*
>>>>>> Data Engineering
>>>>>>
>>>>>> 3101 Park Blvd., Palo Alto, CA 94306
>>>>>> Mobile: 630.267.5938
>>>>>> Groupon
>>>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.google.com_url-3Fq-3Dhttp-253A-252F-252Fwww.groupon.com-252F-26sa-3DD-26sntz-3D1-26usg-3DAFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg&d=CwMFaQ&c=LNdz7nrxyGFUIUTz2qIULQ&r=kx0aYbS3d_bROblEBwnOWEEpQQo7ummi3XKbSfjcDxI&m=DYIhYC8MzfMvMlb_vB97jlBOyX5-ROGZnBR8uHjL9AM&s=ZLtx84pLqRnFvwXQsJT-IoadaGYt0KObWPttSMqW-xY&e=>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> *Binu Mathew*
>>>> Data Engineering
>>>>
>>>> 3101 Park Blvd., Palo Alto, CA 94306
>>>> Mobile: 630.267.5938
>>>> Groupon
>>>> <http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> *Binu Mathew*
>> Data Engineering
>>
>> 3101 Park Blvd., Palo Alto, CA 94306
>> Mobile: 630.267.5938
>> Groupon
>> <http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>
>>
>>
>>
>>
>


-- 
*Binu Mathew*
Data Engineering

3101 Park Blvd., Palo Alto, CA 94306
Mobile: 630.267.5938
Groupon
<http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>

Re: to_date not working as expected

Posted by Thomas D'Silva <td...@salesforce.com>.
Binu,

I am able to repro the issue by manually running the test from the patch
from https://issues.apache.org/jira/browse/PHOENIX-1769 .
I will investigate further.

Thanks,
Thomas

On Fri, Jan 29, 2016 at 4:26 PM, Binu Mathew <bm...@groupon.com> wrote:

> That doesn't seem to work.
>
> Phoenix is not recognizing that created_at in the WERE clause is a derived
> column from to_date("created_at_date").
>
> Some relational databases support this type of functionality.
>
> On Fri, Jan 29, 2016 at 4:16 PM, Alok Singh <al...@cloudability.com> wrote:
>
>> Does this work:
>>
>> select 1, to_date("created_at_date") as created_at from
>> "gp_subscriptions" where created_at > to_date('2010-10-10') limit 3;
>>
>> Alok
>>
>> Alok
>>
>> alok@cloudability.com
>>
>> On Fri, Jan 29, 2016 at 3:54 PM, Binu Mathew <bm...@groupon.com> wrote:
>>
>>> Thank you for the reply.
>>>
>>> I mistakenly wrote that we are using Phoenix with HBase .96. This was a
>>> typo. We are using HBase .98 with Phoenix 4.4
>>>
>>> I tried the UNSIGNED types and still encountering the same issue.
>>>
>>> My field has the following data:
>>>
>>> select "created_at_date" from "gp_subscriptions" limit 3;
>>>
>>> +------------------------------------------+
>>> |             created_at_date              |
>>> +------------------------------------------+
>>> | 2012-11-22                               |
>>> | 2012-11-22                               |
>>> | 2012-11-26                               |
>>> +------------------------------------------+
>>>
>>> to_date function works when I convert the VARCHAR field to a DATE:
>>>
>>> select to_date("created_at_date") from "gp_subscriptions" limit 3;
>>>
>>> +---------------------------------------------------+
>>> | TO_DATE(subscriber."created_at_date", null, null) |
>>> +---------------------------------------------------+
>>> | 2012-11-22 00:00:00.000                           |
>>> | 2012-11-22 00:00:00.000                           |
>>> | 2012-11-26 00:00:00.000                           |
>>> +---------------------------------------------------+
>>>
>>>
>>> However, I can't use the to_date function in the WHERE clause:
>>>
>>> select 1 from "gp_subscriptions" where to_date("created_at_date") >
>>> to_date('2010-10-10') limit 3;
>>>
>>> java.lang.RuntimeException:
>>> org.apache.phoenix.exception.PhoenixIOException:
>>> org.apache.phoenix.exception.PhoenixIOException:
>>> org.apache.hadoop.hbase.DoNotRetryIOException:
>>> java.lang.reflect.InvocationTargetException
>>>
>>> Another issue is how Phoenix evaluates dates. In the queries below, I'm
>>> evaluating string literals:
>>>
>>> 2009-05-05 is greater than (in the future) than 1970-05-05
>>>
>>> The following query should return 2 rows, however, it does not return
>>> any rows:
>>>
>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') >
>>> to_date('1970-05-05') limit 2;
>>> No rows selected (0.024 seconds)
>>>
>>>
>>> The following query should return no rows, however, it returns 2 rows:
>>>
>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') <
>>> to_date('1970-05-05') limit 2;
>>> 2 rows selected (0.033 seconds)
>>>
>>> Thanks,
>>>
>>> On Thu, Jan 28, 2016 at 8:07 PM, James Taylor <ja...@apache.org>
>>> wrote:
>>>
>>>> Hi Binu,
>>>> Phoenix has never supported HBase 0.96, so I'm not sure where you got
>>>> the release from.
>>>>
>>>> I recommend upgrading to a later, supported version of HBase and a
>>>> later version of Phoenix. Give the 4.7.0 RC a try.
>>>>
>>>> One other tip in particular for views you create over existing HBase
>>>> tables. Use the UNSIGNED types documented here[1] as these use the same
>>>> serialization as the Bytes methods provided by HBase. If you tell Phoenix
>>>> the wrong type, it won't know so would produce erroneous data and queries.
>>>>
>>>> Thanks,
>>>> James
>>>>
>>>> [1] https://phoenix.apache.org/language/datatypes.html
>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__phoenix.apache.org_language_datatypes.html&d=CwMFaQ&c=LNdz7nrxyGFUIUTz2qIULQ&r=kx0aYbS3d_bROblEBwnOWEEpQQo7ummi3XKbSfjcDxI&m=DYIhYC8MzfMvMlb_vB97jlBOyX5-ROGZnBR8uHjL9AM&s=qZf4wLz966HePHU8YevWHvcdJWrlFV84gXZLYftu1II&e=>
>>>>
>>>> On Thu, Jan 28, 2016 at 5:57 PM, Binu Mathew <bm...@groupon.com>
>>>> wrote:
>>>>
>>>>> Phoenix version 4.4.0
>>>>>
>>>>> Issues with Phoenix when used with HBase 0.96.0.2.0
>>>>>
>>>>> 2 Issues:
>>>>>
>>>>> *ISSUE:* to_date Function is not converting string data types in
>>>>> valid date formats to a DATE data type when used in the WHERE clause for
>>>>> date comparison.
>>>>>
>>>>> Below is a query I ran against a Phoenix view in which I use the
>>>>> ‘to_date’ function to convert 2 VARCHAR columns to date.
>>>>> 1. column ‘created_at_ts’ stored as VARCHAR in format such as
>>>>> 2009-05-05 15:40:10.000
>>>>> 2. column ‘created_at_date’ stored as VARCHAR in format such as
>>>>> 2009-05-05
>>>>>
>>>>> Observe that the ‘to_date’ function coverts the 2 VARCHAR columns to
>>>>> dates:
>>>>>
>>>>>
>>>>> select to_date("created_at_ts"), to_date("created_at_date") from
>>>>> "gp_subscriptions" limit 5;
>>>>>
>>>>>
>>>>> +-------------------------------------------------+---------------------------------------------------+
>>>>> | TO_DATE(subscriber."created_at_ts", null, null) |
>>>>> TO_DATE(subscriber."created_at_date", null, null) |
>>>>>
>>>>> +-------------------------------------------------+---------------------------------------------------+
>>>>> | 2009-05-05 15:40:10.000                         | 2009-05-05
>>>>> 00:00:00.000                           |
>>>>> | 2012-11-22 07:37:34.000                         | 2012-11-22
>>>>> 00:00:00.000                           |
>>>>> | 2010-07-24 14:12:33.000                         | 2010-07-24
>>>>> 00:00:00.000                           |
>>>>> | 2012-11-22 07:38:04.000                         | 2012-11-22
>>>>> 00:00:00.000                           |
>>>>> | 2012-11-22 07:38:10.000                         | 2012-11-22
>>>>> 00:00:00.000                           |
>>>>>
>>>>> +-------------------------------------------------+---------------------------------------------------+
>>>>>
>>>>>
>>>>> Here is another query in which I’m using the ‘to_date’ function on
>>>>> string literals in the WHERE clause for date comparison .
>>>>>
>>>>> Observer that the ‘to_date’ function coverts the string literals to
>>>>> dates and the the date comparison correctly evaluates:
>>>>>
>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>>>> 15:40:10.000') = to_date('2009-05-05 15:40:10.000') limit 2;
>>>>> 2 rows selected (0.035 seconds)
>>>>>
>>>>>
>>>>> Now when I try the date comparison using the columns from my view, it
>>>>> fails:
>>>>>
>>>>> select '1' from "gp_subscriptions" where to_date("created_at_ts") =
>>>>> to_date('2009-05-05 15:40:10.000') limit 2;
>>>>>
>>>>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>>>>> BooleanExpressionFilter failed during reading: Could not initialize class
>>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>> Caused by: java.lang.NoClassDefFoundError: Could not initialize class
>>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>>
>>>>> Also fails with same error when I try: select '1' from
>>>>> "gp_subscriptions" where to_date("created_at_ts") = to_date('2009-05-05')
>>>>> limit 2;
>>>>>
>>>>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>>>>> BooleanExpressionFilter failed during reading: Could not initialize class
>>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>> Caused by: java.lang.NoClassDefFoundError: Could not initialize class
>>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>>
>>>>>
>>>>> *ISSUE: *Date comparisons on string literals are not evaluating
>>>>> correctly such that dates in the future get interpreted as being less than
>>>>> dates in the past.
>>>>>
>>>>> Test case 1:
>>>>> 2009-05-05 15:40:10.000 is greater than (in the future) 2005-05-05
>>>>> 15:40:10.000
>>>>>
>>>>> The following query should return 2 rows, however, it does not return
>>>>> any rows:
>>>>>
>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>>>> 15:40:10.000') > to_date('2005-05-05 15:40:10.000') limit 2;
>>>>> No rows selected (0.024 seconds)
>>>>>
>>>>>
>>>>> The following query should return no rows, however, it returns 2 rows:
>>>>>
>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>>>> 15:40:10.000') < to_date('2005-05-05 15:40:10.000') limit 2;
>>>>> 2 rows selected (0.033 seconds)
>>>>>
>>>>> Test case 2:
>>>>> 2009-05-05 is greater than (in the future) than 1970-05-05
>>>>>
>>>>> The following query should return 2 rows, however, it does not return
>>>>> any rows:
>>>>>
>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') >
>>>>> to_date('1970-05-05') limit 2;
>>>>> No rows selected (0.024 seconds)
>>>>>
>>>>>
>>>>> The following query should return no rows, however, it returns 2 rows:
>>>>>
>>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') <
>>>>> to_date('1970-05-05') limit 2;
>>>>> 2 rows selected (0.033 seconds)
>>>>>
>>>>>
>>>>> --
>>>>> *Binu Mathew*
>>>>> Data Engineering
>>>>>
>>>>> 3101 Park Blvd., Palo Alto, CA 94306
>>>>> Mobile: 630.267.5938
>>>>> Groupon
>>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.google.com_url-3Fq-3Dhttp-253A-252F-252Fwww.groupon.com-252F-26sa-3DD-26sntz-3D1-26usg-3DAFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg&d=CwMFaQ&c=LNdz7nrxyGFUIUTz2qIULQ&r=kx0aYbS3d_bROblEBwnOWEEpQQo7ummi3XKbSfjcDxI&m=DYIhYC8MzfMvMlb_vB97jlBOyX5-ROGZnBR8uHjL9AM&s=ZLtx84pLqRnFvwXQsJT-IoadaGYt0KObWPttSMqW-xY&e=>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> *Binu Mathew*
>>> Data Engineering
>>>
>>> 3101 Park Blvd., Palo Alto, CA 94306
>>> Mobile: 630.267.5938
>>> Groupon
>>> <http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>
>>>
>>>
>>>
>>>
>>
>
>
> --
> *Binu Mathew*
> Data Engineering
>
> 3101 Park Blvd., Palo Alto, CA 94306
> Mobile: 630.267.5938
> Groupon
> <http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>
>
>
>
>

Re: to_date not working as expected

Posted by Binu Mathew <bm...@groupon.com>.
That doesn't seem to work.

Phoenix is not recognizing that created_at in the WERE clause is a derived
column from to_date("created_at_date").

Some relational databases support this type of functionality.

On Fri, Jan 29, 2016 at 4:16 PM, Alok Singh <al...@cloudability.com> wrote:

> Does this work:
>
> select 1, to_date("created_at_date") as created_at from
> "gp_subscriptions" where created_at > to_date('2010-10-10') limit 3;
>
> Alok
>
> Alok
>
> alok@cloudability.com
>
> On Fri, Jan 29, 2016 at 3:54 PM, Binu Mathew <bm...@groupon.com> wrote:
>
>> Thank you for the reply.
>>
>> I mistakenly wrote that we are using Phoenix with HBase .96. This was a
>> typo. We are using HBase .98 with Phoenix 4.4
>>
>> I tried the UNSIGNED types and still encountering the same issue.
>>
>> My field has the following data:
>>
>> select "created_at_date" from "gp_subscriptions" limit 3;
>>
>> +------------------------------------------+
>> |             created_at_date              |
>> +------------------------------------------+
>> | 2012-11-22                               |
>> | 2012-11-22                               |
>> | 2012-11-26                               |
>> +------------------------------------------+
>>
>> to_date function works when I convert the VARCHAR field to a DATE:
>>
>> select to_date("created_at_date") from "gp_subscriptions" limit 3;
>>
>> +---------------------------------------------------+
>> | TO_DATE(subscriber."created_at_date", null, null) |
>> +---------------------------------------------------+
>> | 2012-11-22 00:00:00.000                           |
>> | 2012-11-22 00:00:00.000                           |
>> | 2012-11-26 00:00:00.000                           |
>> +---------------------------------------------------+
>>
>>
>> However, I can't use the to_date function in the WHERE clause:
>>
>> select 1 from "gp_subscriptions" where to_date("created_at_date") >
>> to_date('2010-10-10') limit 3;
>>
>> java.lang.RuntimeException:
>> org.apache.phoenix.exception.PhoenixIOException:
>> org.apache.phoenix.exception.PhoenixIOException:
>> org.apache.hadoop.hbase.DoNotRetryIOException:
>> java.lang.reflect.InvocationTargetException
>>
>> Another issue is how Phoenix evaluates dates. In the queries below, I'm
>> evaluating string literals:
>>
>> 2009-05-05 is greater than (in the future) than 1970-05-05
>>
>> The following query should return 2 rows, however, it does not return any
>> rows:
>>
>> select '1' from "gp_subscriptions" where to_date('2009-05-05') >
>> to_date('1970-05-05') limit 2;
>> No rows selected (0.024 seconds)
>>
>>
>> The following query should return no rows, however, it returns 2 rows:
>>
>> select '1' from "gp_subscriptions" where to_date('2009-05-05') <
>> to_date('1970-05-05') limit 2;
>> 2 rows selected (0.033 seconds)
>>
>> Thanks,
>>
>> On Thu, Jan 28, 2016 at 8:07 PM, James Taylor <ja...@apache.org>
>> wrote:
>>
>>> Hi Binu,
>>> Phoenix has never supported HBase 0.96, so I'm not sure where you got
>>> the release from.
>>>
>>> I recommend upgrading to a later, supported version of HBase and a later
>>> version of Phoenix. Give the 4.7.0 RC a try.
>>>
>>> One other tip in particular for views you create over existing HBase
>>> tables. Use the UNSIGNED types documented here[1] as these use the same
>>> serialization as the Bytes methods provided by HBase. If you tell Phoenix
>>> the wrong type, it won't know so would produce erroneous data and queries.
>>>
>>> Thanks,
>>> James
>>>
>>> [1] https://phoenix.apache.org/language/datatypes.html
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__phoenix.apache.org_language_datatypes.html&d=CwMFaQ&c=LNdz7nrxyGFUIUTz2qIULQ&r=kx0aYbS3d_bROblEBwnOWEEpQQo7ummi3XKbSfjcDxI&m=DYIhYC8MzfMvMlb_vB97jlBOyX5-ROGZnBR8uHjL9AM&s=qZf4wLz966HePHU8YevWHvcdJWrlFV84gXZLYftu1II&e=>
>>>
>>> On Thu, Jan 28, 2016 at 5:57 PM, Binu Mathew <bm...@groupon.com>
>>> wrote:
>>>
>>>> Phoenix version 4.4.0
>>>>
>>>> Issues with Phoenix when used with HBase 0.96.0.2.0
>>>>
>>>> 2 Issues:
>>>>
>>>> *ISSUE:* to_date Function is not converting string data types in valid
>>>> date formats to a DATE data type when used in the WHERE clause for date
>>>> comparison.
>>>>
>>>> Below is a query I ran against a Phoenix view in which I use the
>>>> ‘to_date’ function to convert 2 VARCHAR columns to date.
>>>> 1. column ‘created_at_ts’ stored as VARCHAR in format such as
>>>> 2009-05-05 15:40:10.000
>>>> 2. column ‘created_at_date’ stored as VARCHAR in format such as
>>>> 2009-05-05
>>>>
>>>> Observe that the ‘to_date’ function coverts the 2 VARCHAR columns to
>>>> dates:
>>>>
>>>>
>>>> select to_date("created_at_ts"), to_date("created_at_date") from
>>>> "gp_subscriptions" limit 5;
>>>>
>>>>
>>>> +-------------------------------------------------+---------------------------------------------------+
>>>> | TO_DATE(subscriber."created_at_ts", null, null) |
>>>> TO_DATE(subscriber."created_at_date", null, null) |
>>>>
>>>> +-------------------------------------------------+---------------------------------------------------+
>>>> | 2009-05-05 15:40:10.000                         | 2009-05-05
>>>> 00:00:00.000                           |
>>>> | 2012-11-22 07:37:34.000                         | 2012-11-22
>>>> 00:00:00.000                           |
>>>> | 2010-07-24 14:12:33.000                         | 2010-07-24
>>>> 00:00:00.000                           |
>>>> | 2012-11-22 07:38:04.000                         | 2012-11-22
>>>> 00:00:00.000                           |
>>>> | 2012-11-22 07:38:10.000                         | 2012-11-22
>>>> 00:00:00.000                           |
>>>>
>>>> +-------------------------------------------------+---------------------------------------------------+
>>>>
>>>>
>>>> Here is another query in which I’m using the ‘to_date’ function on
>>>> string literals in the WHERE clause for date comparison .
>>>>
>>>> Observer that the ‘to_date’ function coverts the string literals to
>>>> dates and the the date comparison correctly evaluates:
>>>>
>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>>> 15:40:10.000') = to_date('2009-05-05 15:40:10.000') limit 2;
>>>> 2 rows selected (0.035 seconds)
>>>>
>>>>
>>>> Now when I try the date comparison using the columns from my view, it
>>>> fails:
>>>>
>>>> select '1' from "gp_subscriptions" where to_date("created_at_ts") =
>>>> to_date('2009-05-05 15:40:10.000') limit 2;
>>>>
>>>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>>>> BooleanExpressionFilter failed during reading: Could not initialize class
>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>> Caused by: java.lang.NoClassDefFoundError: Could not initialize class
>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>
>>>> Also fails with same error when I try: select '1' from
>>>> "gp_subscriptions" where to_date("created_at_ts") = to_date('2009-05-05')
>>>> limit 2;
>>>>
>>>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>>>> BooleanExpressionFilter failed during reading: Could not initialize class
>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>> Caused by: java.lang.NoClassDefFoundError: Could not initialize class
>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>
>>>>
>>>> *ISSUE: *Date comparisons on string literals are not evaluating
>>>> correctly such that dates in the future get interpreted as being less than
>>>> dates in the past.
>>>>
>>>> Test case 1:
>>>> 2009-05-05 15:40:10.000 is greater than (in the future) 2005-05-05
>>>> 15:40:10.000
>>>>
>>>> The following query should return 2 rows, however, it does not return
>>>> any rows:
>>>>
>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>>> 15:40:10.000') > to_date('2005-05-05 15:40:10.000') limit 2;
>>>> No rows selected (0.024 seconds)
>>>>
>>>>
>>>> The following query should return no rows, however, it returns 2 rows:
>>>>
>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>>> 15:40:10.000') < to_date('2005-05-05 15:40:10.000') limit 2;
>>>> 2 rows selected (0.033 seconds)
>>>>
>>>> Test case 2:
>>>> 2009-05-05 is greater than (in the future) than 1970-05-05
>>>>
>>>> The following query should return 2 rows, however, it does not return
>>>> any rows:
>>>>
>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') >
>>>> to_date('1970-05-05') limit 2;
>>>> No rows selected (0.024 seconds)
>>>>
>>>>
>>>> The following query should return no rows, however, it returns 2 rows:
>>>>
>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') <
>>>> to_date('1970-05-05') limit 2;
>>>> 2 rows selected (0.033 seconds)
>>>>
>>>>
>>>> --
>>>> *Binu Mathew*
>>>> Data Engineering
>>>>
>>>> 3101 Park Blvd., Palo Alto, CA 94306
>>>> Mobile: 630.267.5938
>>>> Groupon
>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.google.com_url-3Fq-3Dhttp-253A-252F-252Fwww.groupon.com-252F-26sa-3DD-26sntz-3D1-26usg-3DAFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg&d=CwMFaQ&c=LNdz7nrxyGFUIUTz2qIULQ&r=kx0aYbS3d_bROblEBwnOWEEpQQo7ummi3XKbSfjcDxI&m=DYIhYC8MzfMvMlb_vB97jlBOyX5-ROGZnBR8uHjL9AM&s=ZLtx84pLqRnFvwXQsJT-IoadaGYt0KObWPttSMqW-xY&e=>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> *Binu Mathew*
>> Data Engineering
>>
>> 3101 Park Blvd., Palo Alto, CA 94306
>> Mobile: 630.267.5938
>> Groupon
>> <http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>
>>
>>
>>
>>
>


-- 
*Binu Mathew*
Data Engineering

3101 Park Blvd., Palo Alto, CA 94306
Mobile: 630.267.5938
Groupon
<http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>

Re: to_date not working as expected

Posted by Alok Singh <al...@cloudability.com>.
Does this work:

select 1, to_date("created_at_date") as created_at from "gp_subscriptions"
where created_at > to_date('2010-10-10') limit 3;

Alok

Alok

alok@cloudability.com

On Fri, Jan 29, 2016 at 3:54 PM, Binu Mathew <bm...@groupon.com> wrote:

> Thank you for the reply.
>
> I mistakenly wrote that we are using Phoenix with HBase .96. This was a
> typo. We are using HBase .98 with Phoenix 4.4
>
> I tried the UNSIGNED types and still encountering the same issue.
>
> My field has the following data:
>
> select "created_at_date" from "gp_subscriptions" limit 3;
>
> +------------------------------------------+
> |             created_at_date              |
> +------------------------------------------+
> | 2012-11-22                               |
> | 2012-11-22                               |
> | 2012-11-26                               |
> +------------------------------------------+
>
> to_date function works when I convert the VARCHAR field to a DATE:
>
> select to_date("created_at_date") from "gp_subscriptions" limit 3;
>
> +---------------------------------------------------+
> | TO_DATE(subscriber."created_at_date", null, null) |
> +---------------------------------------------------+
> | 2012-11-22 00:00:00.000                           |
> | 2012-11-22 00:00:00.000                           |
> | 2012-11-26 00:00:00.000                           |
> +---------------------------------------------------+
>
>
> However, I can't use the to_date function in the WHERE clause:
>
> select 1 from "gp_subscriptions" where to_date("created_at_date") >
> to_date('2010-10-10') limit 3;
>
> java.lang.RuntimeException:
> org.apache.phoenix.exception.PhoenixIOException:
> org.apache.phoenix.exception.PhoenixIOException:
> org.apache.hadoop.hbase.DoNotRetryIOException:
> java.lang.reflect.InvocationTargetException
>
> Another issue is how Phoenix evaluates dates. In the queries below, I'm
> evaluating string literals:
>
> 2009-05-05 is greater than (in the future) than 1970-05-05
>
> The following query should return 2 rows, however, it does not return any
> rows:
>
> select '1' from "gp_subscriptions" where to_date('2009-05-05') >
> to_date('1970-05-05') limit 2;
> No rows selected (0.024 seconds)
>
>
> The following query should return no rows, however, it returns 2 rows:
>
> select '1' from "gp_subscriptions" where to_date('2009-05-05') <
> to_date('1970-05-05') limit 2;
> 2 rows selected (0.033 seconds)
>
> Thanks,
>
> On Thu, Jan 28, 2016 at 8:07 PM, James Taylor <ja...@apache.org>
> wrote:
>
>> Hi Binu,
>> Phoenix has never supported HBase 0.96, so I'm not sure where you got the
>> release from.
>>
>> I recommend upgrading to a later, supported version of HBase and a later
>> version of Phoenix. Give the 4.7.0 RC a try.
>>
>> One other tip in particular for views you create over existing HBase
>> tables. Use the UNSIGNED types documented here[1] as these use the same
>> serialization as the Bytes methods provided by HBase. If you tell Phoenix
>> the wrong type, it won't know so would produce erroneous data and queries.
>>
>> Thanks,
>> James
>>
>> [1] https://phoenix.apache.org/language/datatypes.html
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__phoenix.apache.org_language_datatypes.html&d=CwMFaQ&c=LNdz7nrxyGFUIUTz2qIULQ&r=kx0aYbS3d_bROblEBwnOWEEpQQo7ummi3XKbSfjcDxI&m=DYIhYC8MzfMvMlb_vB97jlBOyX5-ROGZnBR8uHjL9AM&s=qZf4wLz966HePHU8YevWHvcdJWrlFV84gXZLYftu1II&e=>
>>
>> On Thu, Jan 28, 2016 at 5:57 PM, Binu Mathew <bm...@groupon.com> wrote:
>>
>>> Phoenix version 4.4.0
>>>
>>> Issues with Phoenix when used with HBase 0.96.0.2.0
>>>
>>> 2 Issues:
>>>
>>> *ISSUE:* to_date Function is not converting string data types in valid
>>> date formats to a DATE data type when used in the WHERE clause for date
>>> comparison.
>>>
>>> Below is a query I ran against a Phoenix view in which I use the
>>> ‘to_date’ function to convert 2 VARCHAR columns to date.
>>> 1. column ‘created_at_ts’ stored as VARCHAR in format such as 2009-05-05
>>> 15:40:10.000
>>> 2. column ‘created_at_date’ stored as VARCHAR in format such as
>>> 2009-05-05
>>>
>>> Observe that the ‘to_date’ function coverts the 2 VARCHAR columns to
>>> dates:
>>>
>>>
>>> select to_date("created_at_ts"), to_date("created_at_date") from
>>> "gp_subscriptions" limit 5;
>>>
>>>
>>> +-------------------------------------------------+---------------------------------------------------+
>>> | TO_DATE(subscriber."created_at_ts", null, null) |
>>> TO_DATE(subscriber."created_at_date", null, null) |
>>>
>>> +-------------------------------------------------+---------------------------------------------------+
>>> | 2009-05-05 15:40:10.000                         | 2009-05-05
>>> 00:00:00.000                           |
>>> | 2012-11-22 07:37:34.000                         | 2012-11-22
>>> 00:00:00.000                           |
>>> | 2010-07-24 14:12:33.000                         | 2010-07-24
>>> 00:00:00.000                           |
>>> | 2012-11-22 07:38:04.000                         | 2012-11-22
>>> 00:00:00.000                           |
>>> | 2012-11-22 07:38:10.000                         | 2012-11-22
>>> 00:00:00.000                           |
>>>
>>> +-------------------------------------------------+---------------------------------------------------+
>>>
>>>
>>> Here is another query in which I’m using the ‘to_date’ function on
>>> string literals in the WHERE clause for date comparison .
>>>
>>> Observer that the ‘to_date’ function coverts the string literals to
>>> dates and the the date comparison correctly evaluates:
>>>
>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>> 15:40:10.000') = to_date('2009-05-05 15:40:10.000') limit 2;
>>> 2 rows selected (0.035 seconds)
>>>
>>>
>>> Now when I try the date comparison using the columns from my view, it
>>> fails:
>>>
>>> select '1' from "gp_subscriptions" where to_date("created_at_ts") =
>>> to_date('2009-05-05 15:40:10.000') limit 2;
>>>
>>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>>> BooleanExpressionFilter failed during reading: Could not initialize class
>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>> Caused by: java.lang.NoClassDefFoundError: Could not initialize class
>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>
>>> Also fails with same error when I try: select '1' from
>>> "gp_subscriptions" where to_date("created_at_ts") = to_date('2009-05-05')
>>> limit 2;
>>>
>>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>>> BooleanExpressionFilter failed during reading: Could not initialize class
>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>> Caused by: java.lang.NoClassDefFoundError: Could not initialize class
>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>
>>>
>>> *ISSUE: *Date comparisons on string literals are not evaluating
>>> correctly such that dates in the future get interpreted as being less than
>>> dates in the past.
>>>
>>> Test case 1:
>>> 2009-05-05 15:40:10.000 is greater than (in the future) 2005-05-05
>>> 15:40:10.000
>>>
>>> The following query should return 2 rows, however, it does not return
>>> any rows:
>>>
>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>> 15:40:10.000') > to_date('2005-05-05 15:40:10.000') limit 2;
>>> No rows selected (0.024 seconds)
>>>
>>>
>>> The following query should return no rows, however, it returns 2 rows:
>>>
>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>> 15:40:10.000') < to_date('2005-05-05 15:40:10.000') limit 2;
>>> 2 rows selected (0.033 seconds)
>>>
>>> Test case 2:
>>> 2009-05-05 is greater than (in the future) than 1970-05-05
>>>
>>> The following query should return 2 rows, however, it does not return
>>> any rows:
>>>
>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') >
>>> to_date('1970-05-05') limit 2;
>>> No rows selected (0.024 seconds)
>>>
>>>
>>> The following query should return no rows, however, it returns 2 rows:
>>>
>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') <
>>> to_date('1970-05-05') limit 2;
>>> 2 rows selected (0.033 seconds)
>>>
>>>
>>> --
>>> *Binu Mathew*
>>> Data Engineering
>>>
>>> 3101 Park Blvd., Palo Alto, CA 94306
>>> Mobile: 630.267.5938
>>> Groupon
>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.google.com_url-3Fq-3Dhttp-253A-252F-252Fwww.groupon.com-252F-26sa-3DD-26sntz-3D1-26usg-3DAFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg&d=CwMFaQ&c=LNdz7nrxyGFUIUTz2qIULQ&r=kx0aYbS3d_bROblEBwnOWEEpQQo7ummi3XKbSfjcDxI&m=DYIhYC8MzfMvMlb_vB97jlBOyX5-ROGZnBR8uHjL9AM&s=ZLtx84pLqRnFvwXQsJT-IoadaGYt0KObWPttSMqW-xY&e=>
>>>
>>>
>>>
>>>
>>
>
>
> --
> *Binu Mathew*
> Data Engineering
>
> 3101 Park Blvd., Palo Alto, CA 94306
> Mobile: 630.267.5938
> Groupon
> <http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>
>
>
>
>

Re: to_date not working as expected

Posted by Binu Mathew <bm...@groupon.com>.
Thank you for the reply.

I mistakenly wrote that we are using Phoenix with HBase .96. This was a
typo. We are using HBase .98 with Phoenix 4.4

I tried the UNSIGNED types and still encountering the same issue.

My field has the following data:

select "created_at_date" from "gp_subscriptions" limit 3;

+------------------------------------------+
|             created_at_date              |
+------------------------------------------+
| 2012-11-22                               |
| 2012-11-22                               |
| 2012-11-26                               |
+------------------------------------------+

to_date function works when I convert the VARCHAR field to a DATE:

select to_date("created_at_date") from "gp_subscriptions" limit 3;

+---------------------------------------------------+
| TO_DATE(subscriber."created_at_date", null, null) |
+---------------------------------------------------+
| 2012-11-22 00:00:00.000                           |
| 2012-11-22 00:00:00.000                           |
| 2012-11-26 00:00:00.000                           |
+---------------------------------------------------+


However, I can't use the to_date function in the WHERE clause:

select 1 from "gp_subscriptions" where to_date("created_at_date") >
to_date('2010-10-10') limit 3;

java.lang.RuntimeException:
org.apache.phoenix.exception.PhoenixIOException:
org.apache.phoenix.exception.PhoenixIOException:
org.apache.hadoop.hbase.DoNotRetryIOException:
java.lang.reflect.InvocationTargetException

Another issue is how Phoenix evaluates dates. In the queries below, I'm
evaluating string literals:

2009-05-05 is greater than (in the future) than 1970-05-05

The following query should return 2 rows, however, it does not return any
rows:

select '1' from "gp_subscriptions" where to_date('2009-05-05') >
to_date('1970-05-05') limit 2;
No rows selected (0.024 seconds)


The following query should return no rows, however, it returns 2 rows:

select '1' from "gp_subscriptions" where to_date('2009-05-05') <
to_date('1970-05-05') limit 2;
2 rows selected (0.033 seconds)

Thanks,

On Thu, Jan 28, 2016 at 8:07 PM, James Taylor <ja...@apache.org>
wrote:

> Hi Binu,
> Phoenix has never supported HBase 0.96, so I'm not sure where you got the
> release from.
>
> I recommend upgrading to a later, supported version of HBase and a later
> version of Phoenix. Give the 4.7.0 RC a try.
>
> One other tip in particular for views you create over existing HBase
> tables. Use the UNSIGNED types documented here[1] as these use the same
> serialization as the Bytes methods provided by HBase. If you tell Phoenix
> the wrong type, it won't know so would produce erroneous data and queries.
>
> Thanks,
> James
>
> [1] https://phoenix.apache.org/language/datatypes.html
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__phoenix.apache.org_language_datatypes.html&d=CwMFaQ&c=LNdz7nrxyGFUIUTz2qIULQ&r=kx0aYbS3d_bROblEBwnOWEEpQQo7ummi3XKbSfjcDxI&m=DYIhYC8MzfMvMlb_vB97jlBOyX5-ROGZnBR8uHjL9AM&s=qZf4wLz966HePHU8YevWHvcdJWrlFV84gXZLYftu1II&e=>
>
> On Thu, Jan 28, 2016 at 5:57 PM, Binu Mathew <bm...@groupon.com> wrote:
>
>> Phoenix version 4.4.0
>>
>> Issues with Phoenix when used with HBase 0.96.0.2.0
>>
>> 2 Issues:
>>
>> *ISSUE:* to_date Function is not converting string data types in valid
>> date formats to a DATE data type when used in the WHERE clause for date
>> comparison.
>>
>> Below is a query I ran against a Phoenix view in which I use the
>> ‘to_date’ function to convert 2 VARCHAR columns to date.
>> 1. column ‘created_at_ts’ stored as VARCHAR in format such as 2009-05-05
>> 15:40:10.000
>> 2. column ‘created_at_date’ stored as VARCHAR in format such as 2009-05-05
>>
>> Observe that the ‘to_date’ function coverts the 2 VARCHAR columns to
>> dates:
>>
>>
>> select to_date("created_at_ts"), to_date("created_at_date") from
>> "gp_subscriptions" limit 5;
>>
>>
>> +-------------------------------------------------+---------------------------------------------------+
>> | TO_DATE(subscriber."created_at_ts", null, null) |
>> TO_DATE(subscriber."created_at_date", null, null) |
>>
>> +-------------------------------------------------+---------------------------------------------------+
>> | 2009-05-05 15:40:10.000                         | 2009-05-05
>> 00:00:00.000                           |
>> | 2012-11-22 07:37:34.000                         | 2012-11-22
>> 00:00:00.000                           |
>> | 2010-07-24 14:12:33.000                         | 2010-07-24
>> 00:00:00.000                           |
>> | 2012-11-22 07:38:04.000                         | 2012-11-22
>> 00:00:00.000                           |
>> | 2012-11-22 07:38:10.000                         | 2012-11-22
>> 00:00:00.000                           |
>>
>> +-------------------------------------------------+---------------------------------------------------+
>>
>>
>> Here is another query in which I’m using the ‘to_date’ function on string
>> literals in the WHERE clause for date comparison .
>>
>> Observer that the ‘to_date’ function coverts the string literals to dates
>> and the the date comparison correctly evaluates:
>>
>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>> 15:40:10.000') = to_date('2009-05-05 15:40:10.000') limit 2;
>> 2 rows selected (0.035 seconds)
>>
>>
>> Now when I try the date comparison using the columns from my view, it
>> fails:
>>
>> select '1' from "gp_subscriptions" where to_date("created_at_ts") =
>> to_date('2009-05-05 15:40:10.000') limit 2;
>>
>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>> BooleanExpressionFilter failed during reading: Could not initialize class
>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>> Caused by: java.lang.NoClassDefFoundError: Could not initialize class
>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>
>> Also fails with same error when I try: select '1' from "gp_subscriptions"
>> where to_date("created_at_ts") = to_date('2009-05-05') limit 2;
>>
>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>> BooleanExpressionFilter failed during reading: Could not initialize class
>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>> Caused by: java.lang.NoClassDefFoundError: Could not initialize class
>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>
>>
>> *ISSUE: *Date comparisons on string literals are not evaluating
>> correctly such that dates in the future get interpreted as being less than
>> dates in the past.
>>
>> Test case 1:
>> 2009-05-05 15:40:10.000 is greater than (in the future) 2005-05-05
>> 15:40:10.000
>>
>> The following query should return 2 rows, however, it does not return any
>> rows:
>>
>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>> 15:40:10.000') > to_date('2005-05-05 15:40:10.000') limit 2;
>> No rows selected (0.024 seconds)
>>
>>
>> The following query should return no rows, however, it returns 2 rows:
>>
>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>> 15:40:10.000') < to_date('2005-05-05 15:40:10.000') limit 2;
>> 2 rows selected (0.033 seconds)
>>
>> Test case 2:
>> 2009-05-05 is greater than (in the future) than 1970-05-05
>>
>> The following query should return 2 rows, however, it does not return any
>> rows:
>>
>> select '1' from "gp_subscriptions" where to_date('2009-05-05') >
>> to_date('1970-05-05') limit 2;
>> No rows selected (0.024 seconds)
>>
>>
>> The following query should return no rows, however, it returns 2 rows:
>>
>> select '1' from "gp_subscriptions" where to_date('2009-05-05') <
>> to_date('1970-05-05') limit 2;
>> 2 rows selected (0.033 seconds)
>>
>>
>> --
>> *Binu Mathew*
>> Data Engineering
>>
>> 3101 Park Blvd., Palo Alto, CA 94306
>> Mobile: 630.267.5938
>> Groupon
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.google.com_url-3Fq-3Dhttp-253A-252F-252Fwww.groupon.com-252F-26sa-3DD-26sntz-3D1-26usg-3DAFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg&d=CwMFaQ&c=LNdz7nrxyGFUIUTz2qIULQ&r=kx0aYbS3d_bROblEBwnOWEEpQQo7ummi3XKbSfjcDxI&m=DYIhYC8MzfMvMlb_vB97jlBOyX5-ROGZnBR8uHjL9AM&s=ZLtx84pLqRnFvwXQsJT-IoadaGYt0KObWPttSMqW-xY&e=>
>>
>>
>>
>>
>


-- 
*Binu Mathew*
Data Engineering

3101 Park Blvd., Palo Alto, CA 94306
Mobile: 630.267.5938
Groupon
<http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>

Re: to_date not working as expected

Posted by James Taylor <ja...@apache.org>.
Hi Binu,
Phoenix has never supported HBase 0.96, so I'm not sure where you got the
release from.

I recommend upgrading to a later, supported version of HBase and a later
version of Phoenix. Give the 4.7.0 RC a try.

One other tip in particular for views you create over existing HBase
tables. Use the UNSIGNED types documented here[1] as these use the same
serialization as the Bytes methods provided by HBase. If you tell Phoenix
the wrong type, it won't know so would produce erroneous data and queries.

Thanks,
James

[1] https://phoenix.apache.org/language/datatypes.html

On Thu, Jan 28, 2016 at 5:57 PM, Binu Mathew <bm...@groupon.com> wrote:

> Phoenix version 4.4.0
>
> Issues with Phoenix when used with HBase 0.96.0.2.0
>
> 2 Issues:
>
> *ISSUE:* to_date Function is not converting string data types in valid
> date formats to a DATE data type when used in the WHERE clause for date
> comparison.
>
> Below is a query I ran against a Phoenix view in which I use the ‘to_date’
> function to convert 2 VARCHAR columns to date.
> 1. column ‘created_at_ts’ stored as VARCHAR in format such as 2009-05-05
> 15:40:10.000
> 2. column ‘created_at_date’ stored as VARCHAR in format such as 2009-05-05
>
> Observe that the ‘to_date’ function coverts the 2 VARCHAR columns to dates:
>
>
> select to_date("created_at_ts"), to_date("created_at_date") from
> "gp_subscriptions" limit 5;
>
>
> +-------------------------------------------------+---------------------------------------------------+
> | TO_DATE(subscriber."created_at_ts", null, null) |
> TO_DATE(subscriber."created_at_date", null, null) |
>
> +-------------------------------------------------+---------------------------------------------------+
> | 2009-05-05 15:40:10.000                         | 2009-05-05
> 00:00:00.000                           |
> | 2012-11-22 07:37:34.000                         | 2012-11-22
> 00:00:00.000                           |
> | 2010-07-24 14:12:33.000                         | 2010-07-24
> 00:00:00.000                           |
> | 2012-11-22 07:38:04.000                         | 2012-11-22
> 00:00:00.000                           |
> | 2012-11-22 07:38:10.000                         | 2012-11-22
> 00:00:00.000                           |
>
> +-------------------------------------------------+---------------------------------------------------+
>
>
> Here is another query in which I’m using the ‘to_date’ function on string
> literals in the WHERE clause for date comparison .
>
> Observer that the ‘to_date’ function coverts the string literals to dates
> and the the date comparison correctly evaluates:
>
> select '1' from "gp_subscriptions" where to_date('2009-05-05
> 15:40:10.000') = to_date('2009-05-05 15:40:10.000') limit 2;
> 2 rows selected (0.035 seconds)
>
>
> Now when I try the date comparison using the columns from my view, it
> fails:
>
> select '1' from "gp_subscriptions" where to_date("created_at_ts") =
> to_date('2009-05-05 15:40:10.000') limit 2;
>
> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
> BooleanExpressionFilter failed during reading: Could not initialize class
> org.apache.phoenix.util.DateUtil$ISODateFormatParser
> Caused by: java.lang.NoClassDefFoundError: Could not initialize class
> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>
> Also fails with same error when I try: select '1' from "gp_subscriptions"
> where to_date("created_at_ts") = to_date('2009-05-05') limit 2;
>
> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
> BooleanExpressionFilter failed during reading: Could not initialize class
> org.apache.phoenix.util.DateUtil$ISODateFormatParser
> Caused by: java.lang.NoClassDefFoundError: Could not initialize class
> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>
>
> *ISSUE: *Date comparisons on string literals are not evaluating correctly
> such that dates in the future get interpreted as being less than dates in
> the past.
>
> Test case 1:
> 2009-05-05 15:40:10.000 is greater than (in the future) 2005-05-05
> 15:40:10.000
>
> The following query should return 2 rows, however, it does not return any
> rows:
>
> select '1' from "gp_subscriptions" where to_date('2009-05-05
> 15:40:10.000') > to_date('2005-05-05 15:40:10.000') limit 2;
> No rows selected (0.024 seconds)
>
>
> The following query should return no rows, however, it returns 2 rows:
>
> select '1' from "gp_subscriptions" where to_date('2009-05-05
> 15:40:10.000') < to_date('2005-05-05 15:40:10.000') limit 2;
> 2 rows selected (0.033 seconds)
>
> Test case 2:
> 2009-05-05 is greater than (in the future) than 1970-05-05
>
> The following query should return 2 rows, however, it does not return any
> rows:
>
> select '1' from "gp_subscriptions" where to_date('2009-05-05') >
> to_date('1970-05-05') limit 2;
> No rows selected (0.024 seconds)
>
>
> The following query should return no rows, however, it returns 2 rows:
>
> select '1' from "gp_subscriptions" where to_date('2009-05-05') <
> to_date('1970-05-05') limit 2;
> 2 rows selected (0.033 seconds)
>
>
> --
> *Binu Mathew*
> Data Engineering
>
> 3101 Park Blvd., Palo Alto, CA 94306
> Mobile: 630.267.5938
> Groupon
> <http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>
>
>
>
>