phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Binu Mathew <bmat...@groupon.com>
Subject Re: to_date not working as expected
Date Fri, 29 Jan 2016 23:54:26 GMT
Thank you for the reply.

I mistakenly wrote that we are using Phoenix with HBase .96. This was a
typo. We are using HBase .98 with Phoenix 4.4

I tried the UNSIGNED types and still encountering the same issue.

My field has the following data:

select "created_at_date" from "gp_subscriptions" limit 3;

+------------------------------------------+
|             created_at_date              |
+------------------------------------------+
| 2012-11-22                               |
| 2012-11-22                               |
| 2012-11-26                               |
+------------------------------------------+

to_date function works when I convert the VARCHAR field to a DATE:

select to_date("created_at_date") from "gp_subscriptions" limit 3;

+---------------------------------------------------+
| TO_DATE(subscriber."created_at_date", null, null) |
+---------------------------------------------------+
| 2012-11-22 00:00:00.000                           |
| 2012-11-22 00:00:00.000                           |
| 2012-11-26 00:00:00.000                           |
+---------------------------------------------------+


However, I can't use the to_date function in the WHERE clause:

select 1 from "gp_subscriptions" where to_date("created_at_date") >
to_date('2010-10-10') limit 3;

java.lang.RuntimeException:
org.apache.phoenix.exception.PhoenixIOException:
org.apache.phoenix.exception.PhoenixIOException:
org.apache.hadoop.hbase.DoNotRetryIOException:
java.lang.reflect.InvocationTargetException

Another issue is how Phoenix evaluates dates. In the queries below, I'm
evaluating string literals:

2009-05-05 is greater than (in the future) than 1970-05-05

The following query should return 2 rows, however, it does not return any
rows:

select '1' from "gp_subscriptions" where to_date('2009-05-05') >
to_date('1970-05-05') limit 2;
No rows selected (0.024 seconds)


The following query should return no rows, however, it returns 2 rows:

select '1' from "gp_subscriptions" where to_date('2009-05-05') <
to_date('1970-05-05') limit 2;
2 rows selected (0.033 seconds)

Thanks,

On Thu, Jan 28, 2016 at 8:07 PM, James Taylor <jamestaylor@apache.org>
wrote:

> Hi Binu,
> Phoenix has never supported HBase 0.96, so I'm not sure where you got the
> release from.
>
> I recommend upgrading to a later, supported version of HBase and a later
> version of Phoenix. Give the 4.7.0 RC a try.
>
> One other tip in particular for views you create over existing HBase
> tables. Use the UNSIGNED types documented here[1] as these use the same
> serialization as the Bytes methods provided by HBase. If you tell Phoenix
> the wrong type, it won't know so would produce erroneous data and queries.
>
> Thanks,
> James
>
> [1] https://phoenix.apache.org/language/datatypes.html
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__phoenix.apache.org_language_datatypes.html&d=CwMFaQ&c=LNdz7nrxyGFUIUTz2qIULQ&r=kx0aYbS3d_bROblEBwnOWEEpQQo7ummi3XKbSfjcDxI&m=DYIhYC8MzfMvMlb_vB97jlBOyX5-ROGZnBR8uHjL9AM&s=qZf4wLz966HePHU8YevWHvcdJWrlFV84gXZLYftu1II&e=>
>
> On Thu, Jan 28, 2016 at 5:57 PM, Binu Mathew <bmathew@groupon.com> wrote:
>
>> Phoenix version 4.4.0
>>
>> Issues with Phoenix when used with HBase 0.96.0.2.0
>>
>> 2 Issues:
>>
>> *ISSUE:* to_date Function is not converting string data types in valid
>> date formats to a DATE data type when used in the WHERE clause for date
>> comparison.
>>
>> Below is a query I ran against a Phoenix view in which I use the
>> ‘to_date’ function to convert 2 VARCHAR columns to date.
>> 1. column ‘created_at_ts’ stored as VARCHAR in format such as 2009-05-05
>> 15:40:10.000
>> 2. column ‘created_at_date’ stored as VARCHAR in format such as 2009-05-05
>>
>> Observe that the ‘to_date’ function coverts the 2 VARCHAR columns to
>> dates:
>>
>>
>> select to_date("created_at_ts"), to_date("created_at_date") from
>> "gp_subscriptions" limit 5;
>>
>>
>> +-------------------------------------------------+---------------------------------------------------+
>> | TO_DATE(subscriber."created_at_ts", null, null) |
>> TO_DATE(subscriber."created_at_date", null, null) |
>>
>> +-------------------------------------------------+---------------------------------------------------+
>> | 2009-05-05 15:40:10.000                         | 2009-05-05
>> 00:00:00.000                           |
>> | 2012-11-22 07:37:34.000                         | 2012-11-22
>> 00:00:00.000                           |
>> | 2010-07-24 14:12:33.000                         | 2010-07-24
>> 00:00:00.000                           |
>> | 2012-11-22 07:38:04.000                         | 2012-11-22
>> 00:00:00.000                           |
>> | 2012-11-22 07:38:10.000                         | 2012-11-22
>> 00:00:00.000                           |
>>
>> +-------------------------------------------------+---------------------------------------------------+
>>
>>
>> Here is another query in which I’m using the ‘to_date’ function on string
>> literals in the WHERE clause for date comparison .
>>
>> Observer that the ‘to_date’ function coverts the string literals to dates
>> and the the date comparison correctly evaluates:
>>
>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>> 15:40:10.000') = to_date('2009-05-05 15:40:10.000') limit 2;
>> 2 rows selected (0.035 seconds)
>>
>>
>> Now when I try the date comparison using the columns from my view, it
>> fails:
>>
>> select '1' from "gp_subscriptions" where to_date("created_at_ts") =
>> to_date('2009-05-05 15:40:10.000') limit 2;
>>
>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>> BooleanExpressionFilter failed during reading: Could not initialize class
>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>> Caused by: java.lang.NoClassDefFoundError: Could not initialize class
>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>
>> Also fails with same error when I try: select '1' from "gp_subscriptions"
>> where to_date("created_at_ts") = to_date('2009-05-05') limit 2;
>>
>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>> BooleanExpressionFilter failed during reading: Could not initialize class
>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>> Caused by: java.lang.NoClassDefFoundError: Could not initialize class
>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>
>>
>> *ISSUE: *Date comparisons on string literals are not evaluating
>> correctly such that dates in the future get interpreted as being less than
>> dates in the past.
>>
>> Test case 1:
>> 2009-05-05 15:40:10.000 is greater than (in the future) 2005-05-05
>> 15:40:10.000
>>
>> The following query should return 2 rows, however, it does not return any
>> rows:
>>
>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>> 15:40:10.000') > to_date('2005-05-05 15:40:10.000') limit 2;
>> No rows selected (0.024 seconds)
>>
>>
>> The following query should return no rows, however, it returns 2 rows:
>>
>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>> 15:40:10.000') < to_date('2005-05-05 15:40:10.000') limit 2;
>> 2 rows selected (0.033 seconds)
>>
>> Test case 2:
>> 2009-05-05 is greater than (in the future) than 1970-05-05
>>
>> The following query should return 2 rows, however, it does not return any
>> rows:
>>
>> select '1' from "gp_subscriptions" where to_date('2009-05-05') >
>> to_date('1970-05-05') limit 2;
>> No rows selected (0.024 seconds)
>>
>>
>> The following query should return no rows, however, it returns 2 rows:
>>
>> select '1' from "gp_subscriptions" where to_date('2009-05-05') <
>> to_date('1970-05-05') limit 2;
>> 2 rows selected (0.033 seconds)
>>
>>
>> --
>> *Binu Mathew*
>> Data Engineering
>>
>> 3101 Park Blvd., Palo Alto, CA 94306
>> Mobile: 630.267.5938
>> Groupon
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.google.com_url-3Fq-3Dhttp-253A-252F-252Fwww.groupon.com-252F-26sa-3DD-26sntz-3D1-26usg-3DAFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg&d=CwMFaQ&c=LNdz7nrxyGFUIUTz2qIULQ&r=kx0aYbS3d_bROblEBwnOWEEpQQo7ummi3XKbSfjcDxI&m=DYIhYC8MzfMvMlb_vB97jlBOyX5-ROGZnBR8uHjL9AM&s=ZLtx84pLqRnFvwXQsJT-IoadaGYt0KObWPttSMqW-xY&e=>
>>
>>
>>
>>
>


-- 
*Binu Mathew*
Data Engineering

3101 Park Blvd., Palo Alto, CA 94306
Mobile: 630.267.5938
Groupon
<http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>

Mime
View raw message