phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alok Singh <a...@cloudability.com>
Subject Re: to_date not working as expected
Date Sat, 30 Jan 2016 00:16:02 GMT
Does this work:

select 1, to_date("created_at_date") as created_at from "gp_subscriptions"
where created_at > to_date('2010-10-10') limit 3;

Alok

Alok

alok@cloudability.com

On Fri, Jan 29, 2016 at 3:54 PM, Binu Mathew <bmathew@groupon.com> wrote:

> Thank you for the reply.
>
> I mistakenly wrote that we are using Phoenix with HBase .96. This was a
> typo. We are using HBase .98 with Phoenix 4.4
>
> I tried the UNSIGNED types and still encountering the same issue.
>
> My field has the following data:
>
> select "created_at_date" from "gp_subscriptions" limit 3;
>
> +------------------------------------------+
> |             created_at_date              |
> +------------------------------------------+
> | 2012-11-22                               |
> | 2012-11-22                               |
> | 2012-11-26                               |
> +------------------------------------------+
>
> to_date function works when I convert the VARCHAR field to a DATE:
>
> select to_date("created_at_date") from "gp_subscriptions" limit 3;
>
> +---------------------------------------------------+
> | TO_DATE(subscriber."created_at_date", null, null) |
> +---------------------------------------------------+
> | 2012-11-22 00:00:00.000                           |
> | 2012-11-22 00:00:00.000                           |
> | 2012-11-26 00:00:00.000                           |
> +---------------------------------------------------+
>
>
> However, I can't use the to_date function in the WHERE clause:
>
> select 1 from "gp_subscriptions" where to_date("created_at_date") >
> to_date('2010-10-10') limit 3;
>
> java.lang.RuntimeException:
> org.apache.phoenix.exception.PhoenixIOException:
> org.apache.phoenix.exception.PhoenixIOException:
> org.apache.hadoop.hbase.DoNotRetryIOException:
> java.lang.reflect.InvocationTargetException
>
> Another issue is how Phoenix evaluates dates. In the queries below, I'm
> evaluating string literals:
>
> 2009-05-05 is greater than (in the future) than 1970-05-05
>
> The following query should return 2 rows, however, it does not return any
> rows:
>
> select '1' from "gp_subscriptions" where to_date('2009-05-05') >
> to_date('1970-05-05') limit 2;
> No rows selected (0.024 seconds)
>
>
> The following query should return no rows, however, it returns 2 rows:
>
> select '1' from "gp_subscriptions" where to_date('2009-05-05') <
> to_date('1970-05-05') limit 2;
> 2 rows selected (0.033 seconds)
>
> Thanks,
>
> On Thu, Jan 28, 2016 at 8:07 PM, James Taylor <jamestaylor@apache.org>
> wrote:
>
>> Hi Binu,
>> Phoenix has never supported HBase 0.96, so I'm not sure where you got the
>> release from.
>>
>> I recommend upgrading to a later, supported version of HBase and a later
>> version of Phoenix. Give the 4.7.0 RC a try.
>>
>> One other tip in particular for views you create over existing HBase
>> tables. Use the UNSIGNED types documented here[1] as these use the same
>> serialization as the Bytes methods provided by HBase. If you tell Phoenix
>> the wrong type, it won't know so would produce erroneous data and queries.
>>
>> Thanks,
>> James
>>
>> [1] https://phoenix.apache.org/language/datatypes.html
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__phoenix.apache.org_language_datatypes.html&d=CwMFaQ&c=LNdz7nrxyGFUIUTz2qIULQ&r=kx0aYbS3d_bROblEBwnOWEEpQQo7ummi3XKbSfjcDxI&m=DYIhYC8MzfMvMlb_vB97jlBOyX5-ROGZnBR8uHjL9AM&s=qZf4wLz966HePHU8YevWHvcdJWrlFV84gXZLYftu1II&e=>
>>
>> On Thu, Jan 28, 2016 at 5:57 PM, Binu Mathew <bmathew@groupon.com> wrote:
>>
>>> Phoenix version 4.4.0
>>>
>>> Issues with Phoenix when used with HBase 0.96.0.2.0
>>>
>>> 2 Issues:
>>>
>>> *ISSUE:* to_date Function is not converting string data types in valid
>>> date formats to a DATE data type when used in the WHERE clause for date
>>> comparison.
>>>
>>> Below is a query I ran against a Phoenix view in which I use the
>>> ‘to_date’ function to convert 2 VARCHAR columns to date.
>>> 1. column ‘created_at_ts’ stored as VARCHAR in format such as 2009-05-05
>>> 15:40:10.000
>>> 2. column ‘created_at_date’ stored as VARCHAR in format such as
>>> 2009-05-05
>>>
>>> Observe that the ‘to_date’ function coverts the 2 VARCHAR columns to
>>> dates:
>>>
>>>
>>> select to_date("created_at_ts"), to_date("created_at_date") from
>>> "gp_subscriptions" limit 5;
>>>
>>>
>>> +-------------------------------------------------+---------------------------------------------------+
>>> | TO_DATE(subscriber."created_at_ts", null, null) |
>>> TO_DATE(subscriber."created_at_date", null, null) |
>>>
>>> +-------------------------------------------------+---------------------------------------------------+
>>> | 2009-05-05 15:40:10.000                         | 2009-05-05
>>> 00:00:00.000                           |
>>> | 2012-11-22 07:37:34.000                         | 2012-11-22
>>> 00:00:00.000                           |
>>> | 2010-07-24 14:12:33.000                         | 2010-07-24
>>> 00:00:00.000                           |
>>> | 2012-11-22 07:38:04.000                         | 2012-11-22
>>> 00:00:00.000                           |
>>> | 2012-11-22 07:38:10.000                         | 2012-11-22
>>> 00:00:00.000                           |
>>>
>>> +-------------------------------------------------+---------------------------------------------------+
>>>
>>>
>>> Here is another query in which I’m using the ‘to_date’ function on
>>> string literals in the WHERE clause for date comparison .
>>>
>>> Observer that the ‘to_date’ function coverts the string literals to
>>> dates and the the date comparison correctly evaluates:
>>>
>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>> 15:40:10.000') = to_date('2009-05-05 15:40:10.000') limit 2;
>>> 2 rows selected (0.035 seconds)
>>>
>>>
>>> Now when I try the date comparison using the columns from my view, it
>>> fails:
>>>
>>> select '1' from "gp_subscriptions" where to_date("created_at_ts") =
>>> to_date('2009-05-05 15:40:10.000') limit 2;
>>>
>>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>>> BooleanExpressionFilter failed during reading: Could not initialize class
>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>> Caused by: java.lang.NoClassDefFoundError: Could not initialize class
>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>
>>> Also fails with same error when I try: select '1' from
>>> "gp_subscriptions" where to_date("created_at_ts") = to_date('2009-05-05')
>>> limit 2;
>>>
>>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>>> BooleanExpressionFilter failed during reading: Could not initialize class
>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>> Caused by: java.lang.NoClassDefFoundError: Could not initialize class
>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>
>>>
>>> *ISSUE: *Date comparisons on string literals are not evaluating
>>> correctly such that dates in the future get interpreted as being less than
>>> dates in the past.
>>>
>>> Test case 1:
>>> 2009-05-05 15:40:10.000 is greater than (in the future) 2005-05-05
>>> 15:40:10.000
>>>
>>> The following query should return 2 rows, however, it does not return
>>> any rows:
>>>
>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>> 15:40:10.000') > to_date('2005-05-05 15:40:10.000') limit 2;
>>> No rows selected (0.024 seconds)
>>>
>>>
>>> The following query should return no rows, however, it returns 2 rows:
>>>
>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>> 15:40:10.000') < to_date('2005-05-05 15:40:10.000') limit 2;
>>> 2 rows selected (0.033 seconds)
>>>
>>> Test case 2:
>>> 2009-05-05 is greater than (in the future) than 1970-05-05
>>>
>>> The following query should return 2 rows, however, it does not return
>>> any rows:
>>>
>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') >
>>> to_date('1970-05-05') limit 2;
>>> No rows selected (0.024 seconds)
>>>
>>>
>>> The following query should return no rows, however, it returns 2 rows:
>>>
>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') <
>>> to_date('1970-05-05') limit 2;
>>> 2 rows selected (0.033 seconds)
>>>
>>>
>>> --
>>> *Binu Mathew*
>>> Data Engineering
>>>
>>> 3101 Park Blvd., Palo Alto, CA 94306
>>> Mobile: 630.267.5938
>>> Groupon
>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.google.com_url-3Fq-3Dhttp-253A-252F-252Fwww.groupon.com-252F-26sa-3DD-26sntz-3D1-26usg-3DAFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg&d=CwMFaQ&c=LNdz7nrxyGFUIUTz2qIULQ&r=kx0aYbS3d_bROblEBwnOWEEpQQo7ummi3XKbSfjcDxI&m=DYIhYC8MzfMvMlb_vB97jlBOyX5-ROGZnBR8uHjL9AM&s=ZLtx84pLqRnFvwXQsJT-IoadaGYt0KObWPttSMqW-xY&e=>
>>>
>>>
>>>
>>>
>>
>
>
> --
> *Binu Mathew*
> Data Engineering
>
> 3101 Park Blvd., Palo Alto, CA 94306
> Mobile: 630.267.5938
> Groupon
> <http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>
>
>
>
>

Mime
View raw message