phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Binu Mathew <bmat...@groupon.com>
Subject Re: to_date not working as expected
Date Sat, 30 Jan 2016 00:26:42 GMT
That doesn't seem to work.

Phoenix is not recognizing that created_at in the WERE clause is a derived
column from to_date("created_at_date").

Some relational databases support this type of functionality.

On Fri, Jan 29, 2016 at 4:16 PM, Alok Singh <alok@cloudability.com> wrote:

> Does this work:
>
> select 1, to_date("created_at_date") as created_at from
> "gp_subscriptions" where created_at > to_date('2010-10-10') limit 3;
>
> Alok
>
> Alok
>
> alok@cloudability.com
>
> On Fri, Jan 29, 2016 at 3:54 PM, Binu Mathew <bmathew@groupon.com> wrote:
>
>> Thank you for the reply.
>>
>> I mistakenly wrote that we are using Phoenix with HBase .96. This was a
>> typo. We are using HBase .98 with Phoenix 4.4
>>
>> I tried the UNSIGNED types and still encountering the same issue.
>>
>> My field has the following data:
>>
>> select "created_at_date" from "gp_subscriptions" limit 3;
>>
>> +------------------------------------------+
>> |             created_at_date              |
>> +------------------------------------------+
>> | 2012-11-22                               |
>> | 2012-11-22                               |
>> | 2012-11-26                               |
>> +------------------------------------------+
>>
>> to_date function works when I convert the VARCHAR field to a DATE:
>>
>> select to_date("created_at_date") from "gp_subscriptions" limit 3;
>>
>> +---------------------------------------------------+
>> | TO_DATE(subscriber."created_at_date", null, null) |
>> +---------------------------------------------------+
>> | 2012-11-22 00:00:00.000                           |
>> | 2012-11-22 00:00:00.000                           |
>> | 2012-11-26 00:00:00.000                           |
>> +---------------------------------------------------+
>>
>>
>> However, I can't use the to_date function in the WHERE clause:
>>
>> select 1 from "gp_subscriptions" where to_date("created_at_date") >
>> to_date('2010-10-10') limit 3;
>>
>> java.lang.RuntimeException:
>> org.apache.phoenix.exception.PhoenixIOException:
>> org.apache.phoenix.exception.PhoenixIOException:
>> org.apache.hadoop.hbase.DoNotRetryIOException:
>> java.lang.reflect.InvocationTargetException
>>
>> Another issue is how Phoenix evaluates dates. In the queries below, I'm
>> evaluating string literals:
>>
>> 2009-05-05 is greater than (in the future) than 1970-05-05
>>
>> The following query should return 2 rows, however, it does not return any
>> rows:
>>
>> select '1' from "gp_subscriptions" where to_date('2009-05-05') >
>> to_date('1970-05-05') limit 2;
>> No rows selected (0.024 seconds)
>>
>>
>> The following query should return no rows, however, it returns 2 rows:
>>
>> select '1' from "gp_subscriptions" where to_date('2009-05-05') <
>> to_date('1970-05-05') limit 2;
>> 2 rows selected (0.033 seconds)
>>
>> Thanks,
>>
>> On Thu, Jan 28, 2016 at 8:07 PM, James Taylor <jamestaylor@apache.org>
>> wrote:
>>
>>> Hi Binu,
>>> Phoenix has never supported HBase 0.96, so I'm not sure where you got
>>> the release from.
>>>
>>> I recommend upgrading to a later, supported version of HBase and a later
>>> version of Phoenix. Give the 4.7.0 RC a try.
>>>
>>> One other tip in particular for views you create over existing HBase
>>> tables. Use the UNSIGNED types documented here[1] as these use the same
>>> serialization as the Bytes methods provided by HBase. If you tell Phoenix
>>> the wrong type, it won't know so would produce erroneous data and queries.
>>>
>>> Thanks,
>>> James
>>>
>>> [1] https://phoenix.apache.org/language/datatypes.html
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__phoenix.apache.org_language_datatypes.html&d=CwMFaQ&c=LNdz7nrxyGFUIUTz2qIULQ&r=kx0aYbS3d_bROblEBwnOWEEpQQo7ummi3XKbSfjcDxI&m=DYIhYC8MzfMvMlb_vB97jlBOyX5-ROGZnBR8uHjL9AM&s=qZf4wLz966HePHU8YevWHvcdJWrlFV84gXZLYftu1II&e=>
>>>
>>> On Thu, Jan 28, 2016 at 5:57 PM, Binu Mathew <bmathew@groupon.com>
>>> wrote:
>>>
>>>> Phoenix version 4.4.0
>>>>
>>>> Issues with Phoenix when used with HBase 0.96.0.2.0
>>>>
>>>> 2 Issues:
>>>>
>>>> *ISSUE:* to_date Function is not converting string data types in valid
>>>> date formats to a DATE data type when used in the WHERE clause for date
>>>> comparison.
>>>>
>>>> Below is a query I ran against a Phoenix view in which I use the
>>>> ‘to_date’ function to convert 2 VARCHAR columns to date.
>>>> 1. column ‘created_at_ts’ stored as VARCHAR in format such as
>>>> 2009-05-05 15:40:10.000
>>>> 2. column ‘created_at_date’ stored as VARCHAR in format such as
>>>> 2009-05-05
>>>>
>>>> Observe that the ‘to_date’ function coverts the 2 VARCHAR columns to
>>>> dates:
>>>>
>>>>
>>>> select to_date("created_at_ts"), to_date("created_at_date") from
>>>> "gp_subscriptions" limit 5;
>>>>
>>>>
>>>> +-------------------------------------------------+---------------------------------------------------+
>>>> | TO_DATE(subscriber."created_at_ts", null, null) |
>>>> TO_DATE(subscriber."created_at_date", null, null) |
>>>>
>>>> +-------------------------------------------------+---------------------------------------------------+
>>>> | 2009-05-05 15:40:10.000                         | 2009-05-05
>>>> 00:00:00.000                           |
>>>> | 2012-11-22 07:37:34.000                         | 2012-11-22
>>>> 00:00:00.000                           |
>>>> | 2010-07-24 14:12:33.000                         | 2010-07-24
>>>> 00:00:00.000                           |
>>>> | 2012-11-22 07:38:04.000                         | 2012-11-22
>>>> 00:00:00.000                           |
>>>> | 2012-11-22 07:38:10.000                         | 2012-11-22
>>>> 00:00:00.000                           |
>>>>
>>>> +-------------------------------------------------+---------------------------------------------------+
>>>>
>>>>
>>>> Here is another query in which I’m using the ‘to_date’ function on
>>>> string literals in the WHERE clause for date comparison .
>>>>
>>>> Observer that the ‘to_date’ function coverts the string literals to
>>>> dates and the the date comparison correctly evaluates:
>>>>
>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>>> 15:40:10.000') = to_date('2009-05-05 15:40:10.000') limit 2;
>>>> 2 rows selected (0.035 seconds)
>>>>
>>>>
>>>> Now when I try the date comparison using the columns from my view, it
>>>> fails:
>>>>
>>>> select '1' from "gp_subscriptions" where to_date("created_at_ts") =
>>>> to_date('2009-05-05 15:40:10.000') limit 2;
>>>>
>>>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>>>> BooleanExpressionFilter failed during reading: Could not initialize class
>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>> Caused by: java.lang.NoClassDefFoundError: Could not initialize class
>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>
>>>> Also fails with same error when I try: select '1' from
>>>> "gp_subscriptions" where to_date("created_at_ts") = to_date('2009-05-05')
>>>> limit 2;
>>>>
>>>> Caused by: org.apache.hadoop.hbase.DoNotRetryIOException:
>>>> BooleanExpressionFilter failed during reading: Could not initialize class
>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>> Caused by: java.lang.NoClassDefFoundError: Could not initialize class
>>>> org.apache.phoenix.util.DateUtil$ISODateFormatParser
>>>>
>>>>
>>>> *ISSUE: *Date comparisons on string literals are not evaluating
>>>> correctly such that dates in the future get interpreted as being less than
>>>> dates in the past.
>>>>
>>>> Test case 1:
>>>> 2009-05-05 15:40:10.000 is greater than (in the future) 2005-05-05
>>>> 15:40:10.000
>>>>
>>>> The following query should return 2 rows, however, it does not return
>>>> any rows:
>>>>
>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>>> 15:40:10.000') > to_date('2005-05-05 15:40:10.000') limit 2;
>>>> No rows selected (0.024 seconds)
>>>>
>>>>
>>>> The following query should return no rows, however, it returns 2 rows:
>>>>
>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05
>>>> 15:40:10.000') < to_date('2005-05-05 15:40:10.000') limit 2;
>>>> 2 rows selected (0.033 seconds)
>>>>
>>>> Test case 2:
>>>> 2009-05-05 is greater than (in the future) than 1970-05-05
>>>>
>>>> The following query should return 2 rows, however, it does not return
>>>> any rows:
>>>>
>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') >
>>>> to_date('1970-05-05') limit 2;
>>>> No rows selected (0.024 seconds)
>>>>
>>>>
>>>> The following query should return no rows, however, it returns 2 rows:
>>>>
>>>> select '1' from "gp_subscriptions" where to_date('2009-05-05') <
>>>> to_date('1970-05-05') limit 2;
>>>> 2 rows selected (0.033 seconds)
>>>>
>>>>
>>>> --
>>>> *Binu Mathew*
>>>> Data Engineering
>>>>
>>>> 3101 Park Blvd., Palo Alto, CA 94306
>>>> Mobile: 630.267.5938
>>>> Groupon
>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.google.com_url-3Fq-3Dhttp-253A-252F-252Fwww.groupon.com-252F-26sa-3DD-26sntz-3D1-26usg-3DAFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg&d=CwMFaQ&c=LNdz7nrxyGFUIUTz2qIULQ&r=kx0aYbS3d_bROblEBwnOWEEpQQo7ummi3XKbSfjcDxI&m=DYIhYC8MzfMvMlb_vB97jlBOyX5-ROGZnBR8uHjL9AM&s=ZLtx84pLqRnFvwXQsJT-IoadaGYt0KObWPttSMqW-xY&e=>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> *Binu Mathew*
>> Data Engineering
>>
>> 3101 Park Blvd., Palo Alto, CA 94306
>> Mobile: 630.267.5938
>> Groupon
>> <http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>
>>
>>
>>
>>
>


-- 
*Binu Mathew*
Data Engineering

3101 Park Blvd., Palo Alto, CA 94306
Mobile: 630.267.5938
Groupon
<http://www.google.com/url?q=http%3A%2F%2Fwww.groupon.com%2F&sa=D&sntz=1&usg=AFrqEzcC80FkwsjyolWTKAH1sZ9yU2t0xg>

Mime
View raw message