phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Dimiduk <ndimi...@gmail.com>
Subject Re: Phoenix/hbase part of composite rowkey retrieval
Date Fri, 10 Oct 2014 16:39:50 GMT
Hi Poonam,

Date/time format parsing is a cesspool. Might I recommend you frontline the
Phoenix ingest with a data cleansing step that normalizes all the dates?
This is probably easier done in Java/Scala/Clojure/Perl/Python/Ruby/C# than
SQL.

-n

---
Thanks,
Nick
n10k.com
hbaseinaction.com

On Fri, Oct 10, 2014 at 9:17 AM, Poonam Ligade <poonam.v.ligade@gmail.com>
wrote:

> Hi Krishna,
>
> Ya definitely. We tried that initially a lot. But while storing data in
> Date type we were getting lot of formatting exceptions.
>
> ERROR util.CSVCommonsLoader: Error upserting record [5/20/2013 12:00:00
> AM]: java.text.ParseException: Unparseable date: "5/20/2013 12:00:00 AM"
>
> so we had to choose fixed size Char.
> and we cant revert it now.
>
> Regards,
> Poonam.
>
> On Fri, Oct 10, 2014 at 9:24 PM, Krishna <research800@gmail.com> wrote:
>
>> Hi Poonam, You should consider storing date fields in DATE/TIME
>> datatypes instead of Char(22).
>>
>>
>> On Friday, October 10, 2014, Poonam Ligade <poonam.v.ligade@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I have phoenix table with below schema,
>>> CREATE TABLE IF NOT EXISTS TEST (
>>> ID  BIGINT NOT NULL,
>>> Date1 char(22) NOT NULL,
>>> StID INTEGER NOT NULL,
>>> ....
>>> ..........
>>> CONSTRAINT PK PRIMARY KEY (ID,Date1,StID)
>>> );
>>>
>>> I have dates in below format
>>> 01/02/2013 12:00:00 AM
>>> 02/07/2013 12:00:00 AM
>>> 03/25/2013 12:00:00 AM
>>> 4/12/2013 12:00:00 AM
>>> 5/16/2013 12:00:00 AM
>>>
>>> I have over 200+milion rows of size 110GB in this table.
>>>
>>> I frequently need to retrieve rows based on Date1.
>>> To get data for January month, I am doing
>>> select *  from TEST where TO_DATE(Date1,'MM/dd/yyyy')  <
>>> TO_DATE('02/01/2013','MM/dd/yyyy');
>>> or
>>> select *  from TEST where substr(Date1,0,2)='01';
>>>
>>> or to get week range
>>> select *  from TEST  where TO_DATE(Date1,'MM/dd/yyyy')  <
>>> TO_DATE('03/18/2013','MM/dd/yyyy') and  TO_DATE(Date1,'MM/dd/yyyy')  >
>>> TO_DATE('03/10/2013','MM/dd/yyyy')
>>>
>>> But these queries are slow,
>>> Can you suggest more performant queries, to fetch data based on date
>>> range in phoenix.
>>>
>>> I am using Hortonworks hadoop 2.1, hbase-0.98 and phoenix 4.1
>>>
>>> Regards,
>>> Poonam.
>>>
>>
>

Mime
View raw message