phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marek Wiewiorka <marek.wiewio...@gmail.com>
Subject Re: hive external table
Date Fri, 10 Apr 2015 16:20:32 GMT
Hi Nicolas - thanks a  lot I will give a try and let you!

Thanks!
Marek

2015-04-10 11:12 GMT+02:00 Nicolas Maillard <nmaillard@hortonworks.com>:

>  Hello Marek
>
>
>  I have corrected your type issues in the public Phoenix-Hive project you
> can go ahead and test the latest version.
>
> I have tested against Phoenix 4.3 to make sure all is well.
>
> This beeing said this build is still dependant on phoenix 4.2, to move to
> 4.3 has a little rewrite that has been done in the private repo, I'd rather
> update everyone at the same time when all gets merged back to standard
> phoenix. So this is a patch for you and the public repo.
>
>
>  let me know if all is better and well, I have many things to make better
> and the next versions of this connector but still always looking for
> initial testers to make sure all is stable.
>
>
>  regards
>
>
>      [image: Hortonworks, Inc.] <http://hortonworks.com/>
> *Nicolas  <mjohansson@hortonworks.com>Maillard*  *Solutions Engineer -
> EMEA* <https://www.linkedin.com/pub/mats-johansson/0/64b/9>
>   +33 (0) 668176605|
> Skype: nicolas.maillard.hortonworks
>       ------------------------------
> *From:* Nicolas Maillard
> *Sent:* Thursday, April 09, 2015 6:52 PM
> *To:* user@phoenix.apache.org
> *Subject:* Re: hive external table
>
>
> Hello Marek
>
>
>  Thanks for the test i'm doing a first fix as we speak and will extend my
> unit test to make sure it stays fixed.
>
> The timestamp is good to go I'll run a couple more tests and a new commit
> this evening.
>
>
>      [image: Hortonworks, Inc.] <http://hortonworks.com/>
> *Nicolas  <mjohansson@hortonworks.com>Maillard*  *Solutions Engineer -
> EMEA* <https://www.linkedin.com/pub/mats-johansson/0/64b/9>
>   +33 (0) 668176605|
> Skype: nicolas.maillard.hortonworks
>       ------------------------------
> *From:* Marek Wiewiorka <marek.wiewiorka@gmail.com>
> *Sent:* Tuesday, April 07, 2015 10:48 AM
>
> *To:* user@phoenix.apache.org
> *Subject:* Re: hive external table
>
>  Hi Nicolas,
> I tranformed hu_date (timestamp column) into bigint uisng to_number
> function in Phoenix.
> I created a new external table in hive as follows:
>  hive> CREATE EXTERNAL TABLE phx_usage_day(
>     >      hu_date bigint,
>     >     hu_ho_id int,
>     >     hu_stream_id int,
>     >     hu_usage double)
>     > STORED BY  "org.apache.phoenix.hive.PhoenixStorageHandler"
>     > TBLPROPERTIES(
>     >     'phoenix.hbase.table.name'='se_dwh.homes_usage_day_export ',
>     >     'phoenix.zookeeper.znode.parent'='hbase-unsecure',
>     >     'phoenix.rowkeys'='hu_date,hu_ho_id,hu_stream_id,hu_usage '
>     > );
>  however the problem is now with double type(no longer with timestamp):
>  hive> select * from phx_usage_day limit 1;
> OK
> Failed with exception
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException:
> java.lang.ClassCastException: org.apache.hadoop.io.DoubleWritable cannot be
> cast to org.apache.hadoop.hive.serde2.io.DoubleWritable
> Ti
>
>  so I changed hu_usage data type double =>float and then I got exception
> like this:
>
>  hive> select * from phx_usage_day limit 1;
> OK
> java.lang.ClassCastException: java.lang.Double cannot be cast to
> java.lang.Float
>         at
> org.apache.phoenix.hive.util.HiveTypeUtil.SQLType2Writable(HiveTypeUtil.java:111)
>         at
> org.apache.phoenix.hive.PhoenixSerde.deserialize(PhoenixSerde.java:100)
>         at
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:647)
>         at
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:561)
>         at
> org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:137)
>         at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1524)
>         at
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:285)
>         at
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
>         at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
>         at
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793)
>         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Failed with exception
> java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException
> Time taken: 0.069 seconds
>
>
>  finally I changed it to string and it worked:
>  hive> select * from phx_usage_day limit 5;
> OK
> 1388534400000   1000    0       8.42
> 1388534400000   1001    0       7.071000000000001
> 1388534400000   1002    0       9.101999999999997
> 1388534400000   1003    0       26.349999999999994
> 1388534400000   1004    0       48.421
> Time taken: 0.064 seconds, Fetched: 5 row(s)
>
>
>  So I think there is a problem with both Timestamp and Double types.
> I didn't try Float.
>
>  Regards,
> Marek
>
>
>
>
>
> 2015-04-06 22:59 GMT+02:00 Nicolas Maillard <nmaillard@hortonworks.com>:
>
>>  Hello Marek
>>
>>
>>  I am currently using a private branch prior to pushing the
>> contribution. I remember having you problem when moving to 4.3.
>>
>> I'll correct the public github and the dependency during the contribution
>> submission
>>
>>
>>      [image: Hortonworks, Inc.] <http://hortonworks.com/>
>> *Nicolas  <mjohansson@hortonworks.com>Maillard*  *Solutions Engineer -
>> EMEA* <https://www.linkedin.com/pub/mats-johansson/0/64b/9>
>>   +33 (0) 668176605|
>> Skype: nicolas.maillard.hortonworks
>>       ------------------------------
>> *From:* Marek Wiewiorka <marek.wiewiorka@gmail.com>
>> *Sent:* Monday, April 06, 2015 10:31 PM
>>
>> *To:* user@phoenix.apache.org
>> *Subject:* Re: hive external table
>>
>>   Thanks Nicolas - could you just let me know what branch should I use
>> to build your project against Phoenix 4.3.0?
>> Master of your repo seems to use 4.2.2...
>> Sure I can live using epoch secs/millis instead of timestamp - however it
>> would be nice to have timestamp mapping implemented.
>>
>>
>>  Regards,
>> Marek
>>
>> 2015-04-06 22:25 GMT+02:00 Nicolas Maillard <nmaillard@hortonworks.com>:
>>
>>>  Hello Marek
>>>
>>>
>>>  your error seems to point to a conversion mismatch when bringing back
>>> the timestamp type from Phoenix, essentially the timestamp is being brought
>>> back as a Text value in the the Hive world, I'll fix that asap.
>>>
>>> Not nice workaorund try using a long it should work but I will fix the
>>> issue at hand
>>>
>>>
>>>  Concerning your table no problem in your declaration, make sure your
>>> znode is hbase-unsecure though, it usually is hbase (the default ) but in
>>> my hortonworks sandbox it is hbase-unsecure this is why I specify it.
>>>
>>>
>>>
>>>
>>>      [image: Hortonworks, Inc.] <http://hortonworks.com/>
>>> *Nicolas  <mjohansson@hortonworks.com>Maillard*  *Solutions Engineer -
>>> EMEA* <https://www.linkedin.com/pub/mats-johansson/0/64b/9>
>>>   +33 (0) 668176605|
>>> Skype: nicolas.maillard.hortonworks
>>>       ------------------------------
>>> *From:* Marek Wiewiorka <marek.wiewiorka@gmail.com>
>>> *Sent:* Monday, April 06, 2015 10:05 PM
>>> *To:* user@phoenix.apache.org
>>> *Subject:* Re: hive external table
>>>
>>>   Hi Nicolas - thanks for your response!
>>>
>>> Where can I find the latest branch - I used master from this repo:
>>> https://github.com/nmaillard/Phoenix-Hive
>>>  and it is built againts 4.2.2 - I changed dependency in pom and tried
>>> to build it using 4.3 I got series of errors like these:
>>>  [ERROR] COMPILATION ERROR :
>>> [INFO] -------------------------------------------------------------
>>> [ERROR]
>>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[29,33]
>>> cannot find symbol
>>>   symbol:   class PDataType
>>>   location: package org.apache.phoenix.schema
>>> [ERROR]
>>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[93,19]
>>> cannot find symbol
>>>   symbol:   class PDataType
>>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>>> [ERROR]
>>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[102,19]
>>> cannot find symbol
>>>   symbol:   class PDataType
>>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>>> [ERROR]
>>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[116,19]
>>> cannot find symbol
>>>   symbol:   class PDataType
>>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>>> [ERROR]
>>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[128,19]
>>> cannot find symbol
>>>   symbol:   class PDataType
>>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>>> [ERROR]
>>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[164,19]
>>> cannot find symbol
>>>   symbol:   class PDataType
>>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>>> [ERROR]
>>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[216,13]
>>> cannot find symbol
>>>   symbol:   class PDataType
>>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>>> [ERROR]
>>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/PhoenixMetaHook.java:[41,33]
>>> cannot find symbol
>>>   symbol:   class PDataType
>>>   location: package org.apache.phoenix.schema
>>> [ERROR]
>>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/PhoenixSerde.java:[41,33]
>>> cannot find symbol
>>>
>>>
>>>  Could you also please advise how should I define an external table in
>>> Hive using Phoenix-hive for a table created in Phoenix as follows:
>>>
>>>  create table se_dwh.homes_usage_hour
>>> (hu_ts time not null ,
>>> hu_ho_id integer not null ,
>>> hu_stream_id integer not null,
>>> hu_usage double constraint pk PRIMARY KEY(hu_ts,hu_ho_id,hu_stream_id) );
>>>
>>>  I'm not sure how to set mapping columns correctly - I tried something
>>> like this:
>>> CREATE EXTERNAL TABLE phx_usage_hour(
>>>      hu_ts timestamp,
>>>     hu_ho_id int,
>>>     hu_stream_id int,
>>>     hu_usage double)
>>> STORED BY  "org.apache.phoenix.hive.PhoenixStorageHandler"
>>> TBLPROPERTIES(
>>>     'phoenix.hbase.table.name'='se_dwh.homes_usage_hour',
>>>     'phoenix.zookeeper.znode.parent'='hbase-unsecure',
>>>     'phoenix.rowkeys'='hu_ts,hu_ho_id,hu_stream_id'
>>> );
>>>
>>>  It was created without a problem - but when I run a simple query:
>>>
>>>  hive> select * from phx_usage_hour limit 1;
>>> OK
>>> Failed with exception
>>> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException:
>>> java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to
>>> org.apache.hadoop.hive.serde2.io.TimestampWritable
>>> Time taken: 0.084 seconds
>>>
>>>  I compiled your project using Phoenix 4.2.2 but I got 4.3 version
>>> installed in HBase - could it be the root cause of the exception ?
>>>
>>>  Thanks!
>>> Marek
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> 2015-04-06 20:57 GMT+02:00 Nicolas Maillard <nmaillard@hortonworks.com>:
>>>
>>>>  Hello Marek
>>>>
>>>>
>>>>  There are a couple ways of reaching to phoenix through Hive
>>>>
>>>> - One is calling out directly the hbase layer with the Hive Hbase
>>>> connector but this has some caveats.
>>>>
>>>> - Second is this project I am working on, the latest branch is built
>>>> against phoenix 4.3, but building it against 4.2.2 should not create any
>>>> problems from a usage part , if it does I need to fix it.
>>>>
>>>> Could you explain the issue or file an issue on the project.
>>>>
>>>> - Third if it is urgent you could dump your Hive data in a csv format
>>>> and use the phoenix Bulkloader.
>>>>
>>>>
>>>>  Whatever the way you decide to go please do forward some details
>>>> about your error so I can fix if needed
>>>>
>>>>
>>>>
>>>>      [image: Hortonworks, Inc.] <http://hortonworks.com/>
>>>> *Nicolas  <mjohansson@hortonworks.com>Maillard*  *Solutions Engineer
-
>>>> EMEA* <https://www.linkedin.com/pub/mats-johansson/0/64b/9>
>>>>   +33 (0) 668176605|
>>>> Skype: nicolas.maillard.hortonworks
>>>>       ------------------------------
>>>> *From:* Marek Wiewiorka <marek.wiewiorka@gmail.com>
>>>> *Sent:* Monday, April 06, 2015 5:45 PM
>>>> *To:* user@phoenix.apache.org
>>>> *Subject:* hive external table
>>>>
>>>>   Hi All,
>>>> I trying to create an external table mapped to a Phoenix table and I
>>>> cannot make it working.
>>>> I tried this project:
>>>> https://github.com/nmaillard/Phoenix-Hive
>>>>  but it fails to build against Phoenix 4.3.0.
>>>> I compiled it using 4.2.2 but still it's throwing some exception while
>>>> trying to access
>>>> an external table.
>>>>
>>>>  Before I try to dig deeper - I would like to ask if  there is  any
>>>> other way of using Phoenix in Hive environment?
>>>>  Maybe you recommend a different approach of transferring data from
>>>> Phoenix
>>>> to Hive?
>>>>
>>>>  Many thanks in advance!
>>>>
>>>>  Marek
>>>>
>>>
>>>
>>
>

Mime
View raw message