phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marek Wiewiorka <marek.wiewio...@gmail.com>
Subject Re: hive external table
Date Tue, 07 Apr 2015 08:48:29 GMT
Hi Nicolas,
I tranformed hu_date (timestamp column) into bigint uisng to_number
function in Phoenix.
I created a new external table in hive as follows:
hive> CREATE EXTERNAL TABLE phx_usage_day(
    >      hu_date bigint,
    >     hu_ho_id int,
    >     hu_stream_id int,
    >     hu_usage double)
    > STORED BY  "org.apache.phoenix.hive.PhoenixStorageHandler"
    > TBLPROPERTIES(
    >     'phoenix.hbase.table.name'='se_dwh.homes_usage_day_export ',
    >     'phoenix.zookeeper.znode.parent'='hbase-unsecure',
    >     'phoenix.rowkeys'='hu_date,hu_ho_id,hu_stream_id,hu_usage '
    > );
however the problem is now with double type(no longer with timestamp):
hive> select * from phx_usage_day limit 1;
OK
Failed with exception
java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.ClassCastException: org.apache.hadoop.io.DoubleWritable cannot be
cast to org.apache.hadoop.hive.serde2.io.DoubleWritable
Ti

so I changed hu_usage data type double =>float and then I got exception
like this:

hive> select * from phx_usage_day limit 1;
OK
java.lang.ClassCastException: java.lang.Double cannot be cast to
java.lang.Float
        at
org.apache.phoenix.hive.util.HiveTypeUtil.SQLType2Writable(HiveTypeUtil.java:111)
        at
org.apache.phoenix.hive.PhoenixSerde.deserialize(PhoenixSerde.java:100)
        at
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:647)
        at
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:561)
        at
org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:137)
        at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1524)
        at
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:285)
        at
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
        at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
        at
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Failed with exception
java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException
Time taken: 0.069 seconds


finally I changed it to string and it worked:
hive> select * from phx_usage_day limit 5;
OK
1388534400000   1000    0       8.42
1388534400000   1001    0       7.071000000000001
1388534400000   1002    0       9.101999999999997
1388534400000   1003    0       26.349999999999994
1388534400000   1004    0       48.421
Time taken: 0.064 seconds, Fetched: 5 row(s)


So I think there is a problem with both Timestamp and Double types.
I didn't try Float.

Regards,
Marek





2015-04-06 22:59 GMT+02:00 Nicolas Maillard <nmaillard@hortonworks.com>:

>  Hello Marek
>
>
>  I am currently using a private branch prior to pushing the contribution.
> I remember having you problem when moving to 4.3.
>
> I'll correct the public github and the dependency during the contribution
> submission
>
>
>      [image: Hortonworks, Inc.] <http://hortonworks.com/>
> *Nicolas  <mjohansson@hortonworks.com>Maillard*  *Solutions Engineer -
> EMEA* <https://www.linkedin.com/pub/mats-johansson/0/64b/9>
>   +33 (0) 668176605|
> Skype: nicolas.maillard.hortonworks
>       ------------------------------
> *From:* Marek Wiewiorka <marek.wiewiorka@gmail.com>
> *Sent:* Monday, April 06, 2015 10:31 PM
>
> *To:* user@phoenix.apache.org
> *Subject:* Re: hive external table
>
>  Thanks Nicolas - could you just let me know what branch should I use to
> build your project against Phoenix 4.3.0?
> Master of your repo seems to use 4.2.2...
> Sure I can live using epoch secs/millis instead of timestamp - however it
> would be nice to have timestamp mapping implemented.
>
>
>  Regards,
> Marek
>
> 2015-04-06 22:25 GMT+02:00 Nicolas Maillard <nmaillard@hortonworks.com>:
>
>>  Hello Marek
>>
>>
>>  your error seems to point to a conversion mismatch when bringing back
>> the timestamp type from Phoenix, essentially the timestamp is being brought
>> back as a Text value in the the Hive world, I'll fix that asap.
>>
>> Not nice workaorund try using a long it should work but I will fix the
>> issue at hand
>>
>>
>>  Concerning your table no problem in your declaration, make sure your
>> znode is hbase-unsecure though, it usually is hbase (the default ) but in
>> my hortonworks sandbox it is hbase-unsecure this is why I specify it.
>>
>>
>>
>>
>>      [image: Hortonworks, Inc.] <http://hortonworks.com/>
>> *Nicolas  <mjohansson@hortonworks.com>Maillard*  *Solutions Engineer -
>> EMEA* <https://www.linkedin.com/pub/mats-johansson/0/64b/9>
>>   +33 (0) 668176605|
>> Skype: nicolas.maillard.hortonworks
>>       ------------------------------
>> *From:* Marek Wiewiorka <marek.wiewiorka@gmail.com>
>> *Sent:* Monday, April 06, 2015 10:05 PM
>> *To:* user@phoenix.apache.org
>> *Subject:* Re: hive external table
>>
>>   Hi Nicolas - thanks for your response!
>>
>> Where can I find the latest branch - I used master from this repo:
>> https://github.com/nmaillard/Phoenix-Hive
>>  and it is built againts 4.2.2 - I changed dependency in pom and tried
>> to build it using 4.3 I got series of errors like these:
>>  [ERROR] COMPILATION ERROR :
>> [INFO] -------------------------------------------------------------
>> [ERROR]
>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[29,33]
>> cannot find symbol
>>   symbol:   class PDataType
>>   location: package org.apache.phoenix.schema
>> [ERROR]
>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[93,19]
>> cannot find symbol
>>   symbol:   class PDataType
>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>> [ERROR]
>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[102,19]
>> cannot find symbol
>>   symbol:   class PDataType
>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>> [ERROR]
>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[116,19]
>> cannot find symbol
>>   symbol:   class PDataType
>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>> [ERROR]
>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[128,19]
>> cannot find symbol
>>   symbol:   class PDataType
>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>> [ERROR]
>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[164,19]
>> cannot find symbol
>>   symbol:   class PDataType
>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>> [ERROR]
>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[216,13]
>> cannot find symbol
>>   symbol:   class PDataType
>>   location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
>> [ERROR]
>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/PhoenixMetaHook.java:[41,33]
>> cannot find symbol
>>   symbol:   class PDataType
>>   location: package org.apache.phoenix.schema
>> [ERROR]
>> /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/PhoenixSerde.java:[41,33]
>> cannot find symbol
>>
>>
>>  Could you also please advise how should I define an external table in
>> Hive using Phoenix-hive for a table created in Phoenix as follows:
>>
>>  create table se_dwh.homes_usage_hour
>> (hu_ts time not null ,
>> hu_ho_id integer not null ,
>> hu_stream_id integer not null,
>> hu_usage double constraint pk PRIMARY KEY(hu_ts,hu_ho_id,hu_stream_id) );
>>
>>  I'm not sure how to set mapping columns correctly - I tried something
>> like this:
>> CREATE EXTERNAL TABLE phx_usage_hour(
>>      hu_ts timestamp,
>>     hu_ho_id int,
>>     hu_stream_id int,
>>     hu_usage double)
>> STORED BY  "org.apache.phoenix.hive.PhoenixStorageHandler"
>> TBLPROPERTIES(
>>     'phoenix.hbase.table.name'='se_dwh.homes_usage_hour',
>>     'phoenix.zookeeper.znode.parent'='hbase-unsecure',
>>     'phoenix.rowkeys'='hu_ts,hu_ho_id,hu_stream_id'
>> );
>>
>>  It was created without a problem - but when I run a simple query:
>>
>>  hive> select * from phx_usage_hour limit 1;
>> OK
>> Failed with exception
>> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException:
>> java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to
>> org.apache.hadoop.hive.serde2.io.TimestampWritable
>> Time taken: 0.084 seconds
>>
>>  I compiled your project using Phoenix 4.2.2 but I got 4.3 version
>> installed in HBase - could it be the root cause of the exception ?
>>
>>  Thanks!
>> Marek
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> 2015-04-06 20:57 GMT+02:00 Nicolas Maillard <nmaillard@hortonworks.com>:
>>
>>>  Hello Marek
>>>
>>>
>>>  There are a couple ways of reaching to phoenix through Hive
>>>
>>> - One is calling out directly the hbase layer with the Hive Hbase
>>> connector but this has some caveats.
>>>
>>> - Second is this project I am working on, the latest branch is built
>>> against phoenix 4.3, but building it against 4.2.2 should not create any
>>> problems from a usage part , if it does I need to fix it.
>>>
>>> Could you explain the issue or file an issue on the project.
>>>
>>> - Third if it is urgent you could dump your Hive data in a csv format
>>> and use the phoenix Bulkloader.
>>>
>>>
>>>  Whatever the way you decide to go please do forward some details about
>>> your error so I can fix if needed
>>>
>>>
>>>
>>>      [image: Hortonworks, Inc.] <http://hortonworks.com/>
>>> *Nicolas  <mjohansson@hortonworks.com>Maillard*  *Solutions Engineer -
>>> EMEA* <https://www.linkedin.com/pub/mats-johansson/0/64b/9>
>>>   +33 (0) 668176605|
>>> Skype: nicolas.maillard.hortonworks
>>>       ------------------------------
>>> *From:* Marek Wiewiorka <marek.wiewiorka@gmail.com>
>>> *Sent:* Monday, April 06, 2015 5:45 PM
>>> *To:* user@phoenix.apache.org
>>> *Subject:* hive external table
>>>
>>>   Hi All,
>>> I trying to create an external table mapped to a Phoenix table and I
>>> cannot make it working.
>>> I tried this project:
>>> https://github.com/nmaillard/Phoenix-Hive
>>>  but it fails to build against Phoenix 4.3.0.
>>> I compiled it using 4.2.2 but still it's throwing some exception while
>>> trying to access
>>> an external table.
>>>
>>>  Before I try to dig deeper - I would like to ask if  there is  any
>>> other way of using Phoenix in Hive environment?
>>>  Maybe you recommend a different approach of transferring data from
>>> Phoenix
>>> to Hive?
>>>
>>>  Many thanks in advance!
>>>
>>>  Marek
>>>
>>
>>
>

Mime
View raw message