phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolas Maillard <nmaill...@hortonworks.com>
Subject Re: hive external table
Date Fri, 10 Apr 2015 09:12:01 GMT
Hello Marek


I have corrected your type issues in the public Phoenix-Hive project you can go ahead and
test the latest version.

I have tested against Phoenix 4.3 to make sure all is well.

This beeing said this build is still dependant on phoenix 4.2, to move to 4.3 has a little
rewrite that has been done in the private repo, I'd rather update everyone at the same time
when all gets merged back to standard phoenix. So this is a patch for you and the public repo.


let me know if all is better and well, I have many things to make better and the next versions
of this connector but still always looking for initial testers to make sure all is stable.


regards


[Hortonworks, Inc.]<http://hortonworks.com/>

Nicolas <mailto:mjohansson@hortonworks.com> Maillard
Solutions Engineer - EMEA<https://www.linkedin.com/pub/mats-johansson/0/64b/9>

+33 (0) 668176605|
Skype: nicolas.maillard.hortonworks



________________________________
From: Nicolas Maillard
Sent: Thursday, April 09, 2015 6:52 PM
To: user@phoenix.apache.org
Subject: Re: hive external table


Hello Marek


Thanks for the test i'm doing a first fix as we speak and will extend my unit test to make
sure it stays fixed.

The timestamp is good to go I'll run a couple more tests and a new commit this evening.


[Hortonworks, Inc.]<http://hortonworks.com/>

Nicolas <mailto:mjohansson@hortonworks.com> Maillard
Solutions Engineer - EMEA<https://www.linkedin.com/pub/mats-johansson/0/64b/9>

+33 (0) 668176605|
Skype: nicolas.maillard.hortonworks



________________________________
From: Marek Wiewiorka <marek.wiewiorka@gmail.com>
Sent: Tuesday, April 07, 2015 10:48 AM
To: user@phoenix.apache.org
Subject: Re: hive external table

Hi Nicolas,
I tranformed hu_date (timestamp column) into bigint uisng to_number function in Phoenix.
I created a new external table in hive as follows:
hive> CREATE EXTERNAL TABLE phx_usage_day(
    >      hu_date bigint,
    >     hu_ho_id int,
    >     hu_stream_id int,
    >     hu_usage double)
    > STORED BY  "org.apache.phoenix.hive.PhoenixStorageHandler"
    > TBLPROPERTIES(
    >     'phoenix.hbase.table.name<http://phoenix.hbase.table.name>'='se_dwh.homes_usage_day_export
',
    >     'phoenix.zookeeper.znode.parent'='hbase-unsecure',
    >     'phoenix.rowkeys'='hu_date,hu_ho_id,hu_stream_id,hu_usage '
    > );
however the problem is now with double type(no longer with timestamp):
hive> select * from phx_usage_day limit 1;
OK
Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.ClassCastException: org.apache.hadoop.io.DoubleWritable cannot be cast to org.apache.hadoop.hive.serde2.io.DoubleWritable
Ti

so I changed hu_usage data type double =>float and then I got exception like this:

hive> select * from phx_usage_day limit 1;
OK
java.lang.ClassCastException: java.lang.Double cannot be cast to java.lang.Float
        at org.apache.phoenix.hive.util.HiveTypeUtil.SQLType2Writable(HiveTypeUtil.java:111)
        at org.apache.phoenix.hive.PhoenixSerde.deserialize(PhoenixSerde.java:100)
        at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:647)
        at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:561)
        at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:137)
        at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1524)
        at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:285)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
        at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.SerDeException
Time taken: 0.069 seconds


finally I changed it to string and it worked:
hive> select * from phx_usage_day limit 5;
OK
1388534400000   1000    0       8.42
1388534400000   1001    0       7.071000000000001
1388534400000   1002    0       9.101999999999997
1388534400000   1003    0       26.349999999999994
1388534400000   1004    0       48.421
Time taken: 0.064 seconds, Fetched: 5 row(s)


So I think there is a problem with both Timestamp and Double types.
I didn't try Float.

Regards,
Marek





2015-04-06 22:59 GMT+02:00 Nicolas Maillard <nmaillard@hortonworks.com<mailto:nmaillard@hortonworks.com>>:

Hello Marek


I am currently using a private branch prior to pushing the contribution. I remember having
you problem when moving to 4.3.

I'll correct the public github and the dependency during the contribution submission


[Hortonworks, Inc.]<http://hortonworks.com/>

Nicolas <mailto:mjohansson@hortonworks.com> Maillard
Solutions Engineer - EMEA<https://www.linkedin.com/pub/mats-johansson/0/64b/9>

+33 (0) 668176605<tel:%2B33%20%280%29%20668176605>|
Skype: nicolas.maillard.hortonworks



________________________________
From: Marek Wiewiorka <marek.wiewiorka@gmail.com<mailto:marek.wiewiorka@gmail.com>>
Sent: Monday, April 06, 2015 10:31 PM

To: user@phoenix.apache.org<mailto:user@phoenix.apache.org>
Subject: Re: hive external table

Thanks Nicolas - could you just let me know what branch should I use to build your project
against Phoenix 4.3.0?
Master of your repo seems to use 4.2.2...
Sure I can live using epoch secs/millis instead of timestamp - however it would be nice to
have timestamp mapping implemented.


Regards,
Marek

2015-04-06 22:25 GMT+02:00 Nicolas Maillard <nmaillard@hortonworks.com<mailto:nmaillard@hortonworks.com>>:

Hello Marek


your error seems to point to a conversion mismatch when bringing back the timestamp type from
Phoenix, essentially the timestamp is being brought back as a Text value in the the Hive world,
I'll fix that asap.

Not nice workaorund try using a long it should work but I will fix the issue at hand


Concerning your table no problem in your declaration, make sure your znode is hbase-unsecure
though, it usually is hbase (the default ) but in my hortonworks sandbox it is hbase-unsecure
this is why I specify it.




[Hortonworks, Inc.]<http://hortonworks.com/>

Nicolas <mailto:mjohansson@hortonworks.com> Maillard
Solutions Engineer - EMEA<https://www.linkedin.com/pub/mats-johansson/0/64b/9>

+33 (0) 668176605<tel:%2B33%20%280%29%20668176605>|
Skype: nicolas.maillard.hortonworks



________________________________
From: Marek Wiewiorka <marek.wiewiorka@gmail.com<mailto:marek.wiewiorka@gmail.com>>
Sent: Monday, April 06, 2015 10:05 PM
To: user@phoenix.apache.org<mailto:user@phoenix.apache.org>
Subject: Re: hive external table

Hi Nicolas - thanks for your response!

Where can I find the latest branch - I used master from this repo:
https://github.com/nmaillard/Phoenix-Hive
and it is built againts 4.2.2 - I changed dependency in pom and tried to build it using 4.3
I got series of errors like these:
[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[29,33]
cannot find symbol
  symbol:   class PDataType
  location: package org.apache.phoenix.schema
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[93,19]
cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[102,19]
cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[116,19]
cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[128,19]
cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[164,19]
cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/util/HiveConfigurationUtil.java:[216,13]
cannot find symbol
  symbol:   class PDataType
  location: class org.apache.phoenix.hive.util.HiveConfigurationUtil
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/PhoenixMetaHook.java:[41,33]
cannot find symbol
  symbol:   class PDataType
  location: package org.apache.phoenix.schema
[ERROR] /home/mw/Phoenix-Hive/src/main/java/org/apache/phoenix/hive/PhoenixSerde.java:[41,33]
cannot find symbol


Could you also please advise how should I define an external table in Hive using Phoenix-hive
for a table created in Phoenix as follows:

create table se_dwh.homes_usage_hour
(hu_ts time not null ,
hu_ho_id integer not null ,
hu_stream_id integer not null,
hu_usage double constraint pk PRIMARY KEY(hu_ts,hu_ho_id,hu_stream_id) );

I'm not sure how to set mapping columns correctly - I tried something like this:
CREATE EXTERNAL TABLE phx_usage_hour(
     hu_ts timestamp,
    hu_ho_id int,
    hu_stream_id int,
    hu_usage double)
STORED BY  "org.apache.phoenix.hive.PhoenixStorageHandler"
TBLPROPERTIES(
    'phoenix.hbase.table.name<http://phoenix.hbase.table.name>'='se_dwh.homes_usage_hour',
    'phoenix.zookeeper.znode.parent'='hbase-unsecure',
    'phoenix.rowkeys'='hu_ts,hu_ho_id,hu_stream_id'
);

It was created without a problem - but when I run a simple query:

hive> select * from phx_usage_hour limit 1;
OK
Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to org.apache.hadoop.hive.serde2.io.TimestampWritable
Time taken: 0.084 seconds

I compiled your project using Phoenix 4.2.2 but I got 4.3 version installed in HBase - could
it be the root cause of the exception ?

Thanks!
Marek









2015-04-06 20:57 GMT+02:00 Nicolas Maillard <nmaillard@hortonworks.com<mailto:nmaillard@hortonworks.com>>:

Hello Marek


There are a couple ways of reaching to phoenix through Hive

- One is calling out directly the hbase layer with the Hive Hbase connector but this has some
caveats.

- Second is this project I am working on, the latest branch is built against phoenix 4.3,
but building it against 4.2.2 should not create any problems from a usage part , if it does
I need to fix it.

Could you explain the issue or file an issue on the project.

- Third if it is urgent you could dump your Hive data in a csv format and use the phoenix
Bulkloader.


Whatever the way you decide to go please do forward some details about your error so I can
fix if needed



[Hortonworks, Inc.]<http://hortonworks.com/>

Nicolas <mailto:mjohansson@hortonworks.com> Maillard
Solutions Engineer - EMEA<https://www.linkedin.com/pub/mats-johansson/0/64b/9>

+33 (0) 668176605<tel:%2B33%20%280%29%20668176605>|
Skype: nicolas.maillard.hortonworks



________________________________
From: Marek Wiewiorka <marek.wiewiorka@gmail.com<mailto:marek.wiewiorka@gmail.com>>
Sent: Monday, April 06, 2015 5:45 PM
To: user@phoenix.apache.org<mailto:user@phoenix.apache.org>
Subject: hive external table

Hi All,
I trying to create an external table mapped to a Phoenix table and I cannot make it working.
I tried this project:
https://github.com/nmaillard/Phoenix-Hive
but it fails to build against Phoenix 4.3.0.
I compiled it using 4.2.2 but still it's throwing some exception while trying to access
an external table.

Before I try to dig deeper - I would like to ask if  there is  any other way of using Phoenix
in Hive environment?
Maybe you recommend a different approach of transferring data from Phoenix
to Hive?

Many thanks in advance!

Marek




Mime
View raw message