phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: Hive UDF for creating row key in HBASE
Date Mon, 18 Dec 2017 19:29:02 GMT
Hi Chethan,
As Ethan mentioned, take a look first at the Phoenix/Hive integration. If
that doesn't work for you, the best way to get the row key for a phoenix
table is to execute an UPSERT VALUES against the primary key columns
without committing it. We have a utility function that will return the
Cells that would be submitted to the server that you can use to get the row
key. You can do this through a "connectionless" JDBC Connection, so you
don't need any RPCs (including executing the CREATE TABLE call so that
Phoenix knows the metadata).

Take a look at ConnectionlessTest.testConnectionlessUpsert() for an example.

Thanks,
James

On Sun, Dec 17, 2017 at 1:19 PM, Ethan <ewang@apache.org> wrote:

>
> Hi Chethan,
>
> When you write data from HDFS, are you planning to use hive to do the ETL?
> Can we do something like reading from HDFS and use Phoenix to write into to
> HBASE?
>
> There is https://phoenix.apache.org/hive_storage_handler.html, I think is
> enabling Hive to read from phoenix table, not the other way around.
>
> Thanks,
>
> On December 16, 2017 at 8:09:10 PM, Chethan Bhawarlal (
> cbhawarlal@collectivei.com) wrote:
>
> Hi Dev,
>
> Currently I am planning to write data from HDFS to HBASE. And to read data
> I am using Phoenix.
>
> Phoenix is converting its primary keys separated by bytes("\x00") and
> storing it in HBASE as row key.
>
> I want to write a custom UDF in hive to create ROW KEY value of HBASE such
> that Phoenix will be able to split it into multiple columns.
>
> Following is the custom UDF code I am trying to write;
>
>
> import org.apache.hadoop.hive.ql.exec.Description;
>
> import org.apache.hadoop.hive.ql.exec.UDF;
>
> import org.apache.hadoop.hive.ql.udf.UDFType;
>
>
> @UDFType(stateful = true)
>
> @Description(name = "hbasekeygenerator", value = "_FUNC_(existing) -
> Returns a unique rowkey value for hbase")
>
> public class CIHbaseKeyGenerator extends UDF{
>
> public String evaluate(String [] args){
>
> byte zerobyte = 0x00;
>
> String zbyte = Byte.toString(zerobyte);
>
> StringBuilder sb = new StringBuilder();
>
>
> for (int i = 0; i < args.length-1;++i) {
>
> sb.append(args[i]);
>
> sb.append(zbyte);
>
>
> }
>
> sb.append(args[args.length-1]);
>
> return sb.toString();
>
> }
>
> }
>
>
> Following are my questions,
>
>
> 1.is it possible to emulate the behavior of phoenix(decoding) using hive
> custom UDF.
>
>
> 2. If it is possible, what is the better approach for this. It will be
> great if some one can share some pointers on this.
>
>
> Thanks,
>
> Chethan.
>
>
>
>
>
>
>
>
>
>
> Collective[i] dramatically improves sales and marketing performance using
> technology, applications and a revolutionary network designed to provide
> next generation analytics and decision-support directly to business users.
> Our goal is to maximize human potential and minimize mistakes. In most
> cases, the results are astounding. We cannot, however, stop emails from
> sometimes being sent to the wrong person. If you are not the intended
> recipient, please notify us by replying to this email's sender and deleting
> it (and any attachments) permanently from your system. If you are, please
> respect the confidentiality of this communication's contents.
>
>

Mime
View raw message