phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anchal Agrawal <anc...@yahoo-inc.com>
Subject Re: Signed long values in column
Date Thu, 30 Jul 2015 01:23:39 GMT
Hi James,
Thanks for your reply. I don't understand the issue fully - do HBase's Bytes.toBytes() methods
not have the same sort order as that of Phoenix? I'd really appreciate it if you could give
more insight on this. Their documentation doesn't mention the sort order. If negative numbers
sort ahead of the positive numbers, why is that incompatible with Phoenix?

It's interesting because it seems that we can't have columns in Phoenix views/tables where
the values have (negative) long values. In my setup, it is not feasible to create a new Phoenix
table and copy over the data because the table is very large and we'd need to recreate the
Phoenix table with updated data every time we want to run queries.

Is it feasible to write a UDF (for a SELECT statement) that converts the bytearray in that
column to a long value? If it is, would I use the Tuple object that's passed in to the evaluate()
method to get the values in that column? I tried using Tuple's getValue() method to grab the
bytearray in the column, but I'm running into issues. I'm looking at ToNumberFunction for
reference.

I really appreciate your help.
- Anchal 

     On Wednesday, July 29, 2015 8:43 AM, James Taylor <jamestaylor@apache.org> wrote:
   

 Hi Anchal,Phoenix depends on the sort order of the serialized bytes to match the natural
sort order of the column value. The HBase Bytes.toBytes() methods do not meet this requirement,
as negative numbers will sort ahead of positive numbers. About the only option you have in
this case is to create a new Phoenix table and copy the data over from your old table. If
the data is being created by some external process, then you'd need to change it to use the
PDataType toBytes() method instead of the HBase Bytes.toBytes() method.
It's possible that Phoenix could relax this constraint for columns that are not part of the
primary key constraint - please file a JIRA for this. We'd need to define a new PDataType
(it could share almost all of it's implementation with PUnsignedLong) and handle ORDER BY
differently for these types.
Thanks,James
On Tue, Jul 28, 2015 at 10:40 PM, Anchal Agrawal <anchal@yahoo-inc.com> wrote:

Hi,
I'm creating a Phoenix view of an existing HBase table on v4.4.0.

Command: CREATE VIEW "table_name" (pk VARBINARY PRIMARY KEY, "cf"."col" DATA_TYPE_HERE);
The col column has long values that are serialized by Bytes.toBytes(long) but since some values
are negative, I can't use UNSIGNED_LONG. I tried BIGINT instead since the documentation says
that it maps to java.lang.Long, but that resulted in incorrect column values. The datatype
documentation for UNSIGNED_LONG says "use the regular signed type instead" - which datatype
is this referring to? LONG isn't supported.

I could create the view with the column values as bytearrays and write a UDF to extract long
values, but I think that will add to the latency. Is there a way around this? I really appreciate
your help.

Sincerely,Anchal Agrawal



  
Mime
View raw message