phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Johnson <matt.john...@algomi.com>
Subject RE: BigDecimal casting issue?
Date Fri, 27 Feb 2015 12:02:18 GMT
Thanks James,



We are  not using BigDecimal for primary key at all, the main use case
would be something like *(rowid varchar primary key, producttype varchar,
price decimal)* and then we would want to do aggregate number functions
like AVG, MAX etc. So the vast majority would be full table scans anyway.



If I had a separate column where the price was stored as a string, eg *(rowid
varchar primary key, producttype varchar, price decimal, s_price varchar)*
will Phoenix allow me to use aggregate number functions along with the
to_number function? Something like AVG(TO_NUMBER(s_price))?



Thanks!

Matt





-----Original Message-----
From: James Taylor [mailto:jamestaylor@apache.org]
Sent: 26 February 2015 18:02
To: user
Subject: Re: BigDecimal casting issue?



The problem with the old way that HBase represents BigDecimal is that the
serialized bytes don't sort the same way that the BigDecimal does (FWIW,
but orthogonal to this discussion and not something that will help you with
this particular situation, a new type system was introduced in HBase to fix
this). If it's a matter of writing a conversion script that runs overnight,
I'd encourage you to do that.

If that's not possible, there are a few alternatives, but performance won't
be as good.



Phoenix requires that the serialized bytes sort order matches the sort
order of the objects they represent so that range scans and order bys work
correctly. For example, let's say you had the following schema:



CREATE TABLE t (n DECIMAL PRIMARY KEY);



and issued the following query:



SELECT * FROM t WHERE n > 0.5 AND n < 0.9;



Phoenix will compile this into a scan that has a start row of the
serialized bytes of 0.5 and a stop row of the serialized bytes of 0.9.

If the serialized byte sort order doesn't match the object sort order,
you'd get incorrect results. We do support UNSIGNED_* types for byte,
short, int, and long because the positive numbers meet this criteria.

We can't have an UNSIGNED_DECIMAL, because even restricted to just positive
BigDecimal values, this contract is not met.



1) You could declare your data byte for BigDecimal as VARBINARY and do the
byte[]->BigDecimal conversion in your code. Note that the filtering shown
above wouldn't work correctly, but an equality filter would be ok.

CREATE TABLE t (n VARBINARY PRIMARY KEY); SELECT * FROM t WHERE n = 0.5;



2) You could stringify your BigDecimal and convert it on-the-fly using our
TO_NUMBER built-in:

CREATE TABLE t (n VARCHAR PRIMARY KEY);

SELECT * FROM t WHERE TO_NUMBER(n) > 0.5 AND TO_NUMBER(n) < 0.9;



Note that this query would be a full table scan.



HTH,

James





On Thu, Feb 26, 2015 at 6:39 AM, Matthew Johnson <matt.johnson@algomi.com>
wrote:

> Thanks guys,

>

>

>

> Unfortunately I cannot easily change the way the value is serialized

> into HBase because that value is used by multiple other projects that

> don’t use Phoenix. Over time I could potentially migrate all these

> projects but I was hoping for a slightly more immediate solution. I

> could write an overnight batch job that scans through the entire

> table, and for every value that is a BigDecimal I could duplicate it

> into a new column with the PDataType Decimal, but this seems kinda
inefficient and would duplicate a lot of data.

>

>

>

> I had a look at the documentation (and tried a few combinations) but

> could not find an UNSIGNED_DECIMAL type for Phoenix.

>

>

>

> Is there any other way I can retrieve BigDecimal values with Phoenix?

> If I wanted to write my own wrapper for Phoenix that supports

> BigDecimal, would I have to patch just the client or would I have to
patch the server as well?

>

>

>

> Cheers,

>

> Matt

>

>

>

>

>

> From: anil gupta [mailto:anilgupta84@gmail.com <anilgupta84@gmail.com>]

> Sent: 26 February 2015 13:44

> To: user@phoenix.apache.org

> Subject: Re: BigDecimal casting issue?

>

>

>

> Hi Matthew,

>

> If you want phoenix to read a BigDecimal value out of HBase. You will

> need to use PDataType.DECIMAL.toBytes(BigDecimal) to serialize that

> value in HBase. IMO, serialization of of BigDecimal via PDataType is

> better because it enables byte array comparison on BigDecimal values.

>

> Thanks,

>

> Anil

>

>

>

> On Thu, Feb 26, 2015 at 4:30 AM, Gabriel Reid <gabriel.reid@gmail.com>

> wrote:

>

> Hi Matt,

>

> Although the object representation of the Phoenix DECIMAL type is

> BigDecimal, the byte-level encoding is different than that of

> Bytes.toBytes(BigDecimal). The reason for this is to allow for

> ordering of these values based comparison of binary values. Sorting

> the values with binary value comparison based on the return value of

> Bytes.toBytes(BigDecimal) will not result in the correct ordering of

> values.

>

> As you may have noticed, many data types in Phoenix have an UNSIGNED_*

> counter part which uses the same underlying encoding as Bytes.toBytes,

> although these datatypes suffer from the same binary comparison issue

> as outlined above.

>

> - Gabriel

>

>

>

>

> On Thu, Feb 26, 2015 at 12:06 PM, Matthew Johnson

> <matt.johnson@algomi.com> wrote:

>> Hi all,

>>

>>

>>

>> I am trying to map an HBase column where I store java.math.BigDecimal

>> values

>> using:

>>

>>

>>

>> Bytes.toBytes(myBigDecimalValueInJava)

>>

>>

>>

>> My understanding from the “Data Types” page

>> (http://phoenix.apache.org/language/datatypes.html#decimal_type) is

>> that the DECIMAL type in Phoenix should map to this. However, when I

>> store:

>>

>>

>>

>> 102.1

>>

>>

>>

>> in HBase, Phoenix reads it back as:

>>

>>

>>

>> -1.02020201E+126

>>

>>

>>

>> Not sure whether I am using it wrong? I have tried creating the view

>> with data type DECIMAL and also DECIMAL(15,5) but both give the same
problem.

>> Is

>> anyone else able to successfully insert BigDecimal values via the

>> HBase client and retrieve them using Phoenix?

>>

>>

>>

>> FYI when I retrieve the value using HBase client:

>>

>>

>>

>> BigDecimal bd = Bytes.toBigDecimal(value);

>>

>>

>>

>> It correctly prints the value of 102.1.

>>

>>

>>

>> Any thoughts?

>>

>>

>>

>> Cheers,

>>

>> Matt

>>

>>

>>

>>

>

>

>

>

> --

>

> Thanks & Regards,

> Anil Gupta

Mime
View raw message