phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: How to implement custom datatypes
Date Tue, 27 May 2014 18:23:11 GMT
Phoenix has support for ARRAY types.

For data types, for efficiency Phoenix tries to operate on them in their
binary form. If you're going to use them in the row key, there are rules
that they have to follow: namely that the byte representation sorts in the
natural sort order. If the server has to serialize and deserialize them all
the time, performance will be poor.

I think you'll likely run into issues by trying to create your own data
types rather than focusing on the problem you really want to solve. I'd
just use primitive types and drop down to BINARY if you come up with a good
binary representation of spatio datatype. Then once you get more
information around schema and row key design, benchmarking, built-in
functions, etc. you can look at how to improve your programming model by
implementing STRUCTs in Phoenix or introducing official new data types.

If you go with your own data type, use null in place of  new LongCodec().
Only primitive types have codecs - they're provide a way to
serialize/deserialize without creating Java objects. Then you'd need to
implement the serialization of Instant through the toBytes() method and the
deserialization through the toObject() method of your new PDataType enum.

Thanks,
James


On Tue, May 27, 2014 at 10:41 AM, faisal moeen <fmorakzai@gmail.com> wrote:

> Hi James,
>
> I could flatten the datatypes using arrays but I guess its not supported
> by phoenix yet. How would you suggest to represent a 2d line with existing
> datatypes.
>
> If I go down the existing way, what are the pros and cons. If I define my
> datatype like this:
>
> MYDT("MYDT", 123, Instant.class, new LongCodec()) {}
>
> Won't it give me an object of "Instant" at the client?
>
> Regards
> Faisal Moeen
>
>
> On Tue, May 27, 2014 at 7:10 PM, James Taylor <jamestaylor@apache.org>wrote:
>
>> Hi Faisal,
>> Thanks for sharing that document - it's very interesting. Sounds like a
>> good use case for a STRUCT (
>> https://issues.apache.org/jira/browse/PHOENIX-477), but we don't have
>> support for that yet.
>>
>> You could continue down the path you're going by implementing your own
>> data type, but none of the SQL tooling would know how to interpret your
>> type. Have you considered flattening your data model into primitives? It's
>> not as elegant, but instead of passing a struct through to your built-in
>> functions, you could for example pass in the primitives (or arrays). This
>> might be sufficient for understanding the best HBase schema design and
>> benchmarking over various row key designs.
>>
>> Regards,
>> James
>>
>>
>> On Tue, May 27, 2014 at 9:41 AM, faisal moeen <fmorakzai@gmail.com>wrote:
>>
>>> Hi James,
>>>
>>> Some of my datatypes can use the existing sql types but most of them
>>> require to have custom implementations. I plan to start with point & line.
>>> Please find attached the explanation of some of the types. I prefer to have
>>> its support at the server level because I also plan to implement a
>>> spatio-temporal join later. In that case, the local join would be through a
>>> co-processor so I will need the support of the types and operations at the
>>> region servers.
>>>
>>> Regards
>>>
>>>
>>> On Tue, May 27, 2014 at 6:30 PM, James Taylor <jamestaylor@apache.org>wrote:
>>>
>>>> Hi Faisal,
>>>> That sounds very interesting. What will be the structure of your custom
>>>> data type? Would you be able to get away with using the existing fixed
>>>> binary type and interpret the value at the application level? Or perhaps
an
>>>> array of one of our primitive types? In general, each PDataType maps to a
>>>> SQL type (http://docs.oracle.com/javase/6/docs/api/java/sql/Types.html
>>>> ).
>>>> Thanks,
>>>> James
>>>>
>>>>
>>>> On Tue, May 27, 2014 at 9:04 AM, faisal moeen <fmorakzai@gmail.com>wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I am trying to implement custom spatio-temporal datatypes for Phoenix.
>>>>> How can I do that?
>>>>>
>>>>> Until now I have added my custom datatype "MYDT" to PDataType and
>>>>> specified it to have a sqltype=123.
>>>>> I chose this number because I am not sure what to add here. I can
>>>>> create a table with this but when I insert something, its type is shown
as
>>>>> NULL.
>>>>>
>>>>> I am using HBase 0.94.18 with Phoenix 3.0.
>>>>>
>>>>> Any help is appreciated.
>>>>>
>>>>> Regards
>>>>>
>>>>> --
>>>>>
>>>>> Regards
>>>>> Faisal Moeen
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Regards
>>> Faisal Moeen
>>>
>>
>>
>
>
> --
>
> Regards
> Faisal Moeen
>

Mime
View raw message