phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Taylor <jamestay...@apache.org>
Subject Re: Select dynamic column content
Date Thu, 08 Jan 2015 02:53:21 GMT
If the dynamic column approach with the cf.* feature (PHOENIX-374) meets
your needs, that's good feedback. FWIW, you would not need to create all
the views up front at schema creation time. You can create them
on-the-fly. All views share the same, single underlying HBase table, so no
HBase metadata operations would be performed when you create a new
view. Creating a new view more or less correspond to doing a Put of a few
rows in the SYSTEM.CATALOG table (the HBase table that holds your
metadata), so it's not an expensive/heavy-weight operation to create and
alter them (as it is with a typical RDBMS). Thanks to HBase, you can create
an "infinite" set of them.

Thanks,
James

On Wed, Jan 7, 2015 at 12:58 AM, Sumanta Gh <sumanta.gh@tcs.com> wrote:

> Thanks James for replying.
> The below example is really a smart way to map dynamic columns into static
> ones. I will implement the idea in another case.
> But I can not create these views at schema creation time and I really have
> an infinite set of values for event_type.
> Keeping a column for all dynamic column names in a cell is perhaps better
> solution for my case.
> Finally, I insist the community to provide an API similar to cf.* to solve
> this kind of issue with simplicity.
>
> Regards
> Sumanta
>
>
> -----James Taylor <jamestaylor@apache.org> wrote: -----
> To: user <user@phoenix.apache.org>
> From: James Taylor <jamestaylor@apache.org>
> Date: 01/07/2015 01:35PM
>
> Subject: Re: Select dynamic column content
>
> Hi Sumanta,
> Another alternative option is to leverage support for VIEWs in Phoenix (
> http://phoenix.apache.org/views.html). In many use cases I've seen where
> there are hundreds of sparse columns defined for a schema, there's a column
> that determines *which* sparse columns are applicable for a given row. An
> example would be a metric_event table in which an event_type column value
> defines which columns are associated with a given event_type. This might be
> modelled with a base table and a series of views like this:
>
> -- base table
> CREATE TABLE metric_event (
>     event_type CHAR(2),
>     event_id VARCHAR,
>     created_date DATE,
>     CONSTRAINT pk PRIMARY KEY (event_type, event_id));
>
> -- updatable view for garbage collection event
> CREATE VIEW gc_event( -- define columns specific to a GC event
>     duration BIGINT,
>     eden_space BIGINT,
>     survivor_space BIGINT,
>     ...)
> AS SELECT * FROM event WHERE event_type = 'GC'
>
> -- updatable view for IO event
> CREATE VIEW io_event( -- define columns specific to an IO event
>     io_wait_time BIGINT,
>     iops BIGINT,
>     read_workload INTEGER,
>     write_workload INTEGER,
>     ...)
> AS SELECT * FROM event WHERE event_type = 'IO'
>
> Using this approach, Phoenix will keep track of the columns each view for
> you so you don't have to. All views reside in the same HBase table (the one
> corresponding to your base table). When you do a SELECT * FROM io_event,
> you'll get back only those columns defined for that view plus the columns
> from the base table.
>
> A further orthogonal level of granularity allows you to define your base
> table as a MULTI_TENANT table and have each VIEW vary depending on the
> particular user (identified at connection time through a property). Just as
> with the scenario described above, each user may define different columns
> that only exist for that particular user. See
> http://phoenix.apache.org/multi-tenancy.html for more info.
>
> HTH,
>
>     James
>
> On Tue, Jan 6, 2015 at 9:43 AM, Jesse Yates <jesse.k.yates@gmail.com>
> wrote:
>
>> And it looks like you already figured that out :)
>>
>>
>> On Tue, Jan 6, 2015, 9:43 AM Jesse Yates <jesse.k.yates@gmail.com> wrote:
>>
>>> You wouldn't even need another table, just a single VARCHAR[] column to
>>> keep the column names. Its ideal to keep it in the same row (possibly in
>>> another cf if you expect it to be large) since you get ACID compliance on
>>> that row, which you wouldn't get from using another table. You then just
>>> upsert the names column at the same time you upsert the dynamic columns.
>>>
>>> Phoenix does something similar for tracing where there is an unknown
>>> number of annotations - we keep a trace annotation count column which then
>>> can be used to figure out the dynamic annotation column names (which are
>>> things like annotations.a0, .a1, .a2, etc)
>>>
>>> The downside is that you then need to do two queries to get all the
>>> columns, but until we implement the cf.* logic for dynamic columns, that's
>>> the best you can do.
>>>
>>> - jesse
>>>
>>> On Tue, Jan 6, 2015, 9:23 AM Sumanta Gh <sumanta.gh@tcs.com> wrote:
>>>
>>>> Thanks Nicolas for replying.
>>>> I am already managing dynamic column names either putting them in a
>>>> separate column or keeping the names in cache.
>>>> But looking at the pace Phoenix is evolving, IMHO this cf.* query
>>>> pattern would be very much helpful for users.
>>>> The stock HBase client is capable of doing that.
>>>> Because my table is extremely sparsed and I allow quite a random schema
>>>> in every row of the table, getting the content of the dynamic column in a
>>>> single query will save lot of time.
>>>> Looking forward to your completion of the work...
>>>>
>>>> Regards
>>>> Sumanta
>>>>
>>>>
>>>> -----Nicolas Maillard <nmaillard@hortonworks.com> wrote: -----
>>>> To: user@phoenix.apache.org
>>>> From: Nicolas Maillard <nmaillard@hortonworks.com>
>>>> Date: 01/06/2015 03:08PM
>>>> Subject: Re: Select dynamic column content
>>>>
>>>>
>>>> Hello Sumanta
>>>>
>>>> This is a last bit missing in the select cf.* query pattern that would
>>>> bring back not only known columns but all dynamic ones also. I never got
>>>> around to finishing that bit for different reasons. The best way would be
>>>> to finish that, other than that I do not see an easy way to retrieve
>>>>  dynamic columns of which you have lost the column name. I guess if there
>>>> is a logic ti these column names you could try to reconstruct or keep a
>>>> second table of keys to dynamic column names to find them in the after math.
>>>>
>>>> regards
>>>>
>>>> On Tue, Jan 6, 2015 at 8:01 AM, Sumanta Gh <sumanta.gh@tcs.com> wrote:
>>>>
>>>>> Hi,
>>>>> It has been a puzzle for me to get the content of dynamic columns in
a
>>>>> single SELECT * FROM query.
>>>>> Challenge is that I do not know the dynamic column names in advance so
>>>>> I can not mention a dynamic column in the SELECT query.
>>>>> Is there any way out? Please suggest.
>>>>>
>>>>> Regards
>>>>> Sumanta
>>>>>
>>>>> =====-----=====-----=====
>>>>> Notice: The information contained in this e-mail
>>>>> message and/or attachments to it may contain
>>>>> confidential or privileged information. If you are
>>>>> not the intended recipient, any dissemination, use,
>>>>> review, distribution, printing or copying of the
>>>>> information contained in this e-mail message
>>>>> and/or attachments to it are strictly prohibited. If
>>>>> you have received this communication in error,
>>>>> please notify us by reply e-mail or telephone and
>>>>> immediately and permanently delete the message
>>>>> and any attachments. Thank you
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Nicolas Maillard Solution Engineer
>>>> Phone:     +33 (0)6 68 17 66 05
>>>> Email:      nmaillard*@hortonworks.com* <nmaillard@hortonworks.com>
>>>> Website:   http://www.hortonworks.com
>>>>
>>>>
>>>>
>>>> CONFIDENTIALITY NOTICE
>>>> NOTICE: This message is intended for the use of the individual or
>>>> entity to which it is addressed and may contain information that is
>>>> confidential, privileged and exempt from disclosure under applicable law.
>>>> If the reader of this message is not the intended recipient, you are hereby
>>>> notified that any printing, copying, dissemination, distribution,
>>>> disclosure or forwarding of this communication is strictly prohibited. If
>>>> you have received this communication in error, please contact the sender
>>>> immediately and delete it from your system. Thank You.
>>>>
>>>
>

Mime
View raw message