phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sumanta Gh <sumanta...@tcs.com>
Subject Re: Select dynamic column content
Date Wed, 07 Jan 2015 08:58:25 GMT
Thanks James for replying. 
The below example is really a smart way to map dynamic columns into static ones. I will implement
the idea in another case.
But I can not create these views at schema creation time and I really have an infinite set
of values for event_type.
Keeping a column for all dynamic column names in a cell is perhaps better solution for my
case.
Finally, I insist the community to provide an API similar to cf.* to solve this kind of issue
with simplicity.

Regards
Sumanta 


-----James Taylor <jamestaylor@apache.org> wrote: -----
To: user <user@phoenix.apache.org>
From: James Taylor <jamestaylor@apache.org>
Date: 01/07/2015 01:35PM
Subject: Re: Select dynamic column content

Hi Sumanta,
Another alternative option is to leverage support for VIEWs in Phoenix (http://phoenix.apache.org/views.html).
In many use cases I've seen where there are hundreds of sparse columns defined for a schema,
there's a column that determines *which* sparse columns are applicable for a given row. An
example would be a metric_event table in which an event_type column value defines which columns
are associated with a given event_type. This might be modelled with a base table and a series
of views like this:

-- base table
CREATE TABLE metric_event (
    event_type CHAR(2), 
    event_id VARCHAR,
    created_date DATE,
    CONSTRAINT pk PRIMARY KEY (event_type, event_id));

-- updatable view for garbage collection event
CREATE VIEW gc_event( -- define columns specific to a GC event
    duration BIGINT,
    eden_space BIGINT,
    survivor_space BIGINT,
    ...)
AS SELECT * FROM event WHERE event_type = 'GC'

-- updatable view for IO event
CREATE VIEW io_event( -- define columns specific to an IO event
    io_wait_time BIGINT,
    iops BIGINT,
    read_workload INTEGER,
    write_workload INTEGER,
    ...)
AS SELECT * FROM event WHERE event_type = 'IO'

Using this approach, Phoenix will keep track of the columns each view for you so you don't
have to. All views reside in the same HBase table (the one corresponding to your base table).
When you do a SELECT * FROM io_event, you'll get back only those columns defined for that
view plus the columns from the base table.

A further orthogonal level of granularity allows you to define your base table as a MULTI_TENANT
table and have each VIEW vary depending on the particular user (identified at connection time
through a property). Just as with the scenario described above, each user may define different
columns that only exist for that particular user. See http://phoenix.apache.org/multi-tenancy.html
for more info.

HTH,

    James

On Tue, Jan 6, 2015 at 9:43 AM, Jesse Yates <jesse.k.yates@gmail.com> wrote:
And it looks like you already figured that out :)


On Tue, Jan 6, 2015, 9:43 AM Jesse Yates <jesse.k.yates@gmail.com> wrote:
You wouldn't even need another table, just a single VARCHAR[] column to keep the column names.
Its ideal to keep it in the same row (possibly in another cf if you expect it to be large)
since you get ACID compliance on that row, which you wouldn't get from using another table.
You then just upsert the names column at the same time you upsert the dynamic columns.

Phoenix does something similar for tracing where there is an unknown number of annotations
- we keep a trace annotation count column which then can be used to figure out the dynamic
annotation column names (which are things like annotations.a0, .a1, .a2, etc)

The downside is that you then need to do two queries to get all the columns, but until we
implement the cf.* logic for dynamic columns, that's the best you can do.

- jesse

On Tue, Jan 6, 2015, 9:23 AM Sumanta Gh <sumanta.gh@tcs.com> wrote:
Thanks Nicolas for replying.
I am already managing dynamic column names either putting them in a separate column or keeping
the names in cache.
But looking at the pace Phoenix is evolving, IMHO this cf.* query pattern would be very much
helpful for users.
The stock HBase client is capable of doing that.
Because my table is extremely sparsed and I allow quite a random schema in every row of the
table, getting the content of the dynamic column in a single query will save lot of time.
Looking forward to your completion of the work...

Regards
Sumanta 


-----Nicolas Maillard <nmaillard@hortonworks.com> wrote: -----
To: user@phoenix.apache.org
From: Nicolas Maillard <nmaillard@hortonworks.com>
Date: 01/06/2015 03:08PM
Subject: Re: Select dynamic column content


Hello Sumanta

This is a last bit missing in the select cf.* query pattern that would bring back not only
known columns but all dynamic ones also. I never got around to finishing that bit for different
reasons. The best way would be to finish that, other than that I do not see an easy way to
retrieve  dynamic columns of which you have lost the column name. I guess if there is a logic
ti these column names you could try to reconstruct or keep a second table of keys to dynamic
column names to find them in the after math.

regards

On Tue, Jan 6, 2015 at 8:01 AM, Sumanta Gh <sumanta.gh@tcs.com> wrote:
Hi, 
It has been a puzzle for me to get the content of dynamic columns in a single SELECT * FROM
query. 
Challenge is that I do not know the dynamic column names in advance so I can not mention a
dynamic column in the SELECT query. 
Is there any way out? Please suggest.

Regards
Sumanta
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you




-- 
Nicolas Maillard Solution Engineer
Phone:ÿÿ ÿÿ+33 (0)6 68 17 66 05
Email:ÿÿÿ ÿÿnmaillard@hortonworks.com
Website:ÿÿÿhttp://www.hortonworks.com



CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed
and may contain information that is confidential, privileged and exempt from disclosure under
applicable law. If the reader of this message is not the intended recipient, you are hereby
notified that any printing, copying, dissemination, distribution, disclosure or forwarding
of this communication is strictly prohibited. If you have received this communication in error,
please contact the sender immediately and delete it from your system. Thank You.


Mime
View raw message