phoenix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vasudevan, Ramkrishna S" <ramkrishna.s.vasude...@intel.com>
Subject RE: short name for columns
Date Tue, 20 Jan 2015 06:37:40 GMT
Hi

Currently the encoding feature tries to avoid as much as duplicates in the row keys, family
names, column qualifier names.  If there are two cells 

Row1/cf1:qual1/val1
Row1/cf1:qual2/val2 

Then we try to find the common part among both the keys.  The first key is stored as it is
but in the second key we do not write the common part 'Row1 to qual' because the row and Cf
are the same.  Even among the qualifier name we have 'qual' which is common.  

So if the key values have more repetitive parts we get better encoding.  So may be in the
Phoenix layer if we find column names bigger and non-repetitive naming structure we could
rename the column qualifiers to make use of the above encoding capability.

Regards
Ram

-----Original Message-----
From: James Taylor [mailto:jamestaylor@apache.org] 
Sent: Monday, January 19, 2015 10:30 PM
To: user
Subject: Re: short name for columns

Good idea. Phoenix doesn't do that today. I'm hoping that HBase can come up with better block
encodings that factor this kind of information out without perf taking a hit. They actually
have one (TRIE), but I'm not sure how stable it is. Also, I'm not sure how well the existing
encodings do for this (maybe good enough?).

Please file a JIRA. Thanks,

James

On Mon, Jan 19, 2015 at 7:41 AM, Anil Gupta <anilgupta84@gmail.com> wrote:
> You mean to have a support for aliases for columns?
> If yes, then +1 for that.
>
> Sent from my iPhone
>
> On Jan 19, 2015, at 3:49 AM, Bulvik, Noam <Noam.Bulvik@teoco.com> wrote:
>
> Hi,
>
>
>
> Do you plan to support assign short name for columns as part of 
> phoenix features. i.e. when creating table using phoenix DDL there 
> will be a metadata table that will convert the column name to short 
> names (like a,b,c … aa,bb….). each time there will be a query the SQL 
> that the user will use will be converted to the short name to query 
> the db and will be converted back to the real name in the result set.
>
>
>
> This may save a lot of space because the name of a column is part of 
> each row saved in the files.
>
>
>
> Regards,
>
> Noam
>
> Information in this e-mail and its attachments is confidential and 
> privileged under the TEOCO confidentiality terms that can be reviewed here.
Mime
View raw message