You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by "Bulvik, Noam" <No...@teoco.com> on 2015/01/19 12:49:23 UTC

short name for columns

Hi,

Do you plan to support assign short name for columns as part of phoenix features. i.e. when creating table using phoenix DDL there will be a metadata table that will convert the column name to short names (like a,b,c ... aa,bb....). each time there will be a query the SQL that the user will use will be converted to the short name to query the db and will be converted back to the real name in the result set.

This may save a lot of space because the name of a column is part of each row saved in the files.

Regards,
Noam
Information in this e-mail and its attachments is confidential and privileged under the TEOCO confidentiality terms that can be reviewed here<http://www.teoco.com/email-disclaimer>.

Re: short name for columns

Posted by James Taylor <ja...@apache.org>.
Thanks, Noam. I opened HBASE-12883 as well. I think this kind of pure
storage optimization should be done at the HBase level.

    James

On Mon, Jan 19, 2015 at 11:07 PM, Bulvik, Noam <No...@teoco.com> wrote:
> I opened https://issues.apache.org/jira/browse/PHOENIX-1598
> This feature can be used with prefix encoding there is contradiction between these two features
>
> -----Original Message-----
> From: James Taylor [mailto:jamestaylor@apache.org]
> Sent: Monday, January 19, 2015 7:00 PM
> To: user
> Subject: Re: short name for columns
>
> Good idea. Phoenix doesn't do that today. I'm hoping that HBase can come up with better block encodings that factor this kind of information out without perf taking a hit. They actually have one (TRIE), but I'm not sure how stable it is. Also, I'm not sure how well the existing encodings do for this (maybe good enough?).
>
> Please file a JIRA. Thanks,
>
> James
>
> On Mon, Jan 19, 2015 at 7:41 AM, Anil Gupta <an...@gmail.com> wrote:
>> You mean to have a support for aliases for columns?
>> If yes, then +1 for that.
>>
>> Sent from my iPhone
>>
>> On Jan 19, 2015, at 3:49 AM, Bulvik, Noam <No...@teoco.com> wrote:
>>
>> Hi,
>>
>>
>>
>> Do you plan to support assign short name for columns as part of
>> phoenix features. i.e. when creating table using phoenix DDL there
>> will be a metadata table that will convert the column name to short
>> names (like a,b,c … aa,bb….). each time there will be a query the SQL
>> that the user will use will be converted to the short name to query
>> the db and will be converted back to the real name in the result set.
>>
>>
>>
>> This may save a lot of space because the name of a column is part of
>> each row saved in the files.
>>
>>
>>
>> Regards,
>>
>> Noam
>>
>> Information in this e-mail and its attachments is confidential and
>> privileged under the TEOCO confidentiality terms that can be reviewed here.
> Information in this e-mail and its attachments is confidential and privileged under the TEOCO confidentiality terms that can be reviewed here<http://www.teoco.com/email-disclaimer>.

RE: short name for columns

Posted by "Bulvik, Noam" <No...@teoco.com>.
I opened https://issues.apache.org/jira/browse/PHOENIX-1598
This feature can be used with prefix encoding there is contradiction between these two features

-----Original Message-----
From: James Taylor [mailto:jamestaylor@apache.org]
Sent: Monday, January 19, 2015 7:00 PM
To: user
Subject: Re: short name for columns

Good idea. Phoenix doesn't do that today. I'm hoping that HBase can come up with better block encodings that factor this kind of information out without perf taking a hit. They actually have one (TRIE), but I'm not sure how stable it is. Also, I'm not sure how well the existing encodings do for this (maybe good enough?).

Please file a JIRA. Thanks,

James

On Mon, Jan 19, 2015 at 7:41 AM, Anil Gupta <an...@gmail.com> wrote:
> You mean to have a support for aliases for columns?
> If yes, then +1 for that.
>
> Sent from my iPhone
>
> On Jan 19, 2015, at 3:49 AM, Bulvik, Noam <No...@teoco.com> wrote:
>
> Hi,
>
>
>
> Do you plan to support assign short name for columns as part of
> phoenix features. i.e. when creating table using phoenix DDL there
> will be a metadata table that will convert the column name to short
> names (like a,b,c … aa,bb….). each time there will be a query the SQL
> that the user will use will be converted to the short name to query
> the db and will be converted back to the real name in the result set.
>
>
>
> This may save a lot of space because the name of a column is part of
> each row saved in the files.
>
>
>
> Regards,
>
> Noam
>
> Information in this e-mail and its attachments is confidential and
> privileged under the TEOCO confidentiality terms that can be reviewed here.
Information in this e-mail and its attachments is confidential and privileged under the TEOCO confidentiality terms that can be reviewed here<http://www.teoco.com/email-disclaimer>.

RE: short name for columns

Posted by "Vasudevan, Ramkrishna S" <ra...@intel.com>.
Hi

Currently the encoding feature tries to avoid as much as duplicates in the row keys, family names, column qualifier names.  If there are two cells 

Row1/cf1:qual1/val1
Row1/cf1:qual2/val2 

Then we try to find the common part among both the keys.  The first key is stored as it is but in the second key we do not write the common part 'Row1 to qual' because the row and Cf are the same.  Even among the qualifier name we have 'qual' which is common.  

So if the key values have more repetitive parts we get better encoding.  So may be in the Phoenix layer if we find column names bigger and non-repetitive naming structure we could rename the column qualifiers to make use of the above encoding capability.

Regards
Ram

-----Original Message-----
From: James Taylor [mailto:jamestaylor@apache.org] 
Sent: Monday, January 19, 2015 10:30 PM
To: user
Subject: Re: short name for columns

Good idea. Phoenix doesn't do that today. I'm hoping that HBase can come up with better block encodings that factor this kind of information out without perf taking a hit. They actually have one (TRIE), but I'm not sure how stable it is. Also, I'm not sure how well the existing encodings do for this (maybe good enough?).

Please file a JIRA. Thanks,

James

On Mon, Jan 19, 2015 at 7:41 AM, Anil Gupta <an...@gmail.com> wrote:
> You mean to have a support for aliases for columns?
> If yes, then +1 for that.
>
> Sent from my iPhone
>
> On Jan 19, 2015, at 3:49 AM, Bulvik, Noam <No...@teoco.com> wrote:
>
> Hi,
>
>
>
> Do you plan to support assign short name for columns as part of 
> phoenix features. i.e. when creating table using phoenix DDL there 
> will be a metadata table that will convert the column name to short 
> names (like a,b,c … aa,bb….). each time there will be a query the SQL 
> that the user will use will be converted to the short name to query 
> the db and will be converted back to the real name in the result set.
>
>
>
> This may save a lot of space because the name of a column is part of 
> each row saved in the files.
>
>
>
> Regards,
>
> Noam
>
> Information in this e-mail and its attachments is confidential and 
> privileged under the TEOCO confidentiality terms that can be reviewed here.

Re: short name for columns

Posted by James Taylor <ja...@apache.org>.
Good idea. Phoenix doesn't do that today. I'm hoping that HBase can
come up with better block encodings that factor this kind of
information out without perf taking a hit. They actually have one
(TRIE), but I'm not sure how stable it is. Also, I'm not sure how well
the existing encodings do for this (maybe good enough?).

Please file a JIRA. Thanks,

James

On Mon, Jan 19, 2015 at 7:41 AM, Anil Gupta <an...@gmail.com> wrote:
> You mean to have a support for aliases for columns?
> If yes, then +1 for that.
>
> Sent from my iPhone
>
> On Jan 19, 2015, at 3:49 AM, Bulvik, Noam <No...@teoco.com> wrote:
>
> Hi,
>
>
>
> Do you plan to support assign short name for columns as part of phoenix
> features. i.e. when creating table using phoenix DDL there will be a
> metadata table that will convert the column name to short names (like a,b,c
> … aa,bb….). each time there will be a query the SQL that the user will use
> will be converted to the short name to query the db and will be converted
> back to the real name in the result set.
>
>
>
> This may save a lot of space because the name of a column is part of each
> row saved in the files.
>
>
>
> Regards,
>
> Noam
>
> Information in this e-mail and its attachments is confidential and
> privileged under the TEOCO confidentiality terms that can be reviewed here.

Re: short name for columns

Posted by Anil Gupta <an...@gmail.com>.
You mean to have a support for aliases for columns?
If yes, then +1 for that.

Sent from my iPhone

> On Jan 19, 2015, at 3:49 AM, Bulvik, Noam <No...@teoco.com> wrote:
> 
> Hi,
>  
> Do you plan to support assign short name for columns as part of phoenix features. i.e. when creating table using phoenix DDL there will be a metadata table that will convert the column name to short names (like a,b,c … aa,bb….). each time there will be a query the SQL that the user will use will be converted to the short name to query the db and will be converted back to the real name in the result set.
>  
> This may save a lot of space because the name of a column is part of each row saved in the files.
>  
> Regards,
> Noam  
> Information in this e-mail and its attachments is confidential and privileged under the TEOCO confidentiality terms that can be reviewed here.