You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by Sumanta Gh <su...@tcs.com> on 2015/01/06 08:01:15 UTC

Select dynamic column content

Hi,
It has been a puzzle for me to get the content of dynamic columns in a 
single SELECT * FROM query. 
Challenge is that I do not know the dynamic column names in advance so I 
can not mention a dynamic column in the SELECT query.
Is there any way out? Please suggest.

Regards
Sumanta 
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you



Re: Select dynamic column content

Posted by James Taylor <ja...@apache.org>.
Hi Sumanta,
Another alternative option is to leverage support for VIEWs in Phoenix (
http://phoenix.apache.org/views.html). In many use cases I've seen where
there are hundreds of sparse columns defined for a schema, there's a column
that determines *which* sparse columns are applicable for a given row. An
example would be a metric_event table in which an event_type column value
defines which columns are associated with a given event_type. This might be
modelled with a base table and a series of views like this:

-- base table
CREATE TABLE metric_event (
    event_type CHAR(2),
    event_id VARCHAR,
    created_date DATE,
    CONSTRAINT pk PRIMARY KEY (event_type, event_id));

-- updatable view for garbage collection event
CREATE VIEW gc_event( -- define columns specific to a GC event
    duration BIGINT,
    eden_space BIGINT,
    survivor_space BIGINT,
    ...)
AS SELECT * FROM event WHERE event_type = 'GC'

-- updatable view for IO event
CREATE VIEW io_event( -- define columns specific to an IO event
    io_wait_time BIGINT,
    iops BIGINT,
    read_workload INTEGER,
    write_workload INTEGER,
    ...)
AS SELECT * FROM event WHERE event_type = 'IO'

Using this approach, Phoenix will keep track of the columns each view for
you so you don't have to. All views reside in the same HBase table (the one
corresponding to your base table). When you do a SELECT * FROM io_event,
you'll get back only those columns defined for that view plus the columns
from the base table.

A further orthogonal level of granularity allows you to define your base
table as a MULTI_TENANT table and have each VIEW vary depending on the
particular user (identified at connection time through a property). Just as
with the scenario described above, each user may define different columns
that only exist for that particular user. See
http://phoenix.apache.org/multi-tenancy.html for more info.

HTH,

    James

On Tue, Jan 6, 2015 at 9:43 AM, Jesse Yates <je...@gmail.com> wrote:

> And it looks like you already figured that out :)
>
>
> On Tue, Jan 6, 2015, 9:43 AM Jesse Yates <je...@gmail.com> wrote:
>
>> You wouldn't even need another table, just a single VARCHAR[] column to
>> keep the column names. Its ideal to keep it in the same row (possibly in
>> another cf if you expect it to be large) since you get ACID compliance on
>> that row, which you wouldn't get from using another table. You then just
>> upsert the names column at the same time you upsert the dynamic columns.
>>
>> Phoenix does something similar for tracing where there is an unknown
>> number of annotations - we keep a trace annotation count column which then
>> can be used to figure out the dynamic annotation column names (which are
>> things like annotations.a0, .a1, .a2, etc)
>>
>> The downside is that you then need to do two queries to get all the
>> columns, but until we implement the cf.* logic for dynamic columns, that's
>> the best you can do.
>>
>> - jesse
>>
>> On Tue, Jan 6, 2015, 9:23 AM Sumanta Gh <su...@tcs.com> wrote:
>>
>>> Thanks Nicolas for replying.
>>> I am already managing dynamic column names either putting them in a
>>> separate column or keeping the names in cache.
>>> But looking at the pace Phoenix is evolving, IMHO this cf.* query
>>> pattern would be very much helpful for users.
>>> The stock HBase client is capable of doing that.
>>> Because my table is extremely sparsed and I allow quite a random schema
>>> in every row of the table, getting the content of the dynamic column in a
>>> single query will save lot of time.
>>> Looking forward to your completion of the work...
>>>
>>> Regards
>>> Sumanta
>>>
>>>
>>> -----Nicolas Maillard <nm...@hortonworks.com> wrote: -----
>>> To: user@phoenix.apache.org
>>> From: Nicolas Maillard <nm...@hortonworks.com>
>>> Date: 01/06/2015 03:08PM
>>> Subject: Re: Select dynamic column content
>>>
>>>
>>> Hello Sumanta
>>>
>>> This is a last bit missing in the select cf.* query pattern that would
>>> bring back not only known columns but all dynamic ones also. I never got
>>> around to finishing that bit for different reasons. The best way would be
>>> to finish that, other than that I do not see an easy way to retrieve
>>>  dynamic columns of which you have lost the column name. I guess if there
>>> is a logic ti these column names you could try to reconstruct or keep a
>>> second table of keys to dynamic column names to find them in the after math.
>>>
>>> regards
>>>
>>> On Tue, Jan 6, 2015 at 8:01 AM, Sumanta Gh <su...@tcs.com> wrote:
>>>
>>>> Hi,
>>>> It has been a puzzle for me to get the content of dynamic columns in a
>>>> single SELECT * FROM query.
>>>> Challenge is that I do not know the dynamic column names in advance so
>>>> I can not mention a dynamic column in the SELECT query.
>>>> Is there any way out? Please suggest.
>>>>
>>>> Regards
>>>> Sumanta
>>>>
>>>> =====-----=====-----=====
>>>> Notice: The information contained in this e-mail
>>>> message and/or attachments to it may contain
>>>> confidential or privileged information. If you are
>>>> not the intended recipient, any dissemination, use,
>>>> review, distribution, printing or copying of the
>>>> information contained in this e-mail message
>>>> and/or attachments to it are strictly prohibited. If
>>>> you have received this communication in error,
>>>> please notify us by reply e-mail or telephone and
>>>> immediately and permanently delete the message
>>>> and any attachments. Thank you
>>>>
>>>>
>>>
>>>
>>> --
>>> Nicolas Maillard Solution Engineer
>>> Phone:     +33 (0)6 68 17 66 05
>>> Email:      nmaillard*@hortonworks.com* <nm...@hortonworks.com>
>>> Website:   http://www.hortonworks.com
>>>
>>>
>>>
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>>> to which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender immediately
>>> and delete it from your system. Thank You.
>>>
>>

Re: Select dynamic column content

Posted by Jesse Yates <je...@gmail.com>.
And it looks like you already figured that out :)

On Tue, Jan 6, 2015, 9:43 AM Jesse Yates <je...@gmail.com> wrote:

> You wouldn't even need another table, just a single VARCHAR[] column to
> keep the column names. Its ideal to keep it in the same row (possibly in
> another cf if you expect it to be large) since you get ACID compliance on
> that row, which you wouldn't get from using another table. You then just
> upsert the names column at the same time you upsert the dynamic columns.
>
> Phoenix does something similar for tracing where there is an unknown
> number of annotations - we keep a trace annotation count column which then
> can be used to figure out the dynamic annotation column names (which are
> things like annotations.a0, .a1, .a2, etc)
>
> The downside is that you then need to do two queries to get all the
> columns, but until we implement the cf.* logic for dynamic columns, that's
> the best you can do.
>
> - jesse
>
> On Tue, Jan 6, 2015, 9:23 AM Sumanta Gh <su...@tcs.com> wrote:
>
>> Thanks Nicolas for replying.
>> I am already managing dynamic column names either putting them in a
>> separate column or keeping the names in cache.
>> But looking at the pace Phoenix is evolving, IMHO this cf.* query pattern
>> would be very much helpful for users.
>> The stock HBase client is capable of doing that.
>> Because my table is extremely sparsed and I allow quite a random schema
>> in every row of the table, getting the content of the dynamic column in a
>> single query will save lot of time.
>> Looking forward to your completion of the work...
>>
>> Regards
>> Sumanta
>>
>>
>> -----Nicolas Maillard <nm...@hortonworks.com> wrote: -----
>> To: user@phoenix.apache.org
>> From: Nicolas Maillard <nm...@hortonworks.com>
>> Date: 01/06/2015 03:08PM
>> Subject: Re: Select dynamic column content
>>
>>
>> Hello Sumanta
>>
>> This is a last bit missing in the select cf.* query pattern that would
>> bring back not only known columns but all dynamic ones also. I never got
>> around to finishing that bit for different reasons. The best way would be
>> to finish that, other than that I do not see an easy way to retrieve
>>  dynamic columns of which you have lost the column name. I guess if there
>> is a logic ti these column names you could try to reconstruct or keep a
>> second table of keys to dynamic column names to find them in the after math.
>>
>> regards
>>
>> On Tue, Jan 6, 2015 at 8:01 AM, Sumanta Gh <su...@tcs.com> wrote:
>>
>>> Hi,
>>> It has been a puzzle for me to get the content of dynamic columns in a
>>> single SELECT * FROM query.
>>> Challenge is that I do not know the dynamic column names in advance so I
>>> can not mention a dynamic column in the SELECT query.
>>> Is there any way out? Please suggest.
>>>
>>> Regards
>>> Sumanta
>>>
>>> =====-----=====-----=====
>>> Notice: The information contained in this e-mail
>>> message and/or attachments to it may contain
>>> confidential or privileged information. If you are
>>> not the intended recipient, any dissemination, use,
>>> review, distribution, printing or copying of the
>>> information contained in this e-mail message
>>> and/or attachments to it are strictly prohibited. If
>>> you have received this communication in error,
>>> please notify us by reply e-mail or telephone and
>>> immediately and permanently delete the message
>>> and any attachments. Thank you
>>>
>>>
>>
>>
>> --
>> Nicolas Maillard Solution Engineer
>> Phone:     +33 (0)6 68 17 66 05
>> Email:      nmaillard*@hortonworks.com* <nm...@hortonworks.com>
>> Website:   http://www.hortonworks.com
>>
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>
>

Re: Select dynamic column content

Posted by Jesse Yates <je...@gmail.com>.
You wouldn't even need another table, just a single VARCHAR[] column to
keep the column names. Its ideal to keep it in the same row (possibly in
another cf if you expect it to be large) since you get ACID compliance on
that row, which you wouldn't get from using another table. You then just
upsert the names column at the same time you upsert the dynamic columns.

Phoenix does something similar for tracing where there is an unknown number
of annotations - we keep a trace annotation count column which then can be
used to figure out the dynamic annotation column names (which are things
like annotations.a0, .a1, .a2, etc)

The downside is that you then need to do two queries to get all the
columns, but until we implement the cf.* logic for dynamic columns, that's
the best you can do.

- jesse

On Tue, Jan 6, 2015, 9:23 AM Sumanta Gh <su...@tcs.com> wrote:

> Thanks Nicolas for replying.
> I am already managing dynamic column names either putting them in a
> separate column or keeping the names in cache.
> But looking at the pace Phoenix is evolving, IMHO this cf.* query pattern
> would be very much helpful for users.
> The stock HBase client is capable of doing that.
> Because my table is extremely sparsed and I allow quite a random schema in
> every row of the table, getting the content of the dynamic column in a
> single query will save lot of time.
> Looking forward to your completion of the work...
>
> Regards
> Sumanta
>
>
> -----Nicolas Maillard <nm...@hortonworks.com> wrote: -----
> To: user@phoenix.apache.org
> From: Nicolas Maillard <nm...@hortonworks.com>
> Date: 01/06/2015 03:08PM
> Subject: Re: Select dynamic column content
>
>
> Hello Sumanta
>
> This is a last bit missing in the select cf.* query pattern that would
> bring back not only known columns but all dynamic ones also. I never got
> around to finishing that bit for different reasons. The best way would be
> to finish that, other than that I do not see an easy way to retrieve
>  dynamic columns of which you have lost the column name. I guess if there
> is a logic ti these column names you could try to reconstruct or keep a
> second table of keys to dynamic column names to find them in the after math.
>
> regards
>
> On Tue, Jan 6, 2015 at 8:01 AM, Sumanta Gh <su...@tcs.com> wrote:
>
>> Hi,
>> It has been a puzzle for me to get the content of dynamic columns in a
>> single SELECT * FROM query.
>> Challenge is that I do not know the dynamic column names in advance so I
>> can not mention a dynamic column in the SELECT query.
>> Is there any way out? Please suggest.
>>
>> Regards
>> Sumanta
>>
>> =====-----=====-----=====
>> Notice: The information contained in this e-mail
>> message and/or attachments to it may contain
>> confidential or privileged information. If you are
>> not the intended recipient, any dissemination, use,
>> review, distribution, printing or copying of the
>> information contained in this e-mail message
>> and/or attachments to it are strictly prohibited. If
>> you have received this communication in error,
>> please notify us by reply e-mail or telephone and
>> immediately and permanently delete the message
>> and any attachments. Thank you
>>
>>
>
>
> --
> Nicolas Maillard Solution Engineer
> Phone:     +33 (0)6 68 17 66 05
> Email:      nmaillard*@hortonworks.com* <nm...@hortonworks.com>
> Website:   http://www.hortonworks.com
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

Re: Select dynamic column content

Posted by Sumanta Gh <su...@tcs.com>.
Thanks Nicolas for replying.
I am already managing dynamic column names either putting them in a separate column or keeping the names in cache.
But looking at the pace Phoenix is evolving, IMHO this cf.* query pattern would be very much helpful for users.
The stock HBase client is capable of doing that.
Because my table is extremely sparsed and I allow quite a random schema in every row of the table, getting the content of the dynamic column in a single query will save lot of time.
Looking forward to your completion of the work...

Regards
Sumanta 


-----Nicolas Maillard <nm...@hortonworks.com> wrote: -----
To: user@phoenix.apache.org
From: Nicolas Maillard <nm...@hortonworks.com>
Date: 01/06/2015 03:08PM
Subject: Re: Select dynamic column content

Hello Sumanta

This is a last bit missing in the select cf.* query pattern that would bring back not only known columns but all dynamic ones also. I never got around to finishing that bit for different reasons. The best way would be to finish that, other than that I do not see an easy way to retrieve  dynamic columns of which you have lost the column name. I guess if there is a logic ti these column names you could try to reconstruct or keep a second table of keys to dynamic column names to find them in the after math.

regards

On Tue, Jan 6, 2015 at 8:01 AM, Sumanta Gh <su...@tcs.com> wrote:
Hi, 
It has been a puzzle for me to get the content of dynamic columns in a single SELECT * FROM query. 
Challenge is that I do not know the dynamic column names in advance so I can not mention a dynamic column in the SELECT query. 
Is there any way out? Please suggest.

Regards
Sumanta
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you




-- 
Nicolas Maillard Solution Engineer
Phone:     +33 (0)6 68 17 66 05
Email:      nmaillard@hortonworks.com
Website:   http://www.hortonworks.com



CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

Re: Select dynamic column content

Posted by Nicolas Maillard <nm...@hortonworks.com>.
Hello Sumanta

This is a last bit missing in the select cf.* query pattern that would
bring back not only known columns but all dynamic ones also. I never got
around to finishing that bit for different reasons. The best way would be
to finish that, other than that I do not see an easy way to retrieve
 dynamic columns of which you have lost the column name. I guess if there
is a logic ti these column names you could try to reconstruct or keep a
second table of keys to dynamic column names to find them in the after math.

regards

On Tue, Jan 6, 2015 at 8:01 AM, Sumanta Gh <su...@tcs.com> wrote:

> Hi,
> It has been a puzzle for me to get the content of dynamic columns in a
> single SELECT * FROM query.
> Challenge is that I do not know the dynamic column names in advance so I
> can not mention a dynamic column in the SELECT query.
> Is there any way out? Please suggest.
>
> Regards
> Sumanta
>
> =====-----=====-----=====
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain
> confidential or privileged information. If you are
> not the intended recipient, any dissemination, use,
> review, distribution, printing or copying of the
> information contained in this e-mail message
> and/or attachments to it are strictly prohibited. If
> you have received this communication in error,
> please notify us by reply e-mail or telephone and
> immediately and permanently delete the message
> and any attachments. Thank you
>
>


-- 
Nicolas Maillard Solution Engineer
Phone:     +33 (0)6 68 17 66 05
Email:      nmaillard*@hortonworks.com* <nm...@hortonworks.com>
Website:   http://www.hortonworks.com

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.