You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@phoenix.apache.org by Shawn Li <sh...@gmail.com> on 2018/12/26 21:59:43 UTC

column mapping schema decoding

Hi,

Phoenix 4.10 introduced column mapping feature. There are four types of
mapping schema (https://phoenix.apache.org/columnencoding.html). Is there
any documentation that shows how to encode/map string column name in
Phoenix to number column qualifier in Hbase?

We are using Lily Hbase indexer to do the batch indexing. So if the column
qualifier is number. We need find a way to decode it back to the original
String column name.

Thanks,
Shawn

Re: column mapping schema decoding

Posted by Thomas D'Silva <td...@salesforce.com>.

The encoded column qualifiers do not start at one (see
QueryConstants.ENCODED_CQ_COUNTER_INITIAL_VALUE). Its best to use
QualifierEncodingScheme as was suggested.

On Wed, Jan 2, 2019 at 3:53 PM Shawn Li <sh...@gmail.com> wrote:

> Hi Jaanai and Pedro,
>
> Any input for my example?
>
> Thanks,
> Shawn
>
> On Thu, Dec 27, 2018, 12:34 Shawn Li <shawnlijob@gmail.com wrote:
>
>> Hi Jaanai,
>>
>> Thanks for the input. So the encoding schema is not simple first come
>> first assigned (such as in my example: A.population -> 1, A.type -> 2;
>> B.zipcode -> 1, B.quality ->2)? In order to decode it, we will have to
>> use QualifierEncodingScheme class? The reason we want to use the column
>> mapping is because the improvement of query performance mentioned on
>> Phoenix website.  We have tables with columns number between 100 to 200.
>>
>> Thanks,
>> Shawn
>>
>> On Thu, Dec 27, 2018 at 2:22 AM Jaanai Zhang <cl...@gmail.com>
>> wrote:
>>
>>> The actual column name and encoded qualifier number are stored in SYSTEM.CATALOG
>>> table, the field names are COLUMN_NAME(string) and COLUMN_QUALIFIER(binary)
>>> respectively, QualifierEncodingScheme can be used to decode/encode
>>> COLUMN_QUALIFIER, but this is a little complicated process.
>>>
>>> For your scenario, maybe use the original column is better.
>>>
>>>
>>>
>>>
>>> ----------------------------------------
>>>    Jaanai Zhang
>>>    Best regards!
>>>
>>>
>>>
>>> Shawn Li <sh...@gmail.com> 于2018年12月27日周四 上午7:17写道：
>>>
>>>> Hi Pedro,
>>>>
>>>> Thanks for reply. Can you explain a little bit more? For example, if we
>>>> use COLUMN_ENCODED_BYTES = 1,How is the following table DDL converted
>>>> to numbered column qualifier in Hbase? (such as A.population maps which
>>>> number, B.zipcode Map to which number in Hbase)
>>>>
>>>> CREATE TABLE IF NOT EXISTS us_population (
>>>>       state CHAR(2) NOT NULL,
>>>>       city VARCHAR NOT NULL,
>>>>       A.population BIGINT,
>>>>       A.type CHAR,
>>>>       B.zipcode CHAR(5),
>>>>       B.quantity INT CONSTRAINT my_pk PRIMARY KEY (state, city));
>>>>
>>>>
>>>> Thanks,
>>>> Shawn
>>>>
>>>> On Wed, Dec 26, 2018 at 6:00 PM Pedro Boado <pb...@apache.org> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Column mapping is stored in SYSTEM.CATALOG table . There is only one
>>>>> column mapping strategy with between 1 to 4 bytes to be used to represent
>>>>> column number. Regardless of encoded column size, column name lookup
>>>>> strategy remains the same.
>>>>>
>>>>> Hope it helps,
>>>>>
>>>>> Pedro.
>>>>>
>>>>>
>>>>>
>>>>> On Wed, 26 Dec 2018, 23:00 Shawn Li <shawnlijob@gmail.com wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Phoenix 4.10 introduced column mapping feature. There are four types
>>>>>> of mapping schema (https://phoenix.apache.org/columnencoding.html).
>>>>>> Is there any documentation that shows how to encode/map string column name
>>>>>> in Phoenix to number column qualifier in Hbase?
>>>>>>
>>>>>> We are using Lily Hbase indexer to do the batch indexing. So if the
>>>>>> column qualifier is number. We need find a way to decode it back to the
>>>>>> original String column name.
>>>>>>
>>>>>> Thanks,
>>>>>> Shawn
>>>>>>
>>>>>

Re: column mapping schema decoding

Posted by Shawn Li <sh...@gmail.com>.

Hi Jaanai and Pedro,

Any input for my example?

Thanks,
Shawn

On Thu, Dec 27, 2018, 12:34 Shawn Li <shawnlijob@gmail.com wrote:

> Hi Jaanai,
>
> Thanks for the input. So the encoding schema is not simple first come
> first assigned (such as in my example: A.population -> 1, A.type -> 2;
> B.zipcode -> 1, B.quality ->2)? In order to decode it, we will have to
> use QualifierEncodingScheme class? The reason we want to use the column
> mapping is because the improvement of query performance mentioned on
> Phoenix website.  We have tables with columns number between 100 to 200.
>
> Thanks,
> Shawn
>
> On Thu, Dec 27, 2018 at 2:22 AM Jaanai Zhang <cl...@gmail.com>
> wrote:
>
>> The actual column name and encoded qualifier number are stored in SYSTEM.CATALOG
>> table, the field names are COLUMN_NAME(string) and COLUMN_QUALIFIER(binary)
>> respectively, QualifierEncodingScheme can be used to decode/encode
>> COLUMN_QUALIFIER, but this is a little complicated process.
>>
>> For your scenario, maybe use the original column is better.
>>
>>
>>
>>
>> ----------------------------------------
>>    Jaanai Zhang
>>    Best regards!
>>
>>
>>
>> Shawn Li <sh...@gmail.com> 于2018年12月27日周四 上午7:17写道：
>>
>>> Hi Pedro,
>>>
>>> Thanks for reply. Can you explain a little bit more? For example, if we
>>> use COLUMN_ENCODED_BYTES = 1,How is the following table DDL converted
>>> to numbered column qualifier in Hbase? (such as A.population maps which
>>> number, B.zipcode Map to which number in Hbase)
>>>
>>> CREATE TABLE IF NOT EXISTS us_population (
>>>       state CHAR(2) NOT NULL,
>>>       city VARCHAR NOT NULL,
>>>       A.population BIGINT,
>>>       A.type CHAR,
>>>       B.zipcode CHAR(5),
>>>       B.quantity INT CONSTRAINT my_pk PRIMARY KEY (state, city));
>>>
>>>
>>> Thanks,
>>> Shawn
>>>
>>> On Wed, Dec 26, 2018 at 6:00 PM Pedro Boado <pb...@apache.org> wrote:
>>>
>>>> Hi,
>>>>
>>>> Column mapping is stored in SYSTEM.CATALOG table . There is only one
>>>> column mapping strategy with between 1 to 4 bytes to be used to represent
>>>> column number. Regardless of encoded column size, column name lookup
>>>> strategy remains the same.
>>>>
>>>> Hope it helps,
>>>>
>>>> Pedro.
>>>>
>>>>
>>>>
>>>> On Wed, 26 Dec 2018, 23:00 Shawn Li <shawnlijob@gmail.com wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Phoenix 4.10 introduced column mapping feature. There are four types
>>>>> of mapping schema (https://phoenix.apache.org/columnencoding.html).
>>>>> Is there any documentation that shows how to encode/map string column name
>>>>> in Phoenix to number column qualifier in Hbase?
>>>>>
>>>>> We are using Lily Hbase indexer to do the batch indexing. So if the
>>>>> column qualifier is number. We need find a way to decode it back to the
>>>>> original String column name.
>>>>>
>>>>> Thanks,
>>>>> Shawn
>>>>>
>>>>

Re: column mapping schema decoding

Posted by Shawn Li <sh...@gmail.com>.

Hi Jaanai,

Thanks for the input. So the encoding schema is not simple first come first
assigned (such as in my example: A.population -> 1, A.type -> 2; B.zipcode
-> 1, B.quality ->2)? In order to decode it, we will have to use
QualifierEncodingScheme class? The reason we want to use the column mapping
is because the improvement of query performance mentioned on Phoenix
website.  We have tables with columns number between 100 to 200.

Thanks,
Shawn

On Thu, Dec 27, 2018 at 2:22 AM Jaanai Zhang <cl...@gmail.com> wrote:

> The actual column name and encoded qualifier number are stored in SYSTEM.CATALOG
> table, the field names are COLUMN_NAME(string) and COLUMN_QUALIFIER(binary)
> respectively, QualifierEncodingScheme can be used to decode/encode
> COLUMN_QUALIFIER, but this is a little complicated process.
>
> For your scenario, maybe use the original column is better.
>
>
>
>
> ----------------------------------------
>    Jaanai Zhang
>    Best regards!
>
>
>
> Shawn Li <sh...@gmail.com> 于2018年12月27日周四 上午7:17写道：
>
>> Hi Pedro,
>>
>> Thanks for reply. Can you explain a little bit more? For example, if we
>> use COLUMN_ENCODED_BYTES = 1,How is the following table DDL converted to
>> numbered column qualifier in Hbase? (such as A.population maps which
>> number, B.zipcode Map to which number in Hbase)
>>
>> CREATE TABLE IF NOT EXISTS us_population (
>>       state CHAR(2) NOT NULL,
>>       city VARCHAR NOT NULL,
>>       A.population BIGINT,
>>       A.type CHAR,
>>       B.zipcode CHAR(5),
>>       B.quantity INT CONSTRAINT my_pk PRIMARY KEY (state, city));
>>
>>
>> Thanks,
>> Shawn
>>
>> On Wed, Dec 26, 2018 at 6:00 PM Pedro Boado <pb...@apache.org> wrote:
>>
>>> Hi,
>>>
>>> Column mapping is stored in SYSTEM.CATALOG table . There is only one
>>> column mapping strategy with between 1 to 4 bytes to be used to represent
>>> column number. Regardless of encoded column size, column name lookup
>>> strategy remains the same.
>>>
>>> Hope it helps,
>>>
>>> Pedro.
>>>
>>>
>>>
>>> On Wed, 26 Dec 2018, 23:00 Shawn Li <shawnlijob@gmail.com wrote:
>>>
>>>> Hi,
>>>>
>>>> Phoenix 4.10 introduced column mapping feature. There are four types of
>>>> mapping schema (https://phoenix.apache.org/columnencoding.html). Is
>>>> there any documentation that shows how to encode/map string column name in
>>>> Phoenix to number column qualifier in Hbase?
>>>>
>>>> We are using Lily Hbase indexer to do the batch indexing. So if the
>>>> column qualifier is number. We need find a way to decode it back to the
>>>> original String column name.
>>>>
>>>> Thanks,
>>>> Shawn
>>>>
>>>

Re: column mapping schema decoding

Posted by Jaanai Zhang <cl...@gmail.com>.

The actual column name and encoded qualifier number are stored in
SYSTEM.CATALOG
table, the field names are COLUMN_NAME(string) and COLUMN_QUALIFIER(binary)
respectively, QualifierEncodingScheme can be used to decode/encode
COLUMN_QUALIFIER, but this is a little complicated process.

For your scenario, maybe use the original column is better.




----------------------------------------
   Jaanai Zhang
   Best regards!



Shawn Li <sh...@gmail.com> 于2018年12月27日周四 上午7:17写道：

> Hi Pedro,
>
> Thanks for reply. Can you explain a little bit more? For example, if we
> use COLUMN_ENCODED_BYTES = 1,How is the following table DDL converted to
> numbered column qualifier in Hbase? (such as A.population maps which
> number, B.zipcode Map to which number in Hbase)
>
> CREATE TABLE IF NOT EXISTS us_population (
>       state CHAR(2) NOT NULL,
>       city VARCHAR NOT NULL,
>       A.population BIGINT,
>       A.type CHAR,
>       B.zipcode CHAR(5),
>       B.quantity INT CONSTRAINT my_pk PRIMARY KEY (state, city));
>
>
> Thanks,
> Shawn
>
> On Wed, Dec 26, 2018 at 6:00 PM Pedro Boado <pb...@apache.org> wrote:
>
>> Hi,
>>
>> Column mapping is stored in SYSTEM.CATALOG table . There is only one
>> column mapping strategy with between 1 to 4 bytes to be used to represent
>> column number. Regardless of encoded column size, column name lookup
>> strategy remains the same.
>>
>> Hope it helps,
>>
>> Pedro.
>>
>>
>>
>> On Wed, 26 Dec 2018, 23:00 Shawn Li <shawnlijob@gmail.com wrote:
>>
>>> Hi,
>>>
>>> Phoenix 4.10 introduced column mapping feature. There are four types of
>>> mapping schema (https://phoenix.apache.org/columnencoding.html). Is
>>> there any documentation that shows how to encode/map string column name in
>>> Phoenix to number column qualifier in Hbase?
>>>
>>> We are using Lily Hbase indexer to do the batch indexing. So if the
>>> column qualifier is number. We need find a way to decode it back to the
>>> original String column name.
>>>
>>> Thanks,
>>> Shawn
>>>
>>

Re: column mapping schema decoding

Posted by Shawn Li <sh...@gmail.com>.

Hi Pedro,

Thanks for reply. Can you explain a little bit more? For example, if
we use COLUMN_ENCODED_BYTES
= 1,How is the following table DDL converted to numbered column qualifier
in Hbase? (such as A.population maps which number, B.zipcode Map to which
number in Hbase)

CREATE TABLE IF NOT EXISTS us_population (
      state CHAR(2) NOT NULL,
      city VARCHAR NOT NULL,
      A.population BIGINT,
      A.type CHAR,
      B.zipcode CHAR(5),
      B.quantity INT CONSTRAINT my_pk PRIMARY KEY (state, city));


Thanks,
Shawn

On Wed, Dec 26, 2018 at 6:00 PM Pedro Boado <pb...@apache.org> wrote:

> Hi,
>
> Column mapping is stored in SYSTEM.CATALOG table . There is only one
> column mapping strategy with between 1 to 4 bytes to be used to represent
> column number. Regardless of encoded column size, column name lookup
> strategy remains the same.
>
> Hope it helps,
>
> Pedro.
>
>
>
> On Wed, 26 Dec 2018, 23:00 Shawn Li <shawnlijob@gmail.com wrote:
>
>> Hi,
>>
>> Phoenix 4.10 introduced column mapping feature. There are four types of
>> mapping schema (https://phoenix.apache.org/columnencoding.html). Is
>> there any documentation that shows how to encode/map string column name in
>> Phoenix to number column qualifier in Hbase?
>>
>> We are using Lily Hbase indexer to do the batch indexing. So if the
>> column qualifier is number. We need find a way to decode it back to the
>> original String column name.
>>
>> Thanks,
>> Shawn
>>
>

Re: column mapping schema decoding

Posted by Pedro Boado <pb...@apache.org>.

Hi,

Column mapping is stored in SYSTEM.CATALOG table . There is only one column
mapping strategy with between 1 to 4 bytes to be used to represent column
number. Regardless of encoded column size, column name lookup strategy
remains the same.

Hope it helps,

Pedro.

On Wed, 26 Dec 2018, 23:00 Shawn Li <shawnlijob@gmail.com wrote:

> Hi,
>
> Phoenix 4.10 introduced column mapping feature. There are four types of
> mapping schema (https://phoenix.apache.org/columnencoding.html). Is there
> any documentation that shows how to encode/map string column name in
> Phoenix to number column qualifier in Hbase?
>
> We are using Lily Hbase indexer to do the batch indexing. So if the column
> qualifier is number. We need find a way to decode it back to the original
> String column name.
>
> Thanks,
> Shawn
>