You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by Jan Van Besien <ja...@gmail.com> on 2021/11/02 13:33:45 UTC

more salt buckets than regions

Phoenix queries seem to return incorrect results when defining tables
with more salt buckets than regions. Is this a known problem or am I
misunderstanding something?

I tested with phoenix 5.1.2 and hbase 2.1.4.

Create a table with 4 salt buckets and 2 initial regions and add some values:

create table JVB (ID BIGINT NOT NULL, NAME VARCHAR, CONSTRAINT pk
PRIMARY KEY (ID)) SALT_BUCKETS = 4 SPLIT ON (5);
upsert into JVB values (0, 'A');
upsert into JVB values (1, 'A');
upsert into JVB values (2, 'A');
upsert into JVB values (3, 'A');
upsert into JVB values (4, 'A');
upsert into JVB values (5, 'A');
upsert into JVB values (6, 'A');
upsert into JVB values (7, 'A');
upsert into JVB values (8, 'A');
upsert into JVB values (9, 'A');

Example queries that are incorrect:

0: jdbc:phoenix:> select * from JVB where ID < 4;
+----+------+
| ID | NAME |
+----+------+
| 3  | A    |
+----+------+
1 row selected (0.011 seconds)
0: jdbc:phoenix:> select * from JVB where ID > 4;
+----+------+
| ID | NAME |
+----+------+
| 7  | A    |
| 2  | A    |
| 6  | A    |
| 1  | A    |
| 5  | A    |
| 9  | A    |
| 0  | A    |
| 4  | A    |
| 8  | A    |
+----+------+
9 rows selected (0.012 seconds)

Thanks,
Jan

Re: more salt buckets than regions

Posted by Istvan Toth <st...@cloudera.com>.
PHOENIX-4906 has the best explanation I think.

Created  PHOENIX-6587 <https://issues.apache.org/jira/browse/PHOENIX-6587> to
disallow this scenario.

On Wed, Nov 3, 2021 at 10:02 AM Jan Van Besien <ja...@gmail.com>
wrote:

> Thanks for confirming. Do you have a reference to the most relevant JIRA
> ticket? I found PHOENIX-4906 but I'm not sure if that's the most relevant
> one.
>
> Jan
>
> On Tue, 2 Nov 2021 at 22:47, Istvan Toth <st...@cloudera.com> wrote:
>
>> It's a known issue.
>> A region may only have rows from a single bucket in it, and it is not
>> hard to get into the state that you describe.
>> We have at least one JIRA open for it.
>> Phoenix will automatically pre-split according to the salt buckets if you
>> don't explicitly specify the splits in the create table statement.
>>
>> On Tue, Nov 2, 2021 at 2:34 PM Jan Van Besien <ja...@gmail.com>
>> wrote:
>>
>>> Phoenix queries seem to return incorrect results when defining tables
>>> with more salt buckets than regions. Is this a known problem or am I
>>> misunderstanding something?
>>>
>>> I tested with phoenix 5.1.2 and hbase 2.1.4.
>>>
>>> Create a table with 4 salt buckets and 2 initial regions and add some
>>> values:
>>>
>>> create table JVB (ID BIGINT NOT NULL, NAME VARCHAR, CONSTRAINT pk
>>> PRIMARY KEY (ID)) SALT_BUCKETS = 4 SPLIT ON (5);
>>> upsert into JVB values (0, 'A');
>>> upsert into JVB values (1, 'A');
>>> upsert into JVB values (2, 'A');
>>> upsert into JVB values (3, 'A');
>>> upsert into JVB values (4, 'A');
>>> upsert into JVB values (5, 'A');
>>> upsert into JVB values (6, 'A');
>>> upsert into JVB values (7, 'A');
>>> upsert into JVB values (8, 'A');
>>> upsert into JVB values (9, 'A');
>>>
>>> Example queries that are incorrect:
>>>
>>> 0: jdbc:phoenix:> select * from JVB where ID < 4;
>>> +----+------+
>>> | ID | NAME |
>>> +----+------+
>>> | 3  | A    |
>>> +----+------+
>>> 1 row selected (0.011 seconds)
>>> 0: jdbc:phoenix:> select * from JVB where ID > 4;
>>> +----+------+
>>> | ID | NAME |
>>> +----+------+
>>> | 7  | A    |
>>> | 2  | A    |
>>> | 6  | A    |
>>> | 1  | A    |
>>> | 5  | A    |
>>> | 9  | A    |
>>> | 0  | A    |
>>> | 4  | A    |
>>> | 8  | A    |
>>> +----+------+
>>> 9 rows selected (0.012 seconds)
>>>
>>> Thanks,
>>> Jan
>>>
>>
>>
>> --
>> *István Tóth* | Staff Software Engineer
>> stoty@cloudera.com <https://www.cloudera.com>
>> [image: Cloudera] <https://www.cloudera.com/>
>> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
>> Cloudera on Facebook] <https://www.facebook.com/cloudera> [image:
>> Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera>
>> <https://www.cloudera.com/>
>> ------------------------------
>>
>

-- 
*István Tóth* | Staff Software Engineer
stoty@cloudera.com <https://www.cloudera.com>
[image: Cloudera] <https://www.cloudera.com/>
[image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera
on LinkedIn] <https://www.linkedin.com/company/cloudera>
<https://www.cloudera.com/>
------------------------------

Re: more salt buckets than regions

Posted by Jan Van Besien <ja...@gmail.com>.
Thanks for confirming. Do you have a reference to the most relevant JIRA
ticket? I found PHOENIX-4906 but I'm not sure if that's the most relevant
one.

Jan

On Tue, 2 Nov 2021 at 22:47, Istvan Toth <st...@cloudera.com> wrote:

> It's a known issue.
> A region may only have rows from a single bucket in it, and it is not hard
> to get into the state that you describe.
> We have at least one JIRA open for it.
> Phoenix will automatically pre-split according to the salt buckets if you
> don't explicitly specify the splits in the create table statement.
>
> On Tue, Nov 2, 2021 at 2:34 PM Jan Van Besien <ja...@gmail.com>
> wrote:
>
>> Phoenix queries seem to return incorrect results when defining tables
>> with more salt buckets than regions. Is this a known problem or am I
>> misunderstanding something?
>>
>> I tested with phoenix 5.1.2 and hbase 2.1.4.
>>
>> Create a table with 4 salt buckets and 2 initial regions and add some
>> values:
>>
>> create table JVB (ID BIGINT NOT NULL, NAME VARCHAR, CONSTRAINT pk
>> PRIMARY KEY (ID)) SALT_BUCKETS = 4 SPLIT ON (5);
>> upsert into JVB values (0, 'A');
>> upsert into JVB values (1, 'A');
>> upsert into JVB values (2, 'A');
>> upsert into JVB values (3, 'A');
>> upsert into JVB values (4, 'A');
>> upsert into JVB values (5, 'A');
>> upsert into JVB values (6, 'A');
>> upsert into JVB values (7, 'A');
>> upsert into JVB values (8, 'A');
>> upsert into JVB values (9, 'A');
>>
>> Example queries that are incorrect:
>>
>> 0: jdbc:phoenix:> select * from JVB where ID < 4;
>> +----+------+
>> | ID | NAME |
>> +----+------+
>> | 3  | A    |
>> +----+------+
>> 1 row selected (0.011 seconds)
>> 0: jdbc:phoenix:> select * from JVB where ID > 4;
>> +----+------+
>> | ID | NAME |
>> +----+------+
>> | 7  | A    |
>> | 2  | A    |
>> | 6  | A    |
>> | 1  | A    |
>> | 5  | A    |
>> | 9  | A    |
>> | 0  | A    |
>> | 4  | A    |
>> | 8  | A    |
>> +----+------+
>> 9 rows selected (0.012 seconds)
>>
>> Thanks,
>> Jan
>>
>
>
> --
> *István Tóth* | Staff Software Engineer
> stoty@cloudera.com <https://www.cloudera.com>
> [image: Cloudera] <https://www.cloudera.com/>
> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
> Cloudera on Facebook] <https://www.facebook.com/cloudera> [image:
> Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera>
> <https://www.cloudera.com/>
> ------------------------------
>

Re: more salt buckets than regions

Posted by Istvan Toth <st...@cloudera.com>.
It's a known issue.
A region may only have rows from a single bucket in it, and it is not hard
to get into the state that you describe.
We have at least one JIRA open for it.
Phoenix will automatically pre-split according to the salt buckets if you
don't explicitly specify the splits in the create table statement.

On Tue, Nov 2, 2021 at 2:34 PM Jan Van Besien <ja...@gmail.com>
wrote:

> Phoenix queries seem to return incorrect results when defining tables
> with more salt buckets than regions. Is this a known problem or am I
> misunderstanding something?
>
> I tested with phoenix 5.1.2 and hbase 2.1.4.
>
> Create a table with 4 salt buckets and 2 initial regions and add some
> values:
>
> create table JVB (ID BIGINT NOT NULL, NAME VARCHAR, CONSTRAINT pk
> PRIMARY KEY (ID)) SALT_BUCKETS = 4 SPLIT ON (5);
> upsert into JVB values (0, 'A');
> upsert into JVB values (1, 'A');
> upsert into JVB values (2, 'A');
> upsert into JVB values (3, 'A');
> upsert into JVB values (4, 'A');
> upsert into JVB values (5, 'A');
> upsert into JVB values (6, 'A');
> upsert into JVB values (7, 'A');
> upsert into JVB values (8, 'A');
> upsert into JVB values (9, 'A');
>
> Example queries that are incorrect:
>
> 0: jdbc:phoenix:> select * from JVB where ID < 4;
> +----+------+
> | ID | NAME |
> +----+------+
> | 3  | A    |
> +----+------+
> 1 row selected (0.011 seconds)
> 0: jdbc:phoenix:> select * from JVB where ID > 4;
> +----+------+
> | ID | NAME |
> +----+------+
> | 7  | A    |
> | 2  | A    |
> | 6  | A    |
> | 1  | A    |
> | 5  | A    |
> | 9  | A    |
> | 0  | A    |
> | 4  | A    |
> | 8  | A    |
> +----+------+
> 9 rows selected (0.012 seconds)
>
> Thanks,
> Jan
>


-- 
*István Tóth* | Staff Software Engineer
stoty@cloudera.com <https://www.cloudera.com>
[image: Cloudera] <https://www.cloudera.com/>
[image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera
on LinkedIn] <https://www.linkedin.com/company/cloudera>
<https://www.cloudera.com/>
------------------------------