You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Yunbo Fan (Jira)" <ji...@apache.org> on 2023/03/08 02:53:00 UTC

[jira] [Updated] (PHOENIX-6897) Aggregate on unverified index rows return wrong result

     [ https://issues.apache.org/jira/browse/PHOENIX-6897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yunbo Fan updated PHOENIX-6897:
-------------------------------
    Description: 
h4. Summary:
Upsert include three phases, and if failed after phase1, unverified index rows will leave in the index table. This will cause wrong result when do aggregate queries.
h4. Steps for reproduce
1. create table and index
{code}
create table students(id integer primary key, name varchar, status integer);
create index students_name_index on students(name, id) include (status);
{code}
2. upsert data using phoenix
{code}
upsert into students values(1, 'tom', 1);
upsert into students values(2, 'jerry', 2);
{code}
3. do phase1 by hbase shell, change status column value to '2' and verified column value to '2'
{code}
put 'STUDENTS_NAME_INDEX', "tom\x00\x80\x00\x00\x01", '0:0:STATUS', "\x80\x00\x00\x02"
put 'STUDENTS_NAME_INDEX', "tom\x00\x80\x00\x00\x01", '0:_0', "\x02"
{code}
notice: hbase shell can't parse colon in column, like '0:0:STATUS', you may need comment the line in hbase/lib/ruby/hbase/table.rb, see https://issues.apache.org/jira/browse/HBASE-13788
{code}
    # Returns family and (when has it) qualifier for a column name
    def parse_column_name(column)
      split = org.apache.hadoop.hbase.KeyValue.parseColumn(column.to_java_bytes)
      -> comment this line out #set_converter(split) if split.length > 1
      return split[0], (split.length > 1) ? split[1] : nil
    end
{code}
4. do query without aggregate, the result is right
{code}
0: jdbc:phoenix:> select status from students where name = 'tom';
+--------+
| STATUS |
+--------+
| 1      |
+--------+
{code}
5. do query with aggregate, get wrong result
{code}
0: jdbc:phoenix:> select count(*) from students where name = 'tom' and status = 1;
+----------+
| COUNT(1) |
+----------+
| 0        |
+----------+
{code}
6. using NO_INDEX hint
{code}
0: jdbc:phoenix:> select /*+ NO_INDEX */ count(*) from students where name = 'tom' and status = 1;
+----------+
| COUNT(1) |
+----------+
| 1        |
+----------+
{code}

  was:
h4. Summary:
Upsert include three phases, and if failed after phase1, unverified index rows will leave in the index table. This will cause wrong result when do aggregate queries.
h4. Steps for reproduce
1. create table and index


> Aggregate on unverified index rows return wrong result
> ------------------------------------------------------
>
>                 Key: PHOENIX-6897
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6897
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 5.1.2
>            Reporter: Yunbo Fan
>            Priority: Major
>
> h4. Summary:
> Upsert include three phases, and if failed after phase1, unverified index rows will leave in the index table. This will cause wrong result when do aggregate queries.
> h4. Steps for reproduce
> 1. create table and index
> {code}
> create table students(id integer primary key, name varchar, status integer);
> create index students_name_index on students(name, id) include (status);
> {code}
> 2. upsert data using phoenix
> {code}
> upsert into students values(1, 'tom', 1);
> upsert into students values(2, 'jerry', 2);
> {code}
> 3. do phase1 by hbase shell, change status column value to '2' and verified column value to '2'
> {code}
> put 'STUDENTS_NAME_INDEX', "tom\x00\x80\x00\x00\x01", '0:0:STATUS', "\x80\x00\x00\x02"
> put 'STUDENTS_NAME_INDEX', "tom\x00\x80\x00\x00\x01", '0:_0', "\x02"
> {code}
> notice: hbase shell can't parse colon in column, like '0:0:STATUS', you may need comment the line in hbase/lib/ruby/hbase/table.rb, see https://issues.apache.org/jira/browse/HBASE-13788
> {code}
>     # Returns family and (when has it) qualifier for a column name
>     def parse_column_name(column)
>       split = org.apache.hadoop.hbase.KeyValue.parseColumn(column.to_java_bytes)
>       -> comment this line out #set_converter(split) if split.length > 1
>       return split[0], (split.length > 1) ? split[1] : nil
>     end
> {code}
> 4. do query without aggregate, the result is right
> {code}
> 0: jdbc:phoenix:> select status from students where name = 'tom';
> +--------+
> | STATUS |
> +--------+
> | 1      |
> +--------+
> {code}
> 5. do query with aggregate, get wrong result
> {code}
> 0: jdbc:phoenix:> select count(*) from students where name = 'tom' and status = 1;
> +----------+
> | COUNT(1) |
> +----------+
> | 0        |
> +----------+
> {code}
> 6. using NO_INDEX hint
> {code}
> 0: jdbc:phoenix:> select /*+ NO_INDEX */ count(*) from students where name = 'tom' and status = 1;
> +----------+
> | COUNT(1) |
> +----------+
> | 1        |
> +----------+
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)