You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Pulkit Bhardwaj (JIRA)" <ji...@apache.org> on 2017/09/06 10:50:00 UTC

[jira] [Updated] (PHOENIX-4167) Phoenix SELECT query returns duplicate data in the same varchar/char column if a trim() is applied on the column AND a distinct arbitrary column is generated in the query

     [ https://issues.apache.org/jira/browse/PHOENIX-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pulkit Bhardwaj updated PHOENIX-4167:
-------------------------------------
    Description: 
1. Created a simple table in phoenix

{code:sql}
create table test_select(nam VARCHAR(20), address VARCHAR(20), id BIGINT constraint my_pk primary key (id));

{code}

2. Insert a sample row

{code:sql}
upsert into test_select (nam, address,id) values('user','place',1);
{code}

3. Confirm that the row is present


{code:sql}
0: jdbc:phoenix:> select * from test_select;
+---------+----------+-----+
|   NAM   | ADDRESS  | ID  |
+---------+----------+-----+
| user  | place   | 1   |
+---------+----------+-----+
{code}


4. Now run the following query

0: jdbc:phoenix:> select distinct 'arbitrary' as "test_column", trim(nam), trim(nam) from test_select;

This would generate the following output

+--------------+----------------+----------------+
| test_column  |   TRIM(NAM)    |   TRIM(NAM)    |
+--------------+----------------+----------------+
| arbitrary      | useruser  | useruser  |
+--------------+----------------+----------------+

As we can see the output for the trim(name) which should have been 'user' is actually printed as 'useruser'

The concatenation to the string is actually the number of times the column is printed.
The following

0: jdbc:phoenix:> select distinct 'arbitrary' as "test_column", trim(nam), trim(nam), trim(nam) from test_select;

Would generate

+--------------+-----------------------+-----------------------+-----------------------+
| test_column  |       TRIM(NAM)       |       TRIM(NAM)       |       TRIM(NAM)       |
+--------------+-----------------------+-----------------------+-----------------------+
| arbitrary      | useruseruser  | useruseruser  | useruseruser  |
+--------------+-----------------------+-----------------------+-----------------------+

A couple of things to notice

1. If I remove the —— distinct 'harshit' as "test_column" ——  The issue is not seen

0: jdbc:phoenix:> select trim(nam), trim(nam), trim(nam) from test_select;
+------------+------------+------------+
| TRIM(NAM)  | TRIM(NAM)  | TRIM(NAM)  |
+------------+------------+------------+
| user     | user     | user     |
+------------+------------+------------+

2. If I remove the trim() again the issue is not seen

0: jdbc:phoenix:> select  trim(nam), trim(nam) from test_select;
+------------+------------+
| TRIM(NAM)  | TRIM(NAM)  |
+------------+------------+
| user     | user     |
+------------+------------+


  was:
1. Created a simple table in phoenix

{code:sql}
create table test_select(nam VARCHAR(20), address VARCHAR(20), id BIGINT constraint my_pk primary key (id));

{code}

2. Insert a sample row

{code:sql}
upsert into test_select (nam, address,id) values('user','place',1);
{code}

3. Confirm that the row is present

0: jdbc:phoenix:> select * from test_select;
+---------+----------+-----+
|   NAM   | ADDRESS  | ID  |
+---------+----------+-----+
| user  | place   | 1   |
+---------+----------+-----+

4. Now run the following query

0: jdbc:phoenix:> select distinct 'arbitrary' as "test_column", trim(nam), trim(nam) from test_select;

This would generate the following output

+--------------+----------------+----------------+
| test_column  |   TRIM(NAM)    |   TRIM(NAM)    |
+--------------+----------------+----------------+
| arbitrary      | useruser  | useruser  |
+--------------+----------------+----------------+

As we can see the output for the trim(name) which should have been 'user' is actually printed as 'useruser'

The concatenation to the string is actually the number of times the column is printed.
The following

0: jdbc:phoenix:> select distinct 'arbitrary' as "test_column", trim(nam), trim(nam), trim(nam) from test_select;

Would generate

+--------------+-----------------------+-----------------------+-----------------------+
| test_column  |       TRIM(NAM)       |       TRIM(NAM)       |       TRIM(NAM)       |
+--------------+-----------------------+-----------------------+-----------------------+
| arbitrary      | useruseruser  | useruseruser  | useruseruser  |
+--------------+-----------------------+-----------------------+-----------------------+

A couple of things to notice

1. If I remove the —— distinct 'harshit' as "test_column" ——  The issue is not seen

0: jdbc:phoenix:> select trim(nam), trim(nam), trim(nam) from test_select;
+------------+------------+------------+
| TRIM(NAM)  | TRIM(NAM)  | TRIM(NAM)  |
+------------+------------+------------+
| user     | user     | user     |
+------------+------------+------------+

2. If I remove the trim() again the issue is not seen

0: jdbc:phoenix:> select  trim(nam), trim(nam) from test_select;
+------------+------------+
| TRIM(NAM)  | TRIM(NAM)  |
+------------+------------+
| user     | user     |
+------------+------------+



> Phoenix SELECT query returns duplicate data in the same varchar/char column if a trim() is applied on the column AND a distinct arbitrary column is generated in the query
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-4167
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4167
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.7.0
>            Reporter: Pulkit Bhardwaj
>            Priority: Minor
>
> 1. Created a simple table in phoenix
> {code:sql}
> create table test_select(nam VARCHAR(20), address VARCHAR(20), id BIGINT constraint my_pk primary key (id));
> {code}
> 2. Insert a sample row
> {code:sql}
> upsert into test_select (nam, address,id) values('user','place',1);
> {code}
> 3. Confirm that the row is present
> {code:sql}
> 0: jdbc:phoenix:> select * from test_select;
> +---------+----------+-----+
> |   NAM   | ADDRESS  | ID  |
> +---------+----------+-----+
> | user  | place   | 1   |
> +---------+----------+-----+
> {code}
> 4. Now run the following query
> 0: jdbc:phoenix:> select distinct 'arbitrary' as "test_column", trim(nam), trim(nam) from test_select;
> This would generate the following output
> +--------------+----------------+----------------+
> | test_column  |   TRIM(NAM)    |   TRIM(NAM)    |
> +--------------+----------------+----------------+
> | arbitrary      | useruser  | useruser  |
> +--------------+----------------+----------------+
> As we can see the output for the trim(name) which should have been 'user' is actually printed as 'useruser'
> The concatenation to the string is actually the number of times the column is printed.
> The following
> 0: jdbc:phoenix:> select distinct 'arbitrary' as "test_column", trim(nam), trim(nam), trim(nam) from test_select;
> Would generate
> +--------------+-----------------------+-----------------------+-----------------------+
> | test_column  |       TRIM(NAM)       |       TRIM(NAM)       |       TRIM(NAM)       |
> +--------------+-----------------------+-----------------------+-----------------------+
> | arbitrary      | useruseruser  | useruseruser  | useruseruser  |
> +--------------+-----------------------+-----------------------+-----------------------+
> A couple of things to notice
> 1. If I remove the —— distinct 'harshit' as "test_column" ——  The issue is not seen
> 0: jdbc:phoenix:> select trim(nam), trim(nam), trim(nam) from test_select;
> +------------+------------+------------+
> | TRIM(NAM)  | TRIM(NAM)  | TRIM(NAM)  |
> +------------+------------+------------+
> | user     | user     | user     |
> +------------+------------+------------+
> 2. If I remove the trim() again the issue is not seen
> 0: jdbc:phoenix:> select  trim(nam), trim(nam) from test_select;
> +------------+------------+
> | TRIM(NAM)  | TRIM(NAM)  |
> +------------+------------+
> | user     | user     |
> +------------+------------+



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)