You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Ankit Singhal (Jira)" <ji...@apache.org> on 2020/05/06 01:37:00 UTC
[jira] [Updated] (PHOENIX-5884) Join query return empty result when
filters for both the tables are present
[ https://issues.apache.org/jira/browse/PHOENIX-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ankit Singhal updated PHOENIX-5884:
-----------------------------------
Description:
Let's assume DDL to be same for both the tables involved in a join
{code}
CREATE TABLE LeftTable ( id1 CHAR(6) NOT NULL, id2 VARCHAR(22) NOT NULL, id3 VARCHAR(12) NOT NULL, id4 CHAR(2) NOT NULL, id5 CHAR(6) NOT NULL, id6 VARCHAR(200) NOT NULL, id7 VARCHAR(50) NOT NULL, ts TIMESTAMP , CONSTRAINT PK_JOIN_AND_INTERSECTION_TABLE PRIMARY KEY(id1,id2,id3,id4,id5,id6,id7))
{code}
Following query return right results
{code}
SELECT m.*,r.* FROM LEFT_TABLE m join RIGHT_TABLE r on m.id3 = r.id3 and m.id2 = r.id2 and m.id4 = r.id4 and m.id5 = r.id5 and m.id1 = r.id1 and m.ts = r.ts where r.id1 IN ('201904','201905') and r.id2 = 'ID2_VAL' and r.id3 IN ('ID3_VAL','ID3_VAL2')
{code}
but When to optimize the query, filters for the left table are also added , query returned empty result . Though the filters are based on join condition so semantically above query and below query should be same.
{code}
SELECT m.*,r.* FROM LEFT_TABLE m join RIGHT_TABLE r on m.id3 = r.id3 and m.id2 = r.id2 and m.id4 = r.id4 and m.id5 = r.id5 and m.id1 = r.id1 and m.ts = r.ts where m.id1 IN ('201904','201905') and r.id1 IN ('201904','201905') and r.id2 = 'ID2_VAL' and m.id3 IN ('ID3_VAL','ID3_VAL2') and r.id3 IN ('ID3_VAL','ID3_VAL2')
{code}
was:
Let's assume DDL to be same for both the tables involved in a join
{code}
CREATE TABLE LeftTable ( id1 CHAR(6) NOT NULL, id2 VARCHAR(22) NOT NULL, id3 VARCHAR(12) NOT NULL, id4 CHAR(2) NOT NULL, id5 CHAR(6) NOT NULL, id6 VARCHAR(200) NOT NULL, id7 VARCHAR(50) NOT NULL, ts TIMESTAMP , CONSTRAINT PK_JOIN_AND_INTERSECTION_TABLE PRIMARY KEY(id1,id2,id3,id4,id5,id6,id7))
{code}
Following query return right results
{code}
SELECT m.*,r.* FROM LEFT_TABLE m join RIGHT_TABLE r on m.id3 = r.id3 and m.id2 = r.id2 and m.id4 = r.id4 and m.id5 = r.id5 and m.id1 = r.id1 and m.ts = r.ts where r.id1 IN ('201904','201905') and r.id2 = 'ID2_VAL' and r.id3 IN ('ID3_VAL','ID3_VAL2')
{code
but When to optimize the query, filters for the left table are also added , query returned empty result . Though the filters are based on join condition so semantically above query and below query should be same.
{code}
SELECT m.*,r.* FROM LEFT_TABLE m join RIGHT_TABLE r on m.id3 = r.id3 and m.id2 = r.id2 and m.id4 = r.id4 and m.id5 = r.id5 and m.id1 = r.id1 and m.ts = r.ts where m.id1 IN ('201904','201905') and r.id1 IN ('201904','201905') and r.id2 = 'ID2_VAL' and m.id3 IN ('ID3_VAL','ID3_VAL2') and r.id3 IN ('ID3_VAL','ID3_VAL2')
{code}
> Join query return empty result when filters for both the tables are present
> ---------------------------------------------------------------------------
>
> Key: PHOENIX-5884
> URL: https://issues.apache.org/jira/browse/PHOENIX-5884
> Project: Phoenix
> Issue Type: Bug
> Reporter: Ankit Singhal
> Assignee: Ankit Singhal
> Priority: Major
>
> Let's assume DDL to be same for both the tables involved in a join
> {code}
> CREATE TABLE LeftTable ( id1 CHAR(6) NOT NULL, id2 VARCHAR(22) NOT NULL, id3 VARCHAR(12) NOT NULL, id4 CHAR(2) NOT NULL, id5 CHAR(6) NOT NULL, id6 VARCHAR(200) NOT NULL, id7 VARCHAR(50) NOT NULL, ts TIMESTAMP , CONSTRAINT PK_JOIN_AND_INTERSECTION_TABLE PRIMARY KEY(id1,id2,id3,id4,id5,id6,id7))
> {code}
> Following query return right results
> {code}
> SELECT m.*,r.* FROM LEFT_TABLE m join RIGHT_TABLE r on m.id3 = r.id3 and m.id2 = r.id2 and m.id4 = r.id4 and m.id5 = r.id5 and m.id1 = r.id1 and m.ts = r.ts where r.id1 IN ('201904','201905') and r.id2 = 'ID2_VAL' and r.id3 IN ('ID3_VAL','ID3_VAL2')
> {code}
> but When to optimize the query, filters for the left table are also added , query returned empty result . Though the filters are based on join condition so semantically above query and below query should be same.
> {code}
> SELECT m.*,r.* FROM LEFT_TABLE m join RIGHT_TABLE r on m.id3 = r.id3 and m.id2 = r.id2 and m.id4 = r.id4 and m.id5 = r.id5 and m.id1 = r.id1 and m.ts = r.ts where m.id1 IN ('201904','201905') and r.id1 IN ('201904','201905') and r.id2 = 'ID2_VAL' and m.id3 IN ('ID3_VAL','ID3_VAL2') and r.id3 IN ('ID3_VAL','ID3_VAL2')
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)