You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Mate Szalay-Beko (Jira)" <ji...@apache.org> on 2020/11/23 17:33:00 UTC

[jira] [Created] (PHOENIX-6232) Same query fails with IllegalArgumentException if part of a join

Mate Szalay-Beko created PHOENIX-6232:
-----------------------------------------

             Summary: Same query fails with IllegalArgumentException if part of a join
                 Key: PHOENIX-6232
                 URL: https://issues.apache.org/jira/browse/PHOENIX-6232
             Project: Phoenix
          Issue Type: Bug
            Reporter: Mate Szalay-Beko


We were facing an interesting problem when a more complex query (with inner selects in the WHERE clause) succeeds alone, while the same query fails, if it is part of a join. I created a test table / query to reproduce the problem:
{code:sql}
DROP TABLE IF EXISTS test;
CREATE TABLE test (
      id INTEGER NOT NULL,
      test_id INTEGER,
      lastchanged TIMESTAMP,
      CONSTRAINT my_pk PRIMARY KEY (id));

UPSERT INTO test VALUES(0, 100, '2000-01-01 00:00:00.0');
UPSERT INTO test VALUES(1, 101, '2000-01-01 00:00:00.0');
UPSERT INTO test VALUES(2, 100, '2011-11-11 11:11:11.0');
{code}

*Query 1:* Example query, running fine in itself:
{code:sql}
SELECT id, test_id, lastchanged FROM test T
WHERE lastchanged = ( SELECT max(lastchanged) FROM test WHERE test_id = T.test_id )

Returns:
+----+---------+-----------------------+
| ID | TEST_ID |      LASTCHANGED      |
+----+---------+-----------------------+
| 1  | 101     | 2000-01-01 01:00:00.0 |
| 2  | 100     | 2011-11-11 12:11:11.0 |
+----+---------+-----------------------+
{code}

*Query 2:* Same query fails on the current master branch, when it is part of a larger (implicit) join:
{code:sql}
SELECT AAA.*
FROM 
(
  SELECT id, test_id, lastchanged FROM test T
  WHERE lastchanged = ( SELECT max(lastchanged) FROM test WHERE test_id = T.test_id )
) as AAA,
(
  SELECT id FROM test
) as BBB
WHERE AAA.id = BBB.id;


java.lang.IllegalArgumentException
	at org.apache.phoenix.thirdparty.com.google.common.base.Preconditions.checkArgument(Preconditions.java:128)
	at org.apache.phoenix.compile.TupleProjectionCompiler.createProjectedTable(TupleProjectionCompiler.java:66)
	at org.apache.phoenix.compile.QueryCompiler.compileSingleFlatQuery(QueryCompiler.java:663)
	at org.apache.phoenix.compile.QueryCompiler.compileJoinQuery(QueryCompiler.java:404)
	at org.apache.phoenix.compile.QueryCompiler.compileJoinQuery(QueryCompiler.java:302)
	at org.apache.phoenix.compile.QueryCompiler.compileSelect(QueryCompiler.java:249)
	at org.apache.phoenix.compile.QueryCompiler.compile(QueryCompiler.java:176)
	at org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:504)
	at org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:467)
	at org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:309)
	at org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:298)
	at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
	at org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:297)
	at org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:290)
	at org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1933)
	at sqlline.Commands.executeSingleQuery(Commands.java:1054)
	at sqlline.Commands.execute(Commands.java:1003)
	at sqlline.Commands.sql(Commands.java:967)
	at sqlline.SqlLine.dispatch(SqlLine.java:734)
	at sqlline.SqlLine.begin(SqlLine.java:541)
	at sqlline.SqlLine.start(SqlLine.java:267)
	at sqlline.SqlLine.main(SqlLine.java:206)
{code}
I am not sure what the problem is exactly. My guess is that Phoenix tries to optimize (flatten) an inner-query, which it shouldn't, if we are inside a join (according to the check in the code which throws the exception).

The best workaround I found was to define an explicit join in the original query (Query 1), basically change the inner select into a join. This modified query return the same as the original one:


*Query 3:*
{code:sql}
SELECT T.id, T.test_id, T.lastchanged 
FROM 
  test T 
  LEFT JOIN (
    SELECT max(lastchanged) AS max_timestamp, 
           test_id AS max_timestamp_test_id
    FROM test
    GROUP BY test_id
  ) JOIN_TABLE ON JOIN_TABLE.max_timestamp_test_id = T.test_id
WHERE T.lastchanged = JOIN_TABLE.max_timestamp

Returns:
+------+-----------+-----------------------+
| T.ID | T.TEST_ID |     T.LASTCHANGED     |
+------+-----------+-----------------------+
| 1    | 101       | 2000-01-01 01:00:00.0 |
| 2    | 100       | 2011-11-11 12:11:11.0 |
+------+-----------+-----------------------+
{code}
*Query 4:* And the same modified query (query 3) now works inside a join:
{code:sql}
SELECT AAA.*
FROM 
(
  SELECT T.id, T.test_id, T.lastchanged 
  FROM 
    test T 
    LEFT JOIN (
      SELECT max(lastchanged) AS max_timestamp, 
             test_id AS max_timestamp_test_id
      FROM test
      GROUP BY test_id
    ) JOIN_TABLE ON JOIN_TABLE.max_timestamp_test_id = T.test_id
  WHERE T.lastchanged = JOIN_TABLE.max_timestamp
) as AAA,
(
  SELECT id FROM test
) as BBB
WHERE AAA.id = BBB.id;

Returns:
+------+-----------+-----------------------+
| T.ID | T.TEST_ID |     T.LASTCHANGED     |
+------+-----------+-----------------------+
| 1    | 101       | 2000-01-01 01:00:00.0 |
| 2    | 100       | 2011-11-11 12:11:11.0 |
+------+-----------+-----------------------+
{code}

I think Query 4 worked, as it is forcing Phoenix to drop the idea of optimizing it's inner-query (Query 3). Although, I can be wrong about the root cause...

Anyway, I think the bug should be fixed and Query 2 should run without exception.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)