You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@phoenix.apache.org by "Daniel Wong (JIRA)" <ji...@apache.org> on 2019/06/19 23:16:00 UTC

[jira] [Comment Edited] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies

    [ https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16868090#comment-16868090 ] 

Daniel Wong edited comment on PHOENIX-4845 at 6/19/19 11:15 PM:
----------------------------------------------------------------

So I have began implementing this Jira and have done some initial scoping and approach thoughts.

Disallowed cases will not be supported:

 
{code:java}
String simpleDdl = "CREATE TABLE " + tableName + " (t_id VARCHAR NOT NULL,\n" + "k1 INTEGER NOT NULL,\n" + "k2 INTEGER NOT NULL,\n" + "C3.k3 INTEGER,\n" + "C2.v1 VARCHAR,\n" + "CONSTRAINT pk PRIMARY KEY (t_id, k1, k2)) ";

//Test RVC Offset columns must be coercible to a base table
@Test
public void testRVCIndexLookup() throws SQLException {
String failureSql = String.format("SELECT t_id, k1, k2 FROM %s OFFSET ('a', 'ab', 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not allow Index lookup with RVC Offset");
}

//Test Not Allow Index Access Path On Base Table
@Test
public void testRVCOffsetNotCoercible() throws SQLException {
String failureSql = String.format("SELECT t_id, k1, k2 FROM %s OFFSET ('a', 'ab', 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not allow non coercible values to PK in RVC Offset");
}

//Test Order By Not PK Order By Exception
@Test
public void testRVCOffsetNotAllowNonPKOrderBy() throws SQLException {
String failureSql = String.format("SELECT t_id, k1, k2, v1 FROM %s ORDER BY v1 OFFSET ('a', 1, 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not allow no PK order by with RVC Offset");
}

//Test Order By Partial PK Order By Exception
@Test
public void testRVCOffsetNotAllowPartialPKOrderBy() throws SQLException {
String failureSql = String.format("SELECT t_id, k1, k2 FROM %s ORDER BY 2 OFFSET ('a', 1, 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not allow partial PK order by with RVC Offset");
}

//Test Order By Not PK Order By Exception
@Test
public void testRVCOffsetNotAllowDifferentPKOrderBy() throws SQLException {
String failureSql = String.format("SELECT t_id, k1, k2 FROM %s ORDER BY 1 DESC,2,3 OFFSET ('a', 1, 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not allow differnt PK order by with RVC Offset");
}

//Test Not allow joins
@Test
public void testRVCOffsetNotAllowedInJoins() throws SQLException {
String failureSql = String.format("SELECT * FROM %s AS T1, %s AS T2 WHERE T1.t_id=T2.t_id OFFSET ('a', 1, 2)",tableName,tableName); //literal works
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not have JOIN in RVC Offset");
}

//Test Not allowed in subsquery
@Test
public void testRVCOffsetNotAllowedInSubQuery() throws SQLException {
String failureSql = String.format("SELECT * FROM (SELECT t_id, k2 FROM %s OFFSET ('a', 1, 2))",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not have subquery with RVC Offset");
}


//Test Not allowed on subsquery
@Test
public void testRVCOffsetNotAllowedOnSubQuery() throws SQLException {
String failureSql = String.format("SELECT * FROM (SELECT t_id, k2 FROM %s) OFFSET ('a', 1, 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not have subquery with RVC Offset");
}

//Test RVC Offset must be a literal, cannot have column reference
@Test
public void testRVCOffsetLiteral() throws SQLException {
String sql = "SELECT * FROM " + tableName + " OFFSET ('a', 1, 2)"; //literal works
conn.createStatement().execute(sql);
String failureSql = "SELECT * FROM " + tableName + " OFFSET ('a', 1, k2)"; //column does works
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not have allowed column in RVC Offset");
}



{code}
Approach Possibilities:

I have been considering 2 posibilities for approaching this in Phoenix and am open to other suggestions as well.

Given DDL/Query:

Create Table TABLE (a UNSIGNED_TINYINT, b UNSIGNED_TINYINT, c UNSIGNED_TINYINT, CONSTANT pk PRIMARY KEY (a,b desc, c))
 SELECT * FROM TABLE OFFSET (1,2,3)

Approach 1: Query Rewrite

SELECT * FROM TABLE OFFSET (1,2,3)
 ->
 SELECT * FROM TABLE WHERE (A > 1 OR (A = 1 AND (B < 2 OR (B = 2 AND (C >= 3)))))

Approach 2: Mini - Resolution
 SELECT * FROM TABLE OFFSET (1,2,3)
 -> SELECT 1 FROM TABLE WHERE (a,b,c)=(1,2,3), (Pass the constructed tree to the WhereOptimizer essentially)
 -> Extract Byte Array From Constant [0x01FD03]
 -> Filter Scans start/end key by Constant

Right now I'm favoring approach 2 but this will require some additional hooks into the scan generation. Any one else have thoughts?  [~tdsilva] [~jamestaylor] [~lhofhansl]

 

 

 


was (Author: dbwong):
So I have began implementing this Jira and have done some initial scoping and approach thoughts.

Disallowed cases will not be supported:

 
{code:java}
//Test RVC Offset columns must be coercible to a base table
@Test
public void testRVCIndexLookup() throws SQLException {
String failureSql = String.format("SELECT t_id, k1, k2 FROM %s OFFSET ('a', 'ab', 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not allow Index lookup with RVC Offset");
}

//Test Not Allow Index Access Path On Base Table
@Test
public void testRVCOffsetNotCoercible() throws SQLException {
String failureSql = String.format("SELECT t_id, k1, k2 FROM %s OFFSET ('a', 'ab', 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not allow non coercible values to PK in RVC Offset");
}

//Test Order By Not PK Order By Exception
@Test
public void testRVCOffsetNotAllowNonPKOrderBy() throws SQLException {
String failureSql = String.format("SELECT t_id, k1, k2, v1 FROM %s ORDER BY v1 OFFSET ('a', 1, 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not allow no PK order by with RVC Offset");
}

//Test Order By Partial PK Order By Exception
@Test
public void testRVCOffsetNotAllowPartialPKOrderBy() throws SQLException {
String failureSql = String.format("SELECT t_id, k1, k2 FROM %s ORDER BY 2 OFFSET ('a', 1, 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not allow partial PK order by with RVC Offset");
}

//Test Order By Not PK Order By Exception
@Test
public void testRVCOffsetNotAllowDifferentPKOrderBy() throws SQLException {
String failureSql = String.format("SELECT t_id, k1, k2 FROM %s ORDER BY 1 DESC,2,3 OFFSET ('a', 1, 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not allow differnt PK order by with RVC Offset");
}

//Test Not allow joins
@Test
public void testRVCOffsetNotAllowedInJoins() throws SQLException {
String failureSql = String.format("SELECT * FROM %s AS T1, %s AS T2 WHERE T1.t_id=T2.t_id OFFSET ('a', 1, 2)",tableName,tableName); //literal works
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not have JOIN in RVC Offset");
}

//Test Not allowed in subsquery
@Test
public void testRVCOffsetNotAllowedInSubQuery() throws SQLException {
String failureSql = String.format("SELECT * FROM (SELECT t_id, k2 FROM %s OFFSET ('a', 1, 2))",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not have subquery with RVC Offset");
}


//Test Not allowed on subsquery
@Test
public void testRVCOffsetNotAllowedOnSubQuery() throws SQLException {
String failureSql = String.format("SELECT * FROM (SELECT t_id, k2 FROM %s) OFFSET ('a', 1, 2)",tableName);
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not have subquery with RVC Offset");
}

//Test RVC Offset must be a literal, cannot have column reference
@Test
public void testRVCOffsetLiteral() throws SQLException {
String sql = "SELECT * FROM " + tableName + " OFFSET ('a', 1, 2)"; //literal works
conn.createStatement().execute(sql);
String failureSql = "SELECT * FROM " + tableName + " OFFSET ('a', 1, k2)"; //column does works
try {
conn.createStatement().execute(failureSql);
} catch (Exception e) {
System.out.println(e.getMessage());
return;
}
fail("Should not have allowed column in RVC Offset");
}



{code}
Approach Possibilities:

I have been considering 2 posibilities for approaching this in Phoenix and am open to other suggestions as well.



Given DDL/Query:

Create Table TABLE (a UNSIGNED_TINYINT, b UNSIGNED_TINYINT, c UNSIGNED_TINYINT, CONSTANT pk PRIMARY KEY (a,b desc, c))
SELECT * FROM TABLE OFFSET (1,2,3)

Approach 1: Query Rewrite
 
 SELECT * FROM TABLE OFFSET (1,2,3)
 ->
 SELECT * FROM TABLE WHERE (A > 1 OR (A = 1 AND (B < 2 OR (B = 2 AND (C >= 3)))))

Approach 2: Mini - Resolution
 SELECT * FROM TABLE OFFSET (1,2,3)
 -> SELECT 1 FROM TABLE WHERE (a,b,c)=(1,2,3), (Pass the constructed tree to the WhereOptimizer essentially)
 -> Extract Byte Array From Constant [0x01FD03]
 -> Filter Scans start/end key by Constant

Right now I'm favoring approach 2 but this will require some additional hooks into the scan generation. Any one else have thoughts?  [~tdsilva] [~jamestaylor] [~lhofhansl]




 

 

 

> Support using Row Value Constructors in OFFSET clause for paging in tables where the sort order of PK columns varies
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-4845
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4845
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: Thomas D'Silva
>            Assignee: Daniel Wong
>            Priority: Major
>              Labels: DESC, SFDC
>         Attachments: PHOENIX-offset.txt
>
>
> RVCs along with the LIMIT clause are useful for efficiently paging through rows (see [http://phoenix.apache.org/paged.html]). This works well if the pk columns are sorted ascending, we can always use the > operator to query for the next batch of row. 
> However if the PK of a table is (A  DESC, B DESC) we cannot use the following query to page through the data
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> Since the rows are sorted by A desc and then by B descending we need change the comparison order
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> If the PK of a table contains columns with mixed sort order for eg (A  DESC, B) then we cannot use RVC to page through data. 
> If we supported using RVCs in the offset clause we could use the offset to set the start row of the scan. Clients would not have to have logic to determine the comparison operator. This would also support paging through data for tables where the PK columns are sorted in mixed order. 
> {code:java}
> SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?)
> {code}
> We would only allow using the offset if the rows are ordered by the sort order of the PK columns.
>  
> FYI [~jfernando_sfdc]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)