You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Jan Fernando (JIRA)" <ji...@apache.org> on 2014/07/19 05:20:38 UTC

[jira] [Commented] (PHOENIX-1100) Upsert Select with Sequence and Order By doesn't result in sequence getting assigned based on requested ordering

    [ https://issues.apache.org/jira/browse/PHOENIX-1100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067361#comment-14067361 ] 

Jan Fernando commented on PHOENIX-1100:
---------------------------------------

[~jamestaylor] As promised here is an IT test we can use to repro. Add this to UpsertSelectIT. If I remove SALT_BUCKETS= 64 from the DDL for DUMMY_SEQ_TEST_DATA, the tests pass which corroborates my theory that parallelization is causing this.

{code}
    @Test
    public void testUpsertSelectWithSequenceAndOrderByWithSalting() throws Exception {
    	
    		int numOfRecords = 2000;
        long ts = nextTimestamp();
        Properties props = new Properties();
	      props.setProperty(QueryServices.THREAD_POOL_SIZE_ATTRIB, Integer.toString(64));
	      props.setProperty(QueryServices.QUEUE_SIZE_ATTRIB, Integer.toString(500));
	      props.setProperty(PhoenixRuntime.CURRENT_SCN_ATTRIB, Long.toString(ts));
        Connection conn = DriverManager.getConnection(getUrl(), props);
        String ddl = "CREATE TABLE IF NOT EXISTS DUMMY_CURSOR_STORAGE ("
        				+ "ORGANIZATION_ID CHAR(15) NOT NULL, QUERY_ID CHAR(15) NOT NULL, CURSOR_ORDER BIGINT NOT NULL, K1 INTEGER, V1 INTEGER "
        				+ "CONSTRAINT MAIN_PK PRIMARY KEY (ORGANIZATION_ID, QUERY_ID, CURSOR_ORDER) "
        				+ ") SALT_BUCKETS = 64";
        conn.createStatement().execute(ddl);
        conn.createStatement().execute("CREATE TABLE DUMMY_SEQ_TEST_DATA " +
													        		 "(ORGANIZATION_ID CHAR(15) NOT NULL, k1 integer NOT NULL, v1 integer NOT NULL " +
													        		 "CONSTRAINT PK PRIMARY KEY (ORGANIZATION_ID, k1, v1) ) VERSIONS=1, SALT_BUCKETS = 64");
        conn.createStatement().execute("create sequence s cache " + Integer.MAX_VALUE);
        conn.close();

        props.setProperty(PhoenixRuntime.CURRENT_SCN_ATTRIB, Long.toString(ts + 10));
        conn = DriverManager.getConnection(getUrl(), props);
        for (int i = 0; i < numOfRecords; i++) {
        	conn.createStatement().execute("UPSERT INTO DUMMY_SEQ_TEST_DATA values ('00Dxx0000001gEH'," + i + "," + (i + 2) + ")");
        }
        conn.commit();

        props.setProperty(PhoenixRuntime.CURRENT_SCN_ATTRIB, Long.toString(ts + 15));
        conn = DriverManager.getConnection(getUrl(), props);
        conn.setAutoCommit(true);
        conn.createStatement().execute("UPSERT INTO DUMMY_CURSOR_STORAGE SELECT '00Dxx0000001gEH', 'MyQueryId', NEXT VALUE FOR S, k1, v1  FROM DUMMY_SEQ_TEST_DATA ORDER BY K1, V1");

        props.setProperty(PhoenixRuntime.CURRENT_SCN_ATTRIB, Long.toString(ts + 20));
        conn = DriverManager.getConnection(getUrl(), props);
        ResultSet rs = conn.createStatement().executeQuery("select count(*) from DUMMY_CURSOR_STORAGE");
        
        assertTrue(rs.next());
        assertEquals(numOfRecords, rs.getLong(1));
        conn.close();

        props.setProperty(PhoenixRuntime.CURRENT_SCN_ATTRIB, Long.toString(ts + 25));
        ResultSet rs2 = conn.createStatement().executeQuery("select cursor_order, k1, v1 from DUMMY_CURSOR_STORAGE order by cursor_order");
        long seq = 1;
        while (rs2.next()) {
            assertEquals(seq, rs2.getLong("cursor_order"));
            // This value should be the sequence - 1 as we said order by k1 in the UPSERT...SELECT, but is not because of sequence processing.
            assertEquals(seq - 1, rs2.getLong("k1"));
            seq++;
        }
        conn.close();
    
    }
{code}

> Upsert Select with Sequence and Order By doesn't result in sequence getting assigned based on requested ordering 
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-1100
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1100
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 3.0.0, 4.0.0, 5.0.0
>            Reporter: Jan Fernando
>
> I working with Query More implementation I noticed that once we had sufficient parallelization in the SELECT portion of UPSERT...SELECT due to Salting or region splits the results were not returned in the correct order. The root cause appeared to be due to the fact that we are using sequences to build a cursor of data for our Query More implementation. What appears to be happening is that as we get the next sequence value from the SequenceManager during result processing parallel threads all increment the sequence and therefore the sequence values reflect the processing order of each iterator versus the actual order specified by the order by. The expectation is that the sequence value should reflect the requested order specified by the ORDER BY versus the processing order.



--
This message was sent by Atlassian JIRA
(v6.2#6252)