You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Brian Esserlieu (JIRA)" <ji...@apache.org> on 2016/09/20 23:23:20 UTC

[jira] [Updated] (PHOENIX-3301) Row Value Constructors Against Indexes Don't Work Correctly

     [ https://issues.apache.org/jira/browse/PHOENIX-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brian Esserlieu updated PHOENIX-3301:
-------------------------------------
    Description: 
Why does adding an ORDER BY change the number of records returned when there is a secondary index? 
Why without including an ORDER by are no records returned (seems to be hitting the base table and not the index)?

I've included repro steps below:

*Repro*

{code:title=repro.sql|borderStyle=solid}
DROP INDEX IF EXISTS TEST_INDEX ON TEST_TABLE;
DROP TABLE IF EXISTS TEST_TABLE;
CREATE TABLE IF NOT EXISTS TEST_TABLE (
    PK1 CHAR(15) NOT NULL,
    PK2 DATE NOT NULL
    CONSTRAINT PK PRIMARY KEY 
    ( 
        PK1,
        PK2 DESC
    )
);
CREATE INDEX IF NOT EXISTS TEST_INDEX ON TEST_TABLE (PK2, PK1);

UPSERT INTO TEST_TABLE (PK1, PK2) 
VALUES ('abc123',TO_DATE('2010-01-01T00:00:01Z'));
UPSERT INTO TEST_TABLE (PK1, PK2) 
VALUES ('abc123',TO_DATE('2010-01-01T00:00:02Z'));
UPSERT INTO TEST_TABLE (PK1, PK2) 
VALUES ('abc123',TO_DATE('2010-01-01T00:00:03Z'));

-- Selects 

--This select statement returns 0 rows, though it should be returning the 3 upserted rows from above
SELECT PK1, PK2
FROM TEST_TABLE 
WHERE (PK2, PK1) >= (TO_DATE('1970-01-01'), null)
-- the EXPLAIN shows the base table is being hit. Why is the base table hit with this RVC?
explain SELECT PK1, PK2
FROM TEST_TABLE 
WHERE (PK2, PK1) >= (TO_DATE('1970-01-01'), null)

--Note: by adding an ORDER BY statement, the query returns all 3 rows
SELECT PK1, PK2
FROM TEST_TABLE 
WHERE (PK2, PK1) >= (TO_DATE('1970-01-01'), null)
ORDER BY PK2, PK1;
-- the EXPLAIN shows the secondary index is now being used by this query
explain SELECT PK1, PK2
FROM TEST_TABLE 
WHERE (PK2, PK1) >= (TO_DATE('1970-01-01'), null)
ORDER BY PK2, PK1;
{code}

  was:
I was trying to run a delete statement using a row value constructor against a table's secondary index, and noticed no data was being deleted. Digging into the problem a bit more, I cant get any queries using a row value constructor against the secondary index to work at all. I've included repro steps below:

*Repro*

{code:title=repro.sql|borderStyle=solid}
DROP INDEX IF EXISTS TEST_INDEX ON TEST_TABLE;
DROP TABLE IF EXISTS TEST_TABLE;
CREATE TABLE IF NOT EXISTS TEST_TABLE (
pk1 VARCHAR NOT NULL,
pk2 VARCHAR NOT NULL,
pk3 VARCHAR NOT NULL,
pk4 DATE NOT NULL,
pk5 VARCHAR NOT NULL,
v1 VARCHAR
CONSTRAINT PK PRIMARY KEY 
(
pk1,
pk2,
pk3,
pk4 DESC,
pk5
)
) MULTI_TENANT=true,IMMUTABLE_ROWS=true;
CREATE INDEX TEST_INDEX ON TEST_TABLE (pk2, pk4, pk3, pk5);
upsert into TEST_TABLE (pk1, pk2, pk3, pk4, pk5, v1) values ('a', '001', '1', TO_DATE('2000-01-01'), 'A', 'value');
upsert into TEST_TABLE (pk1, pk2, pk3, pk4, pk5, v1) values ('a', '001', '1', TO_DATE('2000-01-01'), 'B', 'value');

-- THIS IS THIE ORDERING USING A DEFAULT ROW VALUE CONSTRUCTOR, USING
-- VALUES FOR THE ROW VALUE CONSTRUCTOR SUCH THAT BOTH TEST ROWS 
-- SHOULD BE RETURNED
SELECT *
FROM TEST_TABLE
WHERE 
pk1 = 'a' AND pk2 = '001' AND
(pk1, pk2, pk3, pk4, pk5) >= ('a', '001', '0', TO_DATE('1999-01-01'), 'A')

-- HERE ARE THE EXACT SAME VALUES, WITH FIELDS pk3 AND pk4 TRANSPOSED
SELECT *
FROM TEST_TABLE
WHERE 
pk1 = 'a' AND pk2 = '001' AND
(pk1, pk2, pk4, pk3, pk5) >= ('a', '001', TO_DATE('1999-01-01'), '0', 'A')
-- NOTE, THAT NOTHING IS RETURNED BY THIS QUERY, BUT THE TWO TEST ROWS SHOULD BE.
-- THIS SHOULD BE USING THE INDEX, BUT ISN'T:
EXPLAIN SELECT * FROM TEST_TABLE WHERE  pk1 = 'a' AND pk2 = '001' AND
(pk1, pk2, pk4, pk3, pk5) >= ('a', '001', TO_DATE('1999-01-01'), '0', 'A')

-- IF WE CHANGE THE INEQUALITY TO BE LESS THAN OR EQUALS WITH THE TRANSPOSED ROW
-- KEY, NOTE THAT THE TEST ROWS ARE RETURNED (THEY SHOULDN'T BE):
SELECT *
FROM TEST_TABLE
WHERE 
pk1 = 'a' AND pk2 = '001' AND
(pk1, pk2, pk4, pk3, pk5) <= ('a', '001', TO_DATE('1999-01-01'), '0', 'A')

-- HINTING TO USE AN INDEX DOESN'T WORK:
SELECT /*+ INDEX(TEST_TABLE TEST_INDEX) */ *
FROM TEST_TABLE
WHERE 
pk1 = 'a' AND pk2 = '001' AND
(pk1, pk2, pk4, pk3, pk5) >= ('a', '001', TO_DATE('1999-01-01'), '0', 'A')

-- FORCING AN ORDERING THAT WOULD CAUSE THE INDEX TO BE USED DOESN'T WORK EITHER:
SELECT *
FROM TEST_TABLE
WHERE 
pk1 = 'a' AND pk2 = '001' AND
(pk1, pk2, pk4, pk3, pk5) >= ('a', '001', TO_DATE('1999-01-01'), '0', 'A')
ORDER BY pk1, pk2, pk4, pk3, pk5

-- THIS IS THE ORIGINAL DELETE STATEMENT I TRIED TO RUN, BUT IS FAILING 
-- LIKELY FOR THE SAME REASON AS SELECTS ARE FAILING ABOVE
DELETE
FROM TEST_TABLE
WHERE 
pk1 = 'a' AND pk2 = '001' AND
(pk1, pk2, pk4, pk3, pk5) >= ('a', '001', TO_DATE('1999-01-01'), '0', 'A')
{code}


> Row Value Constructors Against Indexes Don't Work Correctly
> -----------------------------------------------------------
>
>                 Key: PHOENIX-3301
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3301
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.8.0
>            Reporter: Brian Esserlieu
>
> Why does adding an ORDER BY change the number of records returned when there is a secondary index? 
> Why without including an ORDER by are no records returned (seems to be hitting the base table and not the index)?
> I've included repro steps below:
> *Repro*
> {code:title=repro.sql|borderStyle=solid}
> DROP INDEX IF EXISTS TEST_INDEX ON TEST_TABLE;
> DROP TABLE IF EXISTS TEST_TABLE;
> CREATE TABLE IF NOT EXISTS TEST_TABLE (
>     PK1 CHAR(15) NOT NULL,
>     PK2 DATE NOT NULL
>     CONSTRAINT PK PRIMARY KEY 
>     ( 
>         PK1,
>         PK2 DESC
>     )
> );
> CREATE INDEX IF NOT EXISTS TEST_INDEX ON TEST_TABLE (PK2, PK1);
> UPSERT INTO TEST_TABLE (PK1, PK2) 
> VALUES ('abc123',TO_DATE('2010-01-01T00:00:01Z'));
> UPSERT INTO TEST_TABLE (PK1, PK2) 
> VALUES ('abc123',TO_DATE('2010-01-01T00:00:02Z'));
> UPSERT INTO TEST_TABLE (PK1, PK2) 
> VALUES ('abc123',TO_DATE('2010-01-01T00:00:03Z'));
> -- Selects 
> --This select statement returns 0 rows, though it should be returning the 3 upserted rows from above
> SELECT PK1, PK2
> FROM TEST_TABLE 
> WHERE (PK2, PK1) >= (TO_DATE('1970-01-01'), null)
> -- the EXPLAIN shows the base table is being hit. Why is the base table hit with this RVC?
> explain SELECT PK1, PK2
> FROM TEST_TABLE 
> WHERE (PK2, PK1) >= (TO_DATE('1970-01-01'), null)
> --Note: by adding an ORDER BY statement, the query returns all 3 rows
> SELECT PK1, PK2
> FROM TEST_TABLE 
> WHERE (PK2, PK1) >= (TO_DATE('1970-01-01'), null)
> ORDER BY PK2, PK1;
> -- the EXPLAIN shows the secondary index is now being used by this query
> explain SELECT PK1, PK2
> FROM TEST_TABLE 
> WHERE (PK2, PK1) >= (TO_DATE('1970-01-01'), null)
> ORDER BY PK2, PK1;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)