You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Kyle Buzsaki (JIRA)" <ji...@apache.org> on 2014/07/18 20:22:05 UTC

[jira] [Commented] (PHOENIX-1083) IN list of RVC combined with AND doesn't return expected rows

    [ https://issues.apache.org/jira/browse/PHOENIX-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14066676#comment-14066676 ] 

Kyle Buzsaki commented on PHOENIX-1083:
---------------------------------------

Looks like the issue here is again with the WhereOptimizer producing an incorrect cnf. This time, it's because the keySlots are incorrectly parsed by the KeySlotVisitor. I've dropped tenant_type_id from the test because it's not relevant. The resulting ddl, dml, and sql is as follows:
{code}
CREATE TABLE in_test (tenant_id VARCHAR(5) NOT NULL, id INTEGER NOT NULL, user VARCHAR CONSTRAINT pk PRIMARY KEY (tenant_id, id))

upsert into in_test (tenant_id, id, user) values ('a', 1, 'BonA')
upsert into in_test (tenant_id, id, user) values ('a', 2, 'BonB')

select id from in_test WHERE tenant_id = 'a' AND ((id, user) IN ((1, 'BonA'), (1, 'BonB')))
{code}

Here's what we get from the WhereOptimizer:
{code}
schema = [VARCHAR, INTEGER] // tenant_id, id
cnf = [ [ "a" ], ["\x80\x00\x00\x01BonA", "\x80\x00\x00\x01BonB"] ] // tenant_id, (id, user)
{code}

The InListExpression here is incorrectly being parsed as a single key slot and the slot for id is being polluted with the user part of the RowValueConstructor. This doesn't work because the user column is not part of the row key. 

Running the algebraically equivalent query:
{code}
select id from in_test WHERE tenant_id = 'a' and id = 1 and (user in ('BonA', 'BonB'))
{code}
we see what the correct output should be:
{code}
schema = [VARCHAR, INTEGER] // tenant_id, id
cnf = [ [ "a" ], ["\x80\x00\x00\x01"] ] // tenant_id, id
{code}
The WhereOptimizer correctly realizes that it cannot optimize out the check for the user column data and returns that expression up to the WhereCompiler where it is handled by a SingleCQKeyValueComparisonFilter.

The solution here is to fix the parsing of the keySlots by the KeySlotVisitor. I'm not sure which exact segment of code is the culprit, but I'll inspect that further.

> IN list of RVC combined with AND doesn't return expected rows
> -------------------------------------------------------------
>
>                 Key: PHOENIX-1083
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1083
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 3.0.0, 4.0.0, 5.0.0
>            Reporter: Samarth Jain
>            Assignee: James Taylor
>
> {code}
> CREATE TABLE in_test ( user VARCHAR, tenant_id VARCHAR(5) NOT NULL,tenant_type_id VARCHAR(3) NOT NULL,  id INTEGER NOT NULL CONSTRAINT pk PRIMARY KEY (tenant_id, tenant_type_id, id))
> upsert into in_test (tenant_id, tenant_type_id, id, user) values ('a', 'a', 1, 'BonA')
> upsert into in_test (tenant_id, tenant_type_id, id, user) values ('a', 'a', 2, 'BonB')
> select id from in_test WHERE tenant_id = 'a' and tenant_type_id = 'a' and ((id, user) IN ((1, 'BonA'),(1, 'BonA')))
> Rows returned - none. Should have returned one row. 
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)