You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@phoenix.apache.org by "James Taylor (JIRA)" <ji...@apache.org> on 2014/08/07 04:53:11 UTC

[jira] [Commented] (PHOENIX-852) Optimize child/parent foreign key joins

    [ https://issues.apache.org/jira/browse/PHOENIX-852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14088735#comment-14088735 ] 

James Taylor commented on PHOENIX-852:
--------------------------------------

I think it'd be easiest to just take the row key columns from the results of the smaller table, and use them to generate a query with an IN clause. You can use our row value constructor syntax for this. See RowValueConstructorIT.testQueryMoreWithInListRowValueConstructor() for an example. You wouldn't even send a hash cache to the region servers. Just let the IN query go through the normal path, as it's executed pretty optimally already. I think you'd need to generate a different query plan altogether for this, though.

> Optimize child/parent foreign key joins
> ---------------------------------------
>
>                 Key: PHOENIX-852
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-852
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: James Taylor
>            Assignee: Maryann Xue
>
> Often times a join will occur from a child to a parent. Our current algorithm would do a full scan of one side or the other. We can do much better than that if the HashCache contains the PK (or even part of the PK) from the table being joined to. In these cases, we should drive the second scan through a skip scan on the server side.



--
This message was sent by Atlassian JIRA
(v6.2#6252)