You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Pradeep Kamath (JIRA)" <ji...@apache.org> on 2009/10/03 00:30:23 UTC

[jira] Updated: (PIG-953) Enable merge join in pig to work with loaders and store functions which can internally index sorted data

     [ https://issues.apache.org/jira/browse/PIG-953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath updated PIG-953:
-------------------------------

    Attachment: PIG-953-3.patch

Attached patch which has the SortColInfo implementation to convey sort column information in SortInfo. This patch also address PIG-981.

> Enable merge join in pig to work with loaders and store functions which can internally index sorted data 
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-953
>                 URL: https://issues.apache.org/jira/browse/PIG-953
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.3.0
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>         Attachments: PIG-953-2.patch, PIG-953-3.patch, PIG-953.patch
>
>
> Currently merge join implementation in pig includes construction of an index on sorted data and use of that index to seek into the "right input" to efficiently perform the join operation. Some loaders (notably the zebra loader) internally implement an index on sorted data and can perform this seek efficiently using their index. So the use of the index needs to be abstracted in such a way that when the loader supports indexing, pig uses it (indirectly through the loader) and does not construct an index. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.