You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Jeff Zhang (JIRA)" <ji...@apache.org> on 2014/08/26 08:01:02 UTC

[jira] [Created] (TEZ-1499) Add OrderedJoinExample to tez-examples

Jeff Zhang created TEZ-1499:
-------------------------------

             Summary: Add OrderedJoinExample to tez-examples
                 Key: TEZ-1499
                 URL: https://issues.apache.org/jira/browse/TEZ-1499
             Project: Apache Tez
          Issue Type: Bug
            Reporter: Jeff Zhang
            Assignee: Jeff Zhang


In the current join example, the inputs of JoinProcessor is unordered so that it will always need to load one input into memory, and stream another input. This only fit for the case when one dataset is small enough to fit into memory ( even use no-broadcast, memory may not be enough ).  So I'd like to add another join example that make the inputs of JoinProcessor is ordered. ( using OrderedPartitionedKVEdgeConfig ). This kind of join could been used when both of the 2 dataset is large.



--
This message was sent by Atlassian JIRA
(v6.2#6252)