You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Siddharth Seth (JIRA)" <ji...@apache.org> on 2014/03/17 19:27:43 UTC

[jira] [Commented] (TEZ-938) Shuffle Phase - Reduce number of http connections where there is large number of empty partitions

    [ https://issues.apache.org/jira/browse/TEZ-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938169#comment-13938169 ] 

Siddharth Seth commented on TEZ-938:
------------------------------------

Thanks for taking the up Rajesh - avoiding the unnecessary connections and reads in case of shuffle is useful. Comments

Data written by OnFileSortedOutput can be read downstream by ShuffledMergedInput and by ShuffledUnorderedKVInput (when OnFileSortedOutput is used primarily for partitioning). That should change as well to make use of the bitfield - but can be a separate jira if you prefer. 

DataViaEvents is really used to send the entire payload (in case of broadcast) - I don't think we should be re-using the same config parameter to indicate whether the bit field is available (especially since users don't necessarily configure settings on a per vertex level - so this ends up actually turning on dataViaEvents). I think a new config option for this is a better option - with enough information in the payload itself for the consumer to know whether to try parsing the bitset or to read directly (I think this exists already in the patch). A new field in the proto definition will likely be required store the bitset - and the existence of the field should be sufficient for consumers to determine whether they need to read it.

In ShuffleInputEventHandler - logging whether data is available needn't be at DEBUG level. It's 1 log line per upstream partition and useful to have. Also, I don't think the catch(Throwable) is required. If there's any Runtime exceptions - let them show up as is.

ShuffleScheduler - a lot of the code in addEmptyPartition is common to copySucceeded. Can that be consolidated ?

TezIndexRecord - the dataPresent field isn't needed for now. Also, could you please convert the TODOs into a separate jira and include that in the code. (Avoid writing out empty partitions)
{code}
if (partLength != 6) {
{code}
Will this always be correct ? partLength, represents the compressed size of the data. I think a check on rawLength==2 is needed instead (which corresponds to the size of the EOF marker).

The bit array, in many cases, is likely to be very compressible. We should be able to change the event to be always compressed to reduce memory pressure within the AM (The AM should not be decompressing the events - it mainly just forwards the events to tasks). The downstream consumers and other producers which use this payload will also need to change - OnFileUnorderedKVOutput, ShuffledMergedInput, ShuffledUnorderedKVInput.





// TODO

> Shuffle Phase - Reduce number of http connections where there is large number of empty partitions 
> --------------------------------------------------------------------------------------------------
>
>                 Key: TEZ-938
>                 URL: https://issues.apache.org/jira/browse/TEZ-938
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>              Labels: 0.4
>             Fix For: 0.4.0
>
>         Attachments: TEZ-938-v1.patch
>
>
> Lots of TPC queries make thousands of 2 byte reads.  This is due large number of empty partitions.  We need to reduce the number of http calls made in such scenarios. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)