You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Rajesh Balamohan (JIRA)" <ji...@apache.org> on 2017/07/26 00:57:00 UTC

[jira] [Created] (HIVE-17174) LLAP: ShuffleHandler: optimize fadvise calls for broadcast edge

Rajesh Balamohan created HIVE-17174:
---------------------------------------

             Summary: LLAP: ShuffleHandler: optimize fadvise calls for broadcast edge
                 Key: HIVE-17174
                 URL: https://issues.apache.org/jira/browse/HIVE-17174
             Project: Hive
          Issue Type: Bug
            Reporter: Rajesh Balamohan
            Assignee: Rajesh Balamohan
            Priority: Minor



Currently, once the data is transferred `fadvise` call is invoked to throw away the pages. This may not be very helpful in broadcast, as it would tend to transfer the same data to multiple downstream tasks. 

e.g Q50 at 1 TB scale

{noformat}
      Edges:
        Map 1 <- Map 5 (BROADCAST_EDGE)
        Map 6 <- Reducer 2 (BROADCAST_EDGE), Reducer 3 (BROADCAST_EDGE), Reducer 4 (BROADCAST_EDGE)
        Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
        Reducer 3 <- Map 1 (CUSTOM_SIMPLE_EDGE)
        Reducer 4 <- Map 1 (CUSTOM_SIMPLE_EDGE)
        Reducer 7 <- Map 1 (CUSTOM_SIMPLE_EDGE), Map 10 (BROADCAST_EDGE), Map 11 (BROADCAST_EDGE), Map 6 (CUSTOM_SIMPLE_EDGE)
        Reducer 8 <- Reducer 7 (SIMPLE_EDGE)
        Reducer 9 <- Reducer 8 (SIMPLE_EDGE)



Status: Running (Executing on YARN cluster with App id application_1490656001509_6084)

----------------------------------------------------------------------------------------------
        VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
----------------------------------------------------------------------------------------------
Map 5 ..........      llap     SUCCEEDED      1          1        0        0       0       0
Map 1 ..........      llap     SUCCEEDED     11         11        0        0       0       0
Reducer 4 ......      llap     SUCCEEDED      1          1        0        0       0       0
Reducer 2 ......      llap     SUCCEEDED      1          1        0        0       0       0
Reducer 3 ......      llap     SUCCEEDED      1          1        0        0       0       0
Map 6 ..........      llap     SUCCEEDED    139        139        0        0       0       0
Map 10 .........      llap     SUCCEEDED      1          1        0        0       0       0
Map 11 .........      llap     SUCCEEDED      1          1        0        0       0       0
Reducer 7 ......      llap     SUCCEEDED    834        834        0        0       0       0
Reducer 8 ......      llap     SUCCEEDED     24         24        0        0       0       0
Reducer 9 ......      llap     SUCCEEDED      1          1        0        0       0       0
----------------------------------------------------------------------------------------------

e.g count of evictions on files

139 /grid/3/hadoop/yarn/local/usercache/rbalamohan/appcache/application_1490656001509_6084/1/output/attempt_1490656001509_6084_1_05_000000_0_18387/file.out
834 /grid/3/hadoop/yarn/local/usercache/rbalamohan/appcache/application_1490656001509_6084/1/output/attempt_1490656001509_6084_1_07_000000_0_18420_1/file.out
834 /grid/3/hadoop/yarn/local/usercache/rbalamohan/appcache/application_1490656001509_6084/1/output/attempt_1490656001509_6084_1_07_000000_0_18420_2/file.out
   
{noformat}


It would be good to fadvise for cases when "partition != 0". This would help retaining the pages for broadcast.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)