You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Jason Dere (JIRA)" <ji...@apache.org> on 2017/03/31 01:03:41 UTC
[jira] [Created] (HIVE-16341) Tez Task Execution Summary has
incorrect input record counts on some operators
Jason Dere created HIVE-16341:
---------------------------------
Summary: Tez Task Execution Summary has incorrect input record counts on some operators
Key: HIVE-16341
URL: https://issues.apache.org/jira/browse/HIVE-16341
Project: Hive
Issue Type: Bug
Components: Tez
Reporter: Jason Dere
Assignee: Jason Dere
{noformat}
Task Execution Summary
--------------------------------------------------------------------------------------------------------------------------------
VERTICES TOTAL_TASKS FAILED_ATTEMPTS KILLED_TASKS DURATION(ms) CPU_TIME(ms) GC_TIME(ms) INPUT_RECORDS OUTPUT_RECORDS
--------------------------------------------------------------------------------------------------------------------------------
Map 1 167 0 0 17640.00 2,109,200 23,068 150,000,004 11,995,136
Map 11 5 0 0 10559.00 71,960 633 4,023,690 799,900
Map 13 1 0 0 2244.00 6,090 29 25 3
Map 3 1 0 0 2849.00 7,080 99 25 3
Map 5 271 0 0 55834.00 12,934,890 358,376 1,500,000,001 1,500,000,161
Map 7 241 0 0 91243.00 5,020,860 71,182 1,827,250,341 652,413,443
Reducer 10 1 0 0 1010.00 1,900 0 4 0
Reducer 12 1 0 0 3854.00 1,320 0 799,900 1
Reducer 14 1 0 0 1420.00 3,790 45 3 1
Reducer 2 1 0 0 9720.00 6,220 122 11,995,136 1
Reducer 4 1 0 0 810.00 2,100 105 3 1
Reducer 6 1 0 0 24863.00 3,260 5 1,500,000,161 1
Reducer 8 412 0 0 88215.00 17,106,440 184,524 2,165,208,640 1,864
Reducer 9 2 0 0 29752.00 3,980 0 1,864 4
--------------------------------------------------------------------------------------------------------------------
{noformat}
Seeing this on queries using runtime filtering. Noticed the INPUT_RECORDS look incorrect for the reducers that are responsible for aggregating the min/max/bloomfilter (Reducers 12, 14, 2, 6). For example Reducer 2 shows 12M input records. However looking at the task logs for Reducer 2, there were only 167 input records.
It looks like Map 1 has 2 different output vertices (Reducer 2 and Reducer 8), but the total output rows for Map 1 (rather than just the rows going to each specific vertex) is being counted in the input rows for both Reducer 2 and Reducer 8.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)