You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "HeLifu (JIRA)" <ji...@apache.org> on 2017/12/15 10:12:00 UTC

[jira] [Created] (IMPALA-6328) why kudu HashJoin node takes much more time than parquet on Q5 of TPC-H?

HeLifu created IMPALA-6328:
------------------------------

             Summary: why kudu HashJoin node takes much more time than parquet on Q5 of TPC-H?
                 Key: IMPALA-6328
                 URL: https://issues.apache.org/jira/browse/IMPALA-6328
             Project: IMPALA
          Issue Type: Improvement
          Components: Backend
    Affects Versions: Impala 2.9.0
            Reporter: HeLifu
         Attachments: TPCH-5-plan-kudu.jpg, TPCH-5-plan-parquet.jpg

hi, i ran a tpc-h test under 1T scale and found that kudu HashJoin node took much more time(double!) than parquet on query 5 of TPC-H.  could anybody help to explain why? 
Thanks in advance. 

*{color:red}Parquet:{color}*
{code:java}
        HASH_JOIN_NODE (id=6):(Total: 3m34s, non-child: 3m18s, % non-child: 92.17%)
          ExecOption: Probe Side Codegen Enabled, Join Build-Side Prepared Asynchronously
           - BuildRows: 227.55M (227546432)
           - BuildTime: 1m8s
           - NumHashTableBuildsSkipped: 0 (0)
           - PeakMemoryUsage: 17.14 GB (18399317760)
           - ProbeRows: 2.00B (2001992073)
           - ProbeRowsPartitioned: 0 (0)
           - ProbeTime: 2m6s
           - RowsReturned: 303.68M (303678704)
           - RowsReturnedRate: 1.41 M/sec
          Hash Join Builder (join_node_id=6):
            ExecOption: Build Side Codegen Enabled, Hash Table Construction Codegen Enabled
            Runtime filters: 0 of 1 Runtime Filter Published, 1 Disabled
             - BuildRowsPartitionTime: 47s054ms
             - BuildRowsPartitioned: 227.55M (227546432)
             - GetNewBlockTime: 6.349ms
             - HashBuckets: 536.87M (536870912)
             - HashCollisions: 0 (0)
             - HashTablesBuildTime: 20s919ms
             - LargestPartitionPercent: 6 (6)
             - MaxPartitionLevel: 0 (0)
             - NumRepartitions: 0 (0)
             - PartitionsCreated: 16 (16)
             - PeakMemoryUsage: 17.13 GB (18397287552)
             - PinTime: 0.000ns
             - RepartitionTime: 0.000ns
             - SpilledPartitions: 0 (0)
             - UnpinTime: 0.000ns
          EXCHANGE_NODE (id=13):(Total: 9s442ms, non-child: 9s048ms, % non-child: 95.83%)
            BytesReceived(8s000ms): 224.35 MB, 907.67 MB, 1.53 GB, 2.16 GB, 2.79 GB, 3.43 GB, 4.11 GB, 4.63 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB, 4.66 GB
             - BytesReceived: 4.66 GB (5004716793)
             - ConvertRowBatchTime: 4s469ms
             - DeserializeRowBatchTimer: 19s513ms
             - FirstBatchArrivalWaitTime: 0.000ns
             - PeakMemoryUsage: 0
             - RowsReturned: 227.55M (227546432)
             - RowsReturnedRate: 24.10 M/sec
             - SendersBlockedTimer: 11s573ms
             - SendersBlockedTotalTimer(*): 26s223ms
        HDFS_SCAN_NODE (id=2):(Total: 7s389ms, non-child: 7s389ms, % non-child: 100.00%)
          Hdfs split stats (<volume id>:<# splits>/<split lengths>): 4:46/11.37 GB 2:52/12.85 GB 5:49/12.11 GB 1:48/11.86 GB 0:48/11.86 GB 3:51/12.60 GB 
          ExecOption: PARQUET Codegen Enabled, Codegen enabled: 294 out of 294
          Runtime filters: All filters arrived. Waited 0
          Hdfs Read Thread Concurrency Bucket: 0:97.12% 1:1.37% 2:0.411% 3:0.274% 4:0.5479% 5:0.274% 6:0% 7:0% 8:0% 9:0% 10:0% 
          File Formats: PARQUET/SNAPPY:1176 
          BytesRead(8s000ms): 0, 0, 0, 0, 0, 0, 0, 0, 0, 80.38 MB, 3.32 GB, 3.33 GB, 3.33 GB, 3.33 GB, 3.33 GB, 6.13 GB, 6.65 GB, 6.65 GB, 6.65 GB, 6.65 GB, 8.79 GB, 9.98 GB, 9.98 GB, 9.98 GB, 9.98 GB, 11.38 GB, 13.31 GB, 13.31 GB, 13.31 GB, 13.31 GB, 15.24 GB, 16.64 GB, 16.64 GB, 16.64 GB, 16.73 GB, 19.83 GB, 19.96 GB, 19.96 GB, 19.96 GB, 21.27 GB, 23.29 GB, 23.29 GB, 23.29 GB, 23.32 GB, 24.34 GB
           - FooterProcessingTime: (Avg: 13.307ms ; Min: 1.982ms ; Max: 141.238ms ; Number of samples: 294)
           - AverageHdfsReadThreadConcurrency: 0.07 
           - AverageScannerThreadConcurrency: 30.55 
           - BytesRead: 24.45 GB (26257935312)
           - BytesReadDataNodeCache: 0
           - BytesReadLocal: 24.45 GB (26257935312)
           - BytesReadRemoteUnexpected: 0
           - BytesReadShortCircuit: 24.45 GB (26257935312)
           - DecompressionTime: 2m21s
           - MaxCompressedTextFileLength: 0
           - NumColumns: 4 (4)
           - NumDictFilteredRowGroups: 0 (0)
           - NumDisksAccessed: 6 (6)
           - NumRowGroups: 294 (294)
           - NumScannerThreadsStarted: 40 (40)
           - NumScannersWithNoReads: 0 (0)
           - NumStatsFilteredRowGroups: 0 (0)
           - PeakMemoryUsage: 3.61 GB (3872112566)
           - PerReadThreadRawHdfsThroughput: 825.37 MB/sec
           - RemoteScanRanges: 0 (0)
           - RowBatchQueueGetWaitTime: 56.799ms
           - RowBatchQueuePutWaitTime: 2h58m
           - RowsRead: 2.00B (2001992073)
           - RowsReturned: 2.00B (2001992073)
           - RowsReturnedRate: 270.93 M/sec
           - ScanRangesComplete: 294 (294)
           - ScannerThreadsInvoluntaryContextSwitches: 4.09K (4089)
           - ScannerThreadsTotalWallClockTime: 3h5m
             - MaterializeTupleTime(*): 3m33s
             - ScannerThreadsSysTime: 1m17s
             - ScannerThreadsUserTime: 5m45s
           - ScannerThreadsVoluntaryContextSwitches: 2.01M (2014822)
           - TotalRawHdfsReadTime(*): 30s339ms
           - TotalReadThroughput: 68.33 MB/sec
          Filter 4 (4.00 MB):
             - Rows processed: 4.82M (4816896)
             - Rows rejected: 0 (0)
             - Rows total: 2.00B (2001992073)
          Filter 5 (16.00 MB):
             - Rows processed: 4.82M (4816896)
             - Rows rejected: 0 (0)
             - Rows total: 2.00B (2001992073)
{code}

{color:red}*kudu:*{color}
{code:java}
        HASH_JOIN_NODE (id=6):(Total: 6m12s, non-child: 5m50s, % non-child: 94.13%)
          ExecOption: Probe Side Codegen Enabled, Join Build-Side Prepared Asynchronously
           - BuildRows: 227.55M (227546432)
           - BuildTime: 29s641ms
           - NumHashTableBuildsSkipped: 0 (0)
           - PeakMemoryUsage: 11.63 GB (12491762944)
           - ProbeRows: 2.00B (2000027549)
           - ProbeRowsPartitioned: 0 (0)
           - ProbeTime: 5m16s
           - RowsReturned: 303.41M (303410631)
           - RowsReturnedRate: 815.04 K/sec
          Hash Join Builder (join_node_id=6):
            ExecOption: Build Side Codegen Enabled, Hash Table Construction Codegen Enabled
             - BuildRowsPartitionTime: 10s944ms
             - BuildRowsPartitioned: 227.55M (227546432)
             - GetNewBlockTime: 4.046ms
             - HashBuckets: 536.87M (536870912)
             - HashCollisions: 0 (0)
             - HashTablesBuildTime: 18s659ms
             - LargestPartitionPercent: 6 (6)
             - MaxPartitionLevel: 0 (0)
             - NumRepartitions: 0 (0)
             - PartitionsCreated: 16 (16)
             - PeakMemoryUsage: 11.63 GB (12491707520)
             - PinTime: 0.000ns
             - RepartitionTime: 0.000ns
             - SpilledPartitions: 0 (0)
             - UnpinTime: 0.000ns
          EXCHANGE_NODE (id=13):(Total: 13s166ms, non-child: 8s365ms, % non-child: 63.54%)
            BytesReceived(16s000ms): 934.36 MB, 2.80 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB, 3.15 GB
             - BytesReceived: 3.15 GB (3383413889)
             - ConvertRowBatchTime: 6s115ms
             - DeserializeRowBatchTimer: 7s268ms
             - FirstBatchArrivalWaitTime: 0.000ns
             - PeakMemoryUsage: 0
             - RowsReturned: 227.55M (227546432)
             - RowsReturnedRate: 17.28 M/sec
             - SendersBlockedTimer: 957.765ms
             - SendersBlockedTotalTimer(*): 1s915ms
        KUDU_SCAN_NODE (id=2):(Total: 8s687ms, non-child: 8s687ms, % non-child: 100.00%)
          BytesRead(16s000ms): 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
           - BytesRead: 0
           - KuduRemoteScanTokens: 0 (0)
           - NumScannerThreadsStarted: 40 (40)
           - PeakMemoryUsage: 5.65 MB (5919744)
           - RowsRead: 2.00B (2000027549)
           - RowsReturned: 2.00B (2000027549)
           - RowsReturnedRate: 230.22 M/sec
           - ScanRangesComplete: 40 (40)
           - ScannerThreadsInvoluntaryContextSwitches: 2.79K (2794)
           - ScannerThreadsTotalWallClockTime: 6h
             - MaterializeTupleTime(*): 8s466ms
             - ScannerThreadsSysTime: 55s996ms
             - ScannerThreadsUserTime: 2m7s
           - ScannerThreadsVoluntaryContextSwitches: 1.99M (1994345)
           - TotalKuduScanRoundTrips: 7.88K (7880)
           - TotalReadThroughput: 0.00 /sec
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)