You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Dan Hecht (JIRA)" <ji...@apache.org> on 2018/03/06 23:52:00 UTC
[jira] [Resolved] (IMPALA-3559) Explain plan and profiles reference
HDFS while query is running entirely against S3
[ https://issues.apache.org/jira/browse/IMPALA-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dan Hecht resolved IMPALA-3559.
-------------------------------
Resolution: Duplicate
Let's track using IMPALA-6050 which has more discussion.
> Explain plan and profiles reference HDFS while query is running entirely against S3
> -----------------------------------------------------------------------------------
>
> Key: IMPALA-3559
> URL: https://issues.apache.org/jira/browse/IMPALA-3559
> Project: IMPALA
> Issue Type: Task
> Components: Frontend
> Affects Versions: Impala 2.6.0
> Reporter: Mostafa Mokhtar
> Priority: Critical
> Labels: s3, supportability
>
> When running queries against databases entirely in S3 the query profile and plan still mention HDFS
> {code}
> WRITE TO HDFS [tpch_300_parquet_partitioned.lineitem, OVERWRITE=false, PARTITION-KEYS=(L_SHIPDATE)]
> | partitions=2526
> | hosts=32 per-host-mem=12.53GB
> |
> 01:EXCHANGE [HASH(L_SHIPDATE)]
> | hosts=32 per-host-mem=0B
> | tuple-ids=0 row-size=256B cardinality=1682537668
> |
> 00:SCAN HDFS [tpch_300_text_partitioned.lineitem, RANDOM]
> partitions=2526/2526 files=2526 size=210.05GB
> table stats: 1682537668 rows total (74 partition(s) missing stats)
> column stats: all
> hosts=32 per-host-mem=960.00MB
> tuple-ids=0 row-size=256B cardinality=1682537668
> {code}
> {code}
> - TotalIntegrityCheckTime: 0.000ns
> - TotalReadBlockTime: 0.000ns
> HdfsTableSink:(Total: 45s194ms, non-child: 45s194ms, % non-child: 100.00%)
> - BytesWritten: 231.00 B (231)
> - CompressTimer: 5s055ms
> - EncodeTimer: 25s434ms
> - FilesCreated: 57 (57)
> - HdfsWriteTimer: 0.000ns
> - PartitionsCreated: 57 (57)
> - PeakMemoryUsage: 1011.51 MB (1060647936)
> - RowsInserted: 0 (0)
> - TmpFileCreateTimer: 4s703ms
> {code}
> {code}
> HDFS_SCAN_NODE (id=0):(Total: 714.997ms, non-child: 714.997ms, % non-child: 100.00%)
> - AverageHdfsReadThreadConcurrency: 0.36
> - AverageScannerThreadConcurrency: 10.88
> - BytesRead: 2.43 GB (2611737608)
> - BytesReadDataNodeCache: 0
> - BytesReadLocal: 0
> - BytesReadRemoteUnexpected: 0
> - BytesReadShortCircuit: 0
> - DecompressionTime: 0.000ns
> - MaxCompressedTextFileLength: 0
> - NumDisksAccessed: 0 (0)
> - NumScannerThreadsStarted: 10 (10)
> - PeakMemoryUsage: 337.63 MB (354027648)
> - PerReadThreadRawHdfsThroughput: 36.42 MB/sec
> - RemoteScanRanges: 0 (0)
> - RowsRead: 19.48M (19483458)
> - RowsReturned: 19.43M (19431740)
> - RowsReturnedRate: 27.21 M/sec
> - ScanRangesComplete: 74 (74)
> - ScannerThreadsInvoluntaryContextSwitches: 0 (0)
> - ScannerThreadsTotalWallClockTime: 0.000ns
> - DelimiterParseTime: 3s596ms
> - MaterializeTupleTime(*): 7s094ms
> - ScannerThreadsSysTime: 0.000ns
> - ScannerThreadsUserTime: 0.000ns
> - ScannerThreadsVoluntaryContextSwitches: 0 (0)
> - TotalRawHdfsReadTime(*): 1m8s
> - TotalReadThroughput: 13.26 MB/sec
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)