You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by jr...@apache.org on 2017/10/06 04:45:35 UTC
incubator-impala git commit: IMPALA-3316: [DOCS] Add known issue for
timezone conversion slowdown
Repository: incubator-impala
Updated Branches:
refs/heads/master ec957456d -> 1e581a66d
IMPALA-3316: [DOCS] Add known issue for timezone conversion slowdown
Change-Id: I9933ced07e339d589f7f74173cfebe938084e65c
Reviewed-on: http://gerrit.cloudera.org:8080/8165
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Reviewed-by: Alex Behm <al...@cloudera.com>
Tested-by: Impala Public Jenkins
Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/1e581a66
Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/1e581a66
Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/1e581a66
Branch: refs/heads/master
Commit: 1e581a66dddae5b400e50e440063d16de868bb63
Parents: ec95745
Author: John Russell <jr...@cloudera.com>
Authored: Thu Sep 28 10:36:39 2017 -0700
Committer: Impala Public Jenkins <im...@gerrit.cloudera.org>
Committed: Fri Oct 6 04:42:15 2017 +0000
----------------------------------------------------------------------
docs/topics/impala_known_issues.xml | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/1e581a66/docs/topics/impala_known_issues.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_known_issues.xml b/docs/topics/impala_known_issues.xml
index 14ff4e3..28196f5 100644
--- a/docs/topics/impala_known_issues.xml
+++ b/docs/topics/impala_known_issues.xml
@@ -305,6 +305,32 @@ https://issues.apache.org/jira/browse/IMPALA-2144 - Don't have
</conbody>
+ <concept id="IMPALA-3316">
+ <title>Slow queries for Parquet tables with convert_legacy_hive_parquet_utc_timestamps=true</title>
+ <conbody>
+ <p>
+ The configuration setting <codeph>convert_legacy_hive_parquet_utc_timestamps=true</codeph>
+ uses an underlying function that can be a bottleneck on high volume, highly concurrent
+ queries due to the use of a global lock while loading time zone information. This bottleneck
+ can cause slowness when querying Parquet tables, up to 30x for scan-heavy queries. The amount
+ of slowdown depends on factors such as the number of cores and number of threads involved in the query.
+ </p>
+ <note>
+ <p>
+ The slowdown only occurs when accessing <codeph>TIMESTAMP</codeph> columns within Parquet files that
+ were generated by Hive, and therefore require the on-the-fly timezone conversion processing.
+ </p>
+ </note>
+ <p><b>Bug:</b> <xref keyref="IMPALA-3316">IMPALA-3316</xref></p>
+ <p><b>Severity:</b> High</p>
+ <p><b>Workaround:</b> If the <codeph>TIMESTAMP</codeph> values stored in the table represent dates only,
+ with no time portion, consider storing them as strings in <codeph>yyyy-MM-dd</codeph> format.
+ Impala implicitly converts such string values to <codeph>TIMESTAMP</codeph> in calls to date/time
+ functions.
+ </p>
+ </conbody>
+ </concept>
+
<concept id="IMPALA-1480" rev="IMPALA-1480">
<!-- Not part of Alex's spreadsheet. Spreadsheet has IMPALA-1423 which mentions it's similar to this one but not a duplicate. -->