You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by jr...@apache.org on 2017/10/06 04:45:35 UTC

incubator-impala git commit: IMPALA-3316: [DOCS] Add known issue for timezone conversion slowdown

Repository: incubator-impala
Updated Branches:
  refs/heads/master ec957456d -> 1e581a66d


IMPALA-3316: [DOCS] Add known issue for timezone conversion slowdown

Change-Id: I9933ced07e339d589f7f74173cfebe938084e65c
Reviewed-on: http://gerrit.cloudera.org:8080/8165
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Reviewed-by: Alex Behm <al...@cloudera.com>
Tested-by: Impala Public Jenkins


Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/1e581a66
Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/1e581a66
Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/1e581a66

Branch: refs/heads/master
Commit: 1e581a66dddae5b400e50e440063d16de868bb63
Parents: ec95745
Author: John Russell <jr...@cloudera.com>
Authored: Thu Sep 28 10:36:39 2017 -0700
Committer: Impala Public Jenkins <im...@gerrit.cloudera.org>
Committed: Fri Oct 6 04:42:15 2017 +0000

----------------------------------------------------------------------
 docs/topics/impala_known_issues.xml | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/1e581a66/docs/topics/impala_known_issues.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_known_issues.xml b/docs/topics/impala_known_issues.xml
index 14ff4e3..28196f5 100644
--- a/docs/topics/impala_known_issues.xml
+++ b/docs/topics/impala_known_issues.xml
@@ -305,6 +305,32 @@ https://issues.apache.org/jira/browse/IMPALA-2144 - Don't have
 
     </conbody>
 
+    <concept id="IMPALA-3316">
+      <title>Slow queries for Parquet tables with convert_legacy_hive_parquet_utc_timestamps=true</title>
+      <conbody>
+        <p>
+          The configuration setting <codeph>convert_legacy_hive_parquet_utc_timestamps=true</codeph>
+          uses an underlying function that can be a bottleneck on high volume, highly concurrent
+          queries due to the use of a global lock while loading time zone information. This bottleneck
+          can cause slowness when querying Parquet tables, up to 30x for scan-heavy queries. The amount
+          of slowdown depends on factors such as the number of cores and number of threads involved in the query.
+        </p>
+        <note>
+          <p>
+            The slowdown only occurs when accessing <codeph>TIMESTAMP</codeph> columns within Parquet files that
+            were generated by Hive, and therefore require the on-the-fly timezone conversion processing.
+          </p>
+        </note>
+        <p><b>Bug:</b> <xref keyref="IMPALA-3316">IMPALA-3316</xref></p>
+        <p><b>Severity:</b> High</p>
+        <p><b>Workaround:</b> If the <codeph>TIMESTAMP</codeph> values stored in the table represent dates only,
+          with no time portion, consider storing them as strings in <codeph>yyyy-MM-dd</codeph> format.
+          Impala implicitly converts such string values to <codeph>TIMESTAMP</codeph> in calls to date/time
+          functions.
+        </p>
+      </conbody>
+    </concept>
+
     <concept id="IMPALA-1480" rev="IMPALA-1480">
 
 <!-- Not part of Alex's spreadsheet. Spreadsheet has IMPALA-1423 which mentions it's similar to this one but not a duplicate. -->