You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by ar...@apache.org on 2018/04/30 18:44:03 UTC
impala git commit: IMPALA-6522: [DOCS] Document Decimal V2
Repository: impala
Updated Branches:
refs/heads/master 8a609b351 -> 0833408fd
IMPALA-6522: [DOCS] Document Decimal V2
Change-Id: Ic436ff80c9ad05cfada97280cd47552879214a3d
Cherry-picks: not for 2.x.
Reviewed-on: http://gerrit.cloudera.org:8080/10066
Tested-by: Impala Public Jenkins <im...@cloudera.com>
Reviewed-by: Alex Rodoni <ar...@cloudera.com>
Project: http://git-wip-us.apache.org/repos/asf/impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/impala/commit/0833408f
Tree: http://git-wip-us.apache.org/repos/asf/impala/tree/0833408f
Diff: http://git-wip-us.apache.org/repos/asf/impala/diff/0833408f
Branch: refs/heads/master
Commit: 0833408fd769a7225cbdd21ff9b1555f6031e796
Parents: 8a609b3
Author: Alex Rodoni <ar...@cloudera.com>
Authored: Mon Mar 26 18:11:23 2018 -0700
Committer: Alex Rodoni <ar...@cloudera.com>
Committed: Mon Apr 30 18:13:34 2018 +0000
----------------------------------------------------------------------
docs/impala_keydefs.ditamap | 1 -
docs/topics/impala_decimal.xml | 1535 +++++++++++++++++++----------------
2 files changed, 839 insertions(+), 697 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/impala/blob/0833408f/docs/impala_keydefs.ditamap
----------------------------------------------------------------------
diff --git a/docs/impala_keydefs.ditamap b/docs/impala_keydefs.ditamap
index 21c4e3e..08a53ff 100644
--- a/docs/impala_keydefs.ditamap
+++ b/docs/impala_keydefs.ditamap
@@ -10581,7 +10581,6 @@ under the License.
<keydef keys="impala132"><topicmeta><keywords><keyword>Impala 1.3.2</keyword></keywords></topicmeta></keydef>
<keydef keys="impala130"><topicmeta><keywords><keyword>Impala 1.3.0</keyword></keywords></topicmeta></keydef>
-<!-- Long form of mapping from Impala release to vendor-specific releases, for use in running text. -->
<keydef keys="impala30_full"><topicmeta><keywords><keyword>Impala 3.0</keyword></keywords></topicmeta></keydef>
<keydef keys="impala212_full"><topicmeta><keywords><keyword>Impala 2.12</keyword></keywords></topicmeta></keydef>
<keydef keys="impala211_full"><topicmeta><keywords><keyword>Impala 2.11</keyword></keywords></topicmeta></keydef>
http://git-wip-us.apache.org/repos/asf/impala/blob/0833408f/docs/topics/impala_decimal.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_decimal.xml b/docs/topics/impala_decimal.xml
index 6aa8b1e..00f4e84 100644
--- a/docs/topics/impala_decimal.xml
+++ b/docs/topics/impala_decimal.xml
@@ -20,8 +20,14 @@ under the License.
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
<concept rev="1.4.0" id="decimal">
- <title>DECIMAL Data Type (<keyword keyref="impala14"/> or higher only)</title>
- <titlealts audience="PDF"><navtitle>DECIMAL</navtitle></titlealts>
+ <title>DECIMAL Data Type (<keyword keyref="impala30_full"/> or higher only)</title>
+
+ <titlealts audience="PDF">
+
+ <navtitle>DECIMAL</navtitle>
+
+ </titlealts>
+
<prolog>
<metadata>
<data name="Category" value="Impala"/>
@@ -36,62 +42,54 @@ under the License.
<conbody>
<p>
- A numeric data type with fixed scale and precision, used in <codeph>CREATE TABLE</codeph> and <codeph>ALTER
- TABLE</codeph> statements. Suitable for financial and other arithmetic calculations where the imprecise
- representation and rounding behavior of <codeph>FLOAT</codeph> and <codeph>DOUBLE</codeph> make those types
- impractical.
+ The <codeph>DECIMAL</codeph> data type is a numeric data type with fixed scale and
+ precision.
+ </p>
+
+ <p>
+ The data type is useful for storing and doing operations on precise decimal values.
</p>
<p conref="../shared/impala_common.xml#common/syntax_blurb"/>
+<codeblock>DECIMAL[(<varname>precision</varname>[, <varname>scale</varname>])]</codeblock>
+
<p>
- In the column definition of a <codeph>CREATE TABLE</codeph> statement:
+ <b>Precision:</b>
</p>
-<codeblock><varname>column_name</varname> DECIMAL[(<varname>precision</varname>[,<varname>scale</varname>])]</codeblock>
+ <p>
+ <varname>precision</varname> represents the total number of digits that can be represented
+ regardless of the location of the decimal point.
+ </p>
<p>
- <codeph>DECIMAL</codeph> with no precision or scale values is equivalent to <codeph>DECIMAL(9,0)</codeph>.
+ This value must be between 1 and 38, specified as an integer literal.
</p>
<p>
- <b>Precision and Scale:</b>
+ The default precision is 9.
</p>
<p>
- <varname>precision</varname> represents the total number of digits that can be represented by the column,
- regardless of the location of the decimal point. This value must be between 1 and 38. For example,
- representing integer values up to 9999, and floating-point values up to 99.99, both require a precision of 4.
- You can also represent corresponding negative values, without any change in the precision. For example, the
- range -9999 to 9999 still only requires a precision of 4.
+ <b>Scale:</b>
</p>
<p>
- <varname>scale</varname> represents the number of fractional digits. This value must be less than or equal to
- <varname>precision</varname>. A scale of 0 produces integral values, with no fractional part. If precision
- and scale are equal, all the digits come after the decimal point, making all the values between 0 and
- 0.999... or 0 and -0.999...
+ <varname>scale</varname> represents the number of fractional digits.
</p>
<p>
- When <varname>precision</varname> and <varname>scale</varname> are omitted, a <codeph>DECIMAL</codeph> value
- is treated as <codeph>DECIMAL(9,0)</codeph>, that is, an integer value ranging from
- <codeph>-999,999,999</codeph> to <codeph>999,999,999</codeph>. This is the largest <codeph>DECIMAL</codeph>
- value that can still be represented in 4 bytes. If precision is specified but scale is omitted, Impala uses a
- value of zero for the scale.
+ This value must be less than or equal to the precision, specified as an integer literal.
</p>
<p>
- Both <varname>precision</varname> and <varname>scale</varname> must be specified as integer literals, not any
- other kind of constant expressions.
+ The default scale is 0.
</p>
<p>
- To check the precision or scale for arbitrary values, you can call the
- <xref href="impala_math_functions.xml#math_functions"><codeph>precision()</codeph> and
- <codeph>scale()</codeph> built-in functions</xref>. For example, you might use these values to figure out how
- many characters are required for various fields in a report, or to understand the rounding characteristics of
- a formula as applied to a particular <codeph>DECIMAL</codeph> column.
+ When the precision and the scale are omitted, a <codeph>DECIMAL</codeph> is treated as
+ <codeph>DECIMAL(9, 0)</codeph>.
</p>
<p>
@@ -99,732 +97,877 @@ under the License.
</p>
<p>
- The maximum precision value is 38. Thus, the largest integral value is represented by
- <codeph>DECIMAL(38,0)</codeph> (999... with 9 repeated 38 times). The most precise fractional value (between
- 0 and 1, or 0 and -1) is represented by <codeph>DECIMAL(38,38)</codeph>, with 38 digits to the right of the
- decimal point. The value closest to 0 would be .0000...1 (37 zeros and the final 1). The value closest to 1
+ The range of <codeph>DECIMAL</codeph> type is -10^38 +1 through 10^38 –1.
+ </p>
+
+ <p>
+ The largest value is represented by <codeph>DECIMAL(38, 0)</codeph>.
+ </p>
+
+ <p>
+ The most precise fractional value (between 0 and 1, or 0 and -1) is represented by
+ <codeph>DECIMAL(38, 38)</codeph>, with 38 digits to the right of the decimal point. The
+ value closest to 0 would be .0000...1 (37 zeros and the final 1). The value closest to 1
would be .999... (9 repeated 38 times).
</p>
<p>
- For a given precision and scale, the range of <codeph>DECIMAL</codeph> values is the same in the positive and
- negative directions. For example, <codeph>DECIMAL(4,2)</codeph> can represent from -99.99 to 99.99. This is
- different from other integral numeric types where the positive and negative bounds differ slightly.
- </p>
-
- <p>
- When you use <codeph>DECIMAL</codeph> values in arithmetic expressions, the precision and scale of the result
- value are determined as follows:
- </p>
-
- <ul>
- <li>
- <p>
- For addition and subtraction, the precision and scale are based on the maximum possible result, that is,
- if all the digits of the input values were 9s and the absolute values were added together.
- </p>
-<!-- Seems like buggy output from this first query, so hiding the example for the time being. -->
-<codeblock audience="hidden"><![CDATA[[localhost:21000] > select 50000.5 + 12.444, precision(50000.5 + 12.444), scale(50000.5 + 12.444);
-+------------------+-----------------------------+-------------------------+
-| 50000.5 + 12.444 | precision(50000.5 + 12.444) | scale(50000.5 + 12.444) |
-+------------------+-----------------------------+-------------------------+
-| 50012.944 | 9 | 3 |
-+------------------+-----------------------------+-------------------------+
-[localhost:21000] > select 99999.9 + 99.999, precision(99999.9 + 99.999), scale(99999.9 + 99.999);
-+------------------+-----------------------------+-------------------------+
-| 99999.9 + 99.999 | precision(99999.9 + 99.999) | scale(99999.9 + 99.999) |
-+------------------+-----------------------------+-------------------------+
-| 100099.899 | 9 | 3 |
-+------------------+-----------------------------+-------------------------+
-]]>
-</codeblock>
- </li>
-
- <li>
- <p>
- For multiplication, the precision is the sum of the precisions of the input values. The scale is the sum
- of the scales of the input values.
- </p>
- </li>
-
-<!-- Need to add some specifics to discussion of division. Details here: http://blogs.msdn.com/b/sqlprogrammability/archive/2006/03/29/564110.aspx -->
-
- <li>
- <p>
- For division, Impala sets the precision and scale to values large enough to represent the whole and
- fractional parts of the result.
- </p>
- </li>
-
- <li>
- <p>
- For <codeph>UNION</codeph>, the scale is the larger of the scales of the input values, and the precision
- is increased if necessary to accommodate any additional fractional digits. If the same input value has
- the largest precision and the largest scale, the result value has the same precision and scale. If one
- value has a larger precision but smaller scale, the scale of the result value is increased. For example,
- <codeph>DECIMAL(20,2) UNION DECIMAL(8,6)</codeph> produces a result of type
- <codeph>DECIMAL(24,6)</codeph>. The extra 4 fractional digits of scale (6-2) are accommodated by
- extending the precision by the same amount (20+4).
- </p>
- </li>
-
- <li>
- <p>
- To doublecheck, you can always call the <codeph>PRECISION()</codeph> and <codeph>SCALE()</codeph>
- functions on the results of an arithmetic expression to see the relevant values, or use a <codeph>CREATE
- TABLE AS SELECT</codeph> statement to define a column based on the return type of the expression.
- </p>
- </li>
- </ul>
+ <b>Memory and disk storage:</b>
+ </p>
- <p conref="../shared/impala_common.xml#common/compatibility_blurb"/>
+ <p>
+ Only the precision determines the storage size for <codeph>DECIMAL</codeph> values, and
+ the scale setting has no effect on the storage size. The following table describes the
+ in-memory storage once the values are loaded into memory.
+ </p>
- <ul>
- <li>
- Using the <codeph>DECIMAL</codeph> type is only supported under <keyword keyref="impala14_full"/> and higher.
- </li>
+ <p>
+ <simpletable frame="all" relcolwidth="1* 1*" id="simpletable_tty_3y2_mdb">
+
+ <sthead>
+
+ <stentry>Precision</stentry>
+
+ <stentry>In-memory Storage</stentry>
+
+ </sthead>
+
+ <strow>
+
+ <stentry>1 - 9</stentry>
+
+ <stentry>4 bytes</stentry>
+
+ </strow>
+
+ <strow>
- <li>
- Use the <codeph>DECIMAL</codeph> data type in Impala for applications where you used the
- <codeph>NUMBER</codeph> data type in Oracle. The Impala <codeph>DECIMAL</codeph> type does not support the
- Oracle idioms of <codeph>*</codeph> for scale or negative values for precision.
- </li>
- </ul>
+ <stentry>10 - 18</stentry>
+
+ <stentry>8 bytes</stentry>
+
+ </strow>
+
+ <strow>
+
+ <stentry>19 - 38</stentry>
+
+ <stentry>16 bytes</stentry>
+
+ </strow>
+
+ </simpletable>
+ </p>
<p>
- <b>Conversions and casting:</b>
+ The on-disk representation varies depending on the file format of the table.
</p>
<p>
- <ph conref="../shared/impala_common.xml#common/cast_int_to_timestamp"/>
+ Text, RCFile, and SequenceFile tables use ASCII-based formats as below:
</p>
<p>
- Impala automatically converts between <codeph>DECIMAL</codeph> and other numeric types where possible. A
- <codeph>DECIMAL</codeph> with zero scale is converted to or from the smallest appropriate integral type. A
- <codeph>DECIMAL</codeph> with a fractional part is automatically converted to or from the smallest
- appropriate floating-point type. If the destination type does not have sufficient precision or scale to hold
- all possible values of the source type, Impala raises an error and does not convert the value.
+ <ul>
+ <li>
+ Leading zeros are not stored.
+ </li>
+
+ <li>
+ Trailing zeros are stored.
+ </li>
+
+ <li>
+ <p>
+ Each <codeph>DECIMAL</codeph> value takes up as many bytes as the precision of the
+ value, plus:
+ </p>
+ <ul>
+ <li>
+ One extra byte if the decimal point is present.
+ </li>
+
+ <li>
+ One extra byte for negative values.
+ </li>
+ </ul>
+ </li>
+ </ul>
</p>
<p>
- For example, these statements show how expressions of <codeph>DECIMAL</codeph> and other types are reconciled
- to the same type in the context of <codeph>UNION</codeph> queries and <codeph>INSERT</codeph> statements:
+ Parquet and Avro tables use binary formats and offer more compact storage for
+ <codeph>DECIMAL</codeph> values. In these tables, Impala stores each value in fewer bytes
+ where possible depending on the precision specified for the <codeph>DECIMAL</codeph>
+ column. To conserve space in large tables, use the smallest-precision
+ <codeph>DECIMAL</codeph> type.
</p>
-<codeblock><![CDATA[[localhost:21000] > select cast(1 as int) as x union select cast(1.5 as decimal(9,4)) as x;
-+----------------+
-| x |
-+----------------+
-| 1.5000 |
-| 1.0000 |
-+----------------+
-[localhost:21000] > create table int_vs_decimal as select cast(1 as int) as x union select cast(1.5 as decimal(9,4)) as x;
-+-------------------+
-| summary |
-+-------------------+
-| Inserted 2 row(s) |
-+-------------------+
-[localhost:21000] > desc int_vs_decimal;
-+------+---------------+---------+
-| name | type | comment |
-+------+---------------+---------+
-| x | decimal(14,4) | |
-+------+---------------+---------+
-]]>
-</codeblock>
-
- <p>
- To avoid potential conversion errors, you can use <codeph>CAST()</codeph> to convert <codeph>DECIMAL</codeph>
- values to <codeph>FLOAT</codeph>, <codeph>TINYINT</codeph>, <codeph>SMALLINT</codeph>, <codeph>INT</codeph>,
- <codeph>BIGINT</codeph>, <codeph>STRING</codeph>, <codeph>TIMESTAMP</codeph>, or <codeph>BOOLEAN</codeph>.
- You can use exponential notation in <codeph>DECIMAL</codeph> literals or when casting from
- <codeph>STRING</codeph>, for example <codeph>1.0e6</codeph> to represent one million.
- </p>
-
- <p>
- If you cast a value with more fractional digits than the scale of the destination type, any extra fractional
- digits are truncated (not rounded). Casting a value to a target type with not enough precision produces a
- result of <codeph>NULL</codeph> and displays a runtime warning.
- </p>
-
-<codeblock><![CDATA[[localhost:21000] > select cast(1.239 as decimal(3,2));
-+-----------------------------+
-| cast(1.239 as decimal(3,2)) |
-+-----------------------------+
-| 1.23 |
-+-----------------------------+
-[localhost:21000] > select cast(1234 as decimal(3));
-+----------------------------+
-| cast(1234 as decimal(3,0)) |
-+----------------------------+
-| NULL |
-+----------------------------+
-WARNINGS: Expression overflowed, returning NULL
-]]>
-</codeblock>
-
- <p>
- When you specify integer literals, for example in <codeph>INSERT ... VALUES</codeph> statements or arithmetic
- expressions, those numbers are interpreted as the smallest applicable integer type. You must use
- <codeph>CAST()</codeph> calls for some combinations of integer literals and <codeph>DECIMAL</codeph>
- precision. For example, <codeph>INT</codeph> has a maximum value that is 10 digits long,
- <codeph>TINYINT</codeph> has a maximum value that is 3 digits long, and so on. If you specify a value such as
- 123456 to go into a <codeph>DECIMAL</codeph> column, Impala checks if the column has enough precision to
- represent the largest value of that integer type, and raises an error if not. Therefore, use an expression
- like <codeph>CAST(123456 TO DECIMAL(9,0))</codeph> for <codeph>DECIMAL</codeph> columns with precision 9 or
- less, <codeph>CAST(50 TO DECIMAL(2,0))</codeph> for <codeph>DECIMAL</codeph> columns with precision 2 or
- less, and so on. For <codeph>DECIMAL</codeph> columns with precision 10 or greater, Impala automatically
- interprets the value as the correct <codeph>DECIMAL</codeph> type; however, because
- <codeph>DECIMAL(10)</codeph> requires 8 bytes of storage while <codeph>DECIMAL(9)</codeph> requires only 4
- bytes, only use precision of 10 or higher when actually needed.
- </p>
-
-<codeblock><![CDATA[[localhost:21000] > create table decimals_9_0 (x decimal);
-[localhost:21000] > insert into decimals_9_0 values (1), (2), (4), (8), (16), (1024), (32768), (65536), (1000000);
-ERROR: AnalysisException: Possible loss of precision for target table 'decimal_testing.decimals_9_0'.
-Expression '1' (type: INT) would need to be cast to DECIMAL(9,0) for column 'x'
-[localhost:21000] > insert into decimals_9_0 values (cast(1 as decimal)), (cast(2 as decimal)), (cast(4 as decimal)), (cast(8 as decimal)), (cast(16 as decimal)), (cast(1024 as decimal)), (cast(32768 as decimal)), (cast(65536 as decimal)), (cast(1000000 as decimal));
-
-[localhost:21000] > create table decimals_10_0 (x decimal(10,0));
-[localhost:21000] > insert into decimals_10_0 values (1), (2), (4), (8), (16), (1024), (32768), (65536), (1000000);
-]]>
-</codeblock>
-
- <p>
- Be aware that in memory and for binary file formats such as Parquet or Avro, <codeph>DECIMAL(10)</codeph> or
- higher consumes 8 bytes while <codeph>DECIMAL(9)</codeph> (the default for <codeph>DECIMAL</codeph>) or lower
- consumes 4 bytes. Therefore, to conserve space in large tables, use the smallest-precision
- <codeph>DECIMAL</codeph> type that is appropriate and <codeph>CAST()</codeph> literal values where necessary,
- rather than declaring <codeph>DECIMAL</codeph> columns with high precision for convenience.
- </p>
-
- <p>
- To represent a very large or precise <codeph>DECIMAL</codeph> value as a literal, for example one that
- contains more digits than can be represented by a <codeph>BIGINT</codeph> literal, use a quoted string or a
- floating-point value for the number, and <codeph>CAST()</codeph> to the desired <codeph>DECIMAL</codeph>
- type:
- </p>
-
-<codeblock>insert into decimals_38_5 values (1), (2), (4), (8), (16), (1024), (32768), (65536), (1000000),
- (cast("999999999999999999999999999999" as decimal(38,5))),
- (cast(999999999999999999999999999999. as decimal(38,5)));
-</codeblock>
-
- <ul>
- <li>
- <p> The result of the <codeph>SUM()</codeph> aggregate function on
- <codeph>DECIMAL</codeph> values is promoted to a precision of 38,
- with the same precision as the underlying column. Thus, the result can
- represent the largest possible value at that particular precision. </p>
- </li>
-
- <li>
- <p>
- <codeph>STRING</codeph> columns, literals, or expressions can be converted to <codeph>DECIMAL</codeph> as
- long as the overall number of digits and digits to the right of the decimal point fit within the
- specified precision and scale for the declared <codeph>DECIMAL</codeph> type. By default, a
- <codeph>DECIMAL</codeph> value with no specified scale or precision can hold a maximum of 9 digits of an
- integer value. If there are more digits in the string value than are allowed by the
- <codeph>DECIMAL</codeph> scale and precision, the result is <codeph>NULL</codeph>.
- </p>
- <p>
- The following examples demonstrate how <codeph>STRING</codeph> values with integer and fractional parts
- are represented when converted to <codeph>DECIMAL</codeph>. If the scale is 0, the number is treated
- as an integer value with a maximum of <varname>precision</varname> digits. If the precision is greater than
- 0, the scale must be increased to account for the digits both to the left and right of the decimal point.
- As the precision increases, output values are printed with additional trailing zeros after the decimal
- point if needed. Any trailing zeros after the decimal point in the <codeph>STRING</codeph> value must fit
- within the number of digits specified by the precision.
- </p>
-<codeblock><![CDATA[[localhost:21000] > select cast('100' as decimal); -- Small integer value fits within 9 digits of scale.
-+-----------------------------+
-| cast('100' as decimal(9,0)) |
-+-----------------------------+
-| 100 |
-+-----------------------------+
-[localhost:21000] > select cast('100' as decimal(3,0)); -- Small integer value fits within 3 digits of scale.
-+-----------------------------+
-| cast('100' as decimal(3,0)) |
-+-----------------------------+
-| 100 |
-+-----------------------------+
-[localhost:21000] > select cast('100' as decimal(2,0)); -- 2 digits of scale is not enough!
-+-----------------------------+
-| cast('100' as decimal(2,0)) |
-+-----------------------------+
-| NULL |
-+-----------------------------+
-[localhost:21000] > select cast('100' as decimal(3,1)); -- (3,1) = 2 digits left of the decimal point, 1 to the right. Not enough.
-+-----------------------------+
-| cast('100' as decimal(3,1)) |
-+-----------------------------+
-| NULL |
-+-----------------------------+
-[localhost:21000] > select cast('100' as decimal(4,1)); -- 4 digits total, 1 to the right of the decimal point.
-+-----------------------------+
-| cast('100' as decimal(4,1)) |
-+-----------------------------+
-| 100.0 |
-+-----------------------------+
-[localhost:21000] > select cast('98.6' as decimal(3,1)); -- (3,1) can hold a 3 digit number with 1 fractional digit.
-+------------------------------+
-| cast('98.6' as decimal(3,1)) |
-+------------------------------+
-| 98.6 |
-+------------------------------+
-[localhost:21000] > select cast('98.6' as decimal(15,1)); -- Larger scale allows bigger numbers but still only 1 fractional digit.
-+-------------------------------+
-| cast('98.6' as decimal(15,1)) |
-+-------------------------------+
-| 98.6 |
-+-------------------------------+
-[localhost:21000] > select cast('98.6' as decimal(15,5)); -- Larger precision allows more fractional digits, outputs trailing zeros.
-+-------------------------------+
-| cast('98.6' as decimal(15,5)) |
-+-------------------------------+
-| 98.60000 |
-+-------------------------------+
-[localhost:21000] > select cast('98.60000' as decimal(15,1)); -- Trailing zeros in the string must fit within 'scale' digits (1 in this case).
-+-----------------------------------+
-| cast('98.60000' as decimal(15,1)) |
-+-----------------------------------+
-| NULL |
-+-----------------------------------+
-]]>
-</codeblock>
- </li>
-
- <li>
- Most built-in arithmetic functions such as <codeph>SIN()</codeph> and <codeph>COS()</codeph> continue to
- accept only <codeph>DOUBLE</codeph> values because they are so commonly used in scientific context for
- calculations of IEEE 954-compliant values. The built-in functions that accept and return
- <codeph>DECIMAL</codeph> are:
-<!-- List from Skye: positive, negative, least, greatest, fnv_hash, if, nullif, zeroifnull, isnull, coalesce -->
-<!-- Nong had already told me about abs, ceil, floor, round, truncate -->
- <ul>
- <li>
- <codeph>ABS()</codeph>
- </li>
-
- <li>
- <codeph>CEIL()</codeph>
- </li>
-
- <li>
- <codeph>COALESCE()</codeph>
- </li>
-
- <li>
- <codeph>FLOOR()</codeph>
- </li>
-
- <li>
- <codeph>FNV_HASH()</codeph>
- </li>
-
- <li>
- <codeph>GREATEST()</codeph>
- </li>
-
- <li>
- <codeph>IF()</codeph>
- </li>
-
- <li>
- <codeph>ISNULL()</codeph>
- </li>
-
- <li>
- <codeph>LEAST()</codeph>
- </li>
-
- <li>
- <codeph>NEGATIVE()</codeph>
- </li>
-
- <li>
- <codeph>NULLIF()</codeph>
- </li>
-
- <li>
- <codeph>POSITIVE()</codeph>
- </li>
-
- <li>
- <codeph>PRECISION()</codeph>
- </li>
-
- <li>
- <codeph>ROUND()</codeph>
- </li>
-
- <li>
- <codeph>SCALE()</codeph>
- </li>
-
- <li>
- <codeph>TRUNCATE()</codeph>
- </li>
-
- <li>
- <codeph>ZEROIFNULL()</codeph>
- </li>
- </ul>
- See <xref href="impala_functions.xml#builtins"/> for details.
- </li>
-
- <li>
- <p>
- <codeph>BIGINT</codeph>, <codeph>INT</codeph>, <codeph>SMALLINT</codeph>, and <codeph>TINYINT</codeph>
- values can all be cast to <codeph>DECIMAL</codeph>. The number of digits to the left of the decimal point
- in the <codeph>DECIMAL</codeph> type must be sufficient to hold the largest value of the corresponding
- integer type. Note that integer literals are treated as the smallest appropriate integer type, meaning
- there is sometimes a range of values that require one more digit of <codeph>DECIMAL</codeph> scale than
- you might expect. For integer values, the precision of the <codeph>DECIMAL</codeph> type can be zero; if
- the precision is greater than zero, remember to increase the scale value by an equivalent amount to hold
- the required number of digits to the left of the decimal point.
- </p>
- <p>
- The following examples show how different integer types are converted to <codeph>DECIMAL</codeph>.
- </p>
-<!-- According to Nong, it's a bug that so many integer digits can be converted to a DECIMAL
- value with small (s,p) spec. So expect to re-do this example. -->
-<codeblock><![CDATA[[localhost:21000] > select cast(1 as decimal(1,0));
-+-------------------------+
-| cast(1 as decimal(1,0)) |
-+-------------------------+
-| 1 |
-+-------------------------+
-[localhost:21000] > select cast(9 as decimal(1,0));
-+-------------------------+
-| cast(9 as decimal(1,0)) |
-+-------------------------+
-| 9 |
-+-------------------------+
-[localhost:21000] > select cast(10 as decimal(1,0));
-+--------------------------+
-| cast(10 as decimal(1,0)) |
-+--------------------------+
-| 10 |
-+--------------------------+
-[localhost:21000] > select cast(10 as decimal(1,1));
-+--------------------------+
-| cast(10 as decimal(1,1)) |
-+--------------------------+
-| 10.0 |
-+--------------------------+
-[localhost:21000] > select cast(100 as decimal(1,1));
-+---------------------------+
-| cast(100 as decimal(1,1)) |
-+---------------------------+
-| 100.0 |
-+---------------------------+
-[localhost:21000] > select cast(1000 as decimal(1,1));
-+----------------------------+
-| cast(1000 as decimal(1,1)) |
-+----------------------------+
-| 1000.0 |
-+----------------------------+
-]]>
-</codeblock>
- </li>
-
- <li>
- <p>
- When a <codeph>DECIMAL</codeph> value is converted to any of the integer types, any fractional part is
- truncated (that is, rounded towards zero):
- </p>
-<codeblock><![CDATA[[localhost:21000] > create table num_dec_days (x decimal(4,1));
-[localhost:21000] > insert into num_dec_days values (1), (2), (cast(4.5 as decimal(4,1)));
-[localhost:21000] > insert into num_dec_days values (cast(0.1 as decimal(4,1))), (cast(.9 as decimal(4,1))), (cast(9.1 as decimal(4,1))), (cast(9.9 as decimal(4,1)));
-[localhost:21000] > select cast(x as int) from num_dec_days;
-+----------------+
-| cast(x as int) |
-+----------------+
-| 1 |
-| 2 |
-| 4 |
-| 0 |
-| 0 |
-| 9 |
-| 9 |
-+----------------+
-]]>
-</codeblock>
- </li>
-
- <li>
- <p>
- You cannot directly cast <codeph>TIMESTAMP</codeph> or <codeph>BOOLEAN</codeph> values to or from
- <codeph>DECIMAL</codeph> values. You can turn a <codeph>DECIMAL</codeph> value into a time-related
- representation using a two-step process, by converting it to an integer value and then using that result
- in a call to a date and time function such as <codeph>from_unixtime()</codeph>.
- </p>
-<codeblock><![CDATA[[localhost:21000] > select from_unixtime(cast(cast(1000.0 as decimal) as bigint));
-+-------------------------------------------------------------+
-| from_unixtime(cast(cast(1000.0 as decimal(9,0)) as bigint)) |
-+-------------------------------------------------------------+
-| 1970-01-01 00:16:40 |
-+-------------------------------------------------------------+
-[localhost:21000] > select now() + interval cast(x as int) days from num_dec_days; -- x is a DECIMAL column.
-
-[localhost:21000] > create table num_dec_days (x decimal(4,1));
-[localhost:21000] > insert into num_dec_days values (1), (2), (cast(4.5 as decimal(4,1)));
-[localhost:21000] > select now() + interval cast(x as int) days from num_dec_days; -- The 4.5 value is truncated to 4 and becomes '4 days'.
-+--------------------------------------+
-| now() + interval cast(x as int) days |
-+--------------------------------------+
-| 2014-05-13 23:11:55.163284000 |
-| 2014-05-14 23:11:55.163284000 |
-| 2014-05-16 23:11:55.163284000 |
-+--------------------------------------+
-]]>
-</codeblock>
- </li>
-
- <li>
- <p>
- Because values in <codeph>INSERT</codeph> statements are checked rigorously for type compatibility, be
- prepared to use <codeph>CAST()</codeph> function calls around literals, column references, or other
- expressions that you are inserting into a <codeph>DECIMAL</codeph> column.
- </p>
- </li>
- </ul>
-
- <p conref="../shared/impala_common.xml#common/null_bad_numeric_cast"/>
-
- <p>
- <b>DECIMAL differences from integer and floating-point types:</b>
- </p>
-
- <p>
- With the <codeph>DECIMAL</codeph> type, you are concerned with the number of overall digits of a number
- rather than powers of 2 (as in <codeph>TINYINT</codeph>, <codeph>SMALLINT</codeph>, and so on). Therefore,
- the limits with integral values of <codeph>DECIMAL</codeph> types fall around 99, 999, 9999, and so on rather
- than 32767, 65535, 2
- <sup>32</sup>
- -1, and so on. For fractional values, you do not need to account for imprecise representation of the
- fractional part according to the IEEE-954 standard (as in <codeph>FLOAT</codeph> and
- <codeph>DOUBLE</codeph>). Therefore, when you insert a fractional value into a <codeph>DECIMAL</codeph>
- column, you can compare, sum, query, <codeph>GROUP BY</codeph>, and so on that column and get back the
- original values rather than some <q>close but not identical</q> value.
- </p>
-
- <p>
- <codeph>FLOAT</codeph> and <codeph>DOUBLE</codeph> can cause problems or unexpected behavior due to inability
- to precisely represent certain fractional values, for example dollar and cents values for currency. You might
- find output values slightly different than you inserted, equality tests that do not match precisely, or
- unexpected values for <codeph>GROUP BY</codeph> columns. <codeph>DECIMAL</codeph> can help reduce unexpected
- behavior and rounding errors, at the expense of some performance overhead for assignments and comparisons.
+ <p>
+ <b>Precision and scale in arithmetic operations:</b>
</p>
<p>
- <b>Literals and expressions:</b>
+ For all arithmetic options, the resulting precision is at most 38.
+ </p>
+
+ <p>
+ If the precision of the result would be greater than 38, Impala truncates the result from
+ the back, but keeps at least 6 fractional digits in scale and rounds.
+ </p>
+
+ <p>
+ For example, <codeph>DECIMAL(38, 20) * DECIMAL(38, 20)</codeph> returns
+ <codeph>DECIMAL(38, 6)</codeph>. According to the table below, the resulting precision and
+ scale would be <codeph>(77, 40)</codeph>, but they are higher than the maximum precision
+ and scale. So, Impala sets the precision to the maximum allowed 38, and truncates the
+ scale to 6.
+ </p>
+
+ <p>
+ When you use <codeph>DECIMAL</codeph> values in arithmetic operations, the precision and
+ scale of the result value are determined as follows. For better readability, the following
+ terms are used in the table below:
+ <ul>
+ <li dir="ltr">
+ <p dir="ltr">
+ P1, P2: Input precisions
+ </p>
+ </li>
+
+ <li dir="ltr">
+ <p dir="ltr">
+ S1, S2: Input scales
+ </p>
+ </li>
+
+ <li dir="ltr">
+ <p dir="ltr">
+ L1, L2: Leading digits in input <codeph>DECIMAL</codeph>s, i.e., L1 = P1 - S1 and L2
+ = P2 - S2
+ </p>
+ </li>
+ </ul>
+ </p>
+
+ <p>
+ <table id="table_inl_sz2_mdb" colsep="1" rowsep="1" frame="all">
+ <tgroup cols="3" align="left">
+ <colspec colnum="1" colname="col1"/>
+ <colspec colnum="2" colname="col2"/>
+ <colspec colnum="3" colname="col3"/>
+ <tbody>
+ <row>
+ <entry>
+ <b>Operation</b>
+ </entry>
+ <entry>
+ <b>Resulting Precision</b>
+ </entry>
+ <entry>
+ <b>Resulting Scale</b>
+ </entry>
+ </row>
+ <row>
+ <entry>
+ Addition and Subtraction
+ </entry>
+ <entry>
+ <p>
+ max (L1, L2) + max (S1, S2) + 1
+ </p>
+
+
+
+ <p>
+ 1 is for carry-over.
+ </p>
+ </entry>
+ <entry>
+ max (S1, S2)
+ </entry>
+ </row>
+ <row>
+ <entry>
+ Multiplication
+ </entry>
+ <entry>
+ P1 + P2 + 1
+ </entry>
+ <entry>
+ S1 + S2
+ </entry>
+ </row>
+ <row>
+ <entry>
+ Division
+ </entry>
+ <entry>
+ L1 + S2 + max (S1 + P2 + 1, 6)
+ </entry>
+ <entry>
+ max (S1 + P2 + 1, 6)
+ </entry>
+ </row>
+ <row>
+ <entry>
+ Modulo
+ </entry>
+ <entry>
+ min (L1, L2) + max (S1, S2)
+ </entry>
+ <entry>
+ max (S1, S2)
+ </entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+ </p>
+
+ <p>
+ <b>Precision and scale in functions:</b>
+ </p>
+
+ <p>
+ When you use <codeph>DECIMAL</codeph> values in built-in functions, the precision and
+ scale of the result value are determined as follows:
<ul>
+ <li dir="ltr">
+ The result of the <codeph>SUM</codeph> aggregate function on a
+ <codeph>DECIMAL</codeph> value is:
+ <ul>
+ <li>
+ <p dir="ltr">
+ Precision: 38
+ </p>
+ </li>
+
+ <li>
+ <p dir="ltr">
+ Scale: The same scale as the input column
+ </p>
+ </li>
+ </ul>
+ </li>
+
+ <li dir="ltr">
+ <p dir="ltr">
+ The result of <codeph>AVG</codeph> aggregate function on a <codeph>DECIMAL</codeph>
+ value is:
+ </p>
+ <ul>
+ <li>
+ <p dir="ltr">
+ Precision: 38
+ </p>
+ </li>
+
+ <li>
+ <p dir="ltr">
+ Scale: max(Scale of input column, 6)
+ </p>
+ </li>
+ </ul>
+ </li>
+ </ul>
+ </p>
+
+ <p>
+ <b>Implicit conversions in DECIMAL assignments:</b>
+ </p>
+
+ <p>
+ Impala enforces strict conversion rules in decimal assignments like in
+ <codeph>INSERT</codeph> and <codeph>UNION</codeph> statements, or in functions like
+ <codeph>COALESCE</codeph>.
+ </p>
+
+ <p>
+ If there is not enough precision and scale in the destination, Impala fails with an error.
+ </p>
+
+ <p>
+ Impala performs implicit conversions between <codeph>DECIMAL</codeph> and other numeric
+ types as below:
+ <ul>
+ <li>
+ <codeph>DECIMAL</codeph> is implicitly converted to <codeph>DOUBLE</codeph> or
+ <codeph>FLOAT</codeph> when necessary even with a loss of precision. It can be
+ necessary, for example when inserting a <codeph>DECIMAL</codeph> value into a
+ <codeph>DOUBLE</codeph> column. For example:
+<codeblock>CREATE TABLE flt(c FLOAT);
+INSERT INTO flt SELECT CAST(1e37 AS DECIMAL(38, 0));
+SELECT CAST(c AS DECIMAL(38, 0)) FROM flt;
+
+Result: 9999999933815812510711506376257961984</codeblock>
+ <p
+ dir="ltr">
+ The result has a loss of information due to implicit casting. This is why we
+ discourage using the <codeph>DOUBLE</codeph> and <codeph>FLOAT</codeph> types in
+ general.
+ </p>
+ </li>
+
<li>
+ <codeph>DOUBLE</codeph> and <codeph>FLOAT</codeph> cannot be implicitly converted to
+ <codeph>DECIMAL</codeph>. An error is returned.
+ </li>
+
+ <li>
+ <codeph>DECIMAL</codeph> is implicitly converted to <codeph>DECIMAL</codeph> if all
+ digits fit in the resulting <codeph>DECIMAL</codeph>.
<p>
- When you use an integer literal such as <codeph>1</codeph> or <codeph>999</codeph> in a SQL statement,
- depending on the context, Impala will treat it as either the smallest appropriate
- <codeph>DECIMAL</codeph> type, or the smallest integer type (<codeph>TINYINT</codeph>,
- <codeph>SMALLINT</codeph>, <codeph>INT</codeph>, or <codeph>BIGINT</codeph>). To minimize memory usage,
- Impala prefers to treat the literal as the smallest appropriate integer type.
+ For example, the following query returns an error because the resulting type that
+ guarantees that all digits fit cannot be determined .
+<codeblock>SELECT GREATEST (CAST(1 AS DECIMAL(38, 0)), CAST(2 AS DECIMAL(38, 37)));</codeblock>
</p>
</li>
<li>
+ Integer values can be implicitly converted to <codeph>DECIMAL</codeph> when there is
+ enough room in the <codeph>DECIMAL</codeph> to guarantee that all digits will fit. The
+ integer types require the following numbers of digits to the left of the decimal point
+ when converted to <codeph>DECIMAL</codeph>:
+ <ul>
+ <li>
+ <p dir="ltr">
+ <codeph>BIGINT</codeph>: 19 digits
+ </p>
+ </li>
+
+ <li>
+ <p dir="ltr">
+ <codeph>INT</codeph>: 10 digits
+ </p>
+ </li>
+
+ <li>
+ <p dir="ltr">
+ <codeph>SMALLINT</codeph>: 5 digits
+ </p>
+ </li>
+
+ <li>
+ <p dir="ltr">
+ <codeph>TINYINT</codeph>: 3 digits
+ </p>
+ </li>
+ </ul>
<p>
- When you use a floating-point literal such as <codeph>1.1</codeph> or <codeph>999.44</codeph> in a SQL
- statement, depending on the context, Impala will treat it as either the smallest appropriate
- <codeph>DECIMAL</codeph> type, or the smallest floating-point type (<codeph>FLOAT</codeph> or
- <codeph>DOUBLE</codeph>). To avoid loss of accuracy, Impala prefers to treat the literal as a
+ For example:
+ </p>
+
+ <p>
+<codeblock>CREATE TABLE decimals_10_8 (x DECIMAL(10, 8));
+INSERT INTO decimals_10_8 VALUES (CAST(1 AS TINYINT));</codeblock>
+ </p>
+
+ <p>
+ The above <codeph>INSERT</codeph> statement fails because <codeph>TINYINT</codeph>
+ requires room for 3 digits to the left of the decimal point in the
<codeph>DECIMAL</codeph>.
</p>
+
+ <p>
+<codeblock>CREATE TABLE decimals_11_8(x DECIMAL(11, 8));
+INSERT INTO decimals_11_8 VALUES (CAST(1 AS TINYINT));</codeblock>
+ </p>
+
+ <p>
+ The above <codeph>INSERT</codeph> statement succeeds because there is enough room
+ for 3 digits to the left of the decimal point that <codeph>TINYINT</codeph>
+ requires.
+ </p>
+ </li>
+ </ul>
+ </p>
+
+ <p>
+ In <codeph>UNION</codeph>, the resulting precision and scales are determined as follows.
+ <ul>
+ <li>
+ Precision: max (L1, L2) + max (S1, S2)
+ <p>
+ If the resulting type does not fit in the <codeph>DECIMAL</codeph> type, an error is
+ returned. See the first example below.
+ </p>
+ </li>
+
+ <li>
+ Scale: max (S1, S2)
+ </li>
+ </ul>
+ </p>
+
+ <p>
+ Examples for <codeph>UNION</codeph>:
+ <ul>
+ <li>
+ <codeph>DECIMAL(20, 0) UNION DECIMAL(20, 20)</codeph> would require a
+ <codeph>DECIMAL(40, 20)</codeph> to fit all the digits. Since this is larger than the
+ max precision for <codeph>DECIMAL</codeph>, Impala returns an error. One way to fix
+ the error is to cast both operands to the desired type, for example
+ <codeph>DECIMAL(38, 18)</codeph>.
+ </li>
+
+ <li dir="ltr">
+ <p dir="ltr">
+ <codeph>DECIMAL(20, 2) UNION DECIMAL(8, 6)</codeph> returns <codeph>DECIMAL(24,
+ 6)</codeph>.
+ </p>
+ </li>
+
+ <li dir="ltr">
+ <p dir="ltr">
+ <codeph>INT UNION DECIMAL(9, 4)</codeph> returns <codeph>DECIMAL(14, 4)</codeph>.
+ </p>
+
+ <p>
+ <codeph>INT</codeph> has the precision 10 and the scale 0, so it is treated as
+ <codeph>DECIMAL(10, 0) UNION DECIMAL(9. 4)</codeph>.
+ </p>
+ </li>
+ </ul>
+ </p>
+
+ <p>
+ <b>Casting between DECIMAL and other data types:</b>
+ </p>
+
+ <p>
+ To avoid potential conversion errors, use <codeph>CAST</codeph> to explicitly convert
+ between <codeph>DECIMAL</codeph> and other types in decimal assignments like in
+ <codeph>INSERT</codeph> and <codeph>UNION</codeph> statements, or in functions like
+ <codeph>COALESCE</codeph>:
+ <ul>
+ <li dir="ltr">
+ <p dir="ltr">
+ You can cast the following types to <codeph>DECIMAL</codeph>:
+ <codeph>FLOAT</codeph>, <codeph>TINYINT</codeph>, <codeph>SMALLINT</codeph>,
+ <codeph>INT</codeph>, <codeph>BIGINT</codeph>, <codeph>STRING</codeph>
+ </p>
+ </li>
+
+ <li dir="ltr">
+ <p dir="ltr">
+ You can cast <codeph>DECIMAL</codeph> to the following types:
+ <codeph>FLOAT</codeph>, <codeph>TINYINT</codeph>, <codeph>SMALLINT</codeph>,
+ <codeph>INT</codeph>, <codeph>BIGINT</codeph>, <codeph>STRING</codeph>,
+ <codeph>BOOLEAN</codeph>, <codeph>TIMESTAMP</codeph>
+ </p>
</li>
</ul>
</p>
<p>
- <b>Storage considerations:</b>
+ Impala performs <codeph>CAST</codeph> between <codeph>DECIMAL</codeph> and other numeric
+ types as below:
+ <ul>
+ <li dir="ltr">
+ <p dir="ltr">
+ Precision: If you cast a value with bigger precision than the precision of the
+ destination type, Impala returns an error. For example, <codeph>CAST(123456 AS
+ DECIMAL(3,0))</codeph> returns an error because all digits do not fit into
+ <codeph>DECIMAL(3, 0)</codeph>
+ </p>
+ </li>
+
+ <li dir="ltr">
+ <p dir="ltr">
+ Scale: If you cast a value with more fractional digits than the scale of the
+ destination type, the fractional digits are rounded. For example, <codeph>CAST(1.239
+ AS DECIMAL(3, 2))</codeph> returns <codeph>1.24</codeph>.
+ </p>
+ </li>
+ </ul>
</p>
- <ul>
- <li>
- Only the precision determines the storage size for <codeph>DECIMAL</codeph> values; the scale setting has
- no effect on the storage size.
- </li>
+ <p>
+ <b>Casting STRING to DECIMAL:</b>
+ </p>
- <li>
- Text, RCFile, and SequenceFile tables all use ASCII-based formats. In these text-based file formats,
- leading zeros are not stored, but trailing zeros are stored. In these tables, each <codeph>DECIMAL</codeph>
- value takes up as many bytes as there are digits in the value, plus an extra byte if the decimal point is
- present and an extra byte for negative values. Once the values are loaded into memory, they are represented
- in 4, 8, or 16 bytes as described in the following list items. The on-disk representation varies depending
- on the file format of the table.
- </li>
+ <p>
+ You can cast <codeph>STRING</codeph> of numeric characters in columns, literals, or
+ expressions to <codeph>DECIMAL</codeph> as long as number fits within the specified target
+ <codeph>DECIMAL</codeph> type without overflow.
+ <ul>
+ <li dir="ltr">
+ <p dir="ltr">
+ If scale in <codeph>STRING</codeph> > scale in <codeph>DECIMAL</codeph>:
+ </p>
-<!-- Next couple of points can be conref'ed with identical list bullets farther down under File Format Considerations. -->
+ <p dir="ltr">
+ The fractional digits are rounded to the <codeph>DECIMAL</codeph> scale.
+ </p>
- <li>
- Parquet and Avro tables use binary formats, In these tables, Impala stores each value in as few bytes as
- possible
-<!-- 4, 8, or 16 bytes -->
- depending on the precision specified for the <codeph>DECIMAL</codeph> column.
- <ul>
- <li>
- In memory, <codeph>DECIMAL</codeph> values with precision of 9 or less are stored in 4 bytes.
- </li>
+ <p dir="ltr">
+ For example, <codeph>CAST('98.678912' AS DECIMAL(15, 1))</codeph> returns
+ <codeph>98.7</codeph>.
+ </p>
+ </li>
- <li>
- In memory, <codeph>DECIMAL</codeph> values with precision of 10 through 18 are stored in 8 bytes.
- </li>
+ <li dir="ltr">
+ <p dir="ltr">
+ If # leading digits in <codeph>STRING</codeph> > # leading digits in
+ <codeph>DECIMAL</codeph>, an error is returned.
+ </p>
- <li>
- In memory, <codeph>DECIMAL</codeph> values with precision greater than 18 are stored in 16 bytes.
- </li>
- </ul>
- </li>
- </ul>
+ <p dir="ltr">
+ For example, <codeph>CAST('123.45' AS DECIMAL(2, 2))</codeph> returns an error.
+ </p>
+ </li>
+ </ul>
+ </p>
+
+ <p>
+ Exponential notation is supported when casting from <codeph>STRING</codeph>.
+ </p>
+
+ <p>
+ For example, <codeph>CAST('1.0e6' AS DECIMAL(32, 0))</codeph> returns
+ <codeph>1000000</codeph>.
+ </p>
+
+ <p>
+ Casting any non-numeric value, such as <codeph>'ABC'</codeph> to the
+ <codeph>DECIMAL</codeph> type returns an error.
+ </p>
+
+ <p>
+ <b>Casting DECIMAL to TIMESTAMP:</b>
+ </p>
+
+ <p>
+ Casting a <codeph>DECIMAL</codeph> value N to <codeph>TIMESTAMP</codeph> produces a value
+ that is N seconds past the start of the epoch date (January 1, 1970).
+ </p>
+
+ <p>
+ <b>DECIMAL vs FLOAT consideration:</b>
+ </p>
+
+ <p>
+ The <codeph>FLOAT</codeph> and <codeph>DOUBLE</codeph> types can cause problems or
+ unexpected behavior due to inability to precisely represent certain fractional values, for
+ example dollar and cents values for currency. You might find output values slightly
+ different than you inserted, equality tests that do not match precisely, or unexpected
+ values for <codeph>GROUP BY</codeph> columns. The <codeph>DECIMAL</codeph> type can help
+ reduce unexpected behavior and rounding errors, but at the expense of some performance
+ overhead for assignments and comparisons.
+ </p>
+
+ <p>
+ <b>Literals and expressions:</b>
+ </p>
+
+ <p>
+ <ul>
+ <li dir="ltr">
+ <p dir="ltr">
+ Numeric literals without a decimal point
+ </p>
+ <ul>
+ <li>
+ The literals are treated as the smallest integer that would fit the literal. For
+ example, <codeph>111</codeph> is a <codeph>TINYINT</codeph>, and
+ <codeph>1111</codeph> is a <codeph>SMALLINT</codeph>.
+ </li>
+
+ <li>
+ Large literals that do not fit into any integer type are treated as
+ <codeph>DECIMAL</codeph>.
+ </li>
+
+ <li>
+ The literals too large to fit into a <codeph>DECIMAL(38, 0)</codeph> are treated
+ as <codeph>DOUBLE</codeph>.
+ </li>
+ </ul>
+ </li>
+
+ <li dir="ltr">
+ <p dir="ltr">
+ Numeric literals with a decimal point
+ </p>
+ <ul>
+ <li>
+ The literal with less than 38 digits are treated as <codeph>DECIMAL</codeph>.
+ </li>
+
+ <li>
+ The literals with 38 or more digits are treated as a <codeph>DOUBLE</codeph>.
+ </li>
+ </ul>
+ </li>
+
+ <li>
+ Exponential notation is supported in <codeph>DECIMAL</codeph> literals.
+ </li>
+
+ <li dir="ltr">
+ <p>
+ To represent a very large or precise <codeph>DECIMAL</codeph> value as a literal,
+ for example one that contains more digits than can be represented by a
+ <codeph>BIGINT</codeph> literal, use a quoted string or a floating-point value for
+ the number and <codeph>CAST</codeph> the string to the desired
+ <codeph>DECIMAL</codeph> type.
+ </p>
+
+ <p>
+ For example: <codeph>CAST('999999999999999999999999999999' AS DECIMAL(38,
+ 5)))</codeph>
+ </p>
+ </li>
+ </ul>
+ </p>
<p conref="../shared/impala_common.xml#common/file_format_blurb"/>
- <ul>
- <li>
- The <codeph>DECIMAL</codeph> data type can be stored in any of the file formats supported by Impala, as
- described in <xref href="impala_file_formats.xml#file_formats"/>. Impala only writes to tables that use the
- Parquet and text formats, so those formats are the focus for file format compatibility.
- </li>
+ <p>
+ <ul>
+ <li dir="ltr">
+ <p dir="ltr">
+ The <codeph>DECIMAL</codeph> data type can be stored in any of the file formats
+ supported by Impala.
+ </p>
+ </li>
- <li>
- Impala can query Avro, RCFile, or SequenceFile tables containing <codeph>DECIMAL</codeph> columns, created
- by other Hadoop components.
- </li>
+ <li dir="ltr">
+ <p dir="ltr">
+ Impala can query Avro, RCFile, or SequenceFile tables that contain
+ <codeph>DECIMAL</codeph> columns, created by other Hadoop components.
+ </p>
+ </li>
+
+ <li dir="ltr">
+ <p dir="ltr">
+ Impala can query and insert into Kudu tables that contain <codeph>DECIMAL</codeph>
+ columns.
+ </p>
+ </li>
- <li>
- You can use <codeph>DECIMAL</codeph> columns in Impala tables that are mapped to HBase tables. Impala can
- query and insert into such tables.
- </li>
+ <li dir="ltr">
+ <p dir="ltr">
+ The <codeph>DECIMAL</codeph> data type is fully compatible with HBase tables.
+ </p>
+ </li>
- <li>
- Text, RCFile, and SequenceFile tables all use ASCII-based formats. In these tables, each
- <codeph>DECIMAL</codeph> value takes up as many bytes as there are digits in the value, plus an extra byte
- if the decimal point is present. The binary format of Parquet or Avro files offers more compact storage for
- <codeph>DECIMAL</codeph> columns.
- </li>
+ <li dir="ltr">
+ <p dir="ltr">
+ The <codeph>DECIMAL</codeph> data type is fully compatible with Parquet tables.
+ </p>
+ </li>
- <li>
- Parquet and Avro tables use binary formats, In these tables, Impala stores each value in 4, 8, or 16 bytes
- depending on the precision specified for the <codeph>DECIMAL</codeph> column.
- </li>
+ <li dir="ltr">
+ <p dir="ltr">
+ Values of the <codeph>DECIMAL</codeph> data type are potentially larger in text
+ tables than in tables using Parquet or other binary formats.
+ </p>
+ </li>
+ </ul>
+ </p>
- </ul>
+ <p>
+ <b>UDF consideration:</b>
+ </p>
<p>
- <b>UDF considerations:</b> When writing a C++ UDF, use the <codeph>DecimalVal</codeph> data type defined in
+ When writing a C++ UDF, use the <codeph>DecimalVal</codeph> data type defined in
<filepath>/usr/include/impala_udf/udf.h</filepath>.
</p>
+ <p>
+ <b>Changing precision and scale:</b>
+ </p>
+
+ <p>
+ You can issue an <codeph>ALTER TABLE ... REPLACE COLUMNS</codeph> statement to change the
+ precision and scale of an existing <codeph>DECIMAL</codeph> column.
+ <ul>
+ <li dir="ltr">
+ <p dir="ltr">
+ For text-based formats (text, RCFile, and SequenceFile tables)
+ </p>
+ <ul>
+ <li>
+ <p dir="ltr">
+ If the values in the column fit within the new precision and scale, they are
+ returned correctly by a query.
+ </p>
+ </li>
+
+ <li>
+ <p dir="ltr">
+ If any values that do not fit within the new precision and scale:
+ <ul>
+ <li>
+ Impala returns an error if the query option <codeph>ABORT_ON_ERROR</codeph>
+ is set to <codeph>true</codeph>.
+ </li>
+
+ <li>
+ Impala returns a <codeph>NULL</codeph> and warning that conversion failed if
+ the query option <codeph>ABORT_ON_ERROR</codeph> is set to
+ <codeph>false</codeph>.
+ </li>
+ </ul>
+ </p>
+ </li>
+
+ <li>
+ <p>
+ Leading zeros do not count against the precision value, but trailing zeros after
+ the decimal point do.
+ </p>
+ </li>
+ </ul>
+ </li>
+
+ <li dir="ltr">
+ <p dir="ltr">
+ For binary formats (Parquet and Avro tables)
+ </p>
+ <ul>
+ <li>
+ <p dir="ltr">
+ Although an <codeph>ALTER TABLE ... REPLACE COLUMNS</codeph> statement that
+ changes the precision or scale of a <codeph>DECIMAL</codeph> column succeeds,
+ any subsequent attempt to query the changed column results in a fatal error.
+ (The other columns can still be queried successfully.) This is because the
+ metadata about the columns is stored in the data files themselves, and
+ <codeph>ALTER TABLE</codeph> does not actually make any updates to the data
+ files.
+ </p>
+ </li>
+
+ <li>
+ <p dir="ltr">
+ If the metadata in the data files disagrees with the metadata in the metastore
+ database, Impala cancels the query.
+ </p>
+ </li>
+ </ul>
+ </li>
+ </ul>
+ </p>
+
<p conref="../shared/impala_common.xml#common/partitioning_blurb"/>
<p>
- You can use a <codeph>DECIMAL</codeph> column as a partition key. Doing so provides a better match between
- the partition key values and the HDFS directory names than using a <codeph>DOUBLE</codeph> or
- <codeph>FLOAT</codeph> partitioning column:
- </p>
-
- <p conref="../shared/impala_common.xml#common/schema_evolution_blurb"/>
-
- <ul>
- <li>
- For text-based formats (text, RCFile, and SequenceFile tables), you can issue an <codeph>ALTER TABLE ...
- REPLACE COLUMNS</codeph> statement to change the precision and scale of an existing
- <codeph>DECIMAL</codeph> column. As long as the values in the column fit within the new precision and
- scale, they are returned correctly by a query. Any values that do not fit within the new precision and
- scale are returned as <codeph>NULL</codeph>, and Impala reports the conversion error. Leading zeros do not
- count against the precision value, but trailing zeros after the decimal point do.
-<codeblock><![CDATA[[localhost:21000] > create table text_decimals (x string);
-[localhost:21000] > insert into text_decimals values ("1"), ("2"), ("99.99"), ("1.234"), ("000001"), ("1.000000000");
-[localhost:21000] > select * from text_decimals;
-+-------------+
-| x |
-+-------------+
-| 1 |
-| 2 |
-| 99.99 |
-| 1.234 |
-| 000001 |
-| 1.000000000 |
-+-------------+
-[localhost:21000] > alter table text_decimals replace columns (x decimal(4,2));
-[localhost:21000] > select * from text_decimals;
-+-------+
-| x |
-+-------+
-| 1.00 |
-| 2.00 |
-| 99.99 |
-| NULL |
-| 1.00 |
-| NULL |
-+-------+
-ERRORS:
-Backend 0:Error converting column: 0 TO DECIMAL(4, 2) (Data is: 1.234)
-file: hdfs://127.0.0.1:8020/user/hive/warehouse/decimal_testing.db/text_decimals/634d4bd3aa0
-e8420-b4b13bab7f1be787_56794587_data.0
-record: 1.234
-Error converting column: 0 TO DECIMAL(4, 2) (Data is: 1.000000000)
-file: hdfs://127.0.0.1:8020/user/hive/warehouse/decimal_testing.db/text_decimals/cd40dc68e20
-c565a-cc4bd86c724c96ba_311873428_data.0
-record: 1.000000000
-]]>
-</codeblock>
- </li>
-
- <li>
- For binary formats (Parquet and Avro tables), although an <codeph>ALTER TABLE ... REPLACE COLUMNS</codeph>
- statement that changes the precision or scale of a <codeph>DECIMAL</codeph> column succeeds, any subsequent
- attempt to query the changed column results in a fatal error. (The other columns can still be queried
- successfully.) This is because the metadata about the columns is stored in the data files themselves, and
- <codeph>ALTER TABLE</codeph> does not actually make any updates to the data files. If the metadata in the
- data files disagrees with the metadata in the metastore database, Impala cancels the query.
- </li>
- </ul>
-
- <p conref="../shared/impala_common.xml#common/example_blurb"/>
-
-<codeblock>CREATE TABLE t1 (x DECIMAL, y DECIMAL(5,2), z DECIMAL(25,0));
-INSERT INTO t1 VALUES (5, 99.44, 123456), (300, 6.7, 999999999);
-SELECT x+y, ROUND(y,1), z/98.6 FROM t1;
-SELECT CAST(1000.5 AS DECIMAL);
-</codeblock>
-
-<!-- <p conref="../shared/impala_common.xml#common/partitioning_good"/> -->
-
- <p conref="../shared/impala_common.xml#common/hbase_ok"/>
-
- <p conref="../shared/impala_common.xml#common/parquet_ok"/>
-
- <p conref="../shared/impala_common.xml#common/text_bulky"/>
-
-<!-- <p conref="../shared/impala_common.xml#common/compatibility_blurb"/> -->
-
-<!-- <p conref="../shared/impala_common.xml#common/internals_blurb"/> -->
-
-<!-- <p conref="../shared/impala_common.xml#common/added_in_20"/> -->
-
- <p conref="../shared/impala_common.xml#common/column_stats_constant"/>
-
- <p conref="../shared/impala_common.xml#common/kudu_blurb"/>
- <p conref="../shared/impala_common.xml#common/kudu_unsupported_data_type"/>
-
- <p conref="../shared/impala_common.xml#common/related_info"/>
-
- <p>
- <xref href="impala_literals.xml#numeric_literals"/>, <xref href="impala_tinyint.xml#tinyint"/>,
- <xref href="impala_smallint.xml#smallint"/>, <xref href="impala_int.xml#int"/>,
- <xref href="impala_bigint.xml#bigint"/>, <xref href="impala_decimal.xml#decimal"/>,
- <xref href="impala_math_functions.xml#math_functions"/> (especially <codeph>PRECISION()</codeph> and
- <codeph>SCALE()</codeph>)
+ Using a <codeph>DECIMAL</codeph> column as a partition key provides you a better match
+ between the partition key values and the HDFS directory names than using a
+ <codeph>DOUBLE</codeph> or <codeph>FLOAT</codeph> partitioning column.
+ </p>
+
+ <p>
+ <b>Column statistics considerations:</b>
+ </p>
+
+ <p>
+ Because the <codeph>DECIMAL</codeph> type has a fixed size, the maximum and average size
+ fields are always filled in for column statistics, even before you run the <codeph>COMPUTE
+ STATS</codeph> statement.
</p>
+
+ <p conref="../shared/impala_common.xml#common/compatibility_blurb"/>
+
+ <p>
+ <ul>
+ <li dir="ltr">
+ <p dir="ltr">
+ This version of <codeph>DECIMAL</codeph> type is the default in
+ <keyword keyref="impala30_full"/> and higher. The key differences between this
+ version of <codeph>DECIMAL</codeph> and the previous <codeph>DECIMAL</codeph> V1 in
+ Impala 2.x include the following.
+ </p>
+ <simpletable frame="all" relcolwidth="1* 1* 1*"
+ id="simpletable_bdr_rzc_qdb">
+
+ <sthead>
+
+ <stentry/>
+
+ <stentry>DECIMAL in <keyword keyref="impala30_full"/> or
+ higher</stentry>
+
+ <stentry>DECIMAL in <keyword keyref="impala212_full"> or
+ lower</keyword>
+
+ </stentry>
+
+ </sthead>
+
+ <strow>
+
+ <stentry>Overall behavior</stentry>
+
+ <stentry>Returns either the result or an error.</stentry>
+
+ <stentry>Returns either the result or <codeph>NULL</codeph> with a
+ warning.</stentry>
+
+ </strow>
+
+ <strow>
+
+ <stentry>Overflow behavior</stentry>
+
+ <stentry>Aborts with an error.</stentry>
+
+ <stentry>Issues a warning and returns
+ <codeph>NULL</codeph>.</stentry>
+
+ </strow>
+
+ <strow>
+
+ <stentry>Truncation / rounding behavior in arithmetic</stentry>
+
+ <stentry>Truncates and rounds digits from the back.</stentry>
+
+ <stentry>Truncates digits from the front.</stentry>
+
+ </strow>
+
+ <strow>
+
+ <stentry>String cast</stentry>
+
+ <stentry>Truncates from the back and rounds.</stentry>
+
+ <stentry>Truncates from the back.</stentry>
+
+ </strow>
+
+ </simpletable>
+ <p>
+ If you need to continue using the first version of the <codeph>DECIMAL</codeph> type
+ for the backward compatibility of your queries, set the <codeph>DECIMAL_V2</codeph>
+ query option to <codeph>FALSE</codeph>:
+<codeblock>SET DECIMAL_V2=FALSE;</codeblock>
+ </p>
+ </li>
+
+ <li dir="ltr">
+ <p dir="ltr">
+ Use the <codeph>DECIMAL</codeph> data type in Impala for applications where you used
+ the <codeph>NUMBER</codeph> data type in Oracle.
+ </p>
+
+ <p dir="ltr">
+ The Impala <codeph>DECIMAL</codeph> type does not support the Oracle idioms of
+ <codeph>*</codeph> for scale.
+ </p>
+
+ <p dir="ltr">
+ The Impala <codeph>DECIMAL</codeph> type does not support negative values for
+ precision.
+ </p>
+ </li>
+ </ul>
+ </p>
+
</conbody>
+
</concept>