You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/09/06 14:57:20 UTC

[jira] [Commented] (DRILL-4834) decimal implementation is vulnerable to overflow errors, and extremely complex

    [ https://issues.apache.org/jira/browse/DRILL-4834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15467615#comment-15467615 ] 

ASF GitHub Bot commented on DRILL-4834:
---------------------------------------

Github user daveoshinsky commented on the issue:

    https://github.com/apache/drill/pull/570
  
    Please review "one-size-fits-all" VARDECIMAL implementation in this pull request.  Initially, it exists alongside all of the other decimal types, and is only used for variable-width BINARY decimal in Parquet files.  If reviews look good, we can extend it to eventually replace other decimal types.


> decimal implementation is vulnerable to overflow errors, and extremely complex
> ------------------------------------------------------------------------------
>
>                 Key: DRILL-4834
>                 URL: https://issues.apache.org/jira/browse/DRILL-4834
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Data Types
>    Affects Versions: 1.6.0
>         Environment: Drill 1.7 on any platform
>            Reporter: Dave Oshinsky
>             Fix For: Future
>
>
> While working on a fix for DRILL-4704, logic was added to CastIntDecimal.java template to handle the situation where a precision is not supplied (i.e., the supplied precision is zero) for an integer value that is to be casted to a decimal.  The Drill decimal implementation uses a limited selection of fixed decimal precision data types (the total number of decimal digits, i.e., Decimal9, 18, 28, 38) to represent decimal values.  If the destination precision is too small to represent the input integer that is being casted, there is no clean way to deal with the overflow error properly.
> While using fixed decimal precisions as is being done currently can lead to more efficient use of memory, it often will actually lead to less efficient use of memory (when the fixed precision is specified significantly larger than is actually needed to represent the numbers), and it results in a tremendous mushrooming of the complexity of the code.  For each fixed precision (and there are only a limited set of selections, 9, 18, 28, 38, which itself leads to memory inefficiency), there is a separate set of code generated from templates.  For each pairwise combination of decimal or non-decimal numeric types, there are multiple places in the code where conversions must be handled, or conditions must be included to handle the difference in precision between the two types.  A one-size-fits-all approach (using a variable width vector to represent any decimal precision) would usually be more memory-efficient (since precisions are often over-specified), and would greatly simplify the code.
> Also see the DRILL-4184 issue, which is related.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)