You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Jason Altekruse (JIRA)" <ji...@apache.org> on 2014/04/14 20:31:18 UTC

[jira] [Commented] (DRILL-502) Avg function against decimal data returns incorrect result from drill

    [ https://issues.apache.org/jira/browse/DRILL-502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13968632#comment-13968632 ] 

Jason Altekruse commented on DRILL-502:
---------------------------------------

I'm guessing that the average function is dividing by the total number of values present in the dataset, including nulls, rather than counting the number of defined values. I'll look into this, if it is the case should be an easy fix.

> Avg function against decimal data returns incorrect result from drill
> ---------------------------------------------------------------------
>
>                 Key: DRILL-502
>                 URL: https://issues.apache.org/jira/browse/DRILL-502
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Krystal
>         Attachments: student.parquet
>
>
> Ran the following sql query:
> 0: jdbc:drill:schema=dfs> select avg(gpa) from dfs.`student`;
> +------------+
> |   EXPR$0   |
> +------------+
> | 1.999      |
> +------------+
> 1 row selected (0.22 seconds)
> The same query against the same dataset from oracle, postgres and impala returns value of 2.49826.
> Here is the schema of the student table:
> message pig_schema {
>   optional int32 rownum;
>   optional binary name;
>   optional int32 age;
>   optional double gpa;
>   optional int64 studentnum;
>   optional binary create_time;
> }
> The same avg function against int data types returns correct result.



--
This message was sent by Atlassian JIRA
(v6.2#6252)