You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Min Zhou (JIRA)" <ji...@apache.org> on 2009/05/21 14:01:45 UTC
[jira] Updated: (HIVE-503) improvement on distinct: distinguish distinct aggregate function from distinct

     [ https://issues.apache.org/jira/browse/HIVE-503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Min Zhou updated HIVE-503:
--------------------------

    Description: 
distinct
# OK
{code:sql}
select 
   col
from 
  tbl
{code}
# FAILED
{code:sql}
select 
   col1,
   col2
from 
  tbl
{code}

 distinguish distinct aggregate function
# OK
{code:sql}
select 
   count(distinct col % 10)
from 
  tbl
{code}
# OK
{code:sql}
select 
   count(distinct col1% 10)
   count(distinct col1% 9)
from 
  tbl
{code}
# OK
{code:sql}
select 
   count(distinct col1 % 10)
   count(distinct col2 % 9)
from 
  tbl
{code}
# OK
{code:sql}
select 
  sum(distinct col1 % 10),
  count(distinct col2 % 9)
from 
  tbl
{code}
# OK
{code:sql}
select 
  max(distinct substr(col1, 1, 10)),
  count(distinct col2 % 9)
from 
  tbl
{code}

Distinct aggregate function is in connection with the all aggregate function,  it essentially is an aggregate function. 
Only one result each aggregate function will produce,  it's very able one mapreduce job do two different aggregate expression simultaneously.


  was:
distinct
# OK
{code:sql}
select 
   col
from 
  tbl
{code}
# FAILED
{code:sql}
select 
   col1,
   col2
from 
  tbl
{code}

 distinguish distinct aggregate function
# OK
{code:sql}
select 
   count(distinct col% 10)
from 
  tbl
{code}
# OK
{code:sql}
select 
   count(distinct col1% 10)
   count(distinct col1% 9)
from 
  tbl
{code}
# OK
{code:sql}
select 
   count(distinct col1% 10)
   count(distinct col2 % 9)
from 
  tbl
{code}


> improvement on distinct: distinguish distinct aggregate function from distinct
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-503
>                 URL: https://issues.apache.org/jira/browse/HIVE-503
>             Project: Hadoop Hive
>          Issue Type: Improvement
>            Reporter: Min Zhou
>
> distinct
> # OK
> {code:sql}
> select 
>    col
> from 
>   tbl
> {code}
> # FAILED
> {code:sql}
> select 
>    col1,
>    col2
> from 
>   tbl
> {code}
>  distinguish distinct aggregate function
> # OK
> {code:sql}
> select 
>    count(distinct col % 10)
> from 
>   tbl
> {code}
> # OK
> {code:sql}
> select 
>    count(distinct col1% 10)
>    count(distinct col1% 9)
> from 
>   tbl
> {code}
> # OK
> {code:sql}
> select 
>    count(distinct col1 % 10)
>    count(distinct col2 % 9)
> from 
>   tbl
> {code}
> # OK
> {code:sql}
> select 
>   sum(distinct col1 % 10),
>   count(distinct col2 % 9)
> from 
>   tbl
> {code}
> # OK
> {code:sql}
> select 
>   max(distinct substr(col1, 1, 10)),
>   count(distinct col2 % 9)
> from 
>   tbl
> {code}
> Distinct aggregate function is in connection with the all aggregate function,  it essentially is an aggregate function. 
> Only one result each aggregate function will produce,  it's very able one mapreduce job do two different aggregate expression simultaneously.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.