You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Richard <co...@163.com> on 2013/10/16 13:09:21 UTC

histogram_numeric find the most frequent value

I want to find the most frequent value of a column, I noticed histogram_numerc,
but I cannot specify the bin boundary. The result is not what I want. 


take an example as follows,  I want something like


select gid, most_frequent(category) from mytable group by gid.


where category is a column with descritized values.


thanks.
Richard

Re:Re: histogram_numeric find the most frequent value

Posted by Richard <co...@163.com>.
good idea, I will try. thanks

At 2013-10-16 19:12:30,"Ed Soniat" <es...@liveperson.com> wrote:

You could use a modular math to transform the data in to single value representations of each range you intend to represent with your boundary using a sub select.





On Wed, Oct 16, 2013 at 7:09 AM, Richard <co...@163.com> wrote:

I want to find the most frequent value of a column, I noticed histogram_numerc,
but I cannot specify the bin boundary. The result is not what I want. 


take an example as follows,  I want something like


select gid, most_frequent(category) from mytable group by gid.


where category is a column with descritized values.


thanks.
Richard







This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the addressee you must not use, copy, disclose or take action based on this message or any information herein. 
If you have received this message in error, please advise the sender immediately by reply email and delete this message. Thank you.

Re: histogram_numeric find the most frequent value

Posted by Ed Soniat <es...@liveperson.com>.
You could use a modular math to transform the data in to single value
representations of each range you intend to represent with your boundary
using a sub select.



On Wed, Oct 16, 2013 at 7:09 AM, Richard <co...@163.com> wrote:

> I want to find the most frequent value of a column, I noticed
> histogram_numerc,
> but I cannot specify the bin boundary. The result is not what I want.
>
> take an example as follows,  I want something like
>
> select gid, most_frequent(category) from mytable group by gid.
>
> where category is a column with descritized values.
>
> thanks.
> Richard
>
>
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.