You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by shan s <my...@gmail.com> on 2012/05/25 17:14:28 UTC

subquery in select

Hi All,
Does hive support subquery in select statement?

Given below data, I need counts and percentage counts per category. The
divisor in my case is not the total count, but something that is stored in
another table.. In T-SQL, I can do subquery in select statement to get my
divisor.

Select id, count(*), count(*)/ (select val from tableY where name like
‘xyzdivisor’)
>From data
Group by category

What am I missing? This looks like a simple case..
Id Category         Count Output   Percentage Output
1    A                    A      4             A     40 (4/10)*100
2    A                    B      3             B     30
3    B                    C      3             C     30
4    A
5    C
6    C
7    B
8    B
9    A
10  C
Thanks, Prashant

Re: subquery in select

Posted by wd <wd...@wdicc.com>.

I didn't have run it. I think the sql I write should work in every db....

On Mon, May 28, 2012 at 4:54 PM, shan s <my...@gmail.com> wrote:
> That was a typo in the email, but it still errors after the typo is
> corrected.
> Did you try to run it?
>
> Here is the entire script after creating the file, gt.txt from data
> below....
>
> CREATE EXTERNAL TABLE IF NOT EXISTS gt (id INT,  category STRING)
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE
> LOCATION '/user/user1/ht/gt';
> LOAD DATA LOCAL INPATH 'gt.txt' OVERWRITE INTO TABLE gt;
> CREATE TABLE IF NOT EXISTS Res (category STRING, count INT, perVal INT)
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
> INSERT OVERWRITE TABLE  Res
> select a.category, a.count, b.totalCount
> from
> (select category, count(*) as count from gt group by category) a,
>

Maybe you should delete this white line?

> (select count(*) as totalCount from gt) b ;
>
> On Mon, May 28, 2012 at 1:55 PM, wd <wd...@wdicc.com> wrote:
>>
>> group by category
>>
>> On Mon, May 28, 2012 at 2:20 PM, shan s <my...@gmail.com> wrote:
>> > (select category, count(*) as count from gt group by cat) a,
>
>

Re: subquery in select

Posted by shan s <my...@gmail.com>.

That was a typo in the email, but it still errors after the typo is
corrected.
Did you try to run it?

Here is the entire script after creating the file, gt.txt from data
below....

CREATE EXTERNAL TABLE IF NOT EXISTS gt (id INT,  category STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE
LOCATION '/user/user1/ht/gt';
LOAD DATA LOCAL INPATH 'gt.txt' OVERWRITE INTO TABLE gt;
CREATE TABLE IF NOT EXISTS Res (category STRING, count INT, perVal INT)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE;
INSERT OVERWRITE TABLE  Res
select a.category, a.count, b.totalCount
from
(select category, count(*) as count from gt group by category) a,
(select count(*) as totalCount from gt) b ;

On Mon, May 28, 2012 at 1:55 PM, wd <wd...@wdicc.com> wrote:

> group by category
>
> On Mon, May 28, 2012 at 2:20 PM, shan s <my...@gmail.com> wrote:
> > (select category, count(*) as count from gt group by cat) a,
>

Re: subquery in select

Posted by wd <wd...@wdicc.com>.

group by category

On Mon, May 28, 2012 at 2:20 PM, shan s <my...@gmail.com> wrote:
> (select category, count(*) as count from gt group by cat) a,

Re: subquery in select

Posted by shan s <my...@gmail.com>.

I transled it to below, but I am getting an error.
FAILED: Parse Error: line 6:50 mismatched input ',' expecting EOF near 'a'

 select a.category, a.count, a.count/b.totalCount as percentageValue
from
(select category, count(*) as count from gt group by cat) a,
(select count(*) as totalCount from gt) b ;
Thanks for your help!
On Mon, May 28, 2012 at 9:37 AM, wd <wd...@wdicc.com> wrote:

> select a.id, a.count, a.count/b.val
> from (select id, count(*) as count from data group by id) a,
> (select val from tableY ....) b
>
> You don't have a join condition, this maybe output very large data.
>
> On Mon, May 28, 2012 at 11:45 AM, shan s <my...@gmail.com> wrote:
> > Thanks Edward. But I didn't get the trick yet.
> > I was able to use FROM with multiple group bys. But failed to see what to
> > replace the subquery with...
> >
> > Could you please give an example for my use case below.
> >
> > Select id, count(*), count(*)/ (select val from tableY where name like
> > ‘xyzdivisor’)
> > From data
> > Group by category
> >
> > Thank You,
> > Prashant.
> > On Fri, May 25, 2012 at 8:56 PM, Edward Capriolo <ed...@gmail.com>
> > wrote:
> >>
> >> No. But hive does support nested selects so must queries can be
> >> re-written to accomplish the same thing. Actually I love hives
> >>
> >> "FROM (  ) SELECT" syntax
> >>
> >> It takes a while to get used to but it is much more clear then SQL
> >> standard switch allows supqueries in all kinds of places and has about
> >> 40 difference ways to express the same query.
> >>
> >> https://cwiki.apache.org/Hive/languagemanual.html
> >>
> >> On Fri, May 25, 2012 at 11:14 AM, shan s <my...@gmail.com> wrote:
> >> > Hi All,
> >> > Does hive support subquery in select statement?
> >> >
> >> > Given below data, I need counts and percentage counts per category.
> The
> >> > divisor in my case is not the total count, but something that is
> stored
> >> > in
> >> > another table.. In T-SQL, I can do subquery in select statement to get
> >> > my
> >> > divisor.
> >> >
> >> > Select id, count(*), count(*)/ (select val from tableY where name like
> >> > ‘xyzdivisor’)
> >> > From data
> >> > Group by category
> >> >
> >> > What am I missing? This looks like a simple case..
> >> >
> >> > Id Category         Count Output   Percentage Output
> >> > 1    A                    A      4             A     40 (4/10)*100
> >> > 2    A                    B      3             B     30
> >> > 3    B                    C      3             C     30
> >> > 4    A
> >> > 5    C
> >> > 6    C
> >> > 7    B
> >> > 8    B
> >> > 9    A
> >> > 10  C
> >> > Thanks, Prashant
> >
> >
>

Re: subquery in select

Posted by wd <wd...@wdicc.com>.

select a.id, a.count, a.count/b.val
from (select id, count(*) as count from data group by id) a,
(select val from tableY ....) b

You don't have a join condition, this maybe output very large data.

On Mon, May 28, 2012 at 11:45 AM, shan s <my...@gmail.com> wrote:
> Thanks Edward. But I didn't get the trick yet.
> I was able to use FROM with multiple group bys. But failed to see what to
> replace the subquery with...
>
> Could you please give an example for my use case below.
>
> Select id, count(*), count(*)/ (select val from tableY where name like
> ‘xyzdivisor’)
> From data
> Group by category
>
> Thank You,
> Prashant.
> On Fri, May 25, 2012 at 8:56 PM, Edward Capriolo <ed...@gmail.com>
> wrote:
>>
>> No. But hive does support nested selects so must queries can be
>> re-written to accomplish the same thing. Actually I love hives
>>
>> "FROM (  ) SELECT" syntax
>>
>> It takes a while to get used to but it is much more clear then SQL
>> standard switch allows supqueries in all kinds of places and has about
>> 40 difference ways to express the same query.
>>
>> https://cwiki.apache.org/Hive/languagemanual.html
>>
>> On Fri, May 25, 2012 at 11:14 AM, shan s <my...@gmail.com> wrote:
>> > Hi All,
>> > Does hive support subquery in select statement?
>> >
>> > Given below data, I need counts and percentage counts per category. The
>> > divisor in my case is not the total count, but something that is stored
>> > in
>> > another table.. In T-SQL, I can do subquery in select statement to get
>> > my
>> > divisor.
>> >
>> > Select id, count(*), count(*)/ (select val from tableY where name like
>> > ‘xyzdivisor’)
>> > From data
>> > Group by category
>> >
>> > What am I missing? This looks like a simple case..
>> >
>> > Id Category         Count Output   Percentage Output
>> > 1    A                    A      4             A     40 (4/10)*100
>> > 2    A                    B      3             B     30
>> > 3    B                    C      3             C     30
>> > 4    A
>> > 5    C
>> > 6    C
>> > 7    B
>> > 8    B
>> > 9    A
>> > 10  C
>> > Thanks, Prashant
>
>

Re: subquery in select

Posted by shan s <my...@gmail.com>.

Thanks Edward. But I didn't get the trick yet.
I was able to use FROM with multiple group bys. But failed to see what to
replace the subquery with...

Could you please give an example for my use case below.

Select id, count(*), count(*)/ (select val from tableY where name like
‘xyzdivisor’)
>From data
Group by category

Thank You,
Prashant.
On Fri, May 25, 2012 at 8:56 PM, Edward Capriolo <ed...@gmail.com>wrote:

> No. But hive does support nested selects so must queries can be
> re-written to accomplish the same thing. Actually I love hives
>
> "FROM (  ) SELECT" syntax
>
> It takes a while to get used to but it is much more clear then SQL
> standard switch allows supqueries in all kinds of places and has about
> 40 difference ways to express the same query.
>
> https://cwiki.apache.org/Hive/languagemanual.html
>
> On Fri, May 25, 2012 at 11:14 AM, shan s <my...@gmail.com> wrote:
> > Hi All,
> > Does hive support subquery in select statement?
> >
> > Given below data, I need counts and percentage counts per category. The
> > divisor in my case is not the total count, but something that is stored
> in
> > another table.. In T-SQL, I can do subquery in select statement to get my
> > divisor.
> >
> > Select id, count(*), count(*)/ (select val from tableY where name like
> > ‘xyzdivisor’)
> > From data
> > Group by category
> >
> > What am I missing? This looks like a simple case..
> >
> > Id Category         Count Output   Percentage Output
> > 1    A                    A      4             A     40 (4/10)*100
> > 2    A                    B      3             B     30
> > 3    B                    C      3             C     30
> > 4    A
> > 5    C
> > 6    C
> > 7    B
> > 8    B
> > 9    A
> > 10  C
> > Thanks, Prashant
>

Re: subquery in select

Posted by Edward Capriolo <ed...@gmail.com>.

No. But hive does support nested selects so must queries can be
re-written to accomplish the same thing. Actually I love hives

"FROM (  ) SELECT" syntax

It takes a while to get used to but it is much more clear then SQL
standard switch allows supqueries in all kinds of places and has about
40 difference ways to express the same query.

https://cwiki.apache.org/Hive/languagemanual.html

On Fri, May 25, 2012 at 11:14 AM, shan s <my...@gmail.com> wrote:
> Hi All,
> Does hive support subquery in select statement?
>
> Given below data, I need counts and percentage counts per category. The
> divisor in my case is not the total count, but something that is stored in
> another table.. In T-SQL, I can do subquery in select statement to get my
> divisor.
>
> Select id, count(*), count(*)/ (select val from tableY where name like
> ‘xyzdivisor’)
> From data
> Group by category
>
> What am I missing? This looks like a simple case..
>
> Id Category         Count Output   Percentage Output
> 1    A                    A      4             A     40 (4/10)*100
> 2    A                    B      3             B     30
> 3    B                    C      3             C     30
> 4    A
> 5    C
> 6    C
> 7    B
> 8    B
> 9    A
> 10  C
> Thanks, Prashant