You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by Ronak Bhatt <ro...@gmail.com> on 2010/08/20 16:37:40 UTC

question - how to handle OR with HIVe

Hi - I've a following SQL query. What is the way to convert it into HIVE
runnable format -

Select a.id, count(b.url)
from a, b
where a.id = b.id
and (b.url  like '%google.com%' or b.url like '%h.google.com%' or b.url like
'%bing%')
and a.exec_date = date(now()) ;  ----- get today's data
group by a.id


*thanks, ronak*
*
*
*
*

RE: question - how to handle OR with HIVe

Posted by Namit Jain <nj...@facebook.com>.

Currently, only equality joins are supported.
But, you can rewrite your query as:


 select substr(CB.EXEC_DATE,1,10), count(CB.ID<http://CB.ID>)
     from callbacks CB JOIN   
       (select * from  pages p where 
                p.page like '%google.com/search%<http://google.com/search%>'
             or p.page like '%google.com/custom%<http://google.com/custom%>'
             or p.page like '%google.com/#hl%<http://google.com/#hl%>'
             or p.page like '%google.com/cse%<http://google.com/cse%>'
             or p.page like '%search.yahoo.com/search%<http://search.yahoo.com/search%>'
             or p.page like '%bing.com/search%<http://bing.com/search%>'
             or p.page like '%google.com/product%<http://google.com/product%>' ) s
ON s.id
     group by substr(CB.EXEC_DATE,1,10);


________________________________________
From: Ronak Bhatt [ronakbaps@gmail.com]
Sent: Friday, August 20, 2010 10:02 AM
To: hive-user@hadoop.apache.org
Subject: Re: question - how to handle OR with HIVe

Here is an example of what error I'm referring to.... I'm running HIVE in local mode by saying "SET mapred.job.tracker=local;"

any pointers to address the problem would be greatly appreciated....


hive> select substr(CB.EXEC_DATE,1,10), count(CB.ID<http://CB.ID>)
    > from callbacks CB JOIN pages p ON
    > (
    > CB.page_id = p.id<http://p.id>
    > and (p.page like '%google.com/search%<http://google.com/search%>'
    >         or p.page like '%google.com/custom%<http://google.com/custom%>'
    >         or p.page like '%google.com/#hl%<http://google.com/#hl%>'
    >         or p.page like '%google.com/cse%<http://google.com/cse%>'
    >         or p.page like '%search.yahoo.com/search%<http://search.yahoo.com/search%>'
    >         or p.page like '%bing.com/search%<http://bing.com/search%>'
    >         or p.page like '%google.com/product%<http://google.com/product%>' )
    > )
    > group by substr(CB.EXEC_DATE,1,10);

FAILED: Error in semantic analysis: line 5:5 OR not supported in Join currently '%google.com/product%<http://google.com/product%>'



thanks, ronak

408 504 4847
My Blog : http://ronakbaps.posterous.com





On Fri, Aug 20, 2010 at 9:46 AM, Alexey Kovyrin <al...@kovyrin.net>> wrote:
Who told you those can't be used in where clauses? Here is a sample
query I did on one of our servers right now:

------------------------------------------------------------------------------------
# hive
Hive history file=/tmp/root/hive_job_log_root_201008201144_357004789.txt
hive> select * from doc_this_week_views where object_id = 1 OR
object_id = 33054283;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201007222311_3438, Tracking URL =
http://dfs01.local:50030/jobdetails.jsp?jobid=job_201007222311_3438
Kill Command = /usr/lib/hadoop/bin/hadoop job
-Dmapred.job.tracker=dfs01.local:9001 -kill job_201007222311_3438
2010-08-20 11:44:51,874 Stage-1 map = 0%,  reduce = 0%
2010-08-20 11:45:03,962 Stage-1 map = 100%,  reduce = 0%
2010-08-20 11:45:06,990 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201007222311_3438
OK

33054283        10
1       87609

Time taken: 18.468 seconds
------------------------------------------------------------------------------------

Apparently it works as expected.

On Fri, Aug 20, 2010 at 11:58 AM, Ronak Bhatt <ro...@gmail.com>> wrote:
> This OR and AND are for use in select or condition..for example,
> select (a OR B) as end_result_trur_or_false
> from table
> where id > 20;
> What I'm looking for is,
> select sum(amount)
> from table
> where id > 20 and
> (a > 5  OR b < 10)
> as you can see, I want to use OR in where condition...I could possibly use
> UNION ALL to simulate the OR condition, but it is lot of writing
> (duplicating the code) and also not efficient from performance perspective
> as the table will be scanned multiple times for each select of the union...
> hope this helps to clarify what kind of OR I'm looking for....
>
>
> thanks, ronak
> 408 504 4847
> My Blog : http://ronakbaps.posterous.com
>
>
>
>
> On Fri, Aug 20, 2010 at 8:50 AM, Alexey Kovyrin <al...@kovyrin.net>> wrote:
>>
>> It definitely does support it.
>> http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Logical_Operators
>>
>> On Fri, Aug 20, 2010 at 11:37 AM, Ronak Bhatt <ro...@gmail.com>> wrote:
>> > I read somewhere that hive does NOT support OR condition....that's what
>> > I
>> > was wondering as to how others handle the situation.
>> >
>> >
>> >
>> > On Fri, Aug 20, 2010 at 8:30 AM, Jacob R Rideout
>> > <ap...@jacobrideout.net>>
>> > wrote:
>> >>
>> >> On Fri, Aug 20, 2010 at 8:37 AM, Ronak Bhatt <ro...@gmail.com>>
>> >> wrote:
>> >> > Hi - I've a following SQL query. What is the way to convert it into
>> >> > HIVE
>> >> > runnable format -
>> >> > Select a.id<http://a.id>, count(b.url)
>> >> > from a, b
>> >> > where a.id<http://a.id> = b.id<http://b.id>
>> >> > and (b.url  like '%google.com<http://google.com>%' or b.url like '%h.google.com<http://h.google.com>%' or
>> >> > b.url
>> >> > like
>> >> > '%bing%')
>> >> > and a.exec_date = date(now()) ;  ----- get today's data
>> >> > group by a.id<http://a.id>
>> >> >
>> >> > thanks, ronak
>> >> >
>> >>
>> >> I think something like:
>> >>
>> >> SELECT a.id<http://a.id>, COUNT(distinct b.url)
>> >> FROM a
>> >> JOIN b on a.id<http://a.id> = b.id<http://b.id>
>> >> WHERE
>> >>    (b.url  LIKE '%google.com<http://google.com>%' OR b.url LIKE '%h.google.com<http://h.google.com>%' OR
>> >> b.url LIKE '%bing%')
>> >> AND a.exec_date = from_unixtime(unix_timestamp())
>> >> GROUP BY a.id<http://a.id>;
>> >>
>> >> should work, but I didn't test it. You will probably need to format
>> >> the data string to match your data - hive doesn't really have a date
>> >> type as such - but instead has date manipulation functions for
>> >> strings.
>> >>
>> >> Take a look at
>> >> http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Date_Functions
>> >>
>> >>
>> >> Jacob Rideout
>> >
>> >
>>
>>
>>
>> --
>> Alexey Kovyrin
>> http://kovyrin.net/
>
>



--
Alexey Kovyrin
http://kovyrin.net/

Re: question - how to handle OR with HIVe

Posted by Ronak Bhatt <ro...@gmail.com>.

Here is an example of what error I'm referring to.... I'm running HIVE in
local mode by saying "SET mapred.job.tracker=local;"

any pointers to address the problem would be greatly appreciated....


hive> select substr(CB.EXEC_DATE,1,10), count(CB.ID)
    > from callbacks CB JOIN pages p ON
    > (
    > CB.page_id = p.id
    > and (p.page like '%google.com/search%'
    >         or p.page like '%google.com/custom%'
    >         or p.page like '%google.com/#hl%'
    >         or p.page like '%google.com/cse%'
    >         or p.page like '%search.yahoo.com/search%'
    >         or p.page like '%bing.com/search%'
    >         or p.page like '%google.com/product%' )
    > )
    > group by substr(CB.EXEC_DATE,1,10);

*FAILED: Error in semantic analysis: line 5:5 OR not supported in Join
currently '%google.com/product%'*



*thanks, ronak*
*
*
*408 504 4847*
*My Blog : http://ronakbaps.posterous.com*
*
*
*
*



On Fri, Aug 20, 2010 at 9:46 AM, Alexey Kovyrin <al...@kovyrin.net> wrote:

> Who told you those can't be used in where clauses? Here is a sample
> query I did on one of our servers right now:
>
>
> ------------------------------------------------------------------------------------
> # hive
> Hive history file=/tmp/root/hive_job_log_root_201008201144_357004789.txt
> hive> select * from doc_this_week_views where object_id = 1 OR
> object_id = 33054283;
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_201007222311_3438, Tracking URL =
> http://dfs01.local:50030/jobdetails.jsp?jobid=job_201007222311_3438
> Kill Command = /usr/lib/hadoop/bin/hadoop job
> -Dmapred.job.tracker=dfs01.local:9001 -kill job_201007222311_3438
> 2010-08-20 11:44:51,874 Stage-1 map = 0%,  reduce = 0%
> 2010-08-20 11:45:03,962 Stage-1 map = 100%,  reduce = 0%
> 2010-08-20 11:45:06,990 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_201007222311_3438
> OK
>
> 33054283        10
> 1       87609
>
> Time taken: 18.468 seconds
>
> ------------------------------------------------------------------------------------
>
> Apparently it works as expected.
>
> On Fri, Aug 20, 2010 at 11:58 AM, Ronak Bhatt <ro...@gmail.com> wrote:
> > This OR and AND are for use in select or condition..for example,
> > select (a OR B) as end_result_trur_or_false
> > from table
> > where id > 20;
> > What I'm looking for is,
> > select sum(amount)
> > from table
> > where id > 20 and
> > (a > 5  OR b < 10)
> > as you can see, I want to use OR in where condition...I could possibly
> use
> > UNION ALL to simulate the OR condition, but it is lot of writing
> > (duplicating the code) and also not efficient from performance
> perspective
> > as the table will be scanned multiple times for each select of the
> union...
> > hope this helps to clarify what kind of OR I'm looking for....
> >
> >
> > thanks, ronak
> > 408 504 4847
> > My Blog : http://ronakbaps.posterous.com
> >
> >
> >
> >
> > On Fri, Aug 20, 2010 at 8:50 AM, Alexey Kovyrin <al...@kovyrin.net>
> wrote:
> >>
> >> It definitely does support it.
> >> http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Logical_Operators
> >>
> >> On Fri, Aug 20, 2010 at 11:37 AM, Ronak Bhatt <ro...@gmail.com>
> wrote:
> >> > I read somewhere that hive does NOT support OR condition....that's
> what
> >> > I
> >> > was wondering as to how others handle the situation.
> >> >
> >> >
> >> >
> >> > On Fri, Aug 20, 2010 at 8:30 AM, Jacob R Rideout
> >> > <ap...@jacobrideout.net>
> >> > wrote:
> >> >>
> >> >> On Fri, Aug 20, 2010 at 8:37 AM, Ronak Bhatt <ro...@gmail.com>
> >> >> wrote:
> >> >> > Hi - I've a following SQL query. What is the way to convert it into
> >> >> > HIVE
> >> >> > runnable format -
> >> >> > Select a.id, count(b.url)
> >> >> > from a, b
> >> >> > where a.id = b.id
> >> >> > and (b.url  like '%google.com%' or b.url like '%h.google.com%' or
> >> >> > b.url
> >> >> > like
> >> >> > '%bing%')
> >> >> > and a.exec_date = date(now()) ;  ----- get today's data
> >> >> > group by a.id
> >> >> >
> >> >> > thanks, ronak
> >> >> >
> >> >>
> >> >> I think something like:
> >> >>
> >> >> SELECT a.id, COUNT(distinct b.url)
> >> >> FROM a
> >> >> JOIN b on a.id = b.id
> >> >> WHERE
> >> >>    (b.url  LIKE '%google.com%' OR b.url LIKE '%h.google.com%' OR
> >> >> b.url LIKE '%bing%')
> >> >> AND a.exec_date = from_unixtime(unix_timestamp())
> >> >> GROUP BY a.id;
> >> >>
> >> >> should work, but I didn't test it. You will probably need to format
> >> >> the data string to match your data - hive doesn't really have a date
> >> >> type as such - but instead has date manipulation functions for
> >> >> strings.
> >> >>
> >> >> Take a look at
> >> >> http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Date_Functions
> >> >>
> >> >>
> >> >> Jacob Rideout
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Alexey Kovyrin
> >> http://kovyrin.net/
> >
> >
>
>
>
> --
> Alexey Kovyrin
> http://kovyrin.net/
>

Re: question - how to handle OR with HIVe

Posted by Alexey Kovyrin <al...@kovyrin.net>.

Who told you those can't be used in where clauses? Here is a sample
query I did on one of our servers right now:

------------------------------------------------------------------------------------
# hive
Hive history file=/tmp/root/hive_job_log_root_201008201144_357004789.txt
hive> select * from doc_this_week_views where object_id = 1 OR
object_id = 33054283;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201007222311_3438, Tracking URL =
http://dfs01.local:50030/jobdetails.jsp?jobid=job_201007222311_3438
Kill Command = /usr/lib/hadoop/bin/hadoop job
-Dmapred.job.tracker=dfs01.local:9001 -kill job_201007222311_3438
2010-08-20 11:44:51,874 Stage-1 map = 0%,  reduce = 0%
2010-08-20 11:45:03,962 Stage-1 map = 100%,  reduce = 0%
2010-08-20 11:45:06,990 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201007222311_3438
OK

33054283	10
1	87609

Time taken: 18.468 seconds
------------------------------------------------------------------------------------

Apparently it works as expected.

On Fri, Aug 20, 2010 at 11:58 AM, Ronak Bhatt <ro...@gmail.com> wrote:
> This OR and AND are for use in select or condition..for example,
> select (a OR B) as end_result_trur_or_false
> from table
> where id > 20;
> What I'm looking for is,
> select sum(amount)
> from table
> where id > 20 and
> (a > 5  OR b < 10)
> as you can see, I want to use OR in where condition...I could possibly use
> UNION ALL to simulate the OR condition, but it is lot of writing
> (duplicating the code) and also not efficient from performance perspective
> as the table will be scanned multiple times for each select of the union...
> hope this helps to clarify what kind of OR I'm looking for....
>
>
> thanks, ronak
> 408 504 4847
> My Blog : http://ronakbaps.posterous.com
>
>
>
>
> On Fri, Aug 20, 2010 at 8:50 AM, Alexey Kovyrin <al...@kovyrin.net> wrote:
>>
>> It definitely does support it.
>> http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Logical_Operators
>>
>> On Fri, Aug 20, 2010 at 11:37 AM, Ronak Bhatt <ro...@gmail.com> wrote:
>> > I read somewhere that hive does NOT support OR condition....that's what
>> > I
>> > was wondering as to how others handle the situation.
>> >
>> >
>> >
>> > On Fri, Aug 20, 2010 at 8:30 AM, Jacob R Rideout
>> > <ap...@jacobrideout.net>
>> > wrote:
>> >>
>> >> On Fri, Aug 20, 2010 at 8:37 AM, Ronak Bhatt <ro...@gmail.com>
>> >> wrote:
>> >> > Hi - I've a following SQL query. What is the way to convert it into
>> >> > HIVE
>> >> > runnable format -
>> >> > Select a.id, count(b.url)
>> >> > from a, b
>> >> > where a.id = b.id
>> >> > and (b.url  like '%google.com%' or b.url like '%h.google.com%' or
>> >> > b.url
>> >> > like
>> >> > '%bing%')
>> >> > and a.exec_date = date(now()) ;  ----- get today's data
>> >> > group by a.id
>> >> >
>> >> > thanks, ronak
>> >> >
>> >>
>> >> I think something like:
>> >>
>> >> SELECT a.id, COUNT(distinct b.url)
>> >> FROM a
>> >> JOIN b on a.id = b.id
>> >> WHERE
>> >>    (b.url  LIKE '%google.com%' OR b.url LIKE '%h.google.com%' OR
>> >> b.url LIKE '%bing%')
>> >> AND a.exec_date = from_unixtime(unix_timestamp())
>> >> GROUP BY a.id;
>> >>
>> >> should work, but I didn't test it. You will probably need to format
>> >> the data string to match your data - hive doesn't really have a date
>> >> type as such - but instead has date manipulation functions for
>> >> strings.
>> >>
>> >> Take a look at
>> >> http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Date_Functions
>> >>
>> >>
>> >> Jacob Rideout
>> >
>> >
>>
>>
>>
>> --
>> Alexey Kovyrin
>> http://kovyrin.net/
>
>



-- 
Alexey Kovyrin
http://kovyrin.net/

Re: question - how to handle OR with HIVe

Posted by Ronak Bhatt <ro...@gmail.com>.

This OR and AND are for use in select or condition..for example,

select (a OR B) as end_result_trur_or_false
from table
where id > 20;

What I'm looking for is,

select sum(amount)
from table
where id > 20 and
(a > 5  OR b < 10)

as you can see, I want to use OR in where condition...I could possibly use
UNION ALL to simulate the OR condition, but it is lot of writing
(duplicating the code) and also not efficient from performance perspective
as the table will be scanned multiple times for each select of the union...

hope this helps to clarify what kind of OR I'm looking for....



*thanks, ronak*
*
*
*408 504 4847*
*My Blog : http://ronakbaps.posterous.com*
*
*
*
*



On Fri, Aug 20, 2010 at 8:50 AM, Alexey Kovyrin <al...@kovyrin.net> wrote:

> It definitely does support it.
> http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Logical_Operators
>
> On Fri, Aug 20, 2010 at 11:37 AM, Ronak Bhatt <ro...@gmail.com> wrote:
> > I read somewhere that hive does NOT support OR condition....that's what I
> > was wondering as to how others handle the situation.
> >
> >
> >
> > On Fri, Aug 20, 2010 at 8:30 AM, Jacob R Rideout <
> apache@jacobrideout.net>
> > wrote:
> >>
> >> On Fri, Aug 20, 2010 at 8:37 AM, Ronak Bhatt <ro...@gmail.com>
> wrote:
> >> > Hi - I've a following SQL query. What is the way to convert it into
> HIVE
> >> > runnable format -
> >> > Select a.id, count(b.url)
> >> > from a, b
> >> > where a.id = b.id
> >> > and (b.url  like '%google.com%' or b.url like '%h.google.com%' or
> b.url
> >> > like
> >> > '%bing%')
> >> > and a.exec_date = date(now()) ;  ----- get today's data
> >> > group by a.id
> >> >
> >> > thanks, ronak
> >> >
> >>
> >> I think something like:
> >>
> >> SELECT a.id, COUNT(distinct b.url)
> >> FROM a
> >> JOIN b on a.id = b.id
> >> WHERE
> >>    (b.url  LIKE '%google.com%' OR b.url LIKE '%h.google.com%' OR
> >> b.url LIKE '%bing%')
> >> AND a.exec_date = from_unixtime(unix_timestamp())
> >> GROUP BY a.id;
> >>
> >> should work, but I didn't test it. You will probably need to format
> >> the data string to match your data - hive doesn't really have a date
> >> type as such - but instead has date manipulation functions for
> >> strings.
> >>
> >> Take a look at
> >> http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Date_Functions
> >>
> >>
> >> Jacob Rideout
> >
> >
>
>
>
> --
> Alexey Kovyrin
> http://kovyrin.net/
>

Re: question - how to handle OR with HIVe

Posted by Alexey Kovyrin <al...@kovyrin.net>.

It definitely does support it.
http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Logical_Operators

On Fri, Aug 20, 2010 at 11:37 AM, Ronak Bhatt <ro...@gmail.com> wrote:
> I read somewhere that hive does NOT support OR condition....that's what I
> was wondering as to how others handle the situation.
>
>
>
> On Fri, Aug 20, 2010 at 8:30 AM, Jacob R Rideout <ap...@jacobrideout.net>
> wrote:
>>
>> On Fri, Aug 20, 2010 at 8:37 AM, Ronak Bhatt <ro...@gmail.com> wrote:
>> > Hi - I've a following SQL query. What is the way to convert it into HIVE
>> > runnable format -
>> > Select a.id, count(b.url)
>> > from a, b
>> > where a.id = b.id
>> > and (b.url  like '%google.com%' or b.url like '%h.google.com%' or b.url
>> > like
>> > '%bing%')
>> > and a.exec_date = date(now()) ;  ----- get today's data
>> > group by a.id
>> >
>> > thanks, ronak
>> >
>>
>> I think something like:
>>
>> SELECT a.id, COUNT(distinct b.url)
>> FROM a
>> JOIN b on a.id = b.id
>> WHERE
>>    (b.url  LIKE '%google.com%' OR b.url LIKE '%h.google.com%' OR
>> b.url LIKE '%bing%')
>> AND a.exec_date = from_unixtime(unix_timestamp())
>> GROUP BY a.id;
>>
>> should work, but I didn't test it. You will probably need to format
>> the data string to match your data - hive doesn't really have a date
>> type as such - but instead has date manipulation functions for
>> strings.
>>
>> Take a look at
>> http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Date_Functions
>>
>>
>> Jacob Rideout
>
>



-- 
Alexey Kovyrin
http://kovyrin.net/

Re: question - how to handle OR with HIVe

Posted by Ronak Bhatt <ro...@gmail.com>.

*I read somewhere that hive does NOT support OR condition....that's what I
was wondering as to how others handle the situation.*
*
*



On Fri, Aug 20, 2010 at 8:30 AM, Jacob R Rideout <ap...@jacobrideout.net>wrote:

> On Fri, Aug 20, 2010 at 8:37 AM, Ronak Bhatt <ro...@gmail.com> wrote:
> > Hi - I've a following SQL query. What is the way to convert it into HIVE
> > runnable format -
> > Select a.id, count(b.url)
> > from a, b
> > where a.id = b.id
> > and (b.url  like '%google.com%' or b.url like '%h.google.com%' or b.url
> like
> > '%bing%')
> > and a.exec_date = date(now()) ;  ----- get today's data
> > group by a.id
> >
> > thanks, ronak
> >
>
> I think something like:
>
> SELECT a.id, COUNT(distinct b.url)
> FROM a
> JOIN b on a.id = b.id
> WHERE
>    (b.url  LIKE '%google.com%' OR b.url LIKE '%h.google.com%' OR
> b.url LIKE '%bing%')
> AND a.exec_date = from_unixtime(unix_timestamp())
> GROUP BY a.id;
>
> should work, but I didn't test it. You will probably need to format
> the data string to match your data - hive doesn't really have a date
> type as such - but instead has date manipulation functions for
> strings.
>
> Take a look at
> http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Date_Functions
>
>
> Jacob Rideout
>

Re: question - how to handle OR with HIVe

Posted by Jacob R Rideout <ap...@jacobrideout.net>.

On Fri, Aug 20, 2010 at 8:37 AM, Ronak Bhatt <ro...@gmail.com> wrote:
> Hi - I've a following SQL query. What is the way to convert it into HIVE
> runnable format -
> Select a.id, count(b.url)
> from a, b
> where a.id = b.id
> and (b.url  like '%google.com%' or b.url like '%h.google.com%' or b.url like
> '%bing%')
> and a.exec_date = date(now()) ;  ----- get today's data
> group by a.id
>
> thanks, ronak
>

I think something like:

SELECT a.id, COUNT(distinct b.url)
FROM a
JOIN b on a.id = b.id
WHERE
    (b.url  LIKE '%google.com%' OR b.url LIKE '%h.google.com%' OR
b.url LIKE '%bing%')
AND a.exec_date = from_unixtime(unix_timestamp())
GROUP BY a.id;

should work, but I didn't test it. You will probably need to format
the data string to match your data - hive doesn't really have a date
type as such - but instead has date manipulation functions for
strings.

Take a look at http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Date_Functions


Jacob Rideout