You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by sagar naik <sn...@attributor.com> on 2012/06/02 06:20:54 UTC

Querying HBase Records with null valued-columns using hive

Hi ,
I am seeing a very weird hive-hbase query behaviour.
I have an externally mounted hbase table in hive


select creation_ts, length(url), isnull(url), ! (isnull(url)) from
task_table limit 10;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201206011557_0023, Tracking URL =
http://xxxxxxxx:50030/jobdetails.jsp?jobid=job_201206011557_0023
Kill Command = /xxxxxx/xxxxx/xxxxx/../bin/hadoop job
-Dmapred.job.tracker=xxxxxx:54311 -kill job_201206011557_0023
2012-06-01 20:37:09,878 Stage-1 map = 0%,  reduce = 0%
2012-06-01 20:37:15,920 Stage-1 map = 100%,  reduce = 0%
2012-06-01 20:37:16,929 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201206011557_0023
OK
1337061992484 NULL false true
1334307650105 184 false true
1336532379103 229 false true
1335226875331 NULL false true
1335746654565 NULL false true
1335400140889 NULL false true
1338419117954 NULL false true
1338425256315 NULL false true
1336554120401 NULL false true
1338002526497 NULL false true
Time taken: 10.528 seconds

Notice that isnull(url) is false for all strings even if length reported is null

My ultimate aim is to get the number of records where url is null and
join those records with another table



I noticed that FilterOPerator passes (returns TRUE)
however, when it is forwarded (forward (...,...) ) it returns FALSE :O

Any pointers / help is highly appreciated.


-Sagar

Re: Querying HBase Records with null valued-columns using hive

Posted by sagar naik <sn...@attributor.com>.
Sorry for the late reply, Ashutosh.

Thanks for the pointers.
I will soon try it out with the hive version

-Sagar

On Sat, Jun 2, 2012 at 10:30 AM, Ashutosh Chauhan <ha...@apache.org>wrote:

> Hey Sagar,
>
> Seems like you have inserted data in your hbase table directly through
> hbase client and not through hive client. If so, you need
> https://issues.apache.org/jira/browse/HIVE-1634 to correctly read
> pre-existing data in hbase. Hive-1634 is available as a part of 0.9
> release. So, upgrade to 0.9 and your problem should go away.
>
> Hope it helps,
> Ashutosh
>
> On Fri, Jun 1, 2012 at 9:31 PM, sagar naik <sn...@attributor.com> wrote:
>
>> I am using hive-0.7-cdh3u0
>>
>> Thanks Again
>>
>> On Fri, Jun 1, 2012 at 9:20 PM, sagar naik <sn...@attributor.com> wrote:
>> > Hi ,
>> > I am seeing a very weird hive-hbase query behaviour.
>> > I have an externally mounted hbase table in hive
>> >
>> >
>> > select creation_ts, length(url), isnull(url), ! (isnull(url)) from
>> > task_table limit 10;
>> > Total MapReduce jobs = 1
>> > Launching Job 1 out of 1
>> > Number of reduce tasks is set to 0 since there's no reduce operator
>> > Starting Job = job_201206011557_0023, Tracking URL =
>> > http://xxxxxxxx:50030/jobdetails.jsp?jobid=job_201206011557_0023
>> > Kill Command = /xxxxxx/xxxxx/xxxxx/../bin/hadoop job
>> > -Dmapred.job.tracker=xxxxxx:54311 -kill job_201206011557_0023
>> > 2012-06-01 20:37:09,878 Stage-1 map = 0%,  reduce = 0%
>> > 2012-06-01 20:37:15,920 Stage-1 map = 100%,  reduce = 0%
>> > 2012-06-01 20:37:16,929 Stage-1 map = 100%,  reduce = 100%
>> > Ended Job = job_201206011557_0023
>> > OK
>> > 1337061992484 NULL false true
>> > 1334307650105 184 false true
>> > 1336532379103 229 false true
>> > 1335226875331 NULL false true
>> > 1335746654565 NULL false true
>> > 1335400140889 NULL false true
>> > 1338419117954 NULL false true
>> > 1338425256315 NULL false true
>> > 1336554120401 NULL false true
>> > 1338002526497 NULL false true
>> > Time taken: 10.528 seconds
>> >
>> > Notice that isnull(url) is false for all strings even if length
>> reported is null
>> >
>> > My ultimate aim is to get the number of records where url is null and
>> > join those records with another table
>> >
>> >
>> >
>> > I noticed that FilterOPerator passes (returns TRUE)
>> > however, when it is forwarded (forward (...,...) ) it returns FALSE :O
>> >
>> > Any pointers / help is highly appreciated.
>> >
>> >
>> > -Sagar
>>
>
>

Re: Querying HBase Records with null valued-columns using hive

Posted by Ashutosh Chauhan <ha...@apache.org>.
Hey Sagar,

Seems like you have inserted data in your hbase table directly through
hbase client and not through hive client. If so, you need
https://issues.apache.org/jira/browse/HIVE-1634 to correctly read
pre-existing data in hbase. Hive-1634 is available as a part of 0.9
release. So, upgrade to 0.9 and your problem should go away.

Hope it helps,
Ashutosh
On Fri, Jun 1, 2012 at 9:31 PM, sagar naik <sn...@attributor.com> wrote:

> I am using hive-0.7-cdh3u0
>
> Thanks Again
>
> On Fri, Jun 1, 2012 at 9:20 PM, sagar naik <sn...@attributor.com> wrote:
> > Hi ,
> > I am seeing a very weird hive-hbase query behaviour.
> > I have an externally mounted hbase table in hive
> >
> >
> > select creation_ts, length(url), isnull(url), ! (isnull(url)) from
> > task_table limit 10;
> > Total MapReduce jobs = 1
> > Launching Job 1 out of 1
> > Number of reduce tasks is set to 0 since there's no reduce operator
> > Starting Job = job_201206011557_0023, Tracking URL =
> > http://xxxxxxxx:50030/jobdetails.jsp?jobid=job_201206011557_0023
> > Kill Command = /xxxxxx/xxxxx/xxxxx/../bin/hadoop job
> > -Dmapred.job.tracker=xxxxxx:54311 -kill job_201206011557_0023
> > 2012-06-01 20:37:09,878 Stage-1 map = 0%,  reduce = 0%
> > 2012-06-01 20:37:15,920 Stage-1 map = 100%,  reduce = 0%
> > 2012-06-01 20:37:16,929 Stage-1 map = 100%,  reduce = 100%
> > Ended Job = job_201206011557_0023
> > OK
> > 1337061992484 NULL false true
> > 1334307650105 184 false true
> > 1336532379103 229 false true
> > 1335226875331 NULL false true
> > 1335746654565 NULL false true
> > 1335400140889 NULL false true
> > 1338419117954 NULL false true
> > 1338425256315 NULL false true
> > 1336554120401 NULL false true
> > 1338002526497 NULL false true
> > Time taken: 10.528 seconds
> >
> > Notice that isnull(url) is false for all strings even if length reported
> is null
> >
> > My ultimate aim is to get the number of records where url is null and
> > join those records with another table
> >
> >
> >
> > I noticed that FilterOPerator passes (returns TRUE)
> > however, when it is forwarded (forward (...,...) ) it returns FALSE :O
> >
> > Any pointers / help is highly appreciated.
> >
> >
> > -Sagar
>

Re: Querying HBase Records with null valued-columns using hive

Posted by sagar naik <sn...@attributor.com>.
I am using hive-0.7-cdh3u0

Thanks Again

On Fri, Jun 1, 2012 at 9:20 PM, sagar naik <sn...@attributor.com> wrote:
> Hi ,
> I am seeing a very weird hive-hbase query behaviour.
> I have an externally mounted hbase table in hive
>
>
> select creation_ts, length(url), isnull(url), ! (isnull(url)) from
> task_table limit 10;
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_201206011557_0023, Tracking URL =
> http://xxxxxxxx:50030/jobdetails.jsp?jobid=job_201206011557_0023
> Kill Command = /xxxxxx/xxxxx/xxxxx/../bin/hadoop job
> -Dmapred.job.tracker=xxxxxx:54311 -kill job_201206011557_0023
> 2012-06-01 20:37:09,878 Stage-1 map = 0%,  reduce = 0%
> 2012-06-01 20:37:15,920 Stage-1 map = 100%,  reduce = 0%
> 2012-06-01 20:37:16,929 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_201206011557_0023
> OK
> 1337061992484 NULL false true
> 1334307650105 184 false true
> 1336532379103 229 false true
> 1335226875331 NULL false true
> 1335746654565 NULL false true
> 1335400140889 NULL false true
> 1338419117954 NULL false true
> 1338425256315 NULL false true
> 1336554120401 NULL false true
> 1338002526497 NULL false true
> Time taken: 10.528 seconds
>
> Notice that isnull(url) is false for all strings even if length reported is null
>
> My ultimate aim is to get the number of records where url is null and
> join those records with another table
>
>
>
> I noticed that FilterOPerator passes (returns TRUE)
> however, when it is forwarded (forward (...,...) ) it returns FALSE :O
>
> Any pointers / help is highly appreciated.
>
>
> -Sagar