You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Jason Michael <jm...@videoegg.com> on 2010/02/19 23:23:11 UTC

Having trouble with lateral view

I'm currently running a hive build from trunk, revision number 911889.  I've built a UDTF called map_explode which just emits the key and value of each entry in a map as a row in the result table.  The table I'm running it against looks like:

hive> describe mytable;
product    string    from deserializer
...
interactions    map<string,int>    from deserializer

If I use the map_explode in the select clause, I get the expected results:

hive> select map_explode(interactions) as (key, value) from mytable where day = '2010-02-18' and hour = 1 limit 10;
...
OK
invite_impression    1
invite_impression    1
invite_impression    1
invite_impression    1
rollout    12
invite_impression    1
invite_impression    1
invite_impression    1
rollout    4
invite_impression    1
Time taken: 22.11 seconds

However, if I try to use LATERAL JOIN to relate the exploded values back to the parent table, like so:

hive> select product, key, sum(value) from mytable LATERAL VIEW map_explode(interactions) interacts as key, value where day = '2010-02-18' and hour = 1 group by product, key;

I get the following error:

FAILED: Unknown exception: null

Looking in hive.log, I see the follow stack trace:

2010-02-19 14:15:17,215 ERROR ql.Driver (SessionState.java:printError(255)) - FAILED: Unknown exception: null
java.lang.NullPointerException
    at org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory$ColumnExprProcessor.process(ExprWalkerProcFactory.java:87)
    at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
    at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
    at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:129)
    at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:103)
    at org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory.extractPushdownPreds(ExprWalkerProcFactory.java:273)
    at org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.mergeWithChildrenPred(OpProcFactory.java:317)
    at org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.process(OpProcFactory.java:258)
    at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
    at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
    at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:129)
    at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:103)
    at org.apache.hadoop.hive.ql.ppd.PredicatePushDown.transform(PredicatePushDown.java:103)
    at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:74)
    at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5758)
    at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:125)
    at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:304)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:377)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:138)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:303)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

I peeked at ExprWalkerProcFactory, but couldn't readily see what was causing the problem.  Any ideas?

Jason

Re: Having trouble with lateral view

Posted by Zheng Shao <zs...@gmail.com>.
Jason,

Do you want to open a JIRA and contrib your map_explode function to Hive?
That will be greatly appreciated.


Zheng

On Fri, Feb 19, 2010 at 2:49 PM, Yongqiang He
<he...@software.ict.ac.cn> wrote:
> Hi Jason,
>
> This is a known bug, see https://issues.apache.org/jira/browse/HIVE-1056
>
> You can first disable ppd with “set hive.optimize.ppd=false;”
>
> Thanks
> Yongqiang
> On 2/19/10 2:23 PM, "Jason Michael" <jm...@videoegg.com> wrote:
>
> I’m currently running a hive build from trunk, revision number 911889.  I’ve
> built a UDTF called map_explode which just emits the key and value of each
> entry in a map as a row in the result table.  The table I’m running it
> against looks like:
>
> hive> describe mytable;
> product    string    from deserializer
> ...
> interactions    map<string,int>    from deserializer
>
> If I use the map_explode in the select clause, I get the expected results:
>
> hive> select map_explode(interactions) as (key, value) from mytable where
> day = '2010-02-18' and hour = 1 limit 10;
> ...
> OK
> invite_impression    1
> invite_impression    1
> invite_impression    1
> invite_impression    1
> rollout    12
> invite_impression    1
> invite_impression    1
> invite_impression    1
> rollout    4
> invite_impression    1
> Time taken: 22.11 seconds
>
> However, if I try to use LATERAL JOIN to relate the exploded values back to
> the parent table, like so:
>
> hive> select product, key, sum(value) from mytable LATERAL VIEW
> map_explode(interactions) interacts as key, value where day = '2010-02-18'
> and hour = 1 group by product, key;
>
> I get the following error:
>
> FAILED: Unknown exception: null
>
> Looking in hive.log, I see the follow stack trace:
>
> 2010-02-19 14:15:17,215 ERROR ql.Driver (SessionState.java:printError(255))
> - FAILED: Unknown exception: null
> java.lang.NullPointerException
>     at
> org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory$ColumnExprProcessor.process(ExprWalkerProcFactory.java:87)
>     at
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
>     at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
>     at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:129)
>     at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:103)
>     at
> org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory.extractPushdownPreds(ExprWalkerProcFactory.java:273)
>     at
> org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.mergeWithChildrenPred(OpProcFactory.java:317)
>     at
> org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.process(OpProcFactory.java:258)
>     at
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
>     at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
>     at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:129)
>     at
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:103)
>     at
> org.apache.hadoop.hive.ql.ppd.PredicatePushDown.transform(PredicatePushDown.java:103)
>     at
> org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:74)
>     at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5758)
>     at
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:125)
>     at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:304)
>     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:377)
>     at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:138)
>     at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197)
>     at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:303)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
> I peeked at ExprWalkerProcFactory, but couldn’t readily see what was causing
> the problem.  Any ideas?
>
> Jason
>



-- 
Yours,
Zheng

Re: Having trouble with lateral view

Posted by Yongqiang He <he...@software.ict.ac.cn>.
Hi Jason,

This is a known bug, see https://issues.apache.org/jira/browse/HIVE-1056

You can first disable ppd with ³set hive.optimize.ppd=false;²

Thanks
Yongqiang
On 2/19/10 2:23 PM, "Jason Michael" <jm...@videoegg.com> wrote:

> I¹m currently running a hive build from trunk, revision number 911889.  I¹ve
> built a UDTF called map_explode which just emits the key and value of each
> entry in a map as a row in the result table.  The table I¹m running it against
> looks like:
> 
> hive> describe mytable;
> product    string    from deserializer
> ...
> interactions    map<string,int>    from deserializer
> 
> If I use the map_explode in the select clause, I get the expected results:
> 
> hive> select map_explode(interactions) as (key, value) from mytable where day
> = '2010-02-18' and hour = 1 limit 10;
> ...
> OK
> invite_impression    1
> invite_impression    1
> invite_impression    1
> invite_impression    1
> rollout    12
> invite_impression    1
> invite_impression    1
> invite_impression    1
> rollout    4
> invite_impression    1
> Time taken: 22.11 seconds
> 
> However, if I try to use LATERAL JOIN to relate the exploded values back to
> the parent table, like so:
> 
> hive> select product, key, sum(value) from mytable LATERAL VIEW
> map_explode(interactions) interacts as key, value where day = '2010-02-18' and
> hour = 1 group by product, key;
> 
> I get the following error:
> 
> FAILED: Unknown exception: null
> 
> Looking in hive.log, I see the follow stack trace:
> 
> 2010-02-19 14:15:17,215 ERROR ql.Driver (SessionState.java:printError(255)) -
> FAILED: Unknown exception: null
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory$ColumnExprProcessor.proces
> s(ExprWalkerProcFactory.java:87)
>     at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispat
> cher.java:89)
>     at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.j
> ava:89)
>     at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:
> 129)
>     at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalk
> er.java:103)
>     at 
> org.apache.hadoop.hive.ql.ppd.ExprWalkerProcFactory.extractPushdownPreds(ExprW
> alkerProcFactory.java:273)
>     at 
> org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.mergeWithChildrenPred(O
> pProcFactory.java:317)
>     at 
> org.apache.hadoop.hive.ql.ppd.OpProcFactory$DefaultPPD.process(OpProcFactory.j
> ava:258)
>     at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispat
> cher.java:89)
>     at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.j
> ava:89)
>     at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:
> 129)
>     at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalk
> er.java:103)
>     at 
> org.apache.hadoop.hive.ql.ppd.PredicatePushDown.transform(PredicatePushDown.ja
> va:103)
>     at 
> org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:74)
>     at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnaly
> zer.java:5758)
>     at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnaly
> zer.java:125)
>     at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:304)
>     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:377)
>     at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:138)
>     at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197)
>     at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:303)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.j
> ava:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> 
> I peeked at ExprWalkerProcFactory, but couldn¹t readily see what was causing
> the problem.  Any ideas?
> 
> Jason