You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org> on 2013/02/13 10:40:12 UTC
[jira] [Commented] (HIVE-4018) MapJoin failing with Distributed
Cache error
[ https://issues.apache.org/jira/browse/HIVE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13577438#comment-13577438 ]
Amareshwari Sriramadasu commented on HIVE-4018:
-----------------------------------------------
The setup is as follows :
We have 7 dimension tables dim1,... dim7. Number of rows in each dimension - 1009530, 3, 227358, 238514, 519, 203841, 47.
and the query is
{noformat}
Select SUM(msr1), SUM(msr2) , ....
from fact
Left outer join dim1 on fact.d1= dim1.id
Left outer join dim2 on dim1.id2 = dim2.id
Left outer Join dim3 on fact.d3= dim3.id1
Left outer Join dim4 on dim3.id3= dim4.id4
Left outer join dim5 on dim4.id5= dim5.id
Left outer Join dim6 on dim3.id6= dim6.id
Left outer Join dim7 on dim6.id7 = dim7.id;
{noformat}
here is the log of lacal task loading hash tables, I'm seeing an NPE while loading one the tables :
{noformat}
2013-02-13 09:04:47 Starting to launch local task to process map join; maximum memory = 1004929024
2013-02-13 09:04:48 Processing rows: 519 Hashtable size: 519 Memory usage: 11845496 rate: 0.012
2013-02-13 09:04:48 Dump the hashtable into file: file:/tmp/ubuntu/hive_2013-02-13_09-04-35_481_9216097600487630659/-local-10008/HashTable-Stage-19/MapJoin-mapfile21--.hashtable
2013-02-13 09:04:48 Upload 1 File to: file:/tmp/ubuntu/hive_2013-02-13_09-04-35_481_9216097600487630659/-local-10008/HashTable-Stage-19/MapJoin-mapfile21--.hashtable File size: 31191
2013-02-13 09:04:49 Processing rows: 200000 Hashtable size: 199999 Memory usage: 60980296 rate: 0.061
2013-02-13 09:04:54 Processing rows: 200000 Hashtable size: 199999 Memory usage: 156217016 rate: 0.155
2013-02-13 09:05:01 Processing rows: 300000 Hashtable size: 299999 Memory usage: 202205440 rate: 0.201
2013-02-13 09:05:05 Processing rows: 400000 Hashtable size: 399999 Memory usage: 260133024 rate: 0.259
2013-02-13 09:05:10 Processing rows: 500000 Hashtable size: 499999 Memory usage: 293007176 rate: 0.292
2013-02-13 09:05:14 Processing rows: 600000 Hashtable size: 599999 Memory usage: 347795184 rate: 0.346
2013-02-13 09:05:22 Processing rows: 700000 Hashtable size: 699999 Memory usage: 388323912 rate: 0.386
2013-02-13 09:05:28 Processing rows: 800000 Hashtable size: 799999 Memory usage: 453952824 rate: 0.452
2013-02-13 09:05:34 Processing rows: 900000 Hashtable size: 899999 Memory usage: 482001544 rate: 0.48
2013-02-13 09:05:43 Processing rows: 1000000 Hashtable size: 999999 Memory usage: 539703480 rate: 0.537
2013-02-13 09:05:47 Processing rows: 1009530 Hashtable size: 1009530 Memory usage: 530473664 rate: 0.528
2013-02-13 09:05:47 Dump the hashtable into file: file:/tmp/ubuntu/hive_2013-02-13_09-04-35_481_9216097600487630659/-local-10008/HashTable-Stage-19/MapJoin-mapfile61--.hashtable
2013-02-13 09:06:29 Upload 1 File to: file:/tmp/ubuntu/hive_2013-02-13_09-04-35_481_9216097600487630659/-local-10008/HashTable-Stage-19/MapJoin-mapfile61--.hashtable File size: 148246102
2013-02-13 09:06:31 Processing rows: 258054 Hashtable size: 54213 Memory usage: 111883448 rate: 0.111
2013-02-13 09:06:31 Dump the hashtable into file: file:/tmp/ubuntu/hive_2013-02-13_09-04-35_481_9216097600487630659/-local-10008/HashTable-Stage-19/MapJoin-mapfile31--.hashtable
2013-02-13 09:06:33 Upload 1 File to: file:/tmp/ubuntu/hive_2013-02-13_09-04-35_481_9216097600487630659/-local-10008/HashTable-Stage-19/MapJoin-mapfile31--.hashtable File size: 4251559
2013-02-13 09:06:34 Processing rows: 258054 Hashtable size: 203841 Memory usage: 72276192 rate: 0.072
2013-02-13 09:06:34 Dump the hashtable into file: file:/tmp/ubuntu/hive_2013-02-13_09-04-35_481_9216097600487630659/-local-10008/HashTable-Stage-19/MapJoin-mapfile32--.hashtable
java.lang.NullPointerException
at org.apache.hadoop.hive.ql.exec.persistence.MapJoinObjectValue.writeExternal(MapJoinObjectValue.java:138)
at java.io.ObjectOutputStream.writeExternalData(ObjectOutputStream.java:1443)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1414)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1174)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:346)
at java.util.HashMap.writeObject(HashMap.java:1018)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:959)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1480)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1416)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1174)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:346)
at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.flushMemoryCacheToPersistent(HashMapWrapper.java:116)
at org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.closeOp(HashTableSinkOperator.java:415)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:607)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:616)
at org.apache.hadoop.hive.ql.exec.MapredLocalTask.startForward(MapredLocalTask.java:324)
at org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:276)
at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:677)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
2013-02-13 09:06:34 Processing rows: 47 Hashtable size: 47 Memory usage: 72554224 rate: 0.072
2013-02-13 09:06:34 Dump the hashtable into file: file:/tmp/ubuntu/hive_2013-02-13_09-04-35_481_9216097600487630659/-local-10008/HashTable-Stage-19/MapJoin-mapfile11--.hashtable
2013-02-13 09:06:34 Upload 1 File to: file:/tmp/ubuntu/hive_2013-02-13_09-04-35_481_9216097600487630659/-local-10008/HashTable-Stage-19/MapJoin-mapfile11--.hashtable File size: 2908
2013-02-13 09:06:37 Processing rows: 200000 Hashtable size: 199999 Memory usage: 154624680 rate: 0.154
2013-02-13 09:06:38 Processing rows: 227358 Hashtable size: 227358 Memory usage: 165643352 rate: 0.165
2013-02-13 09:06:38 Dump the hashtable into file: file:/tmp/ubuntu/hive_2013-02-13_09-04-35_481_9216097600487630659/-local-10008/HashTable-Stage-19/MapJoin-mapfile41--.hashtable
2013-02-13 09:06:46 Upload 1 File to: file:/tmp/ubuntu/hive_2013-02-13_09-04-35_481_9216097600487630659/-local-10008/HashTable-Stage-19/MapJoin-mapfile41--.hashtable File size: 34351618
2013-02-13 09:06:47 Processing rows: 3 Hashtable size: 3 Memory usage: 74456192 rate: 0.074
2013-02-13 09:06:47 Dump the hashtable into file: file:/tmp/ubuntu/hive_2013-02-13_09-04-35_481_9216097600487630659/-local-10008/HashTable-Stage-19/MapJoin-mapfile51--.hashtable
2013-02-13 09:06:47 Upload 1 File to: file:/tmp/ubuntu/hive_2013-02-13_09-04-35_481_9216097600487630659/-local-10008/HashTable-Stage-19/MapJoin-mapfile51--.hashtable File size: 457
2013-02-13 09:06:47 End of local task; Time Taken: 119.326 sec.
Execution completed successfully
Mapred Local Task Succeeded . Convert the Join into MapJoin
Mapred Local Task Succeeded . Convert the Join into MapJoin
{noformat}
> MapJoin failing with Distributed Cache error
> --------------------------------------------
>
> Key: HIVE-4018
> URL: https://issues.apache.org/jira/browse/HIVE-4018
> Project: Hive
> Issue Type: Bug
> Components: SQL
> Affects Versions: 0.11.0
> Reporter: Amareshwari Sriramadasu
> Fix For: 0.11.0
>
>
> When I'm a running a star join query after HIVE-3784, it is failing with following error:
> 2013-02-13 08:36:04,584 ERROR org.apache.hadoop.hive.ql.exec.MapJoinOperator: Load Distributed Cache Error
> 2013-02-13 08:36:04,585 FATAL ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException
> at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:189)
> at org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:203)
> at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1421)
> at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425)
> at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425)
> at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425)
> at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425)
> at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425)
> at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425)
> at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425)
> at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425)
> at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:614)
> at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:416)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278)
> at org.apache.hadoop.mapred.Child.main(Child.java:260)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira