You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Chao Sun <ch...@cloudera.com> on 2014/11/18 02:49:36 UTC

Review Request 28145: HIVE-8883 - Investigate test failures on auto_join30.q [Spark Branch]

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28145/
-----------------------------------------------------------

Review request for hive, Jimmy Xiang and Szehon Ho.


Bugs: HIVE-8883
    https://issues.apache.org/jira/browse/HIVE-8883


Repository: hive-git


Description
-------

This test fails with the following stack trace:
java.lang.NullPointerException
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:257)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
  at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
  at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:319)
  at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:276)
  at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:48)
  at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
  at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96)
  at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
  at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214)
  at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
  at org.apache.spark.scheduler.Task.run(Task.scala:56)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
2014-11-14 17:05:09,206 ERROR [Executor task launch worker-4]: spark.SparkReduceRecordHandler (SparkReduceRecordHandler.java:processRow(285)) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":"val_0"},"value":{"_col0":"0"}}
  at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:328)
  at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:276)
  at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:48)
  at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
  at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96)
  at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
  at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214)
  at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
  at org.apache.spark.scheduler.Task.run(Task.scala:56)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception: null
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:318)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
  at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
  at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:319)
  ... 14 more
Caused by: java.lang.NullPointerException
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:257)
  ... 17 more
auto_join27.q and auto_join31.q seem to fail with the same error.


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 2895d80 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkReduceRecordHandler.java 141ae6f 

Diff: https://reviews.apache.org/r/28145/diff/


Testing
-------

Tested with auto_join30.q, auto_join31.q, and auto_join27.q. They now generates correct results.


Thanks,

Chao Sun


Re: Review Request 28145: HIVE-8883 - Investigate test failures on auto_join30.q [Spark Branch]

Posted by Szehon Ho <sz...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28145/#review62295
-----------------------------------------------------------

Ship it!


Ship It!

- Szehon Ho


On Nov. 19, 2014, 11:57 p.m., Chao Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28145/
> -----------------------------------------------------------
> 
> (Updated Nov. 19, 2014, 11:57 p.m.)
> 
> 
> Review request for hive, Jimmy Xiang and Szehon Ho.
> 
> 
> Bugs: HIVE-8883
>     https://issues.apache.org/jira/browse/HIVE-8883
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This test fails with the following stack trace:
> java.lang.NullPointerException
>   at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:257)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
>   at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:319)
>   at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:276)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:48)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96)
>   at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214)
>   at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
>   at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>   at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)
>   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> 2014-11-14 17:05:09,206 ERROR [Executor task launch worker-4]: spark.SparkReduceRecordHandler (SparkReduceRecordHandler.java:processRow(285)) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":"val_0"},"value":{"_col0":"0"}}
>   at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:328)
>   at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:276)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:48)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96)
>   at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214)
>   at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
>   at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>   at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)
>   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception: null
>   at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:318)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
>   at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:319)
>   ... 14 more
> Caused by: java.lang.NullPointerException
>   at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:257)
>   ... 17 more
> auto_join27.q and auto_join31.q seem to fail with the same error.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 2895d80 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkReduceRecordHandler.java 141ae6f 
> 
> Diff: https://reviews.apache.org/r/28145/diff/
> 
> 
> Testing
> -------
> 
> Tested with auto_join30.q, auto_join31.q, and auto_join27.q. They now generates correct results.
> 
> 
> Thanks,
> 
> Chao Sun
> 
>


Re: Review Request 28145: HIVE-8883 - Investigate test failures on auto_join30.q [Spark Branch]

Posted by Chao Sun <ch...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28145/
-----------------------------------------------------------

(Updated Nov. 19, 2014, 11:57 p.m.)


Review request for hive, Jimmy Xiang and Szehon Ho.


Bugs: HIVE-8883
    https://issues.apache.org/jira/browse/HIVE-8883


Repository: hive-git


Description
-------

This test fails with the following stack trace:
java.lang.NullPointerException
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:257)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
  at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
  at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:319)
  at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:276)
  at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:48)
  at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
  at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96)
  at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
  at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214)
  at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
  at org.apache.spark.scheduler.Task.run(Task.scala:56)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
2014-11-14 17:05:09,206 ERROR [Executor task launch worker-4]: spark.SparkReduceRecordHandler (SparkReduceRecordHandler.java:processRow(285)) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":"val_0"},"value":{"_col0":"0"}}
  at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:328)
  at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:276)
  at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:48)
  at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
  at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96)
  at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
  at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214)
  at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
  at org.apache.spark.scheduler.Task.run(Task.scala:56)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception: null
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:318)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
  at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
  at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:319)
  ... 14 more
Caused by: java.lang.NullPointerException
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:257)
  ... 17 more
auto_join27.q and auto_join31.q seem to fail with the same error.


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 2895d80 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkReduceRecordHandler.java 141ae6f 

Diff: https://reviews.apache.org/r/28145/diff/


Testing
-------

Tested with auto_join30.q, auto_join31.q, and auto_join27.q. They now generates correct results.


Thanks,

Chao Sun


Re: Review Request 28145: HIVE-8883 - Investigate test failures on auto_join30.q [Spark Branch]

Posted by Chao Sun <ch...@cloudera.com>.

> On Nov. 19, 2014, 11:50 p.m., Jimmy Xiang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java, line 74
> > <https://reviews.apache.org/r/28145/diff/3/?file=770558#file770558line74>
> >
> >     We don't need this any more?

I was thinking about cleaning it and then restoring the code in the non-staged map join JIRA. But, after talking with Szehon, I decided to keep it anyway.


- Chao


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28145/#review62285
-----------------------------------------------------------


On Nov. 19, 2014, 11:35 p.m., Chao Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28145/
> -----------------------------------------------------------
> 
> (Updated Nov. 19, 2014, 11:35 p.m.)
> 
> 
> Review request for hive, Jimmy Xiang and Szehon Ho.
> 
> 
> Bugs: HIVE-8883
>     https://issues.apache.org/jira/browse/HIVE-8883
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This test fails with the following stack trace:
> java.lang.NullPointerException
>   at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:257)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
>   at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:319)
>   at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:276)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:48)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96)
>   at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214)
>   at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
>   at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>   at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)
>   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> 2014-11-14 17:05:09,206 ERROR [Executor task launch worker-4]: spark.SparkReduceRecordHandler (SparkReduceRecordHandler.java:processRow(285)) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":"val_0"},"value":{"_col0":"0"}}
>   at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:328)
>   at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:276)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:48)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96)
>   at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214)
>   at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
>   at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>   at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)
>   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception: null
>   at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:318)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
>   at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:319)
>   ... 14 more
> Caused by: java.lang.NullPointerException
>   at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:257)
>   ... 17 more
> auto_join27.q and auto_join31.q seem to fail with the same error.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 2895d80 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkReduceRecordHandler.java 141ae6f 
> 
> Diff: https://reviews.apache.org/r/28145/diff/
> 
> 
> Testing
> -------
> 
> Tested with auto_join30.q, auto_join31.q, and auto_join27.q. They now generates correct results.
> 
> 
> Thanks,
> 
> Chao Sun
> 
>


Re: Review Request 28145: HIVE-8883 - Investigate test failures on auto_join30.q [Spark Branch]

Posted by Jimmy Xiang <jx...@cloudera.com>.

> On Nov. 19, 2014, 11:50 p.m., Jimmy Xiang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java, line 74
> > <https://reviews.apache.org/r/28145/diff/3/?file=770558#file770558line74>
> >
> >     We don't need this any more?
> 
> Chao Sun wrote:
>     I was thinking about cleaning it and then restoring the code in the non-staged map join JIRA. But, after talking with Szehon, I decided to keep it anyway.

I see. Perhaps, you can move it around in the non-staged map join JIRA.


- Jimmy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28145/#review62285
-----------------------------------------------------------


On Nov. 19, 2014, 11:57 p.m., Chao Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28145/
> -----------------------------------------------------------
> 
> (Updated Nov. 19, 2014, 11:57 p.m.)
> 
> 
> Review request for hive, Jimmy Xiang and Szehon Ho.
> 
> 
> Bugs: HIVE-8883
>     https://issues.apache.org/jira/browse/HIVE-8883
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This test fails with the following stack trace:
> java.lang.NullPointerException
>   at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:257)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
>   at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:319)
>   at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:276)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:48)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96)
>   at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214)
>   at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
>   at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>   at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)
>   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> 2014-11-14 17:05:09,206 ERROR [Executor task launch worker-4]: spark.SparkReduceRecordHandler (SparkReduceRecordHandler.java:processRow(285)) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":"val_0"},"value":{"_col0":"0"}}
>   at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:328)
>   at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:276)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:48)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96)
>   at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214)
>   at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
>   at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>   at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)
>   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception: null
>   at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:318)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
>   at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:319)
>   ... 14 more
> Caused by: java.lang.NullPointerException
>   at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:257)
>   ... 17 more
> auto_join27.q and auto_join31.q seem to fail with the same error.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 2895d80 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkReduceRecordHandler.java 141ae6f 
> 
> Diff: https://reviews.apache.org/r/28145/diff/
> 
> 
> Testing
> -------
> 
> Tested with auto_join30.q, auto_join31.q, and auto_join27.q. They now generates correct results.
> 
> 
> Thanks,
> 
> Chao Sun
> 
>


Re: Review Request 28145: HIVE-8883 - Investigate test failures on auto_join30.q [Spark Branch]

Posted by Jimmy Xiang <jx...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28145/#review62285
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java
<https://reviews.apache.org/r/28145/#comment104311>

    We don't need this any more?


- Jimmy Xiang


On Nov. 19, 2014, 11:35 p.m., Chao Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28145/
> -----------------------------------------------------------
> 
> (Updated Nov. 19, 2014, 11:35 p.m.)
> 
> 
> Review request for hive, Jimmy Xiang and Szehon Ho.
> 
> 
> Bugs: HIVE-8883
>     https://issues.apache.org/jira/browse/HIVE-8883
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This test fails with the following stack trace:
> java.lang.NullPointerException
>   at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:257)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
>   at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:319)
>   at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:276)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:48)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96)
>   at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214)
>   at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
>   at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>   at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)
>   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> 2014-11-14 17:05:09,206 ERROR [Executor task launch worker-4]: spark.SparkReduceRecordHandler (SparkReduceRecordHandler.java:processRow(285)) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":"val_0"},"value":{"_col0":"0"}}
>   at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:328)
>   at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:276)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:48)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
>   at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96)
>   at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>   at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214)
>   at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
>   at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>   at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:56)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)
>   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception: null
>   at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:318)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
>   at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
>   at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:319)
>   ... 14 more
> Caused by: java.lang.NullPointerException
>   at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:257)
>   ... 17 more
> auto_join27.q and auto_join31.q seem to fail with the same error.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 2895d80 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkReduceRecordHandler.java 141ae6f 
> 
> Diff: https://reviews.apache.org/r/28145/diff/
> 
> 
> Testing
> -------
> 
> Tested with auto_join30.q, auto_join31.q, and auto_join27.q. They now generates correct results.
> 
> 
> Thanks,
> 
> Chao Sun
> 
>


Re: Review Request 28145: HIVE-8883 - Investigate test failures on auto_join30.q [Spark Branch]

Posted by Chao Sun <ch...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28145/
-----------------------------------------------------------

(Updated Nov. 19, 2014, 11:35 p.m.)


Review request for hive, Jimmy Xiang and Szehon Ho.


Bugs: HIVE-8883
    https://issues.apache.org/jira/browse/HIVE-8883


Repository: hive-git


Description
-------

This test fails with the following stack trace:
java.lang.NullPointerException
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:257)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
  at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
  at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:319)
  at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:276)
  at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:48)
  at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
  at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96)
  at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
  at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214)
  at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
  at org.apache.spark.scheduler.Task.run(Task.scala:56)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
2014-11-14 17:05:09,206 ERROR [Executor task launch worker-4]: spark.SparkReduceRecordHandler (SparkReduceRecordHandler.java:processRow(285)) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":"val_0"},"value":{"_col0":"0"}}
  at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:328)
  at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:276)
  at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:48)
  at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
  at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96)
  at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
  at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214)
  at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
  at org.apache.spark.scheduler.Task.run(Task.scala:56)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception: null
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:318)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
  at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
  at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:319)
  ... 14 more
Caused by: java.lang.NullPointerException
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:257)
  ... 17 more
auto_join27.q and auto_join31.q seem to fail with the same error.


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 2895d80 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkReduceRecordHandler.java 141ae6f 

Diff: https://reviews.apache.org/r/28145/diff/


Testing
-------

Tested with auto_join30.q, auto_join31.q, and auto_join27.q. They now generates correct results.


Thanks,

Chao Sun


Re: Review Request 28145: HIVE-8883 - Investigate test failures on auto_join30.q [Spark Branch]

Posted by Chao Sun <ch...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28145/
-----------------------------------------------------------

(Updated Nov. 18, 2014, 2:51 a.m.)


Review request for hive, Jimmy Xiang and Szehon Ho.


Changes
-------

Last patch failed because of upstream change on HashTableLoader#load(). Now fixed.


Bugs: HIVE-8883
    https://issues.apache.org/jira/browse/HIVE-8883


Repository: hive-git


Description
-------

This test fails with the following stack trace:
java.lang.NullPointerException
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:257)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
  at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
  at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:319)
  at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:276)
  at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:48)
  at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
  at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96)
  at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
  at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214)
  at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
  at org.apache.spark.scheduler.Task.run(Task.scala:56)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
2014-11-14 17:05:09,206 ERROR [Executor task launch worker-4]: spark.SparkReduceRecordHandler (SparkReduceRecordHandler.java:processRow(285)) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":"val_0"},"value":{"_col0":"0"}}
  at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:328)
  at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processRow(SparkReduceRecordHandler.java:276)
  at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:48)
  at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.processNextRecord(HiveReduceFunctionResultList.java:28)
  at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:96)
  at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
  at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:214)
  at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65)
  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
  at org.apache.spark.scheduler.Task.run(Task.scala:56)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:186)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception: null
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:318)
  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
  at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
  at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.processKeyValues(SparkReduceRecordHandler.java:319)
  ... 14 more
Caused by: java.lang.NullPointerException
  at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:257)
  ... 17 more
auto_join27.q and auto_join31.q seem to fail with the same error.


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HashTableLoader.java 2895d80 
  ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkReduceRecordHandler.java 141ae6f 

Diff: https://reviews.apache.org/r/28145/diff/


Testing
-------

Tested with auto_join30.q, auto_join31.q, and auto_join27.q. They now generates correct results.


Thanks,

Chao Sun