You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hive.apache.org by "Szehon Ho (JIRA)" <ji...@apache.org> on 2014/08/09 02:14:12 UTC

[jira] [Commented] (HIVE-7382) Create a MiniSparkCluster and set up a testing framework

    [ https://issues.apache.org/jira/browse/HIVE-7382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14091486#comment-14091486 ] 

Szehon Ho commented on HIVE-7382:
---------------------------------

This would be similar to HIVE-7665 and would be done by setting spark.master=local-cluster, as opposed to local.  

However, although it's used by spark unit tests, it's not publically exposed in spark.  I tried to set this and got the error: {noformat}java.io.IOException: Cannot run program "/home/szehon/repos/apache-hive/hive/itests/spark-qtest/./bin/compute-classpath.sh" (in directory "."): error=2, No such file or directory{noformat}

in Master.scala.  The error that surfaces is "ApplicationRemoved(FAILED)"

I think HIVE-7665 may serve our use-case for now to unblock testing, as this might be a bit more involved.  Talking with folks, it seems even a local spark cluster will catch most of the issues (including serialization issues).

> Create a MiniSparkCluster and set up a testing framework
> --------------------------------------------------------
>
>                 Key: HIVE-7382
>                 URL: https://issues.apache.org/jira/browse/HIVE-7382
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Xuefu Zhang
>            Assignee: Szehon Ho
>
> To automatically test Hive functionality over Spark execution engine, we need to create a test framework that can execute Hive queries with Spark as the backend. For that, we should create a MiniSparkCluser for this, similar to other execution engines.
> Spark has a way to create a local cluster with a few processes in the local machine, each process is a work node. It's fairly close to a real Spark cluster. Our mini cluster can be based on that.
> For more info, please refer to the design doc on wiki.



--
This message was sent by Atlassian JIRA
(v6.2#6252)