You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "Carlo.Allocca" <ca...@open.ac.uk> on 2016/07/28 09:44:59 UTC

SPARK Exception thrown in awaitResult

Hi All,

I am running SPARK locally, and when running d3=join(d1,d2) and d5=(d3, d4) am getting  the following exception "org.apache.spark.SparkException: Exception thrown in awaitResult”.
Googling for it, I found that the closed is the answer reported https://issues.apache.org/jira/browse/SPARK-16522 which mention that it is bug of the SPARK 2.0.0.

Is it correct or am I missing anything?

Many Thanks for your answer and help.

Best Regards,
Carlo

-- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302). The Open University is authorised and regulated by the Financial Conduct Authority.

Re: SPARK Exception thrown in awaitResult

Posted by "Carlo.Allocca" <ca...@open.ac.uk>.
Solved!!
The solution is using date_format with the “u” option.

Thank you very much.
Best,
Carlo

On 28 Jul 2016, at 18:59, carlo allocca <ca...@open.ac.uk>> wrote:

Hi Mark,

Thanks for the suggestion.
I changed the maven entries as follows

<artifactId>spark-core_2.10</artifactId>
            <version>2.0.0</version>

and
<artifactId>spark-sql_2.10</artifactId>
            <version>2.0.0</version>

As result, it worked when I removed the following line of code to compute DAYOFWEEK (Monday—>1 etc.):

Dataset<Row> tmp6=tmp5.withColumn("ORD_DAYOFWEEK", callUDF("computeDayOfWeek", tmp5.col("ORD_time_window_per_hour#3").getItem("start").cast(DataTypes.StringType)));

 this.spark.udf().register("computeDayOfWeek", new UDF1<String,Integer>() {
        @Override
          public Integer call(String myDate) throws Exception {
                Date date = new SimpleDateFormat(dateFormat).parse(myDate);
                Calendar c = Calendar.getInstance();
                c.setTime(date);
                int dayOfWeek = c.get(Calendar.DAY_OF_WEEK);
          return dayOfWeek;//myDate.length();
        }
      }, DataTypes.IntegerType);



And the full stack is reported below.

Is there another way to compute DAYOFWEEK from a dateFormat="yyyy-MM-dd HH:mm:ss" by using built-in function? I have  checked date_format but it does not do it.

Any Suggestion?

Many Thanks,
Carlo



====
Test set: org.mksmart.amaretto.ml.DatasetPerHourVerOneTest
-------------------------------------------------------------------------------
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 32.658 sec <<< FAILURE!
testBuildDatasetNew(org.mksmart.amaretto.ml.DatasetPerHourVerOneTest)  Time elapsed: 32.581 sec  <<< ERROR!
org.apache.spark.SparkException: Task not serializable
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298)
at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108)
at org.apache.spark.SparkContext.clean(SparkContext.scala:2037)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1.apply(RDD.scala:798)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1.apply(RDD.scala:797)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:358)
at org.apache.spark.rdd.RDD.mapPartitionsWithIndex(RDD.scala:797)
at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:364)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
at org.apache.spark.sql.execution.TakeOrderedAndProjectExec.executeCollect(limit.scala:128)
at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2183)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2532)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2182)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collect(Dataset.scala:2189)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1925)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1924)
at org.apache.spark.sql.Dataset.withTypedCallback(Dataset.scala:2562)
at org.apache.spark.sql.Dataset.head(Dataset.scala:1924)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2139)
at org.mksmart.amaretto.ml.DatasetPerHourVerOneTest.testBuildDatasetNew(DatasetPerHourVerOneTest.java:202)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164)
at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110)
at org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:175)
at org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(SurefireStarter.java:107)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:68)
Caused by: java.io.NotSerializableException: org.mksmart.amaretto.etl.GetOrderInfoAsNew
Serialization stack:
- object not serializable (class: org.mksmart.amaretto.etl.GetOrderInfoAsNew, value: org.mksmart.amaretto.etl.GetOrderInfoAsNew@7d49fe37)
- field (class: org.mksmart.amaretto.etl.GetOrderInfoAsNew$4, name: this$0, type: class org.mksmart.amaretto.etl.GetOrderInfoAsNew)
- object (class org.mksmart.amaretto.etl.GetOrderInfoAsNew$4, org.mksmart.amaretto.etl.GetOrderInfoAsNew$4@690e7b89)
- field (class: org.apache.spark.sql.UDFRegistration$$anonfun$register$25, name: f$1, type: interface org.apache.spark.sql.api.java.UDF1)
- object (class org.apache.spark.sql.UDFRegistration$$anonfun$register$25, <function1>)
- field (class: org.apache.spark.sql.UDFRegistration$$anonfun$register$25$$anonfun$apply$1, name: $outer, type: class org.apache.spark.sql.UDFRegistration$$anonfun$register$25)
- object (class org.apache.spark.sql.UDFRegistration$$anonfun$register$25$$anonfun$apply$1, <function1>)
- field (class: org.apache.spark.sql.catalyst.expressions.ScalaUDF$$anonfun$2, name: func$2, type: interface scala.Function1)
- object (class org.apache.spark.sql.catalyst.expressions.ScalaUDF$$anonfun$2, <function1>)
- field (class: org.apache.spark.sql.catalyst.expressions.ScalaUDF, name: f, type: interface scala.Function1)
- object (class org.apache.spark.sql.catalyst.expressions.ScalaUDF, UDF(cast(time_window_per_hour#3#739.start as string)))
- field (class: org.apache.spark.sql.catalyst.expressions.Alias, name: child, type: class org.apache.spark.sql.catalyst.expressions.Expression)
- object (class org.apache.spark.sql.catalyst.expressions.Alias, UDF(cast(time_window_per_hour#3#739.start as string)) AS ORD_DAYOFWEEK#787)
- element of array (index: 15)
- array (class [Ljava.lang.Object;, size 16)
- field (class: scala.collection.mutable.ArrayBuffer, name: array, type: class [Ljava.lang.Object;)
- object (class scala.collection.mutable.ArrayBuffer, ArrayBuffer(time_window_per_hour#3#739 AS ORD_time_window_per_hour#3#748, asin#94 AS ORD_asin#749, count(time_window_per_hour#3#739)#756L AS ORD_num_of_order#758L, count(qty_shipped#98)#759L AS ORD_tot_qty_shipped#761L, count(qty_ordered#97)#762L AS ORD_tot_qty_ordered#764L, min(item_price#99)#765 AS ORD_min_ord_item_price#766, max(item_price#99)#767 AS ORD_max_ord_item_price#768, round(avg(cast(item_price#99 as double))#769, 2) AS ORD_avg_ord_item_price#770, stddev_pop(cast(item_price#99 as double))#779 AS ORD_std_ord_item_price#780, year(cast(cast(time_window_per_hour#3#739.start as string) as date)) AS ORD_YEAR#781, month(cast(cast(time_window_per_hour#3#739.start as string) as date)) AS ORD_MONTH#782, dayofmonth(cast(cast(time_window_per_hour#3#739.start as string) as date)) AS ORD_DAYOFMONTH#783, hour(cast(cast(time_window_per_hour#3#739.start as string) as timestamp)) AS ORD_HOUR#784, dayofyear(cast(cast(time_window_per_hour#3#739.start as string) as date)) AS ORD_DAYOFYEAR#785, weekofyear(cast(cast(time_window_per_hour#3#739.start as string) as date)) AS ORD_WEEKOFYEAR#786, UDF(cast(time_window_per_hour#3#739.start as string)) AS ORD_DAYOFWEEK#787))
- field (class: org.apache.spark.sql.execution.aggregate.HashAggregateExec, name: resultExpressions, type: interface scala.collection.Seq)
- object (class org.apache.spark.sql.execution.aggregate.HashAggregateExec, HashAggregate(keys=[time_window_per_hour#3#739, asin#94], functions=[count(time_window_per_hour#3#739), count(qty_shipped#98), count(qty_ordered#97), min(item_price#99), max(item_price#99), avg(cast(item_price#99 as double)), stddev_pop(cast(item_price#99 as double))], output=[ORD_time_window_per_hour#3#748, ORD_asin#749, ORD_num_of_order#758L, ORD_tot_qty_shipped#761L, ORD_tot_qty_ordered#764L, ORD_min_ord_item_price#766, ORD_max_ord_item_price#768, ORD_avg_ord_item_price#770, ORD_std_ord_item_price#780, ORD_YEAR#781, ORD_MONTH#782, ORD_DAYOFMONTH#783, ORD_HOUR#784, ORD_DAYOFYEAR#785, ORD_WEEKOFYEAR#786, ORD_DAYOFWEEK#787])
+- Exchange hashpartitioning(time_window_per_hour#3#739, asin#94, 200)
   +- *HashAggregate(keys=[time_window_per_hour#3#739, asin#94], functions=[partial_count(time_window_per_hour#3#739), partial_count(qty_shipped#98), partial_count(qty_ordered#97), partial_min(item_price#99), partial_max(item_price#99), partial_avg(cast(item_price#99 as double)), partial_stddev_pop(cast(item_price#99 as double))], output=[time_window_per_hour#3#739, asin#94, count#846L, count#847L, count#848L, min#849, max#850, sum#851, count#852L, n#833, avg#834, m2#835])
      +- *Project [window#746 AS time_window_per_hour#3#739, asin#94, qty_shipped#98, item_price#99, qty_ordered#97]
         +- *Filter ((isnotnull(purchased#102) && (cast(purchased#102 as timestamp) >= window#746.start)) && (cast(purchased#102 as timestamp) < window#746.end))
            +- *Expand [List(named_struct(start, ((((CEIL((cast((precisetimestamp(cast(purchased#102 as timestamp)) - 0) as double) / 3.6E9)) + 0) - 1) * 3600000000) + 0), end, (((((CEIL((cast((precisetimestamp(cast(purchased#102 as timestamp)) - 0) as double) / 3.6E9)) + 0) - 1) * 3600000000) + 0) + 3600000000)), asin#94, qty_ordered#97, qty_shipped#98, item_price#99, purchased#102), List(named_struct(start, ((((CEIL((cast((precisetimestamp(cast(purchased#102 as timestamp)) - 0) as double) / 3.6E9)) + 1) - 1) * 3600000000) + 0), end, (((((CEIL((cast((precisetimestamp(cast(purchased#102 as timestamp)) - 0) as double) / 3.6E9)) + 1) - 1) * 3600000000) + 0) + 3600000000)), asin#94, qty_ordered#97, qty_shipped#98, item_price#99, purchased#102)], [window#746, asin#94, qty_ordered#97, qty_shipped#98, item_price#99, purchased#102]
               +- *Project [asin#94, qty_ordered#97, qty_shipped#98, item_price#99, purchased#102]
                  +- *Filter (((isnotnull(grade#96) && isnotnull(item_price#99)) && (grade#96 = New Item)) && isnotnull(purchased#102))
                     +- *Scan csv [asin#94,grade#96,qty_ordered#97,qty_shipped#98,item_price#99,purchased#102] Format: CSV, InputPaths: file:/Users/carloallocca/Desktop/NSLDataset/20160706/order.csv, PushedFilters: [IsNotNull(grade), IsNotNull(item_price), EqualTo(grade,New Item), IsNotNull(purchased)], ReadSchema: struct<asin:string,grade:string,qty_ordered:int,qty_shipped:string,item_price:float,purchased:str...
)
- element of array (index: 0)
- array (class [Ljava.lang.Object;, size 6)
- field (class: org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8, name: references$1, type: class [Ljava.lang.Object;)
- object (class org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8, <function2>)
at org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40)
at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46)
at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:295)
... 61 more






On 28 Jul 2016, at 17:39, Mark Hamstra <ma...@clearstorydata.com>> wrote:

Don't use Spark 2.0.0-preview.  That was a preview release with known issues, and was intended to be used only for early, pre-release testing purpose.  Spark 2.0.0 is now released, and you should be using that.

On Thu, Jul 28, 2016 at 3:48 AM, Carlo.Allocca <ca...@open.ac.uk>> wrote:
and, of course I am using

    <dependency> <!-- Spark dependency -->
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.11</artifactId>
            <version>2.0.0-preview</version>
        </dependency>


        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.11</artifactId>
            <version>2.0.0-preview</version>
            <type>jar</type>
        </dependency>


Is the below problem/issue related to the experimental version of SPARK 2.0.0.

Many Thanks for your help and support.

Best Regards,
carlo

On 28 Jul 2016, at 11:14, Carlo.Allocca <ca...@open.ac.uk>> wrote:

I have also found the following two related links:

1) https://github.com/apache/spark/commit/947b9020b0d621bc97661a0a056297e6889936d3
2) https://github.com/apache/spark/pull/12433

which both explain why it happens but nothing about what to do to solve it.

Do you have any suggestion/recommendation?

Many thanks.
Carlo

On 28 Jul 2016, at 11:06, carlo allocca <ca...@open.ac.uk>> wrote:

Hi Rui,

Thanks for the promptly reply.
No, I am not using Mesos.

Ok. I am writing a code to build a suitable dataset for my needs as in the following:

== Session configuration:

 SparkSession spark = SparkSession
                .builder()
                .master("local[6]") //
                .appName("DatasetForCaseNew")
                .config("spark.executor.memory", "4g")
                .config("spark.shuffle.blockTransferService", "nio")
                .getOrCreate();


public Dataset<Row> buildDataset(){
...

// STEP A
// Join prdDS with cmpDS
Dataset<Row> prdDS_Join_cmpDS
                = res1
                  .join(res2, (res1.col("PRD_asin#100")).equalTo(res2.col("CMP_asin")), "inner");

        prdDS_Join_cmpDS.take(1);

// STEP B
// Join prdDS with cmpDS
Dataset<Row> prdDS_Join_cmpDS_Join
                = prdDS_Join_cmpDS
                  .join(res3, prdDS_Join_cmpDS.col("PRD_asin#100").equalTo(res3.col("ORD_asin")), "inner");
        prdDS_Join_cmpDS_Join.take(1);
        prdDS_Join_cmpDS_Join.show();

}


The exception is thrown when the computation reach the STEP B, until STEP A is fine.

Is there anything wrong or missing?

Thanks for your help in advance.

Best Regards,
Carlo





=== STACK TRACE

Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 422.102 sec <<< FAILURE!
testBuildDataset(org.mksmart.amaretto.ml.DatasetPerHourVerOneTest)  Time elapsed: 421.994 sec  <<< ERROR!
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:194)
at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:102)
at org.apache.spark.sql.execution.InputAdapter.doExecuteBroadcast(WholeStageCodegenExec.scala:229)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:125)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:125)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.SparkPlan.executeBroadcast(SparkPlan.scala:124)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.prepareBroadcast(BroadcastHashJoinExec.scala:98)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.codegenInner(BroadcastHashJoinExec.scala:197)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.doConsume(BroadcastHashJoinExec.scala:82)
at org.apache.spark.sql.execution.CodegenSupport$class.consume(WholeStageCodegenExec.scala:153)
at org.apache.spark.sql.execution.joins.SortMergeJoinExec.consume(SortMergeJoinExec.scala:35)
at org.apache.spark.sql.execution.joins.SortMergeJoinExec.doProduce(SortMergeJoinExec.scala:565)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.joins.SortMergeJoinExec.produce(SortMergeJoinExec.scala:35)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.doProduce(BroadcastHashJoinExec.scala:77)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.produce(BroadcastHashJoinExec.scala:38)
at org.apache.spark.sql.execution.WholeStageCodegenExec.doCodeGen(WholeStageCodegenExec.scala:304)
at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:343)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:240)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:323)
at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2122)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2436)
at org.apache.spark.sql.Dataset.org<http://org.apache.spark.sql.dataset.org/>$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2121)
at org.apache.spark.sql.Dataset.org<http://org.apache.spark.sql.dataset.org/>$apache$spark$sql$Dataset$$collect(Dataset.scala:2128)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1862)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1861)
at org.apache.spark.sql.Dataset.withTypedCallback(Dataset.scala:2466)
at org.apache.spark.sql.Dataset.head(Dataset.scala:1861)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2078)
at org.mksmart.amaretto.ml.DatasetPerHourVerOne.buildDataset(DatasetPerHourVerOne.java:115)
at org.mksmart.amaretto.ml.DatasetPerHourVerOneTest.testBuildDataset(DatasetPerHourVerOneTest.java:76)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164)
at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110)
at org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:175)
at org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(SurefireStarter.java:107)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:68)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [300 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:190)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:190)
... 85 more



On 28 Jul 2016, at 10:55, Sun Rui <su...@163.com>> wrote:

Are you using Mesos? if not , https://issues.apache.org/jira/browse/SPARK-16522  is not relevant

You may describe more information about your Spark environment, and the full stack trace.
On Jul 28, 2016, at 17:44, Carlo.Allocca <ca...@open.ac.uk>> wrote:

Hi All,

I am running SPARK locally, and when running d3=join(d1,d2) and d5=(d3, d4) am getting  the following exception "org.apache.spark.SparkException: Exception thrown in awaitResult”.
Googling for it, I found that the closed is the answer reported https://issues.apache.org/jira/browse/SPARK-16522 which mention that it is bug of the SPARK 2.0.0.

Is it correct or am I missing anything?

Many Thanks for your answer and help.

Best Regards,
Carlo

-- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302). The Open University is authorised and regulated by the Financial Conduct Authority.








Re: SPARK Exception thrown in awaitResult

Posted by "Carlo.Allocca" <ca...@open.ac.uk>.
Hi Mark,

Thanks for the suggestion.
I changed the maven entries as follows

<artifactId>spark-core_2.10</artifactId>
            <version>2.0.0</version>

and
<artifactId>spark-sql_2.10</artifactId>
            <version>2.0.0</version>

As result, it worked when I removed the following line of code to compute DAYOFWEEK (Monday—>1 etc.):

Dataset<Row> tmp6=tmp5.withColumn("ORD_DAYOFWEEK", callUDF("computeDayOfWeek", tmp5.col("ORD_time_window_per_hour#3").getItem("start").cast(DataTypes.StringType)));

 this.spark.udf().register("computeDayOfWeek", new UDF1<String,Integer>() {
        @Override
          public Integer call(String myDate) throws Exception {
                Date date = new SimpleDateFormat(dateFormat).parse(myDate);
                Calendar c = Calendar.getInstance();
                c.setTime(date);
                int dayOfWeek = c.get(Calendar.DAY_OF_WEEK);
          return dayOfWeek;//myDate.length();
        }
      }, DataTypes.IntegerType);



And the full stack is reported below.

Is there another way to compute DAYOFWEEK from a dateFormat="yyyy-MM-dd HH:mm:ss" by using built-in function? I have  checked date_format but it does not do it.

Any Suggestion?

Many Thanks,
Carlo



====
Test set: org.mksmart.amaretto.ml.DatasetPerHourVerOneTest
-------------------------------------------------------------------------------
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 32.658 sec <<< FAILURE!
testBuildDatasetNew(org.mksmart.amaretto.ml.DatasetPerHourVerOneTest)  Time elapsed: 32.581 sec  <<< ERROR!
org.apache.spark.SparkException: Task not serializable
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298)
at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108)
at org.apache.spark.SparkContext.clean(SparkContext.scala:2037)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1.apply(RDD.scala:798)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1.apply(RDD.scala:797)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:358)
at org.apache.spark.rdd.RDD.mapPartitionsWithIndex(RDD.scala:797)
at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:364)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
at org.apache.spark.sql.execution.TakeOrderedAndProjectExec.executeCollect(limit.scala:128)
at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2183)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2532)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2182)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collect(Dataset.scala:2189)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1925)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1924)
at org.apache.spark.sql.Dataset.withTypedCallback(Dataset.scala:2562)
at org.apache.spark.sql.Dataset.head(Dataset.scala:1924)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2139)
at org.mksmart.amaretto.ml.DatasetPerHourVerOneTest.testBuildDatasetNew(DatasetPerHourVerOneTest.java:202)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164)
at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110)
at org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:175)
at org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(SurefireStarter.java:107)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:68)
Caused by: java.io.NotSerializableException: org.mksmart.amaretto.etl.GetOrderInfoAsNew
Serialization stack:
- object not serializable (class: org.mksmart.amaretto.etl.GetOrderInfoAsNew, value: org.mksmart.amaretto.etl.GetOrderInfoAsNew@7d49fe37)
- field (class: org.mksmart.amaretto.etl.GetOrderInfoAsNew$4, name: this$0, type: class org.mksmart.amaretto.etl.GetOrderInfoAsNew)
- object (class org.mksmart.amaretto.etl.GetOrderInfoAsNew$4, org.mksmart.amaretto.etl.GetOrderInfoAsNew$4@690e7b89)
- field (class: org.apache.spark.sql.UDFRegistration$$anonfun$register$25, name: f$1, type: interface org.apache.spark.sql.api.java.UDF1)
- object (class org.apache.spark.sql.UDFRegistration$$anonfun$register$25, <function1>)
- field (class: org.apache.spark.sql.UDFRegistration$$anonfun$register$25$$anonfun$apply$1, name: $outer, type: class org.apache.spark.sql.UDFRegistration$$anonfun$register$25)
- object (class org.apache.spark.sql.UDFRegistration$$anonfun$register$25$$anonfun$apply$1, <function1>)
- field (class: org.apache.spark.sql.catalyst.expressions.ScalaUDF$$anonfun$2, name: func$2, type: interface scala.Function1)
- object (class org.apache.spark.sql.catalyst.expressions.ScalaUDF$$anonfun$2, <function1>)
- field (class: org.apache.spark.sql.catalyst.expressions.ScalaUDF, name: f, type: interface scala.Function1)
- object (class org.apache.spark.sql.catalyst.expressions.ScalaUDF, UDF(cast(time_window_per_hour#3#739.start as string)))
- field (class: org.apache.spark.sql.catalyst.expressions.Alias, name: child, type: class org.apache.spark.sql.catalyst.expressions.Expression)
- object (class org.apache.spark.sql.catalyst.expressions.Alias, UDF(cast(time_window_per_hour#3#739.start as string)) AS ORD_DAYOFWEEK#787)
- element of array (index: 15)
- array (class [Ljava.lang.Object;, size 16)
- field (class: scala.collection.mutable.ArrayBuffer, name: array, type: class [Ljava.lang.Object;)
- object (class scala.collection.mutable.ArrayBuffer, ArrayBuffer(time_window_per_hour#3#739 AS ORD_time_window_per_hour#3#748, asin#94 AS ORD_asin#749, count(time_window_per_hour#3#739)#756L AS ORD_num_of_order#758L, count(qty_shipped#98)#759L AS ORD_tot_qty_shipped#761L, count(qty_ordered#97)#762L AS ORD_tot_qty_ordered#764L, min(item_price#99)#765 AS ORD_min_ord_item_price#766, max(item_price#99)#767 AS ORD_max_ord_item_price#768, round(avg(cast(item_price#99 as double))#769, 2) AS ORD_avg_ord_item_price#770, stddev_pop(cast(item_price#99 as double))#779 AS ORD_std_ord_item_price#780, year(cast(cast(time_window_per_hour#3#739.start as string) as date)) AS ORD_YEAR#781, month(cast(cast(time_window_per_hour#3#739.start as string) as date)) AS ORD_MONTH#782, dayofmonth(cast(cast(time_window_per_hour#3#739.start as string) as date)) AS ORD_DAYOFMONTH#783, hour(cast(cast(time_window_per_hour#3#739.start as string) as timestamp)) AS ORD_HOUR#784, dayofyear(cast(cast(time_window_per_hour#3#739.start as string) as date)) AS ORD_DAYOFYEAR#785, weekofyear(cast(cast(time_window_per_hour#3#739.start as string) as date)) AS ORD_WEEKOFYEAR#786, UDF(cast(time_window_per_hour#3#739.start as string)) AS ORD_DAYOFWEEK#787))
- field (class: org.apache.spark.sql.execution.aggregate.HashAggregateExec, name: resultExpressions, type: interface scala.collection.Seq)
- object (class org.apache.spark.sql.execution.aggregate.HashAggregateExec, HashAggregate(keys=[time_window_per_hour#3#739, asin#94], functions=[count(time_window_per_hour#3#739), count(qty_shipped#98), count(qty_ordered#97), min(item_price#99), max(item_price#99), avg(cast(item_price#99 as double)), stddev_pop(cast(item_price#99 as double))], output=[ORD_time_window_per_hour#3#748, ORD_asin#749, ORD_num_of_order#758L, ORD_tot_qty_shipped#761L, ORD_tot_qty_ordered#764L, ORD_min_ord_item_price#766, ORD_max_ord_item_price#768, ORD_avg_ord_item_price#770, ORD_std_ord_item_price#780, ORD_YEAR#781, ORD_MONTH#782, ORD_DAYOFMONTH#783, ORD_HOUR#784, ORD_DAYOFYEAR#785, ORD_WEEKOFYEAR#786, ORD_DAYOFWEEK#787])
+- Exchange hashpartitioning(time_window_per_hour#3#739, asin#94, 200)
   +- *HashAggregate(keys=[time_window_per_hour#3#739, asin#94], functions=[partial_count(time_window_per_hour#3#739), partial_count(qty_shipped#98), partial_count(qty_ordered#97), partial_min(item_price#99), partial_max(item_price#99), partial_avg(cast(item_price#99 as double)), partial_stddev_pop(cast(item_price#99 as double))], output=[time_window_per_hour#3#739, asin#94, count#846L, count#847L, count#848L, min#849, max#850, sum#851, count#852L, n#833, avg#834, m2#835])
      +- *Project [window#746 AS time_window_per_hour#3#739, asin#94, qty_shipped#98, item_price#99, qty_ordered#97]
         +- *Filter ((isnotnull(purchased#102) && (cast(purchased#102 as timestamp) >= window#746.start)) && (cast(purchased#102 as timestamp) < window#746.end))
            +- *Expand [List(named_struct(start, ((((CEIL((cast((precisetimestamp(cast(purchased#102 as timestamp)) - 0) as double) / 3.6E9)) + 0) - 1) * 3600000000) + 0), end, (((((CEIL((cast((precisetimestamp(cast(purchased#102 as timestamp)) - 0) as double) / 3.6E9)) + 0) - 1) * 3600000000) + 0) + 3600000000)), asin#94, qty_ordered#97, qty_shipped#98, item_price#99, purchased#102), List(named_struct(start, ((((CEIL((cast((precisetimestamp(cast(purchased#102 as timestamp)) - 0) as double) / 3.6E9)) + 1) - 1) * 3600000000) + 0), end, (((((CEIL((cast((precisetimestamp(cast(purchased#102 as timestamp)) - 0) as double) / 3.6E9)) + 1) - 1) * 3600000000) + 0) + 3600000000)), asin#94, qty_ordered#97, qty_shipped#98, item_price#99, purchased#102)], [window#746, asin#94, qty_ordered#97, qty_shipped#98, item_price#99, purchased#102]
               +- *Project [asin#94, qty_ordered#97, qty_shipped#98, item_price#99, purchased#102]
                  +- *Filter (((isnotnull(grade#96) && isnotnull(item_price#99)) && (grade#96 = New Item)) && isnotnull(purchased#102))
                     +- *Scan csv [asin#94,grade#96,qty_ordered#97,qty_shipped#98,item_price#99,purchased#102] Format: CSV, InputPaths: file:/Users/carloallocca/Desktop/NSLDataset/20160706/order.csv, PushedFilters: [IsNotNull(grade), IsNotNull(item_price), EqualTo(grade,New Item), IsNotNull(purchased)], ReadSchema: struct<asin:string,grade:string,qty_ordered:int,qty_shipped:string,item_price:float,purchased:str...
)
- element of array (index: 0)
- array (class [Ljava.lang.Object;, size 6)
- field (class: org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8, name: references$1, type: class [Ljava.lang.Object;)
- object (class org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8, <function2>)
at org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40)
at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46)
at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:100)
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:295)
... 61 more






On 28 Jul 2016, at 17:39, Mark Hamstra <ma...@clearstorydata.com>> wrote:

Don't use Spark 2.0.0-preview.  That was a preview release with known issues, and was intended to be used only for early, pre-release testing purpose.  Spark 2.0.0 is now released, and you should be using that.

On Thu, Jul 28, 2016 at 3:48 AM, Carlo.Allocca <ca...@open.ac.uk>> wrote:
and, of course I am using

    <dependency> <!-- Spark dependency -->
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.11</artifactId>
            <version>2.0.0-preview</version>
        </dependency>


        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.11</artifactId>
            <version>2.0.0-preview</version>
            <type>jar</type>
        </dependency>


Is the below problem/issue related to the experimental version of SPARK 2.0.0.

Many Thanks for your help and support.

Best Regards,
carlo

On 28 Jul 2016, at 11:14, Carlo.Allocca <ca...@open.ac.uk>> wrote:

I have also found the following two related links:

1) https://github.com/apache/spark/commit/947b9020b0d621bc97661a0a056297e6889936d3
2) https://github.com/apache/spark/pull/12433

which both explain why it happens but nothing about what to do to solve it.

Do you have any suggestion/recommendation?

Many thanks.
Carlo

On 28 Jul 2016, at 11:06, carlo allocca <ca...@open.ac.uk>> wrote:

Hi Rui,

Thanks for the promptly reply.
No, I am not using Mesos.

Ok. I am writing a code to build a suitable dataset for my needs as in the following:

== Session configuration:

 SparkSession spark = SparkSession
                .builder()
                .master("local[6]") //
                .appName("DatasetForCaseNew")
                .config("spark.executor.memory", "4g")
                .config("spark.shuffle.blockTransferService", "nio")
                .getOrCreate();


public Dataset<Row> buildDataset(){
...

// STEP A
// Join prdDS with cmpDS
Dataset<Row> prdDS_Join_cmpDS
                = res1
                  .join(res2, (res1.col("PRD_asin#100")).equalTo(res2.col("CMP_asin")), "inner");

        prdDS_Join_cmpDS.take(1);

// STEP B
// Join prdDS with cmpDS
Dataset<Row> prdDS_Join_cmpDS_Join
                = prdDS_Join_cmpDS
                  .join(res3, prdDS_Join_cmpDS.col("PRD_asin#100").equalTo(res3.col("ORD_asin")), "inner");
        prdDS_Join_cmpDS_Join.take(1);
        prdDS_Join_cmpDS_Join.show();

}


The exception is thrown when the computation reach the STEP B, until STEP A is fine.

Is there anything wrong or missing?

Thanks for your help in advance.

Best Regards,
Carlo





=== STACK TRACE

Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 422.102 sec <<< FAILURE!
testBuildDataset(org.mksmart.amaretto.ml.DatasetPerHourVerOneTest)  Time elapsed: 421.994 sec  <<< ERROR!
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:194)
at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:102)
at org.apache.spark.sql.execution.InputAdapter.doExecuteBroadcast(WholeStageCodegenExec.scala:229)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:125)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:125)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.SparkPlan.executeBroadcast(SparkPlan.scala:124)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.prepareBroadcast(BroadcastHashJoinExec.scala:98)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.codegenInner(BroadcastHashJoinExec.scala:197)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.doConsume(BroadcastHashJoinExec.scala:82)
at org.apache.spark.sql.execution.CodegenSupport$class.consume(WholeStageCodegenExec.scala:153)
at org.apache.spark.sql.execution.joins.SortMergeJoinExec.consume(SortMergeJoinExec.scala:35)
at org.apache.spark.sql.execution.joins.SortMergeJoinExec.doProduce(SortMergeJoinExec.scala:565)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.joins.SortMergeJoinExec.produce(SortMergeJoinExec.scala:35)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.doProduce(BroadcastHashJoinExec.scala:77)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.produce(BroadcastHashJoinExec.scala:38)
at org.apache.spark.sql.execution.WholeStageCodegenExec.doCodeGen(WholeStageCodegenExec.scala:304)
at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:343)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:240)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:323)
at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2122)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2436)
at org.apache.spark.sql.Dataset.org<http://org.apache.spark.sql.dataset.org/>$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2121)
at org.apache.spark.sql.Dataset.org<http://org.apache.spark.sql.dataset.org/>$apache$spark$sql$Dataset$$collect(Dataset.scala:2128)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1862)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1861)
at org.apache.spark.sql.Dataset.withTypedCallback(Dataset.scala:2466)
at org.apache.spark.sql.Dataset.head(Dataset.scala:1861)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2078)
at org.mksmart.amaretto.ml.DatasetPerHourVerOne.buildDataset(DatasetPerHourVerOne.java:115)
at org.mksmart.amaretto.ml.DatasetPerHourVerOneTest.testBuildDataset(DatasetPerHourVerOneTest.java:76)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164)
at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110)
at org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:175)
at org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(SurefireStarter.java:107)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:68)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [300 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:190)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:190)
... 85 more



On 28 Jul 2016, at 10:55, Sun Rui <su...@163.com>> wrote:

Are you using Mesos? if not , https://issues.apache.org/jira/browse/SPARK-16522  is not relevant

You may describe more information about your Spark environment, and the full stack trace.
On Jul 28, 2016, at 17:44, Carlo.Allocca <ca...@open.ac.uk>> wrote:

Hi All,

I am running SPARK locally, and when running d3=join(d1,d2) and d5=(d3, d4) am getting  the following exception "org.apache.spark.SparkException: Exception thrown in awaitResult”.
Googling for it, I found that the closed is the answer reported https://issues.apache.org/jira/browse/SPARK-16522 which mention that it is bug of the SPARK 2.0.0.

Is it correct or am I missing anything?

Many Thanks for your answer and help.

Best Regards,
Carlo

-- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302). The Open University is authorised and regulated by the Financial Conduct Authority.







Re: SPARK Exception thrown in awaitResult

Posted by Mark Hamstra <ma...@clearstorydata.com>.
Don't use Spark 2.0.0-preview.  That was a preview release with known
issues, and was intended to be used only for early, pre-release testing
purpose.  Spark 2.0.0 is now released, and you should be using that.

On Thu, Jul 28, 2016 at 3:48 AM, Carlo.Allocca <ca...@open.ac.uk>
wrote:

> and, of course I am using
>
>     <dependency> <!-- Spark dependency -->
>             <groupId>org.apache.spark</groupId>
>             <artifactId>spark-core_2.11</artifactId>
>             <version>2.0.0-preview</version>
>         </dependency>
>
>
>         <dependency>
>             <groupId>org.apache.spark</groupId>
>             <artifactId>spark-sql_2.11</artifactId>
>             <version>2.0.0-preview</version>
>             <type>jar</type>
>         </dependency>
>
>
> Is the below problem/issue related to the experimental version of SPARK
> 2.0.0.
>
> Many Thanks for your help and support.
>
> Best Regards,
> carlo
>
> On 28 Jul 2016, at 11:14, Carlo.Allocca <ca...@open.ac.uk> wrote:
>
> I have also found the following two related links:
>
> 1)
> https://github.com/apache/spark/commit/947b9020b0d621bc97661a0a056297e6889936d3
> 2) https://github.com/apache/spark/pull/12433
>
> which both explain why it happens but nothing about what to do to solve
> it.
>
> Do you have any suggestion/recommendation?
>
> Many thanks.
> Carlo
>
> On 28 Jul 2016, at 11:06, carlo allocca <ca...@open.ac.uk> wrote:
>
> Hi Rui,
>
> Thanks for the promptly reply.
> No, I am not using Mesos.
>
> Ok. I am writing a code to build a suitable dataset for my needs as in the
> following:
>
> == Session configuration:
>
>  SparkSession spark = SparkSession
>                 .builder()
>                 .master("local[6]") //
>                 .appName("DatasetForCaseNew")
>                 .config("spark.executor.memory", "4g")
>                 .config("spark.shuffle.blockTransferService", "nio")
>                 .getOrCreate();
>
>
> public Dataset<Row> buildDataset(){
> ...
>
> // STEP A
> // Join prdDS with cmpDS
> Dataset<Row> prdDS_Join_cmpDS
>                 = res1
>                   .join(res2,
> (res1.col("PRD_asin#100")).equalTo(res2.col("CMP_asin")), "inner");
>
>         prdDS_Join_cmpDS.take(1);
>
> // STEP B
> // Join prdDS with cmpDS
> Dataset<Row> prdDS_Join_cmpDS_Join
>                 = prdDS_Join_cmpDS
>                   .join(res3,
> prdDS_Join_cmpDS.col("PRD_asin#100").equalTo(res3.col("ORD_asin")),
> "inner");
>         prdDS_Join_cmpDS_Join.take(1);
>         prdDS_Join_cmpDS_Join.show();
>
> }
>
>
> The exception is thrown when the computation reach the STEP B, until STEP
> A is fine.
>
> Is there anything wrong or missing?
>
> Thanks for your help in advance.
>
> Best Regards,
> Carlo
>
>
>
>
>
> === STACK TRACE
>
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 422.102
> sec <<< FAILURE!
> testBuildDataset(org.mksmart.amaretto.ml.DatasetPerHourVerOneTest)  Time
> elapsed: 421.994 sec  <<< ERROR!
> org.apache.spark.SparkException: Exception thrown in awaitResult:
> at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:194)
> at
> org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:102)
> at
> org.apache.spark.sql.execution.InputAdapter.doExecuteBroadcast(WholeStageCodegenExec.scala:229)
> at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:125)
> at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:125)
> at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
> at
> org.apache.spark.sql.execution.SparkPlan.executeBroadcast(SparkPlan.scala:124)
> at
> org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.prepareBroadcast(BroadcastHashJoinExec.scala:98)
> at
> org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.codegenInner(BroadcastHashJoinExec.scala:197)
> at
> org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.doConsume(BroadcastHashJoinExec.scala:82)
> at
> org.apache.spark.sql.execution.CodegenSupport$class.consume(WholeStageCodegenExec.scala:153)
> at
> org.apache.spark.sql.execution.joins.SortMergeJoinExec.consume(SortMergeJoinExec.scala:35)
> at
> org.apache.spark.sql.execution.joins.SortMergeJoinExec.doProduce(SortMergeJoinExec.scala:565)
> at
> org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
> at
> org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:78)
> at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
> at
> org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:78)
> at
> org.apache.spark.sql.execution.joins.SortMergeJoinExec.produce(SortMergeJoinExec.scala:35)
> at
> org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.doProduce(BroadcastHashJoinExec.scala:77)
> at
> org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
> at
> org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:78)
> at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
> at
> org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:78)
> at
> org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.produce(BroadcastHashJoinExec.scala:38)
> at
> org.apache.spark.sql.execution.WholeStageCodegenExec.doCodeGen(WholeStageCodegenExec.scala:304)
> at
> org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:343)
> at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
> at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
> at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
> at
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
> at
> org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:240)
> at
> org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:323)
> at
> org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
> at
> org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2122)
> at
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
> at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2436)
> at org.apache.spark.sql.Dataset.org
> $apache$spark$sql$Dataset$$execute$1(Dataset.scala:2121)
> at org.apache.spark.sql.Dataset.org
> $apache$spark$sql$Dataset$$collect(Dataset.scala:2128)
> at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1862)
> at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1861)
> at org.apache.spark.sql.Dataset.withTypedCallback(Dataset.scala:2466)
> at org.apache.spark.sql.Dataset.head(Dataset.scala:1861)
> at org.apache.spark.sql.Dataset.take(Dataset.scala:2078)
> at
> org.mksmart.amaretto.ml.DatasetPerHourVerOne.buildDataset(DatasetPerHourVerOne.java:115)
> at
> org.mksmart.amaretto.ml.DatasetPerHourVerOneTest.testBuildDataset(DatasetPerHourVerOneTest.java:76)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> at
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> at
> org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53)
> at
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123)
> at
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at
> org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164)
> at
> org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110)
> at
> org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:175)
> at
> org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(SurefireStarter.java:107)
> at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:68)
> Caused by: java.util.concurrent.TimeoutException: Futures timed out after
> [300 seconds]
> at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
> at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
> at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
> at
> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
> at scala.concurrent.Await$.result(package.scala:190)
> at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:190)
> ... 85 more
>
>
>
> On 28 Jul 2016, at 10:55, Sun Rui <su...@163.com> wrote:
>
> Are you using Mesos? if not ,
> https://issues.apache.org/jira/browse/SPARK-16522  is not relevant
>
> You may describe more information about your Spark environment, and the
> full stack trace.
>
> On Jul 28, 2016, at 17:44, Carlo.Allocca <ca...@open.ac.uk> wrote:
>
> Hi All,
>
> I am running SPARK locally, and when running d3=join(d1,d2) and d5=(d3,
> d4) am getting  the following exception "org.apache.spark.SparkException:
> Exception thrown in awaitResult”.
> Googling for it, I found that the closed is the answer reported
> https://issues.apache.org/jira/browse/SPARK-16522 which mention that it
> is bug of the SPARK 2.0.0.
>
> Is it correct or am I missing anything?
>
> Many Thanks for your answer and help.
>
> Best Regards,
> Carlo
>
> -- The Open University is incorporated by Royal Charter (RC 000391), an
> exempt charity in England & Wales and a charity registered in Scotland (SC
> 038302). The Open University is authorised and regulated by the Financial
> Conduct Authority.
>
>
>
>
>
>

Re: SPARK Exception thrown in awaitResult

Posted by "Carlo.Allocca" <ca...@open.ac.uk>.
and, of course I am using

    <dependency> <!-- Spark dependency -->
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.11</artifactId>
            <version>2.0.0-preview</version>
        </dependency>


        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.11</artifactId>
            <version>2.0.0-preview</version>
            <type>jar</type>
        </dependency>


Is the below problem/issue related to the experimental version of SPARK 2.0.0.

Many Thanks for your help and support.

Best Regards,
carlo

On 28 Jul 2016, at 11:14, Carlo.Allocca <ca...@open.ac.uk>> wrote:

I have also found the following two related links:

1) https://github.com/apache/spark/commit/947b9020b0d621bc97661a0a056297e6889936d3
2) https://github.com/apache/spark/pull/12433

which both explain why it happens but nothing about what to do to solve it.

Do you have any suggestion/recommendation?

Many thanks.
Carlo

On 28 Jul 2016, at 11:06, carlo allocca <ca...@open.ac.uk>> wrote:

Hi Rui,

Thanks for the promptly reply.
No, I am not using Mesos.

Ok. I am writing a code to build a suitable dataset for my needs as in the following:

== Session configuration:

 SparkSession spark = SparkSession
                .builder()
                .master("local[6]") //
                .appName("DatasetForCaseNew")
                .config("spark.executor.memory", "4g")
                .config("spark.shuffle.blockTransferService", "nio")
                .getOrCreate();


public Dataset<Row> buildDataset(){
...

// STEP A
// Join prdDS with cmpDS
Dataset<Row> prdDS_Join_cmpDS
                = res1
                  .join(res2, (res1.col("PRD_asin#100")).equalTo(res2.col("CMP_asin")), "inner");

        prdDS_Join_cmpDS.take(1);

// STEP B
// Join prdDS with cmpDS
Dataset<Row> prdDS_Join_cmpDS_Join
                = prdDS_Join_cmpDS
                  .join(res3, prdDS_Join_cmpDS.col("PRD_asin#100").equalTo(res3.col("ORD_asin")), "inner");
        prdDS_Join_cmpDS_Join.take(1);
        prdDS_Join_cmpDS_Join.show();

}


The exception is thrown when the computation reach the STEP B, until STEP A is fine.

Is there anything wrong or missing?

Thanks for your help in advance.

Best Regards,
Carlo





=== STACK TRACE

Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 422.102 sec <<< FAILURE!
testBuildDataset(org.mksmart.amaretto.ml.DatasetPerHourVerOneTest)  Time elapsed: 421.994 sec  <<< ERROR!
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:194)
at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:102)
at org.apache.spark.sql.execution.InputAdapter.doExecuteBroadcast(WholeStageCodegenExec.scala:229)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:125)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:125)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.SparkPlan.executeBroadcast(SparkPlan.scala:124)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.prepareBroadcast(BroadcastHashJoinExec.scala:98)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.codegenInner(BroadcastHashJoinExec.scala:197)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.doConsume(BroadcastHashJoinExec.scala:82)
at org.apache.spark.sql.execution.CodegenSupport$class.consume(WholeStageCodegenExec.scala:153)
at org.apache.spark.sql.execution.joins.SortMergeJoinExec.consume(SortMergeJoinExec.scala:35)
at org.apache.spark.sql.execution.joins.SortMergeJoinExec.doProduce(SortMergeJoinExec.scala:565)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.joins.SortMergeJoinExec.produce(SortMergeJoinExec.scala:35)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.doProduce(BroadcastHashJoinExec.scala:77)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.produce(BroadcastHashJoinExec.scala:38)
at org.apache.spark.sql.execution.WholeStageCodegenExec.doCodeGen(WholeStageCodegenExec.scala:304)
at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:343)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:240)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:323)
at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2122)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2436)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2121)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collect(Dataset.scala:2128)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1862)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1861)
at org.apache.spark.sql.Dataset.withTypedCallback(Dataset.scala:2466)
at org.apache.spark.sql.Dataset.head(Dataset.scala:1861)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2078)
at org.mksmart.amaretto.ml.DatasetPerHourVerOne.buildDataset(DatasetPerHourVerOne.java:115)
at org.mksmart.amaretto.ml.DatasetPerHourVerOneTest.testBuildDataset(DatasetPerHourVerOneTest.java:76)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164)
at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110)
at org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:175)
at org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(SurefireStarter.java:107)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:68)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [300 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:190)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:190)
... 85 more



On 28 Jul 2016, at 10:55, Sun Rui <su...@163.com>> wrote:

Are you using Mesos? if not , https://issues.apache.org/jira/browse/SPARK-16522  is not relevant

You may describe more information about your Spark environment, and the full stack trace.
On Jul 28, 2016, at 17:44, Carlo.Allocca <ca...@open.ac.uk>> wrote:

Hi All,

I am running SPARK locally, and when running d3=join(d1,d2) and d5=(d3, d4) am getting  the following exception "org.apache.spark.SparkException: Exception thrown in awaitResult”.
Googling for it, I found that the closed is the answer reported https://issues.apache.org/jira/browse/SPARK-16522 which mention that it is bug of the SPARK 2.0.0.

Is it correct or am I missing anything?

Many Thanks for your answer and help.

Best Regards,
Carlo

-- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302). The Open University is authorised and regulated by the Financial Conduct Authority.





Re: SPARK Exception thrown in awaitResult

Posted by "Carlo.Allocca" <ca...@open.ac.uk>.
I have also found the following two related links:

1) https://github.com/apache/spark/commit/947b9020b0d621bc97661a0a056297e6889936d3
2) https://github.com/apache/spark/pull/12433

which both explain why it happens but nothing about what to do to solve it.

Do you have any suggestion/recommendation?

Many thanks.
Carlo

On 28 Jul 2016, at 11:06, carlo allocca <ca...@open.ac.uk>> wrote:

Hi Rui,

Thanks for the promptly reply.
No, I am not using Mesos.

Ok. I am writing a code to build a suitable dataset for my needs as in the following:

== Session configuration:

 SparkSession spark = SparkSession
                .builder()
                .master("local[6]") //
                .appName("DatasetForCaseNew")
                .config("spark.executor.memory", "4g")
                .config("spark.shuffle.blockTransferService", "nio")
                .getOrCreate();


public Dataset<Row> buildDataset(){
...

// STEP A
// Join prdDS with cmpDS
Dataset<Row> prdDS_Join_cmpDS
                = res1
                  .join(res2, (res1.col("PRD_asin#100")).equalTo(res2.col("CMP_asin")), "inner");

        prdDS_Join_cmpDS.take(1);

// STEP B
// Join prdDS with cmpDS
Dataset<Row> prdDS_Join_cmpDS_Join
                = prdDS_Join_cmpDS
                  .join(res3, prdDS_Join_cmpDS.col("PRD_asin#100").equalTo(res3.col("ORD_asin")), "inner");
        prdDS_Join_cmpDS_Join.take(1);
        prdDS_Join_cmpDS_Join.show();

}


The exception is thrown when the computation reach the STEP B, until STEP A is fine.

Is there anything wrong or missing?

Thanks for your help in advance.

Best Regards,
Carlo





=== STACK TRACE

Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 422.102 sec <<< FAILURE!
testBuildDataset(org.mksmart.amaretto.ml.DatasetPerHourVerOneTest)  Time elapsed: 421.994 sec  <<< ERROR!
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:194)
at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:102)
at org.apache.spark.sql.execution.InputAdapter.doExecuteBroadcast(WholeStageCodegenExec.scala:229)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:125)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:125)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.SparkPlan.executeBroadcast(SparkPlan.scala:124)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.prepareBroadcast(BroadcastHashJoinExec.scala:98)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.codegenInner(BroadcastHashJoinExec.scala:197)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.doConsume(BroadcastHashJoinExec.scala:82)
at org.apache.spark.sql.execution.CodegenSupport$class.consume(WholeStageCodegenExec.scala:153)
at org.apache.spark.sql.execution.joins.SortMergeJoinExec.consume(SortMergeJoinExec.scala:35)
at org.apache.spark.sql.execution.joins.SortMergeJoinExec.doProduce(SortMergeJoinExec.scala:565)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.joins.SortMergeJoinExec.produce(SortMergeJoinExec.scala:35)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.doProduce(BroadcastHashJoinExec.scala:77)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.produce(BroadcastHashJoinExec.scala:38)
at org.apache.spark.sql.execution.WholeStageCodegenExec.doCodeGen(WholeStageCodegenExec.scala:304)
at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:343)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:240)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:323)
at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2122)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2436)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2121)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collect(Dataset.scala:2128)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1862)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1861)
at org.apache.spark.sql.Dataset.withTypedCallback(Dataset.scala:2466)
at org.apache.spark.sql.Dataset.head(Dataset.scala:1861)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2078)
at org.mksmart.amaretto.ml.DatasetPerHourVerOne.buildDataset(DatasetPerHourVerOne.java:115)
at org.mksmart.amaretto.ml.DatasetPerHourVerOneTest.testBuildDataset(DatasetPerHourVerOneTest.java:76)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164)
at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110)
at org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:175)
at org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(SurefireStarter.java:107)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:68)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [300 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:190)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:190)
... 85 more



On 28 Jul 2016, at 10:55, Sun Rui <su...@163.com>> wrote:

Are you using Mesos? if not , https://issues.apache.org/jira/browse/SPARK-16522  is not relevant

You may describe more information about your Spark environment, and the full stack trace.
On Jul 28, 2016, at 17:44, Carlo.Allocca <ca...@open.ac.uk>> wrote:

Hi All,

I am running SPARK locally, and when running d3=join(d1,d2) and d5=(d3, d4) am getting  the following exception "org.apache.spark.SparkException: Exception thrown in awaitResult”.
Googling for it, I found that the closed is the answer reported https://issues.apache.org/jira/browse/SPARK-16522 which mention that it is bug of the SPARK 2.0.0.

Is it correct or am I missing anything?

Many Thanks for your answer and help.

Best Regards,
Carlo

-- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302). The Open University is authorised and regulated by the Financial Conduct Authority.




Re: SPARK Exception thrown in awaitResult

Posted by "Carlo.Allocca" <ca...@open.ac.uk>.
Hi Rui,

Thanks for the promptly reply.
No, I am not using Mesos.

Ok. I am writing a code to build a suitable dataset for my needs as in the following:

== Session configuration:

 SparkSession spark = SparkSession
                .builder()
                .master("local[6]") //
                .appName("DatasetForCaseNew")
                .config("spark.executor.memory", "4g")
                .config("spark.shuffle.blockTransferService", "nio")
                .getOrCreate();


public Dataset<Row> buildDataset(){
...

// STEP A
// Join prdDS with cmpDS
Dataset<Row> prdDS_Join_cmpDS
                = res1
                  .join(res2, (res1.col("PRD_asin#100")).equalTo(res2.col("CMP_asin")), "inner");

        prdDS_Join_cmpDS.take(1);

// STEP B
// Join prdDS with cmpDS
Dataset<Row> prdDS_Join_cmpDS_Join
                = prdDS_Join_cmpDS
                  .join(res3, prdDS_Join_cmpDS.col("PRD_asin#100").equalTo(res3.col("ORD_asin")), "inner");
        prdDS_Join_cmpDS_Join.take(1);
        prdDS_Join_cmpDS_Join.show();

}


The exception is thrown when the computation reach the STEP B, until STEP A is fine.

Is there anything wrong or missing?

Thanks for your help in advance.

Best Regards,
Carlo





=== STACK TRACE

Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 422.102 sec <<< FAILURE!
testBuildDataset(org.mksmart.amaretto.ml.DatasetPerHourVerOneTest)  Time elapsed: 421.994 sec  <<< ERROR!
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:194)
at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:102)
at org.apache.spark.sql.execution.InputAdapter.doExecuteBroadcast(WholeStageCodegenExec.scala:229)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:125)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:125)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.SparkPlan.executeBroadcast(SparkPlan.scala:124)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.prepareBroadcast(BroadcastHashJoinExec.scala:98)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.codegenInner(BroadcastHashJoinExec.scala:197)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.doConsume(BroadcastHashJoinExec.scala:82)
at org.apache.spark.sql.execution.CodegenSupport$class.consume(WholeStageCodegenExec.scala:153)
at org.apache.spark.sql.execution.joins.SortMergeJoinExec.consume(SortMergeJoinExec.scala:35)
at org.apache.spark.sql.execution.joins.SortMergeJoinExec.doProduce(SortMergeJoinExec.scala:565)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.joins.SortMergeJoinExec.produce(SortMergeJoinExec.scala:35)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.doProduce(BroadcastHashJoinExec.scala:77)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:83)
at org.apache.spark.sql.execution.CodegenSupport$$anonfun$produce$1.apply(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.CodegenSupport$class.produce(WholeStageCodegenExec.scala:78)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.produce(BroadcastHashJoinExec.scala:38)
at org.apache.spark.sql.execution.WholeStageCodegenExec.doCodeGen(WholeStageCodegenExec.scala:304)
at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:343)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:240)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:323)
at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
at org.apache.spark.sql.Dataset$$anonfun$org$apache$spark$sql$Dataset$$execute$1$1.apply(Dataset.scala:2122)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
at org.apache.spark.sql.Dataset.withNewExecutionId(Dataset.scala:2436)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$execute$1(Dataset.scala:2121)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collect(Dataset.scala:2128)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1862)
at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:1861)
at org.apache.spark.sql.Dataset.withTypedCallback(Dataset.scala:2466)
at org.apache.spark.sql.Dataset.head(Dataset.scala:1861)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2078)
at org.mksmart.amaretto.ml.DatasetPerHourVerOne.buildDataset(DatasetPerHourVerOne.java:115)
at org.mksmart.amaretto.ml.DatasetPerHourVerOneTest.testBuildDataset(DatasetPerHourVerOneTest.java:76)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164)
at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110)
at org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:175)
at org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(SurefireStarter.java:107)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:68)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [300 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:190)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:190)
... 85 more



On 28 Jul 2016, at 10:55, Sun Rui <su...@163.com>> wrote:

Are you using Mesos? if not , https://issues.apache.org/jira/browse/SPARK-16522  is not relevant

You may describe more information about your Spark environment, and the full stack trace.
On Jul 28, 2016, at 17:44, Carlo.Allocca <ca...@open.ac.uk>> wrote:

Hi All,

I am running SPARK locally, and when running d3=join(d1,d2) and d5=(d3, d4) am getting  the following exception "org.apache.spark.SparkException: Exception thrown in awaitResult”.
Googling for it, I found that the closed is the answer reported https://issues.apache.org/jira/browse/SPARK-16522 which mention that it is bug of the SPARK 2.0.0.

Is it correct or am I missing anything?

Many Thanks for your answer and help.

Best Regards,
Carlo

-- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302). The Open University is authorised and regulated by the Financial Conduct Authority.



Re: SPARK Exception thrown in awaitResult

Posted by Sun Rui <su...@163.com>.
Are you using Mesos? if not , https://issues.apache.org/jira/browse/SPARK-16522 <https://issues.apache.org/jira/browse/SPARK-16522>  is not relevant

You may describe more information about your Spark environment, and the full stack trace.
> On Jul 28, 2016, at 17:44, Carlo.Allocca <ca...@open.ac.uk> wrote:
> 
> Hi All, 
> 
> I am running SPARK locally, and when running d3=join(d1,d2) and d5=(d3, d4) am getting  the following exception "org.apache.spark.SparkException: Exception thrown in awaitResult”. 
> Googling for it, I found that the closed is the answer reported https://issues.apache.org/jira/browse/SPARK-16522 <https://issues.apache.org/jira/browse/SPARK-16522> which mention that it is bug of the SPARK 2.0.0. 
> 
> Is it correct or am I missing anything? 
> 
> Many Thanks for your answer and help. 
> 
> Best Regards,
> Carlo
> 
> -- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302). The Open University is authorised and regulated by the Financial Conduct Authority.