You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mrql.apache.org by Etienne Dumoulin <et...@idiro.com> on 2015/01/07 13:53:15 UTC
Spark mode file not found
Hi MRQL users,
I am using mrql 0.9.2 and spark 1.0.2.
I have a little issue with the spark mode.
When I try to execute a small test job that is successful in MapReduce mode
I get an error.
RMAT and pagerank examples throw a lot of warnings but I get the result at
the end.
[hadoop@namenode ~]$ /home/hadoop/mrql-0.9.2-incubating-src/bin/mrql.spark
-dist -nodes 1 mrql-0.9.2-incubating-src/queries/RMAT.mrql 100 1000
Apache MRQL version 0.9.2 (compiled distributed Spark mode using 1 tasks)
Query type: ( int, int, int, int ) -> ( int, int )
Query type: !bag(( int, int ))
Physical plan:
MapReduce:
input: Generator
Run time: 6.165 secs
15/01/07 10:39:10 ERROR remote.EndpointWriter: AssociationError
[akka.tcp://spark@namenode:43173] <-
[akka.tcp://sparkExecutor@datanode2:40321]:
Error [Shut down address: akka.tcp://sparkExecutor@datanode2:40321] [
akka.remote.ShutDownAssociation: Shut down address:
akka.tcp://sparkExecutor@datanode2:40321
Caused by: akka.remote.transport.Transport$InvalidAssociationException: The
remote system terminated the association because it is shutting down.
]
15/01/07 10:39:10 ERROR remote.EndpointWriter: AssociationError
[akka.tcp://spark@namenode:43173] <-
[akka.tcp://sparkExecutor@datanode3:39739]:
Error [Shut down address: akka.tcp://sparkExecutor@datanode3:39739] [
akka.remote.ShutDownAssociation: Shut down address:
akka.tcp://sparkExecutor@datanode3:39739
Caused by: akka.remote.transport.Transport$InvalidAssociationException: The
remote system terminated the association because it is shutting down.
]
/home/hadoop/mrql-0.9.2-incubating-src/bin/mrql.spark -dist -nodes 1
mrql-0.9.2-incubating-src/queries/pagerank.mrql
Apache MRQL version 0.9.2 (compiled distributed Spark mode using 1 tasks)
Query type: long
Physical plan:
Aggregate:
input: MapAggregateReduce:
input: Source (binary): "tmp/graph.bin"
Run time: 6.352 secs
Query type: string
Result:
"*** number of nodes: 76"
Run time: 0.055 secs
Query type: !list(< node: int, rank: double >)
Physical plan:
MapReduce:
input: Repeat (x_55):
init: MapReduce:
input: Source (binary): "tmp/graph.bin"
step: MapReduce:
input: x_55
Repeat #1: 67 true results
15/01/07 12:32:08 WARN scheduler.TaskSetManager: Lost TID 6 (task 5.0:0)
15/01/07 12:32:08 WARN scheduler.TaskSetManager: Loss was due to
java.lang.Error
java.lang.Error: Cannot up-coerce the numerical value null
at org.apache.mrql.SystemFunctions.error(SystemFunctions.java:38)
at org.apache.mrql.SystemFunctions.coerce(SystemFunctions.java:359)
at org.apache.mrql.MRQL_Lambda_15.eval(UserFunctions_6.java from
JavaSourceFromString:32)
at org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:49)
at org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:50)
at
scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at
scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:107)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
at org.apache.spark.scheduler.Task.run(Task.scala:51)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Repeat #2: 61 true results
15/01/07 12:32:09 WARN scheduler.TaskSetManager: Lost TID 9 (task 9.0:0)
15/01/07 12:32:09 WARN scheduler.TaskSetManager: Loss was due to
java.lang.Error
java.lang.Error: Cannot up-coerce the numerical value null
at org.apache.mrql.SystemFunctions.error(SystemFunctions.java:38)
at org.apache.mrql.SystemFunctions.coerce(SystemFunctions.java:359)
at org.apache.mrql.MRQL_Lambda_15.eval(UserFunctions_6.java from
JavaSourceFromString:32)
at org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:49)
at org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:50)
at
scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at
scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:107)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
at org.apache.spark.scheduler.Task.run(Task.scala:51)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
15/01/07 12:32:09 WARN scheduler.TaskSetManager: Lost TID 10 (task 9.0:0)
Repeat #3: 41 true results
15/01/07 12:32:09 WARN scheduler.TaskSetManager: Lost TID 13 (task 14.0:0)
15/01/07 12:32:09 WARN scheduler.TaskSetManager: Loss was due to
java.lang.Error
java.lang.Error: Cannot up-coerce the numerical value null
at org.apache.mrql.SystemFunctions.error(SystemFunctions.java:38)
at org.apache.mrql.SystemFunctions.coerce(SystemFunctions.java:359)
at org.apache.mrql.MRQL_Lambda_15.eval(UserFunctions_6.java from
JavaSourceFromString:32)
at org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:49)
at org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:50)
at
scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at
scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at
scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:107)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
at org.apache.spark.scheduler.Task.run(Task.scala:51)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
15/01/07 12:32:09 WARN scheduler.TaskSetManager: Lost TID 14 (task 14.0:0)
15/01/07 12:32:09 WARN scheduler.TaskSetManager: Lost TID 15 (task 14.0:0)
Repeat #4: 6 true results
Repeat #5: 0 true results
Result:
[ < node: 0, rank: 0.07872101046749681 >, < node: 6, rank:
0.05065660974066882 >, < node: 3, rank: 0.0459596116701186 >, < node: 25,
rank: 0.04376273127207268 >, < node: 12, rank: 0.04188453600736429 >, <
node: 1, rank: 0.04040564312366409 >, < node: 50, rank:
0.033875663833967604 >, < node: 31, rank: 0.023745510555659932 >, < node:
28, rank: 0.022415986790041594 >, < node: 9, rank: 0.021938982955425616 >,
< node: 75, rank: 0.021586163633336066 >, < node: 18, rank:
0.02116068257012926 >, < node: 15, rank: 0.02026443639558245 >, < node: 62,
rank: 0.017767555008348448 >, < node: 53, rank: 0.017410210069182946 >, <
node: 7, rank: 0.017195867587074802 >, < node: 56, rank:
0.01699938245303271 >, < node: 10, rank: 0.01654553114240747 >, < node: 26,
rank: 0.01650909559014266 >, < node: 37, rank: 0.016466833143024252 >, ... ]
Run time: 5.379 secs
15/01/07 12:32:11 ERROR remote.EndpointWriter: AssociationError
[akka.tcp://spark@namenode:50514] <-
[akka.tcp://sparkExecutor@datanode2:32794]:
Error [Shut down address: akka.tcp://sparkExecutor@datanode2:32794] [
akka.remote.ShutDownAssociation: Shut down address:
akka.tcp://sparkExecutor@datanode2:32794
Caused by: akka.remote.transport.Transport$InvalidAssociationException: The
remote system terminated the association because it is shutting down.
]
15/01/07 12:32:11 ERROR remote.EndpointWriter: AssociationError
[akka.tcp://spark@namenode:50514] <-
[akka.tcp://sparkExecutor@datanode3:35978]:
Error [Shut down address: akka.tcp://sparkExecutor@datanode3:35978] [
akka.remote.ShutDownAssociation: Shut down address:
akka.tcp://sparkExecutor@datanode3:35978
Caused by: akka.remote.transport.Transport$InvalidAssociationException: The
remote system terminated the association because it is shutting down.
]
However when I run a simple select script (that works in MapReduce mode):
ds =
source(line,'/user/marcos/hdfs_file2/Text1Comma',',',type(<city:string,country:string,pop:string>));
SELECT (t.city, t.country) FROM t in ds;
/home/hadoop/mrql-0.9.2-incubating-src/bin/mrql.spark -dist -nodes 1
/tmp/script_etienne.mrql
Apache MRQL version 0.9.2 (compiled distributed Spark mode using 1 tasks)
Query type: !bag(( string, string ))
Physical plan:
cMap:
input: Source (line): "/user/marcos/hdfs_file2/Text1Comma"
15/01/07 12:32:48 WARN scheduler.TaskSetManager: Lost TID 0 (task 0.0:0)
15/01/07 12:32:48 WARN scheduler.TaskSetManager:
*Loss was due to
java.io.FileNotFoundExceptionjava.io.FileNotFoundException: File does not
exist: /tmp/hadoop_data_source_dir.txt*
at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1843)
at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1834)
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:578)
at
org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:154)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427)
at
org.apache.mrql.SparkEvaluator.load_source_dir(SparkEvaluator.java:155)
at
org.apache.mrql.SparkParsedInputFormat.getRecordReader(SparkParsedInputFormat.java:92)
at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:193)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:184)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:93)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
at org.apache.spark.rdd.FlatMappedRDD.compute(FlatMappedRDD.scala:33)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
at org.apache.spark.scheduler.Task.run(Task.scala:51)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
15/01/07 12:32:48 WARN scheduler.TaskSetManager: Lost TID 1 (task 0.0:0)
15/01/07 12:32:48 WARN scheduler.TaskSetManager: Lost TID 2 (task 0.0:0)
15/01/07 12:32:48 WARN scheduler.TaskSetManager: Lost TID 3 (task 0.0:0)
15/01/07 12:32:48 ERROR scheduler.TaskSetManager: Task 0.0:0 failed 4
times; aborting job
*** MRQL System Error at line 8: java.lang.Error:
org.apache.spark.SparkException: Job aborted due to stage failure: Task
0.0:0 failed 4 times, most recent failure: Exception failure in TID 3 on
host datanode3: java.io.FileNotFoundException:* File does not exist:
/tmp/hadoop_data_source_dir.txt*
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1843)
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1834)
org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:578)
org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:154)
org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427)
org.apache.mrql.SparkEvaluator.load_source_dir(SparkEvaluator.java:155)
org.apache.mrql.SparkParsedInputFormat.getRecordReader(SparkParsedInputFormat.java:92)
org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:193)
org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:184)
org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:93)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
org.apache.spark.rdd.FlatMappedRDD.compute(FlatMappedRDD.scala:33)
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
org.apache.spark.scheduler.Task.run(Task.scala:51)
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
java.lang.Thread.run(Thread.java:662)
Driver stacktrace:
15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
[akka.tcp://spark@namenode:43495] <-
[akka.tcp://sparkExecutor@datanode2:47194]:
Error [Shut down address: akka.tcp://sparkExecutor@datanode2:47194] [
akka.remote.ShutDownAssociation: Shut down address:
akka.tcp://sparkExecutor@datanode2:47194
Caused by: akka.remote.transport.Transport$InvalidAssociationException: The
remote system terminated the association because it is shutting down.
]
15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
[akka.tcp://spark@namenode:43495] <-
[akka.tcp://sparkExecutor@datanode3:60893]:
Error [Shut down address: akka.tcp://sparkExecutor@datanode3:60893] [
akka.remote.ShutDownAssociation: Shut down address:
akka.tcp://sparkExecutor@datanode3:60893
Caused by: akka.remote.transport.Transport$InvalidAssociationException: The
remote system terminated the association because it is shutting down.
]
15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
[akka.tcp://spark@namenode:43495] -> [akka.tcp://spark@datanode2:56494]:
Error [Association failed with [akka.tcp://spark@datanode2:56494]] [
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://spark@datanode2:56494]
Caused by:
akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
Connection refused: datanode2/192.168.11.90:56494
]
15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
[akka.tcp://spark@namenode:43495] -> [akka.tcp://spark@datanode2:56494]:
Error [Association failed with [akka.tcp://spark@datanode2:56494]] [
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://spark@datanode2:56494]
Caused by:
akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
Connection refused: datanode2/192.168.11.90:56494
]
15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
[akka.tcp://spark@namenode:43495] -> [akka.tcp://spark@datanode2:56494]:
Error [Association failed with [akka.tcp://spark@datanode2:56494]] [
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://spark@datanode2:56494]
Caused by:
akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
Connection refused: datanode2/192.168.11.90:56494
]
15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
[akka.tcp://spark@namenode:43495] -> [akka.tcp://spark@datanode3:42480]:
Error [Association failed with [akka.tcp://spark@datanode3:42480]] [
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://spark@datanode3:42480]
Caused by:
akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
Connection refused: datanode3/192.168.11.91:42480
]
15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
[akka.tcp://spark@namenode:43495] -> [akka.tcp://spark@datanode3:42480]:
Error [Association failed with [akka.tcp://spark@datanode3:42480]] [
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://spark@datanode3:42480]
Caused by:
akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
Connection refused: datanode3/192.168.11.91:42480
]
15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
[akka.tcp://spark@namenode:43495] -> [akka.tcp://spark@datanode3:42480]:
Error [Association failed with [akka.tcp://spark@datanode3:42480]] [
akka.remote.EndpointAssociationException: Association failed with
[akka.tcp://spark@datanode3:42480]
Caused by:
akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
Connection refused: datanode3/192.168.11.91:42480
]
The spark installation seems fine as I can use the spark shell and Zeppelin
is working.
There are no useful spark logs and I am not sure where to start the
troubleshooting.
Thanks for your help,
Étienne
--
Étienne Dumoulin
Head of product development
Idiro Technologies
Clarendon House,
34-37 Clarendon St,
Dublin 2
Ireland
Re: Spark mode file not found
Posted by Leonidas Fegaras <fe...@cse.uta.edu>.
Hi Etienne,
I think this is a file permission problem. Must be a hadoop bug.
Please login as the hadoop user and do:
hadoop fs -chmod -R 777 /tmp
Best regards
Leonidas
On 01/08/2015 04:01 AM, Etienne Dumoulin wrote:
> Hi Leonidas,
>
> Thank you for your help. The newest version fix my original problem.
> The result is now delivered.
>
> However from a non root user
> I get a similar exception:
> [marcos@namenode mrql]$ /home/hadoop/mrql-0.9.4/bin/mrql.spark -dist
> -nodes 1 /tmp/script_etienne.mrql
> .
> .
> java.io.FileNotFoundException: File does not exist:
> /user/marcos/tmp/hadoop_data_source_dir.txt
>
> Where hadoop is my hadoop root user and marcos my regular user.
> If I /ls/ the tmp directory I get:
> [marcos@namenode mrql]$ hadoop fs -ls /user/marcos/tmp/
> Found 10 items
> -rw-r--r-- 2 marcos users 213 2015-01-08 09:51
> /user/marcos/tmp/marcos_data_source_dir.txt
>
> Regards,
>
> Étienne
>
> On 7 January 2015 at 22:03, Leonidas Fegaras <fegaras@cse.uta.edu
> <ma...@cse.uta.edu>> wrote:
>
> Hi Etienne,
> Both problems were fixed after MRQL 0.9.2 was released. You can
> get the latest 0.9.4 src tarball using:
> git clone
> https://git-wip-us.apache.org/repos/asf/incubator-mrql.git mrql-0.9.4
> The EndpointWriter are Spark errors during Spark shutdown; you may
> simply ignore them.
> Best regards,
> Leonidas
>
>
>
> On 01/07/2015 06:53 AM, Etienne Dumoulin wrote:
>> Hi MRQL users,
>>
>> I am using mrql 0.9.2 and spark 1.0.2.
>>
>> I have a little issue with the spark mode.
>> When I try to execute a small test job that is successful in
>> MapReduce mode I get an error.
>>
>>
>> RMAT and pagerank examples throw a lot of warnings but I get the
>> result at the end.
>>
>> [hadoop@namenode ~]$
>> /home/hadoop/mrql-0.9.2-incubating-src/bin/mrql.spark -dist
>> -nodes 1 mrql-0.9.2-incubating-src/queries/RMAT.mrql 100 1000
>> Apache MRQL version 0.9.2 (compiled distributed Spark mode using
>> 1 tasks)
>> Query type: ( int, int, int, int ) -> ( int, int )
>> Query type: !bag(( int, int ))
>> Physical plan:
>> MapReduce:
>> input: Generator
>> Run time: 6.165 secs
>> 15/01/07 10:39:10 ERROR remote.EndpointWriter: AssociationError
>> [akka.tcp://spark@namenode:43173] <-
>> [akka.tcp://sparkExecutor@datanode2:40321]: Error [Shut down
>> address: akka.tcp://sparkExecutor@datanode2:40321] [
>> akka.remote.ShutDownAssociation: Shut down address:
>> akka.tcp://sparkExecutor@datanode2:40321
>> Caused by:
>> akka.remote.transport.Transport$InvalidAssociationException: The
>> remote system terminated the association because it is shutting down.
>> ]
>> 15/01/07 10:39:10 ERROR remote.EndpointWriter: AssociationError
>> [akka.tcp://spark@namenode:43173] <-
>> [akka.tcp://sparkExecutor@datanode3:39739]: Error [Shut down
>> address: akka.tcp://sparkExecutor@datanode3:39739] [
>> akka.remote.ShutDownAssociation: Shut down address:
>> akka.tcp://sparkExecutor@datanode3:39739
>> Caused by:
>> akka.remote.transport.Transport$InvalidAssociationException: The
>> remote system terminated the association because it is shutting down.
>> ]
>>
>> /home/hadoop/mrql-0.9.2-incubating-src/bin/mrql.spark -dist
>> -nodes 1 mrql-0.9.2-incubating-src/queries/pagerank.mrql
>> Apache MRQL version 0.9.2 (compiled distributed Spark mode using
>> 1 tasks)
>> Query type: long
>> Physical plan:
>> Aggregate:
>> input: MapAggregateReduce:
>> input: Source (binary): "tmp/graph.bin"
>> Run time: 6.352 secs
>> Query type: string
>> Result:
>> "*** number of nodes: 76"
>> Run time: 0.055 secs
>> Query type: !list(< node: int, rank: double >)
>> Physical plan:
>> MapReduce:
>> input: Repeat (x_55):
>> init: MapReduce:
>> input: Source (binary): "tmp/graph.bin"
>> step: MapReduce:
>> input: x_55
>> Repeat #1: 67 true results
>> 15/01/07 12:32:08 WARN scheduler.TaskSetManager: Lost TID 6 (task
>> 5.0:0)
>> 15/01/07 12:32:08 WARN scheduler.TaskSetManager: Loss was due to
>> java.lang.Error
>> java.lang.Error: Cannot up-coerce the numerical value null
>> at org.apache.mrql.SystemFunctions.error(SystemFunctions.java:38)
>> at
>> org.apache.mrql.SystemFunctions.coerce(SystemFunctions.java:359)
>> at org.apache.mrql.MRQL_Lambda_15.eval(UserFunctions_6.java
>> from JavaSourceFromString:32)
>> at
>> org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:49)
>> at
>> org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:50)
>> at
>> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>> at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
>> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>> at
>> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
>> at
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
>> at
>> org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:107)
>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
>> at
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
>> at org.apache.spark.scheduler.Task.run(Task.scala:51)
>> at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> at java.lang.Thread.run(Thread.java:662)
>> Repeat #2: 61 true results
>> 15/01/07 12:32:09 WARN scheduler.TaskSetManager: Lost TID 9 (task
>> 9.0:0)
>> 15/01/07 12:32:09 WARN scheduler.TaskSetManager: Loss was due to
>> java.lang.Error
>> java.lang.Error: Cannot up-coerce the numerical value null
>> at org.apache.mrql.SystemFunctions.error(SystemFunctions.java:38)
>> at
>> org.apache.mrql.SystemFunctions.coerce(SystemFunctions.java:359)
>> at org.apache.mrql.MRQL_Lambda_15.eval(UserFunctions_6.java
>> from JavaSourceFromString:32)
>> at
>> org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:49)
>> at
>> org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:50)
>> at
>> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>> at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
>> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>> at
>> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
>> at
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
>> at
>> org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:107)
>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
>> at
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
>> at org.apache.spark.scheduler.Task.run(Task.scala:51)
>> at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> at java.lang.Thread.run(Thread.java:662)
>> 15/01/07 12:32:09 WARN scheduler.TaskSetManager: Lost TID 10
>> (task 9.0:0)
>> Repeat #3: 41 true results
>> 15/01/07 12:32:09 WARN scheduler.TaskSetManager: Lost TID 13
>> (task 14.0:0)
>> 15/01/07 12:32:09 WARN scheduler.TaskSetManager: Loss was due to
>> java.lang.Error
>> java.lang.Error: Cannot up-coerce the numerical value null
>> at org.apache.mrql.SystemFunctions.error(SystemFunctions.java:38)
>> at
>> org.apache.mrql.SystemFunctions.coerce(SystemFunctions.java:359)
>> at org.apache.mrql.MRQL_Lambda_15.eval(UserFunctions_6.java
>> from JavaSourceFromString:32)
>> at
>> org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:49)
>> at
>> org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:50)
>> at
>> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
>> at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
>> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>> at
>> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
>> at
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
>> at
>> org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:107)
>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
>> at
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
>> at org.apache.spark.scheduler.Task.run(Task.scala:51)
>> at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> at java.lang.Thread.run(Thread.java:662)
>> 15/01/07 12:32:09 WARN scheduler.TaskSetManager: Lost TID 14
>> (task 14.0:0)
>> 15/01/07 12:32:09 WARN scheduler.TaskSetManager: Lost TID 15
>> (task 14.0:0)
>> Repeat #4: 6 true results
>> Repeat #5: 0 true results
>> Result:
>> [ < node: 0, rank: 0.07872101046749681 >, < node: 6, rank:
>> 0.05065660974066882 >, < node: 3, rank: 0.0459596116701186 >, <
>> node: 25, rank: 0.04376273127207268 >, < node: 12, rank:
>> 0.04188453600736429 >, < node: 1, rank: 0.04040564312366409 >, <
>> node: 50, rank: 0.033875663833967604 >, < node: 31, rank:
>> 0.023745510555659932 >, < node: 28, rank: 0.022415986790041594 >,
>> < node: 9, rank: 0.021938982955425616 >, < node: 75, rank:
>> 0.021586163633336066 >, < node: 18, rank: 0.02116068257012926 >,
>> < node: 15, rank: 0.02026443639558245 >, < node: 62, rank:
>> 0.017767555008348448 >, < node: 53, rank: 0.017410210069182946 >,
>> < node: 7, rank: 0.017195867587074802 >, < node: 56, rank:
>> 0.01699938245303271 >, < node: 10, rank: 0.01654553114240747 >, <
>> node: 26, rank: 0.01650909559014266 >, < node: 37, rank:
>> 0.016466833143024252 >, ... ]
>> Run time: 5.379 secs
>> 15/01/07 12:32:11 ERROR remote.EndpointWriter: AssociationError
>> [akka.tcp://spark@namenode:50514] <-
>> [akka.tcp://sparkExecutor@datanode2:32794]: Error [Shut down
>> address: akka.tcp://sparkExecutor@datanode2:32794] [
>> akka.remote.ShutDownAssociation: Shut down address:
>> akka.tcp://sparkExecutor@datanode2:32794
>> Caused by:
>> akka.remote.transport.Transport$InvalidAssociationException: The
>> remote system terminated the association because it is shutting down.
>> ]
>> 15/01/07 12:32:11 ERROR remote.EndpointWriter: AssociationError
>> [akka.tcp://spark@namenode:50514] <-
>> [akka.tcp://sparkExecutor@datanode3:35978]: Error [Shut down
>> address: akka.tcp://sparkExecutor@datanode3:35978] [
>> akka.remote.ShutDownAssociation: Shut down address:
>> akka.tcp://sparkExecutor@datanode3:35978
>> Caused by:
>> akka.remote.transport.Transport$InvalidAssociationException: The
>> remote system terminated the association because it is shutting down.
>> ]
>>
>>
>> However when I run a simple select script (that works in
>> MapReduce mode):
>> ds =
>> source(line,'/user/marcos/hdfs_file2/Text1Comma',',',type(<city:string,country:string,pop:string>));
>> SELECT (t.city, t.country) FROM t in ds;
>>
>> /home/hadoop/mrql-0.9.2-incubating-src/bin/mrql.spark -dist
>> -nodes 1 /tmp/script_etienne.mrql
>> Apache MRQL version 0.9.2 (compiled distributed Spark mode using
>> 1 tasks)
>> Query type: !bag(( string, string ))
>> Physical plan:
>> cMap:
>> input: Source (line): "/user/marcos/hdfs_file2/Text1Comma"
>> 15/01/07 12:32:48 WARN scheduler.TaskSetManager: Lost TID 0 (task
>> 0.0:0)
>> 15/01/07 12:32:48 WARN scheduler.TaskSetManager: *Loss was due to
>> java.io.FileNotFoundException
>> java.io.FileNotFoundException: File does not exist:
>> /tmp/hadoop_data_source_dir.txt*
>> at
>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1843)
>> at
>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1834)
>> at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:578)
>> at
>> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:154)
>> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427)
>> at
>> org.apache.mrql.SparkEvaluator.load_source_dir(SparkEvaluator.java:155)
>> at
>> org.apache.mrql.SparkParsedInputFormat.getRecordReader(SparkParsedInputFormat.java:92)
>> at
>> org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:193)
>> at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:184)
>> at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:93)
>> at
>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>> at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>> at
>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>> at
>> org.apache.spark.rdd.FlatMappedRDD.compute(FlatMappedRDD.scala:33)
>> at
>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>> at
>> org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
>> at
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
>> at org.apache.spark.scheduler.Task.run(Task.scala:51)
>> at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> at java.lang.Thread.run(Thread.java:662)
>> 15/01/07 12:32:48 WARN scheduler.TaskSetManager: Lost TID 1 (task
>> 0.0:0)
>> 15/01/07 12:32:48 WARN scheduler.TaskSetManager: Lost TID 2 (task
>> 0.0:0)
>> 15/01/07 12:32:48 WARN scheduler.TaskSetManager: Lost TID 3 (task
>> 0.0:0)
>> 15/01/07 12:32:48 ERROR scheduler.TaskSetManager: Task 0.0:0
>> failed 4 times; aborting job
>> *** MRQL System Error at line 8: java.lang.Error:
>> org.apache.spark.SparkException: Job aborted due to stage
>> failure: Task 0.0:0 failed 4 times, most recent failure:
>> Exception failure in TID 3 on host datanode3:
>> java.io.FileNotFoundException:*File does not exist:
>> /tmp/hadoop_data_source_dir.txt*
>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1843)
>> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1834)
>> org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:578)
>> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:154)
>> org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427)
>> org.apache.mrql.SparkEvaluator.load_source_dir(SparkEvaluator.java:155)
>> org.apache.mrql.SparkParsedInputFormat.getRecordReader(SparkParsedInputFormat.java:92)
>> org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:193)
>> org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:184)
>> org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:93)
>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>> org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>> org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>> org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
>> org.apache.spark.rdd.FlatMappedRDD.compute(FlatMappedRDD.scala:33)
>> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
>> org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
>> org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
>> org.apache.spark.scheduler.Task.run(Task.scala:51)
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> java.lang.Thread.run(Thread.java:662)
>> Driver stacktrace:
>> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
>> [akka.tcp://spark@namenode:43495] <-
>> [akka.tcp://sparkExecutor@datanode2:47194]: Error [Shut down
>> address: akka.tcp://sparkExecutor@datanode2:47194] [
>> akka.remote.ShutDownAssociation: Shut down address:
>> akka.tcp://sparkExecutor@datanode2:47194
>> Caused by:
>> akka.remote.transport.Transport$InvalidAssociationException: The
>> remote system terminated the association because it is shutting down.
>> ]
>> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
>> [akka.tcp://spark@namenode:43495] <-
>> [akka.tcp://sparkExecutor@datanode3:60893]: Error [Shut down
>> address: akka.tcp://sparkExecutor@datanode3:60893] [
>> akka.remote.ShutDownAssociation: Shut down address:
>> akka.tcp://sparkExecutor@datanode3:60893
>> Caused by:
>> akka.remote.transport.Transport$InvalidAssociationException: The
>> remote system terminated the association because it is shutting down.
>> ]
>> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
>> [akka.tcp://spark@namenode:43495] ->
>> [akka.tcp://spark@datanode2:56494]: Error [Association failed
>> with [akka.tcp://spark@datanode2:56494]] [
>> akka.remote.EndpointAssociationException: Association failed with
>> [akka.tcp://spark@datanode2:56494]
>> Caused by:
>> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
>> Connection refused: datanode2/192.168.11.90:56494
>> <http://192.168.11.90:56494>
>> ]
>> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
>> [akka.tcp://spark@namenode:43495] ->
>> [akka.tcp://spark@datanode2:56494]: Error [Association failed
>> with [akka.tcp://spark@datanode2:56494]] [
>> akka.remote.EndpointAssociationException: Association failed with
>> [akka.tcp://spark@datanode2:56494]
>> Caused by:
>> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
>> Connection refused: datanode2/192.168.11.90:56494
>> <http://192.168.11.90:56494>
>> ]
>> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
>> [akka.tcp://spark@namenode:43495] ->
>> [akka.tcp://spark@datanode2:56494]: Error [Association failed
>> with [akka.tcp://spark@datanode2:56494]] [
>> akka.remote.EndpointAssociationException: Association failed with
>> [akka.tcp://spark@datanode2:56494]
>> Caused by:
>> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
>> Connection refused: datanode2/192.168.11.90:56494
>> <http://192.168.11.90:56494>
>> ]
>> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
>> [akka.tcp://spark@namenode:43495] ->
>> [akka.tcp://spark@datanode3:42480]: Error [Association failed
>> with [akka.tcp://spark@datanode3:42480]] [
>> akka.remote.EndpointAssociationException: Association failed with
>> [akka.tcp://spark@datanode3:42480]
>> Caused by:
>> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
>> Connection refused: datanode3/192.168.11.91:42480
>> <http://192.168.11.91:42480>
>> ]
>> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
>> [akka.tcp://spark@namenode:43495] ->
>> [akka.tcp://spark@datanode3:42480]: Error [Association failed
>> with [akka.tcp://spark@datanode3:42480]] [
>> akka.remote.EndpointAssociationException: Association failed with
>> [akka.tcp://spark@datanode3:42480]
>> Caused by:
>> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
>> Connection refused: datanode3/192.168.11.91:42480
>> <http://192.168.11.91:42480>
>> ]
>> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
>> [akka.tcp://spark@namenode:43495] ->
>> [akka.tcp://spark@datanode3:42480]: Error [Association failed
>> with [akka.tcp://spark@datanode3:42480]] [
>> akka.remote.EndpointAssociationException: Association failed with
>> [akka.tcp://spark@datanode3:42480]
>> Caused by:
>> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
>> Connection refused: datanode3/192.168.11.91:42480
>> <http://192.168.11.91:42480>
>> ]
>>
>>
>> The spark installation seems fine as I can use the spark shell
>> and Zeppelin is working.
>> There are no useful spark logs and I am not sure where to start
>> the troubleshooting.
>>
>> Thanks for your help,
>>
>> Étienne
>> --
>> Étienne Dumoulin
>> Head of product development
>>
>> Idiro Technologies
>> Clarendon House,
>> 34-37 Clarendon St,
>> Dublin 2
>> Ireland
>
>
>
>
> --
> Étienne Dumoulin
> Head of product development
>
> Idiro Technologies
> Clarendon House,
> 34-37 Clarendon St,
> Dublin 2
> Ireland
> Email:etienne.dumoulin@idiro.com <ma...@idiro.com>
Re: Spark mode file not found
Posted by Etienne Dumoulin <et...@idiro.com>.
Hi Leonidas,
Thank you for your help. The newest version fix my original problem.
The result is now delivered.
However from a non root user
I get a similar exception:
[marcos@namenode mrql]$ /home/hadoop/mrql-0.9.4/bin/mrql.spark -dist -nodes
1 /tmp/script_etienne.mrql
.
.
java.io.FileNotFoundException: File does not exist:
/user/marcos/tmp/hadoop_data_source_dir.txt
Where hadoop is my hadoop root user and marcos my regular user.
If I *ls* the tmp directory I get:
[marcos@namenode mrql]$ hadoop fs -ls /user/marcos/tmp/
Found 10 items
-rw-r--r-- 2 marcos users 213 2015-01-08 09:51
/user/marcos/tmp/marcos_data_source_dir.txt
Regards,
Étienne
On 7 January 2015 at 22:03, Leonidas Fegaras <fe...@cse.uta.edu> wrote:
> Hi Etienne,
> Both problems were fixed after MRQL 0.9.2 was released. You can get the
> latest 0.9.4 src tarball using:
> git clone https://git-wip-us.apache.org/repos/asf/incubator-mrql.git
> mrql-0.9.4
> The EndpointWriter are Spark errors during Spark shutdown; you may simply
> ignore them.
> Best regards,
> Leonidas
>
>
>
> On 01/07/2015 06:53 AM, Etienne Dumoulin wrote:
>
> Hi MRQL users,
>
> I am using mrql 0.9.2 and spark 1.0.2.
>
> I have a little issue with the spark mode.
> When I try to execute a small test job that is successful in MapReduce
> mode I get an error.
>
>
> RMAT and pagerank examples throw a lot of warnings but I get the result at
> the end.
>
> [hadoop@namenode ~]$
> /home/hadoop/mrql-0.9.2-incubating-src/bin/mrql.spark -dist -nodes 1
> mrql-0.9.2-incubating-src/queries/RMAT.mrql 100 1000
> Apache MRQL version 0.9.2 (compiled distributed Spark mode using 1 tasks)
> Query type: ( int, int, int, int ) -> ( int, int )
> Query type: !bag(( int, int ))
> Physical plan:
> MapReduce:
> input: Generator
> Run time: 6.165 secs
> 15/01/07 10:39:10 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:43173] <- [akka.tcp://sparkExecutor@datanode2:40321]:
> Error [Shut down address: akka.tcp://sparkExecutor@datanode2:40321] [
> akka.remote.ShutDownAssociation: Shut down address:
> akka.tcp://sparkExecutor@datanode2:40321
> Caused by: akka.remote.transport.Transport$InvalidAssociationException:
> The remote system terminated the association because it is shutting down.
> ]
> 15/01/07 10:39:10 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:43173] <- [akka.tcp://sparkExecutor@datanode3:39739]:
> Error [Shut down address: akka.tcp://sparkExecutor@datanode3:39739] [
> akka.remote.ShutDownAssociation: Shut down address:
> akka.tcp://sparkExecutor@datanode3:39739
> Caused by: akka.remote.transport.Transport$InvalidAssociationException:
> The remote system terminated the association because it is shutting down.
> ]
>
> /home/hadoop/mrql-0.9.2-incubating-src/bin/mrql.spark -dist -nodes 1
> mrql-0.9.2-incubating-src/queries/pagerank.mrql
> Apache MRQL version 0.9.2 (compiled distributed Spark mode using 1 tasks)
> Query type: long
> Physical plan:
> Aggregate:
> input: MapAggregateReduce:
> input: Source (binary): "tmp/graph.bin"
> Run time: 6.352 secs
> Query type: string
> Result:
> "*** number of nodes: 76"
> Run time: 0.055 secs
> Query type: !list(< node: int, rank: double >)
> Physical plan:
> MapReduce:
> input: Repeat (x_55):
> init: MapReduce:
> input: Source (binary): "tmp/graph.bin"
> step: MapReduce:
> input: x_55
> Repeat #1: 67 true results
> 15/01/07 12:32:08 WARN scheduler.TaskSetManager: Lost TID 6 (task 5.0:0)
> 15/01/07 12:32:08 WARN scheduler.TaskSetManager: Loss was due to
> java.lang.Error
> java.lang.Error: Cannot up-coerce the numerical value null
> at org.apache.mrql.SystemFunctions.error(SystemFunctions.java:38)
> at org.apache.mrql.SystemFunctions.coerce(SystemFunctions.java:359)
> at org.apache.mrql.MRQL_Lambda_15.eval(UserFunctions_6.java from
> JavaSourceFromString:32)
> at org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:49)
> at org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:50)
> at
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
> at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at
> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:107)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
> at org.apache.spark.scheduler.Task.run(Task.scala:51)
> at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Repeat #2: 61 true results
> 15/01/07 12:32:09 WARN scheduler.TaskSetManager: Lost TID 9 (task 9.0:0)
> 15/01/07 12:32:09 WARN scheduler.TaskSetManager: Loss was due to
> java.lang.Error
> java.lang.Error: Cannot up-coerce the numerical value null
> at org.apache.mrql.SystemFunctions.error(SystemFunctions.java:38)
> at org.apache.mrql.SystemFunctions.coerce(SystemFunctions.java:359)
> at org.apache.mrql.MRQL_Lambda_15.eval(UserFunctions_6.java from
> JavaSourceFromString:32)
> at org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:49)
> at org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:50)
> at
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
> at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at
> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:107)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
> at org.apache.spark.scheduler.Task.run(Task.scala:51)
> at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 15/01/07 12:32:09 WARN scheduler.TaskSetManager: Lost TID 10 (task 9.0:0)
> Repeat #3: 41 true results
> 15/01/07 12:32:09 WARN scheduler.TaskSetManager: Lost TID 13 (task 14.0:0)
> 15/01/07 12:32:09 WARN scheduler.TaskSetManager: Loss was due to
> java.lang.Error
> java.lang.Error: Cannot up-coerce the numerical value null
> at org.apache.mrql.SystemFunctions.error(SystemFunctions.java:38)
> at org.apache.mrql.SystemFunctions.coerce(SystemFunctions.java:359)
> at org.apache.mrql.MRQL_Lambda_15.eval(UserFunctions_6.java from
> JavaSourceFromString:32)
> at org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:49)
> at org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:50)
> at
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
> at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at
> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:107)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
> at org.apache.spark.scheduler.Task.run(Task.scala:51)
> at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 15/01/07 12:32:09 WARN scheduler.TaskSetManager: Lost TID 14 (task 14.0:0)
> 15/01/07 12:32:09 WARN scheduler.TaskSetManager: Lost TID 15 (task 14.0:0)
> Repeat #4: 6 true results
> Repeat #5: 0 true results
> Result:
> [ < node: 0, rank: 0.07872101046749681 >, < node: 6, rank:
> 0.05065660974066882 >, < node: 3, rank: 0.0459596116701186 >, < node: 25,
> rank: 0.04376273127207268 >, < node: 12, rank: 0.04188453600736429 >, <
> node: 1, rank: 0.04040564312366409 >, < node: 50, rank:
> 0.033875663833967604 >, < node: 31, rank: 0.023745510555659932 >, < node:
> 28, rank: 0.022415986790041594 >, < node: 9, rank: 0.021938982955425616 >,
> < node: 75, rank: 0.021586163633336066 >, < node: 18, rank:
> 0.02116068257012926 >, < node: 15, rank: 0.02026443639558245 >, < node: 62,
> rank: 0.017767555008348448 >, < node: 53, rank: 0.017410210069182946 >, <
> node: 7, rank: 0.017195867587074802 >, < node: 56, rank:
> 0.01699938245303271 >, < node: 10, rank: 0.01654553114240747 >, < node: 26,
> rank: 0.01650909559014266 >, < node: 37, rank: 0.016466833143024252 >, ... ]
> Run time: 5.379 secs
> 15/01/07 12:32:11 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:50514] <- [akka.tcp://sparkExecutor@datanode2:32794]:
> Error [Shut down address: akka.tcp://sparkExecutor@datanode2:32794] [
> akka.remote.ShutDownAssociation: Shut down address:
> akka.tcp://sparkExecutor@datanode2:32794
> Caused by: akka.remote.transport.Transport$InvalidAssociationException:
> The remote system terminated the association because it is shutting down.
> ]
> 15/01/07 12:32:11 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:50514] <- [akka.tcp://sparkExecutor@datanode3:35978]:
> Error [Shut down address: akka.tcp://sparkExecutor@datanode3:35978] [
> akka.remote.ShutDownAssociation: Shut down address:
> akka.tcp://sparkExecutor@datanode3:35978
> Caused by: akka.remote.transport.Transport$InvalidAssociationException:
> The remote system terminated the association because it is shutting down.
> ]
>
>
> However when I run a simple select script (that works in MapReduce mode):
> ds =
> source(line,'/user/marcos/hdfs_file2/Text1Comma',',',type(<city:string,country:string,pop:string>));
> SELECT (t.city, t.country) FROM t in ds;
>
> /home/hadoop/mrql-0.9.2-incubating-src/bin/mrql.spark -dist -nodes 1
> /tmp/script_etienne.mrql
> Apache MRQL version 0.9.2 (compiled distributed Spark mode using 1 tasks)
> Query type: !bag(( string, string ))
> Physical plan:
> cMap:
> input: Source (line): "/user/marcos/hdfs_file2/Text1Comma"
> 15/01/07 12:32:48 WARN scheduler.TaskSetManager: Lost TID 0 (task 0.0:0)
> 15/01/07 12:32:48 WARN scheduler.TaskSetManager:
> *Loss was due to java.io.FileNotFoundException
> java.io.FileNotFoundException: File does not exist:
> /tmp/hadoop_data_source_dir.txt*
> at
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1843)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1834)
> at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:578)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:154)
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427)
> at
> org.apache.mrql.SparkEvaluator.load_source_dir(SparkEvaluator.java:155)
> at
> org.apache.mrql.SparkParsedInputFormat.getRecordReader(SparkParsedInputFormat.java:92)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:193)
> at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:184)
> at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:93)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
> at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
> at org.apache.spark.rdd.FlatMappedRDD.compute(FlatMappedRDD.scala:33)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
> at org.apache.spark.scheduler.Task.run(Task.scala:51)
> at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 15/01/07 12:32:48 WARN scheduler.TaskSetManager: Lost TID 1 (task 0.0:0)
> 15/01/07 12:32:48 WARN scheduler.TaskSetManager: Lost TID 2 (task 0.0:0)
> 15/01/07 12:32:48 WARN scheduler.TaskSetManager: Lost TID 3 (task 0.0:0)
> 15/01/07 12:32:48 ERROR scheduler.TaskSetManager: Task 0.0:0 failed 4
> times; aborting job
> *** MRQL System Error at line 8: java.lang.Error:
> org.apache.spark.SparkException: Job aborted due to stage failure: Task
> 0.0:0 failed 4 times, most recent failure: Exception failure in TID 3 on
> host datanode3: java.io.FileNotFoundException:* File does not exist:
> /tmp/hadoop_data_source_dir.txt*
>
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1843)
>
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1834)
> org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:578)
>
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:154)
> org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427)
>
> org.apache.mrql.SparkEvaluator.load_source_dir(SparkEvaluator.java:155)
>
> org.apache.mrql.SparkParsedInputFormat.getRecordReader(SparkParsedInputFormat.java:92)
> org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:193)
> org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:184)
> org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:93)
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
> org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
> org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
> org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
> org.apache.spark.rdd.FlatMappedRDD.compute(FlatMappedRDD.scala:33)
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
> org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
> org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
> org.apache.spark.scheduler.Task.run(Task.scala:51)
>
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
>
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> java.lang.Thread.run(Thread.java:662)
> Driver stacktrace:
> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:43495] <- [akka.tcp://sparkExecutor@datanode2:47194]:
> Error [Shut down address: akka.tcp://sparkExecutor@datanode2:47194] [
> akka.remote.ShutDownAssociation: Shut down address:
> akka.tcp://sparkExecutor@datanode2:47194
> Caused by: akka.remote.transport.Transport$InvalidAssociationException:
> The remote system terminated the association because it is shutting down.
> ]
> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:43495] <- [akka.tcp://sparkExecutor@datanode3:60893]:
> Error [Shut down address: akka.tcp://sparkExecutor@datanode3:60893] [
> akka.remote.ShutDownAssociation: Shut down address:
> akka.tcp://sparkExecutor@datanode3:60893
> Caused by: akka.remote.transport.Transport$InvalidAssociationException:
> The remote system terminated the association because it is shutting down.
> ]
> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:43495] -> [akka.tcp://spark@datanode2:56494]:
> Error [Association failed with [akka.tcp://spark@datanode2:56494]] [
> akka.remote.EndpointAssociationException: Association failed with
> [akka.tcp://spark@datanode2:56494]
> Caused by:
> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
> Connection refused: datanode2/192.168.11.90:56494
> ]
> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:43495] -> [akka.tcp://spark@datanode2:56494]:
> Error [Association failed with [akka.tcp://spark@datanode2:56494]] [
> akka.remote.EndpointAssociationException: Association failed with
> [akka.tcp://spark@datanode2:56494]
> Caused by:
> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
> Connection refused: datanode2/192.168.11.90:56494
> ]
> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:43495] -> [akka.tcp://spark@datanode2:56494]:
> Error [Association failed with [akka.tcp://spark@datanode2:56494]] [
> akka.remote.EndpointAssociationException: Association failed with
> [akka.tcp://spark@datanode2:56494]
> Caused by:
> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
> Connection refused: datanode2/192.168.11.90:56494
> ]
> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:43495] -> [akka.tcp://spark@datanode3:42480]:
> Error [Association failed with [akka.tcp://spark@datanode3:42480]] [
> akka.remote.EndpointAssociationException: Association failed with
> [akka.tcp://spark@datanode3:42480]
> Caused by:
> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
> Connection refused: datanode3/192.168.11.91:42480
> ]
> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:43495] -> [akka.tcp://spark@datanode3:42480]:
> Error [Association failed with [akka.tcp://spark@datanode3:42480]] [
> akka.remote.EndpointAssociationException: Association failed with
> [akka.tcp://spark@datanode3:42480]
> Caused by:
> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
> Connection refused: datanode3/192.168.11.91:42480
> ]
> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:43495] -> [akka.tcp://spark@datanode3:42480]:
> Error [Association failed with [akka.tcp://spark@datanode3:42480]] [
> akka.remote.EndpointAssociationException: Association failed with
> [akka.tcp://spark@datanode3:42480]
> Caused by:
> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
> Connection refused: datanode3/192.168.11.91:42480
> ]
>
>
> The spark installation seems fine as I can use the spark shell and
> Zeppelin is working.
> There are no useful spark logs and I am not sure where to start the
> troubleshooting.
>
> Thanks for your help,
>
> Étienne
> --
> Étienne Dumoulin
> Head of product development
>
> Idiro Technologies
> Clarendon House,
> 34-37 Clarendon St,
> Dublin 2
> Ireland
>
>
>
--
Étienne Dumoulin
Head of product development
Idiro Technologies
Clarendon House,
34-37 Clarendon St,
Dublin 2
Ireland
Email: etienne.dumoulin@idiro.com
Re: Spark mode file not found
Posted by Leonidas Fegaras <fe...@cse.uta.edu>.
Hi Etienne,
Both problems were fixed after MRQL 0.9.2 was released. You can get the
latest 0.9.4 src tarball using:
git clone https://git-wip-us.apache.org/repos/asf/incubator-mrql.git
mrql-0.9.4
The EndpointWriter are Spark errors during Spark shutdown; you may
simply ignore them.
Best regards,
Leonidas
On 01/07/2015 06:53 AM, Etienne Dumoulin wrote:
> Hi MRQL users,
>
> I am using mrql 0.9.2 and spark 1.0.2.
>
> I have a little issue with the spark mode.
> When I try to execute a small test job that is successful in MapReduce
> mode I get an error.
>
>
> RMAT and pagerank examples throw a lot of warnings but I get the
> result at the end.
>
> [hadoop@namenode ~]$
> /home/hadoop/mrql-0.9.2-incubating-src/bin/mrql.spark -dist -nodes 1
> mrql-0.9.2-incubating-src/queries/RMAT.mrql 100 1000
> Apache MRQL version 0.9.2 (compiled distributed Spark mode using 1 tasks)
> Query type: ( int, int, int, int ) -> ( int, int )
> Query type: !bag(( int, int ))
> Physical plan:
> MapReduce:
> input: Generator
> Run time: 6.165 secs
> 15/01/07 10:39:10 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:43173] <-
> [akka.tcp://sparkExecutor@datanode2:40321]: Error [Shut down address:
> akka.tcp://sparkExecutor@datanode2:40321] [
> akka.remote.ShutDownAssociation: Shut down address:
> akka.tcp://sparkExecutor@datanode2:40321
> Caused by:
> akka.remote.transport.Transport$InvalidAssociationException: The
> remote system terminated the association because it is shutting down.
> ]
> 15/01/07 10:39:10 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:43173] <-
> [akka.tcp://sparkExecutor@datanode3:39739]: Error [Shut down address:
> akka.tcp://sparkExecutor@datanode3:39739] [
> akka.remote.ShutDownAssociation: Shut down address:
> akka.tcp://sparkExecutor@datanode3:39739
> Caused by:
> akka.remote.transport.Transport$InvalidAssociationException: The
> remote system terminated the association because it is shutting down.
> ]
>
> /home/hadoop/mrql-0.9.2-incubating-src/bin/mrql.spark -dist -nodes 1
> mrql-0.9.2-incubating-src/queries/pagerank.mrql
> Apache MRQL version 0.9.2 (compiled distributed Spark mode using 1 tasks)
> Query type: long
> Physical plan:
> Aggregate:
> input: MapAggregateReduce:
> input: Source (binary): "tmp/graph.bin"
> Run time: 6.352 secs
> Query type: string
> Result:
> "*** number of nodes: 76"
> Run time: 0.055 secs
> Query type: !list(< node: int, rank: double >)
> Physical plan:
> MapReduce:
> input: Repeat (x_55):
> init: MapReduce:
> input: Source (binary): "tmp/graph.bin"
> step: MapReduce:
> input: x_55
> Repeat #1: 67 true results
> 15/01/07 12:32:08 WARN scheduler.TaskSetManager: Lost TID 6 (task 5.0:0)
> 15/01/07 12:32:08 WARN scheduler.TaskSetManager: Loss was due to
> java.lang.Error
> java.lang.Error: Cannot up-coerce the numerical value null
> at org.apache.mrql.SystemFunctions.error(SystemFunctions.java:38)
> at org.apache.mrql.SystemFunctions.coerce(SystemFunctions.java:359)
> at org.apache.mrql.MRQL_Lambda_15.eval(UserFunctions_6.java from
> JavaSourceFromString:32)
> at
> org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:49)
> at
> org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:50)
> at
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
> at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at
> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:107)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
> at org.apache.spark.scheduler.Task.run(Task.scala:51)
> at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Repeat #2: 61 true results
> 15/01/07 12:32:09 WARN scheduler.TaskSetManager: Lost TID 9 (task 9.0:0)
> 15/01/07 12:32:09 WARN scheduler.TaskSetManager: Loss was due to
> java.lang.Error
> java.lang.Error: Cannot up-coerce the numerical value null
> at org.apache.mrql.SystemFunctions.error(SystemFunctions.java:38)
> at org.apache.mrql.SystemFunctions.coerce(SystemFunctions.java:359)
> at org.apache.mrql.MRQL_Lambda_15.eval(UserFunctions_6.java from
> JavaSourceFromString:32)
> at
> org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:49)
> at
> org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:50)
> at
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
> at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at
> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:107)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
> at org.apache.spark.scheduler.Task.run(Task.scala:51)
> at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 15/01/07 12:32:09 WARN scheduler.TaskSetManager: Lost TID 10 (task 9.0:0)
> Repeat #3: 41 true results
> 15/01/07 12:32:09 WARN scheduler.TaskSetManager: Lost TID 13 (task 14.0:0)
> 15/01/07 12:32:09 WARN scheduler.TaskSetManager: Loss was due to
> java.lang.Error
> java.lang.Error: Cannot up-coerce the numerical value null
> at org.apache.mrql.SystemFunctions.error(SystemFunctions.java:38)
> at org.apache.mrql.SystemFunctions.coerce(SystemFunctions.java:359)
> at org.apache.mrql.MRQL_Lambda_15.eval(UserFunctions_6.java from
> JavaSourceFromString:32)
> at
> org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:49)
> at
> org.apache.mrql.MapReduceAlgebra$1.hasNext(MapReduceAlgebra.java:50)
> at
> scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
> at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> at
> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:107)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
> at org.apache.spark.scheduler.Task.run(Task.scala:51)
> at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 15/01/07 12:32:09 WARN scheduler.TaskSetManager: Lost TID 14 (task 14.0:0)
> 15/01/07 12:32:09 WARN scheduler.TaskSetManager: Lost TID 15 (task 14.0:0)
> Repeat #4: 6 true results
> Repeat #5: 0 true results
> Result:
> [ < node: 0, rank: 0.07872101046749681 >, < node: 6, rank:
> 0.05065660974066882 >, < node: 3, rank: 0.0459596116701186 >, < node:
> 25, rank: 0.04376273127207268 >, < node: 12, rank: 0.04188453600736429
> >, < node: 1, rank: 0.04040564312366409 >, < node: 50, rank:
> 0.033875663833967604 >, < node: 31, rank: 0.023745510555659932 >, <
> node: 28, rank: 0.022415986790041594 >, < node: 9, rank:
> 0.021938982955425616 >, < node: 75, rank: 0.021586163633336066 >, <
> node: 18, rank: 0.02116068257012926 >, < node: 15, rank:
> 0.02026443639558245 >, < node: 62, rank: 0.017767555008348448 >, <
> node: 53, rank: 0.017410210069182946 >, < node: 7, rank:
> 0.017195867587074802 >, < node: 56, rank: 0.01699938245303271 >, <
> node: 10, rank: 0.01654553114240747 >, < node: 26, rank:
> 0.01650909559014266 >, < node: 37, rank: 0.016466833143024252 >, ... ]
> Run time: 5.379 secs
> 15/01/07 12:32:11 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:50514] <-
> [akka.tcp://sparkExecutor@datanode2:32794]: Error [Shut down address:
> akka.tcp://sparkExecutor@datanode2:32794] [
> akka.remote.ShutDownAssociation: Shut down address:
> akka.tcp://sparkExecutor@datanode2:32794
> Caused by:
> akka.remote.transport.Transport$InvalidAssociationException: The
> remote system terminated the association because it is shutting down.
> ]
> 15/01/07 12:32:11 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:50514] <-
> [akka.tcp://sparkExecutor@datanode3:35978]: Error [Shut down address:
> akka.tcp://sparkExecutor@datanode3:35978] [
> akka.remote.ShutDownAssociation: Shut down address:
> akka.tcp://sparkExecutor@datanode3:35978
> Caused by:
> akka.remote.transport.Transport$InvalidAssociationException: The
> remote system terminated the association because it is shutting down.
> ]
>
>
> However when I run a simple select script (that works in MapReduce mode):
> ds =
> source(line,'/user/marcos/hdfs_file2/Text1Comma',',',type(<city:string,country:string,pop:string>));
> SELECT (t.city, t.country) FROM t in ds;
>
> /home/hadoop/mrql-0.9.2-incubating-src/bin/mrql.spark -dist -nodes 1
> /tmp/script_etienne.mrql
> Apache MRQL version 0.9.2 (compiled distributed Spark mode using 1 tasks)
> Query type: !bag(( string, string ))
> Physical plan:
> cMap:
> input: Source (line): "/user/marcos/hdfs_file2/Text1Comma"
> 15/01/07 12:32:48 WARN scheduler.TaskSetManager: Lost TID 0 (task 0.0:0)
> 15/01/07 12:32:48 WARN scheduler.TaskSetManager: *Loss was due to
> java.io.FileNotFoundException
> java.io.FileNotFoundException: File does not exist:
> /tmp/hadoop_data_source_dir.txt*
> at
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1843)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1834)
> at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:578)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:154)
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427)
> at
> org.apache.mrql.SparkEvaluator.load_source_dir(SparkEvaluator.java:155)
> at
> org.apache.mrql.SparkParsedInputFormat.getRecordReader(SparkParsedInputFormat.java:92)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:193)
> at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:184)
> at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:93)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
> at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
> at org.apache.spark.rdd.FlatMappedRDD.compute(FlatMappedRDD.scala:33)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
> at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
> at org.apache.spark.scheduler.Task.run(Task.scala:51)
> at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 15/01/07 12:32:48 WARN scheduler.TaskSetManager: Lost TID 1 (task 0.0:0)
> 15/01/07 12:32:48 WARN scheduler.TaskSetManager: Lost TID 2 (task 0.0:0)
> 15/01/07 12:32:48 WARN scheduler.TaskSetManager: Lost TID 3 (task 0.0:0)
> 15/01/07 12:32:48 ERROR scheduler.TaskSetManager: Task 0.0:0 failed 4
> times; aborting job
> *** MRQL System Error at line 8: java.lang.Error:
> org.apache.spark.SparkException: Job aborted due to stage failure:
> Task 0.0:0 failed 4 times, most recent failure: Exception failure in
> TID 3 on host datanode3: java.io.FileNotFoundException:*File does not
> exist: /tmp/hadoop_data_source_dir.txt*
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1843)
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1834)
> org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:578)
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:154)
> org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427)
> org.apache.mrql.SparkEvaluator.load_source_dir(SparkEvaluator.java:155)
> org.apache.mrql.SparkParsedInputFormat.getRecordReader(SparkParsedInputFormat.java:92)
> org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:193)
> org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:184)
> org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:93)
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
> org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
> org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
> org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
> org.apache.spark.rdd.FlatMappedRDD.compute(FlatMappedRDD.scala:33)
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
> org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:77)
> org.apache.spark.rdd.RDD.iterator(RDD.scala:227)
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
> org.apache.spark.scheduler.Task.run(Task.scala:51)
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> java.lang.Thread.run(Thread.java:662)
> Driver stacktrace:
> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:43495] <-
> [akka.tcp://sparkExecutor@datanode2:47194]: Error [Shut down address:
> akka.tcp://sparkExecutor@datanode2:47194] [
> akka.remote.ShutDownAssociation: Shut down address:
> akka.tcp://sparkExecutor@datanode2:47194
> Caused by:
> akka.remote.transport.Transport$InvalidAssociationException: The
> remote system terminated the association because it is shutting down.
> ]
> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:43495] <-
> [akka.tcp://sparkExecutor@datanode3:60893]: Error [Shut down address:
> akka.tcp://sparkExecutor@datanode3:60893] [
> akka.remote.ShutDownAssociation: Shut down address:
> akka.tcp://sparkExecutor@datanode3:60893
> Caused by:
> akka.remote.transport.Transport$InvalidAssociationException: The
> remote system terminated the association because it is shutting down.
> ]
> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:43495] ->
> [akka.tcp://spark@datanode2:56494]: Error [Association failed with
> [akka.tcp://spark@datanode2:56494]] [
> akka.remote.EndpointAssociationException: Association failed with
> [akka.tcp://spark@datanode2:56494]
> Caused by:
> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
> Connection refused: datanode2/192.168.11.90:56494
> <http://192.168.11.90:56494>
> ]
> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:43495] ->
> [akka.tcp://spark@datanode2:56494]: Error [Association failed with
> [akka.tcp://spark@datanode2:56494]] [
> akka.remote.EndpointAssociationException: Association failed with
> [akka.tcp://spark@datanode2:56494]
> Caused by:
> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
> Connection refused: datanode2/192.168.11.90:56494
> <http://192.168.11.90:56494>
> ]
> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:43495] ->
> [akka.tcp://spark@datanode2:56494]: Error [Association failed with
> [akka.tcp://spark@datanode2:56494]] [
> akka.remote.EndpointAssociationException: Association failed with
> [akka.tcp://spark@datanode2:56494]
> Caused by:
> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
> Connection refused: datanode2/192.168.11.90:56494
> <http://192.168.11.90:56494>
> ]
> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:43495] ->
> [akka.tcp://spark@datanode3:42480]: Error [Association failed with
> [akka.tcp://spark@datanode3:42480]] [
> akka.remote.EndpointAssociationException: Association failed with
> [akka.tcp://spark@datanode3:42480]
> Caused by:
> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
> Connection refused: datanode3/192.168.11.91:42480
> <http://192.168.11.91:42480>
> ]
> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:43495] ->
> [akka.tcp://spark@datanode3:42480]: Error [Association failed with
> [akka.tcp://spark@datanode3:42480]] [
> akka.remote.EndpointAssociationException: Association failed with
> [akka.tcp://spark@datanode3:42480]
> Caused by:
> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
> Connection refused: datanode3/192.168.11.91:42480
> <http://192.168.11.91:42480>
> ]
> 15/01/07 12:32:48 ERROR remote.EndpointWriter: AssociationError
> [akka.tcp://spark@namenode:43495] ->
> [akka.tcp://spark@datanode3:42480]: Error [Association failed with
> [akka.tcp://spark@datanode3:42480]] [
> akka.remote.EndpointAssociationException: Association failed with
> [akka.tcp://spark@datanode3:42480]
> Caused by:
> akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2:
> Connection refused: datanode3/192.168.11.91:42480
> <http://192.168.11.91:42480>
> ]
>
>
> The spark installation seems fine as I can use the spark shell and
> Zeppelin is working.
> There are no useful spark logs and I am not sure where to start the
> troubleshooting.
>
> Thanks for your help,
>
> Étienne
> --
> Étienne Dumoulin
> Head of product development
>
> Idiro Technologies
> Clarendon House,
> 34-37 Clarendon St,
> Dublin 2
> Ireland