You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Benoit Pasquereau <Be...@amdocs.com> on 2014/11/20 18:18:12 UTC

ClassNotFoundException in standalone mode

Hi Guys,

I'm having an issue in standalone mode (Spark 1.1, Hadoop 2.4, Windows Server 2008).

A very simple program runs fine in local mode but fails in standalone mode.

Here is the error:

14/11/20 17:01:53 INFO DAGScheduler: Failed to run count at SimpleApp.scala:22
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task
0.3 in stage 0.0 (TID 6, UK-RND-PN02.actixhost.eu): java.lang.ClassNotFoundException: SimpleApp$$anonfun$1
        java.net.URLClassLoader$1.run(URLClassLoader.java:202)

I have added the jar to the SparkConf() to be on the safe side and it appears in standard output (copied after the code):

/* SimpleApp.scala */
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf

import java.net.URLClassLoader

object SimpleApp {
  def main(args: Array[String]) {
    val logFile = "S:\\spark-1.1.0-bin-hadoop2.4\\README.md"
    val conf = new SparkConf()//.setJars(Seq("s:\\spark\\simple\\target\\scala-2.10\\simple-project_2.10-1.0.jar"))
     .setMaster("spark://UK-RND-PN02.actixhost.eu:7077")
     //.setMaster("local[4]")
     .setAppName("Simple Application")
    val sc = new SparkContext(conf)

    val cl = ClassLoader.getSystemClassLoader
    val urls = cl.asInstanceOf[URLClassLoader].getURLs
    urls.foreach(url => println("Executor classpath is:" + url.getFile))

    val logData = sc.textFile(logFile, 2).cache()
    val numAs = logData.filter(line => line.contains("a")).count()
    val numBs = logData.filter(line => line.contains("b")).count()
    println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
    sc.stop()
  }
}

Simple-project is in the executor classpath list:
14/11/20 17:01:48 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
Executor classpath is:/S:/spark/simple/
Executor classpath is:/S:/spark/simple/target/scala-2.10/simple-project_2.10-1.0.jar
Executor classpath is:/S:/spark-1.1.0-bin-hadoop2.4/conf/
Executor classpath is:/S:/spark-1.1.0-bin-hadoop2.4/lib/spark-assembly-1.1.0-hadoop2.4.0.jar
Executor classpath is:/S:/spark/simple/
Executor classpath is:/S:/spark-1.1.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.1.jar
Executor classpath is:/S:/spark-1.1.0-bin-hadoop2.4/lib/datanucleus-core-3.2.2.jar
Executor classpath is:/S:/spark-1.1.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.1.jar
Executor classpath is:/S:/spark/simple/

Would you have any idea how I could investigate further ?

Thanks !
Benoit.


PS: I could attach a debugger to the Worker where the ClassNotFoundException happens but it is a bit painful

This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement,
you may review at http://www.amdocs.com/email_disclaimer.asp

Re: ClassNotFoundException in standalone mode

Posted by Yanbo Liang <ya...@gmail.com>.
Looks like it can not found class or jar in your Driver machine.
Are you sure that the corresponding jar file exist in Driver machine rather
than your develop machine?

2014-11-21 11:16 GMT+08:00 angel2014 <an...@gmail.com>:

> Can you make sure the class "SimpleApp$$anonfun$1" is included in your
> app jar?
>
> 2014-11-20 18:19 GMT+01:00 Benoit Pasquereau [via Apache Spark User List]
> <[hidden email] <http://user/SendEmail.jtp?type=node&node=19443&i=0>>:
>
>>  Hi Guys,
>>
>>
>>
>> I’m having an issue in standalone mode (Spark 1.1, Hadoop 2.4, Windows
>> Server 2008).
>>
>>
>>
>> A very simple program runs fine in local mode but fails in standalone
>> mode.
>>
>>
>>
>> Here is the error:
>>
>>
>>
>> 14/11/20 17:01:53 INFO DAGScheduler: Failed to run count at
>> SimpleApp.scala:22
>>
>> Exception in thread "main" org.apache.spark.SparkException: Job aborted
>> due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent
>> failure: Lost task
>>
>> 0.3 in stage 0.0 (TID 6, UK-RND-PN02.actixhost.eu):
>> java.lang.ClassNotFoundException: SimpleApp$$anonfun$1
>>
>>         java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>>
>>
>>
>> I have added the jar to the SparkConf() to be on the safe side and it
>> appears in standard output (copied after the code):
>>
>>
>>
>> /* SimpleApp.scala */
>>
>> import org.apache.spark.SparkContext
>>
>> import org.apache.spark.SparkContext._
>>
>> import org.apache.spark.SparkConf
>>
>>
>>
>> import java.net.URLClassLoader
>>
>>
>>
>> object SimpleApp {
>>
>>   def main(args: Array[String]) {
>>
>>     val logFile = "S:\\spark-1.1.0-bin-hadoop2.4\\README.md"
>>
>>     val conf = new
>> SparkConf()//.setJars(Seq("s:\\spark\\simple\\target\\scala-2.10\\simple-project_2.10-1.0.jar"))
>>
>>      .setMaster("spark://UK-RND-PN02.actixhost.eu:7077")
>>
>>      //.setMaster("local[4]")
>>
>>      .setAppName("Simple Application")
>>
>>     val sc = new SparkContext(conf)
>>
>>
>>
>>     val cl = ClassLoader.getSystemClassLoader
>>
>>     val urls = cl.asInstanceOf[URLClassLoader].getURLs
>>
>>     urls.foreach(url => println("Executor classpath is:" + url.getFile))
>>
>>
>>
>>     val logData = sc.textFile(logFile, 2).cache()
>>
>>     val numAs = logData.filter(line => line.contains("a")).count()
>>
>>     val numBs = logData.filter(line => line.contains("b")).count()
>>
>>     println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
>>
>>     sc.stop()
>>
>>   }
>>
>> }
>>
>>
>>
>> Simple-project is in the executor classpath list:
>>
>> 14/11/20 17:01:48 INFO SparkDeploySchedulerBackend: SchedulerBackend is
>> ready for scheduling beginning after reached minRegisteredResourcesRatio:
>> 0.0
>>
>> Executor classpath is:/S:/spark/simple/
>>
>> Executor classpath is:
>> */S:/spark/simple/target/scala-2.10/simple-project_2.10-1.0.jar*
>>
>> Executor classpath is:/S:/spark-1.1.0-bin-hadoop2.4/conf/
>>
>> Executor classpath
>> is:/S:/spark-1.1.0-bin-hadoop2.4/lib/spark-assembly-1.1.0-hadoop2.4.0.jar
>>
>> Executor classpath is:/S:/spark/simple/
>>
>> Executor classpath
>> is:/S:/spark-1.1.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.1.jar
>>
>> Executor classpath
>> is:/S:/spark-1.1.0-bin-hadoop2.4/lib/datanucleus-core-3.2.2.jar
>>
>> Executor classpath
>> is:/S:/spark-1.1.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.1.jar
>>
>> Executor classpath is:/S:/spark/simple/
>>
>>
>>
>> Would you have any idea how I could investigate further ?
>>
>>
>>
>> Thanks !
>>
>> Benoit.
>>
>>
>>
>>
>>
>> PS: I could attach a debugger to the Worker where the
>> ClassNotFoundException happens but it is a bit painful
>>  This message and the information contained herein is proprietary and
>> confidential and subject to the Amdocs policy statement, you may review at
>> http://www.amdocs.com/email_disclaimer.asp
>>
>> ------------------------------
>>  If you reply to this email, your message will be added to the
>> discussion below:
>>
>> http://apache-spark-user-list.1001560.n3.nabble.com/ClassNotFoundException-in-standalone-mode-tp19391.html
>>  To start a new topic under Apache Spark User List, email [hidden email]
>> <http://user/SendEmail.jtp?type=node&node=19443&i=1>
>> To unsubscribe from Apache Spark User List, click here.
>> NAML
>> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>
>
> ------------------------------
> View this message in context: Re: ClassNotFoundException in standalone
> mode
> <http://apache-spark-user-list.1001560.n3.nabble.com/ClassNotFoundException-in-standalone-mode-tp19391p19443.html>
> Sent from the Apache Spark User List mailing list archive
> <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com.
>

RE: ClassNotFoundException in standalone mode

Posted by Benoit Pasquereau <Be...@amdocs.com>.
I finally managed to get the example working, here are the details that may help other users.

I have 2 windows nodes for the test system, PN01 and PN02. Both have the same shared drive S: (it is mapped to C:\source on PN02).

If I run the worker and master from S:\spark-1.1.0-bin-hadoop2.4, then running simple test fails on the ClassNotFoundException (even if there is only one node which hosts both the master and the worker).

If I run the workers and masters from the local drive (c:\source\spark-1.1.0-bin-hadoop2.4), then the simple test runs ok (with one or two nodes)

I haven’t found why the class fails to load with the shared drive (I checked the permissions and they look ok) but at least the cluster is working now.

If anyone has experience getting Spark with windows shared drive, any advice welcome !

Thanks,
Benoit.


PS: Yes thanks Angel, I did check that
s:\spark\simple>"%JAVA_HOME%"\bin\jar tvf s:\spark\simple\target\scala-2.10\simple-project_2.10-1.0.jar
   299 Thu Nov 20 17:29:40 GMT 2014 META-INF/MANIFEST.MF
  1070 Thu Nov 20 17:29:40 GMT 2014 SimpleApp$$anonfun$2.class
  1350 Thu Nov 20 17:29:40 GMT 2014 SimpleApp$$anonfun$main$1.class
  2581 Thu Nov 20 17:29:40 GMT 2014 SimpleApp$.class
  1070 Thu Nov 20 17:29:40 GMT 2014 SimpleApp$$anonfun$1.class
   710 Thu Nov 20 17:29:40 GMT 2014 SimpleApp.class


From: angel2014 [mailto:angel.alvarez.pascua@gmail.com]
Sent: Friday, November 21, 2014 3:16 AM
To: user@spark.incubator.apache.org
Subject: Re: ClassNotFoundException in standalone mode

Can you make sure the class "SimpleApp$$anonfun$1" is included in your app jar?

2014-11-20 18:19 GMT+01:00 Benoit Pasquereau [via Apache Spark User List] <[hidden email]</user/SendEmail.jtp?type=node&node=19443&i=0>>:
Hi Guys,

I’m having an issue in standalone mode (Spark 1.1, Hadoop 2.4, Windows Server 2008).

A very simple program runs fine in local mode but fails in standalone mode.

Here is the error:

14/11/20 17:01:53 INFO DAGScheduler: Failed to run count at SimpleApp.scala:22
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task
0.3 in stage 0.0 (TID 6, UK-RND-PN02.actixhost.eu<http://UK-RND-PN02.actixhost.eu>): java.lang.ClassNotFoundException: SimpleApp$$anonfun$1
        java.net.URLClassLoader$1.run(URLClassLoader.java:202)

I have added the jar to the SparkConf() to be on the safe side and it appears in standard output (copied after the code):

/* SimpleApp.scala */
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf

import java.net.URLClassLoader

object SimpleApp {
  def main(args: Array[String]) {
    val logFile = "S:\\spark-1.1.0-bin-hadoop2.4\\README.md"
    val conf = new SparkConf()//.setJars(Seq("s:\\spark\\simple\\target\\scala-2.10\\simple-project_2.10-1.0.jar"))
     .setMaster("spark://UK-RND-PN02.actixhost.eu:7077<http://UK-RND-PN02.actixhost.eu:7077>")
     //.setMaster("local[4]")
     .setAppName("Simple Application")
    val sc = new SparkContext(conf)

    val cl = ClassLoader.getSystemClassLoader
    val urls = cl.asInstanceOf[URLClassLoader].getURLs
    urls.foreach(url => println("Executor classpath is:" + url.getFile))

    val logData = sc.textFile(logFile, 2).cache()
    val numAs = logData.filter(line => line.contains("a")).count()
    val numBs = logData.filter(line => line.contains("b")).count()
    println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
    sc.stop()
  }
}

Simple-project is in the executor classpath list:
14/11/20 17:01:48 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
Executor classpath is:/S:/spark/simple/
Executor classpath is:/S:/spark/simple/target/scala-2.10/simple-project_2.10-1.0.jar
Executor classpath is:/S:/spark-1.1.0-bin-hadoop2.4/conf/
Executor classpath is:/S:/spark-1.1.0-bin-hadoop2.4/lib/spark-assembly-1.1.0-hadoop2.4.0.jar
Executor classpath is:/S:/spark/simple/
Executor classpath is:/S:/spark-1.1.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.1.jar
Executor classpath is:/S:/spark-1.1.0-bin-hadoop2.4/lib/datanucleus-core-3.2.2.jar
Executor classpath is:/S:/spark-1.1.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.1.jar
Executor classpath is:/S:/spark/simple/

Would you have any idea how I could investigate further ?

Thanks !
Benoit.


PS: I could attach a debugger to the Worker where the ClassNotFoundException happens but it is a bit painful
This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement, you may review at http://www.amdocs.com/email_disclaimer.asp
________________________________
If you reply to this email, your message will be added to the discussion below:
http://apache-spark-user-list.1001560.n3.nabble.com/ClassNotFoundException-in-standalone-mode-tp19391.html
To start a new topic under Apache Spark User List, email [hidden email]</user/SendEmail.jtp?type=node&node=19443&i=1>
To unsubscribe from Apache Spark User List, click here.
NAML<http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>


________________________________
View this message in context: Re: ClassNotFoundException in standalone mode<http://apache-spark-user-list.1001560.n3.nabble.com/ClassNotFoundException-in-standalone-mode-tp19391p19443.html>
Sent from the Apache Spark User List mailing list archive<http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com.

Re: ClassNotFoundException in standalone mode

Posted by angel2014 <an...@gmail.com>.
Can you make sure the class "SimpleApp$$anonfun$1" is included in your app
jar?

2014-11-20 18:19 GMT+01:00 Benoit Pasquereau [via Apache Spark User List] <
ml-node+s1001560n19391h11@n3.nabble.com>:

>  Hi Guys,
>
>
>
> I’m having an issue in standalone mode (Spark 1.1, Hadoop 2.4, Windows
> Server 2008).
>
>
>
> A very simple program runs fine in local mode but fails in standalone
> mode.
>
>
>
> Here is the error:
>
>
>
> 14/11/20 17:01:53 INFO DAGScheduler: Failed to run count at
> SimpleApp.scala:22
>
> Exception in thread "main" org.apache.spark.SparkException: Job aborted
> due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent
> failure: Lost task
>
> 0.3 in stage 0.0 (TID 6, UK-RND-PN02.actixhost.eu):
> java.lang.ClassNotFoundException: SimpleApp$$anonfun$1
>
>         java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>
>
>
> I have added the jar to the SparkConf() to be on the safe side and it
> appears in standard output (copied after the code):
>
>
>
> /* SimpleApp.scala */
>
> import org.apache.spark.SparkContext
>
> import org.apache.spark.SparkContext._
>
> import org.apache.spark.SparkConf
>
>
>
> import java.net.URLClassLoader
>
>
>
> object SimpleApp {
>
>   def main(args: Array[String]) {
>
>     val logFile = "S:\\spark-1.1.0-bin-hadoop2.4\\README.md"
>
>     val conf = new
> SparkConf()//.setJars(Seq("s:\\spark\\simple\\target\\scala-2.10\\simple-project_2.10-1.0.jar"))
>
>      .setMaster("spark://UK-RND-PN02.actixhost.eu:7077")
>
>      //.setMaster("local[4]")
>
>      .setAppName("Simple Application")
>
>     val sc = new SparkContext(conf)
>
>
>
>     val cl = ClassLoader.getSystemClassLoader
>
>     val urls = cl.asInstanceOf[URLClassLoader].getURLs
>
>     urls.foreach(url => println("Executor classpath is:" + url.getFile))
>
>
>
>     val logData = sc.textFile(logFile, 2).cache()
>
>     val numAs = logData.filter(line => line.contains("a")).count()
>
>     val numBs = logData.filter(line => line.contains("b")).count()
>
>     println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
>
>     sc.stop()
>
>   }
>
> }
>
>
>
> Simple-project is in the executor classpath list:
>
> 14/11/20 17:01:48 INFO SparkDeploySchedulerBackend: SchedulerBackend is
> ready for scheduling beginning after reached minRegisteredResourcesRatio:
> 0.0
>
> Executor classpath is:/S:/spark/simple/
>
> Executor classpath is:
> */S:/spark/simple/target/scala-2.10/simple-project_2.10-1.0.jar*
>
> Executor classpath is:/S:/spark-1.1.0-bin-hadoop2.4/conf/
>
> Executor classpath
> is:/S:/spark-1.1.0-bin-hadoop2.4/lib/spark-assembly-1.1.0-hadoop2.4.0.jar
>
> Executor classpath is:/S:/spark/simple/
>
> Executor classpath
> is:/S:/spark-1.1.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.1.jar
>
> Executor classpath
> is:/S:/spark-1.1.0-bin-hadoop2.4/lib/datanucleus-core-3.2.2.jar
>
> Executor classpath
> is:/S:/spark-1.1.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.1.jar
>
> Executor classpath is:/S:/spark/simple/
>
>
>
> Would you have any idea how I could investigate further ?
>
>
>
> Thanks !
>
> Benoit.
>
>
>
>
>
> PS: I could attach a debugger to the Worker where the
> ClassNotFoundException happens but it is a bit painful
>  This message and the information contained herein is proprietary and
> confidential and subject to the Amdocs policy statement, you may review at
> http://www.amdocs.com/email_disclaimer.asp
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-user-list.1001560.n3.nabble.com/ClassNotFoundException-in-standalone-mode-tp19391.html
>  To start a new topic under Apache Spark User List, email
> ml-node+s1001560n1h21@n3.nabble.com
> To unsubscribe from Apache Spark User List, click here
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=YW5nZWwuYWx2YXJlei5wYXNjdWFAZ21haWwuY29tfDF8ODAzOTc5ODky>
> .
> NAML
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/ClassNotFoundException-in-standalone-mode-tp19391p19443.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.