You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by VG <vl...@gmail.com> on 2016/06/17 09:08:08 UTC

java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

Failed to find data source: com.databricks.spark.xml

Any suggestions to resolve this

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

Posted by Siva A <si...@gmail.com>.
Try to import the class and see if you are getting compilation error

import com.databricks.spark.xml

Siva

On Fri, Jun 17, 2016 at 4:02 PM, VG <vl...@gmail.com> wrote:

> nopes. eclipse.
>
>
> On Fri, Jun 17, 2016 at 3:58 PM, Siva A <si...@gmail.com> wrote:
>
>> If you are running from IDE, Are you using Intellij?
>>
>> On Fri, Jun 17, 2016 at 3:20 PM, Siva A <si...@gmail.com> wrote:
>>
>>> Can you try to package as a jar and run using spark-submit
>>>
>>> Siva
>>>
>>> On Fri, Jun 17, 2016 at 3:17 PM, VG <vl...@gmail.com> wrote:
>>>
>>>> I am trying to run from IDE and everything else is working fine.
>>>> I added spark-xml jar and now I ended up into this dependency
>>>>
>>>> 6/06/17 15:15:57 INFO BlockManagerMaster: Registered BlockManager
>>>> Exception in thread "main" *java.lang.NoClassDefFoundError:
>>>> scala/collection/GenTraversableOnce$class*
>>>> at
>>>> org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.<init>(ddl.scala:150)
>>>> at
>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:154)
>>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>> Caused by:* java.lang.ClassNotFoundException:
>>>> scala.collection.GenTraversableOnce$class*
>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>> ... 5 more
>>>> 16/06/17 15:15:58 INFO SparkContext: Invoking stop() from shutdown hook
>>>>
>>>>
>>>>
>>>> On Fri, Jun 17, 2016 at 2:59 PM, Marco Mistroni <mm...@gmail.com>
>>>> wrote:
>>>>
>>>>> So you are using spark-submit  or spark-shell?
>>>>>
>>>>> you will need to launch either by passing --packages option (like in
>>>>> the example below for spark-csv). you will need to iknow
>>>>>
>>>>> --packages com.databricks:spark-xml_<scala.version>:<package version>
>>>>>
>>>>> hth
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jun 17, 2016 at 10:20 AM, VG <vl...@gmail.com> wrote:
>>>>>
>>>>>> Apologies for that.
>>>>>> I am trying to use spark-xml to load data of a xml file.
>>>>>>
>>>>>> here is the exception
>>>>>>
>>>>>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
>>>>>> Exception in thread "main" java.lang.ClassNotFoundException: Failed
>>>>>> to find data source: org.apache.spark.xml. Please find packages at
>>>>>> http://spark-packages.org
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>>>>>> at
>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>>>> at
>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>> org.apache.spark.xml.DefaultSource
>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>>>> at scala.util.Try$.apply(Try.scala:192)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>>>>> at scala.util.Try.orElse(Try.scala:84)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
>>>>>> ... 4 more
>>>>>>
>>>>>> Code
>>>>>>         SQLContext sqlContext = new SQLContext(sc);
>>>>>>         DataFrame df = sqlContext.read()
>>>>>>             .format("org.apache.spark.xml")
>>>>>>             .option("rowTag", "row")
>>>>>>             .load("A.xml");
>>>>>>
>>>>>> Any suggestions please ..
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni <mm...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> too little info
>>>>>>> it'll help if you can post the exception and show your sbt file (if
>>>>>>> you are using sbt), and provide minimal details on what you are doing
>>>>>>> kr
>>>>>>>
>>>>>>> On Fri, Jun 17, 2016 at 10:08 AM, VG <vl...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Failed to find data source: com.databricks.spark.xml
>>>>>>>>
>>>>>>>> Any suggestions to resolve this
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

Posted by Siva A <si...@gmail.com>.
Use Spark XML version,0.3.3
<dependency>
<groupId>com.databricks</groupId>
<artifactId>spark-xml_2.10</artifactId>
<version>0.3.3</version>
</dependency>

On Fri, Jun 17, 2016 at 4:25 PM, VG <vl...@gmail.com> wrote:

> Hi Siva
>
> This is what i have for jars. Did you manage to run with these or
> different versions ?
>
>
> <dependency>
> <groupId>org.apache.spark</groupId>
> <artifactId>spark-core_2.10</artifactId>
> <version>1.6.1</version>
> </dependency>
> <dependency>
> <groupId>org.apache.spark</groupId>
> <artifactId>spark-sql_2.10</artifactId>
> <version>1.6.1</version>
> </dependency>
> <dependency>
> <groupId>com.databricks</groupId>
> <artifactId>spark-xml_2.10</artifactId>
> <version>0.2.0</version>
> </dependency>
> <dependency>
> <groupId>org.scala-lang</groupId>
> <artifactId>scala-library</artifactId>
> <version>2.10.6</version>
> </dependency>
>
> Thanks
> VG
>
>
> On Fri, Jun 17, 2016 at 4:16 PM, Siva A <si...@gmail.com> wrote:
>
>> Hi Marco,
>>
>> I did run in IDE(Intellij) as well. It works fine.
>> VG, make sure the right jar is in classpath.
>>
>> --Siva
>>
>> On Fri, Jun 17, 2016 at 4:11 PM, Marco Mistroni <mm...@gmail.com>
>> wrote:
>>
>>> and  your eclipse path is correct?
>>> i suggest, as Siva did before, to build your jar and run it via
>>> spark-submit  by specifying the --packages option
>>> it's as simple as run this command
>>>
>>> spark-submit   --packages
>>> com.databricks:spark-xml_<scalaversion>:<packageversion>   --class <Name of
>>> your class containing main> <path to your jar>
>>>
>>> Indeed, if you have only these lines to run, why dont you try them in
>>> spark-shell ?
>>>
>>> hth
>>>
>>> On Fri, Jun 17, 2016 at 11:32 AM, VG <vl...@gmail.com> wrote:
>>>
>>>> nopes. eclipse.
>>>>
>>>>
>>>> On Fri, Jun 17, 2016 at 3:58 PM, Siva A <si...@gmail.com>
>>>> wrote:
>>>>
>>>>> If you are running from IDE, Are you using Intellij?
>>>>>
>>>>> On Fri, Jun 17, 2016 at 3:20 PM, Siva A <si...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Can you try to package as a jar and run using spark-submit
>>>>>>
>>>>>> Siva
>>>>>>
>>>>>> On Fri, Jun 17, 2016 at 3:17 PM, VG <vl...@gmail.com> wrote:
>>>>>>
>>>>>>> I am trying to run from IDE and everything else is working fine.
>>>>>>> I added spark-xml jar and now I ended up into this dependency
>>>>>>>
>>>>>>> 6/06/17 15:15:57 INFO BlockManagerMaster: Registered BlockManager
>>>>>>> Exception in thread "main" *java.lang.NoClassDefFoundError:
>>>>>>> scala/collection/GenTraversableOnce$class*
>>>>>>> at
>>>>>>> org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.<init>(ddl.scala:150)
>>>>>>> at
>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:154)
>>>>>>> at
>>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>>>>> at
>>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>>>>> Caused by:* java.lang.ClassNotFoundException:
>>>>>>> scala.collection.GenTraversableOnce$class*
>>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>>>> ... 5 more
>>>>>>> 16/06/17 15:15:58 INFO SparkContext: Invoking stop() from shutdown
>>>>>>> hook
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jun 17, 2016 at 2:59 PM, Marco Mistroni <mmistroni@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> So you are using spark-submit  or spark-shell?
>>>>>>>>
>>>>>>>> you will need to launch either by passing --packages option (like
>>>>>>>> in the example below for spark-csv). you will need to iknow
>>>>>>>>
>>>>>>>> --packages com.databricks:spark-xml_<scala.version>:<package
>>>>>>>> version>
>>>>>>>>
>>>>>>>> hth
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Jun 17, 2016 at 10:20 AM, VG <vl...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Apologies for that.
>>>>>>>>> I am trying to use spark-xml to load data of a xml file.
>>>>>>>>>
>>>>>>>>> here is the exception
>>>>>>>>>
>>>>>>>>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
>>>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>>>>>>>> Failed to find data source: org.apache.spark.xml. Please find packages at
>>>>>>>>> http://spark-packages.org
>>>>>>>>> at
>>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>>>>>>>>> at
>>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>>>>>>>>> at
>>>>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>>>>>>> at
>>>>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>>>>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>>> org.apache.spark.xml.DefaultSource
>>>>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>>>>>> at
>>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>>>>>>> at
>>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>>>>>>> at scala.util.Try$.apply(Try.scala:192)
>>>>>>>>> at
>>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>>>>>>>> at
>>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>>>>>>>> at scala.util.Try.orElse(Try.scala:84)
>>>>>>>>> at
>>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
>>>>>>>>> ... 4 more
>>>>>>>>>
>>>>>>>>> Code
>>>>>>>>>         SQLContext sqlContext = new SQLContext(sc);
>>>>>>>>>         DataFrame df = sqlContext.read()
>>>>>>>>>             .format("org.apache.spark.xml")
>>>>>>>>>             .option("rowTag", "row")
>>>>>>>>>             .load("A.xml");
>>>>>>>>>
>>>>>>>>> Any suggestions please ..
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni <
>>>>>>>>> mmistroni@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> too little info
>>>>>>>>>> it'll help if you can post the exception and show your sbt file
>>>>>>>>>> (if you are using sbt), and provide minimal details on what you are doing
>>>>>>>>>> kr
>>>>>>>>>>
>>>>>>>>>> On Fri, Jun 17, 2016 at 10:08 AM, VG <vl...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Failed to find data source: com.databricks.spark.xml
>>>>>>>>>>>
>>>>>>>>>>> Any suggestions to resolve this
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

Posted by VG <vl...@gmail.com>.
It proceeded with the jars I mentioned.
However no data getting loaded into data frame...

sob sob :(

On Fri, Jun 17, 2016 at 4:25 PM, VG <vl...@gmail.com> wrote:

> Hi Siva
>
> This is what i have for jars. Did you manage to run with these or
> different versions ?
>
>
> <dependency>
> <groupId>org.apache.spark</groupId>
> <artifactId>spark-core_2.10</artifactId>
> <version>1.6.1</version>
> </dependency>
> <dependency>
> <groupId>org.apache.spark</groupId>
> <artifactId>spark-sql_2.10</artifactId>
> <version>1.6.1</version>
> </dependency>
> <dependency>
> <groupId>com.databricks</groupId>
> <artifactId>spark-xml_2.10</artifactId>
> <version>0.2.0</version>
> </dependency>
> <dependency>
> <groupId>org.scala-lang</groupId>
> <artifactId>scala-library</artifactId>
> <version>2.10.6</version>
> </dependency>
>
> Thanks
> VG
>
>
> On Fri, Jun 17, 2016 at 4:16 PM, Siva A <si...@gmail.com> wrote:
>
>> Hi Marco,
>>
>> I did run in IDE(Intellij) as well. It works fine.
>> VG, make sure the right jar is in classpath.
>>
>> --Siva
>>
>> On Fri, Jun 17, 2016 at 4:11 PM, Marco Mistroni <mm...@gmail.com>
>> wrote:
>>
>>> and  your eclipse path is correct?
>>> i suggest, as Siva did before, to build your jar and run it via
>>> spark-submit  by specifying the --packages option
>>> it's as simple as run this command
>>>
>>> spark-submit   --packages
>>> com.databricks:spark-xml_<scalaversion>:<packageversion>   --class <Name of
>>> your class containing main> <path to your jar>
>>>
>>> Indeed, if you have only these lines to run, why dont you try them in
>>> spark-shell ?
>>>
>>> hth
>>>
>>> On Fri, Jun 17, 2016 at 11:32 AM, VG <vl...@gmail.com> wrote:
>>>
>>>> nopes. eclipse.
>>>>
>>>>
>>>> On Fri, Jun 17, 2016 at 3:58 PM, Siva A <si...@gmail.com>
>>>> wrote:
>>>>
>>>>> If you are running from IDE, Are you using Intellij?
>>>>>
>>>>> On Fri, Jun 17, 2016 at 3:20 PM, Siva A <si...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Can you try to package as a jar and run using spark-submit
>>>>>>
>>>>>> Siva
>>>>>>
>>>>>> On Fri, Jun 17, 2016 at 3:17 PM, VG <vl...@gmail.com> wrote:
>>>>>>
>>>>>>> I am trying to run from IDE and everything else is working fine.
>>>>>>> I added spark-xml jar and now I ended up into this dependency
>>>>>>>
>>>>>>> 6/06/17 15:15:57 INFO BlockManagerMaster: Registered BlockManager
>>>>>>> Exception in thread "main" *java.lang.NoClassDefFoundError:
>>>>>>> scala/collection/GenTraversableOnce$class*
>>>>>>> at
>>>>>>> org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.<init>(ddl.scala:150)
>>>>>>> at
>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:154)
>>>>>>> at
>>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>>>>> at
>>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>>>>> Caused by:* java.lang.ClassNotFoundException:
>>>>>>> scala.collection.GenTraversableOnce$class*
>>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>>>> ... 5 more
>>>>>>> 16/06/17 15:15:58 INFO SparkContext: Invoking stop() from shutdown
>>>>>>> hook
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jun 17, 2016 at 2:59 PM, Marco Mistroni <mmistroni@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> So you are using spark-submit  or spark-shell?
>>>>>>>>
>>>>>>>> you will need to launch either by passing --packages option (like
>>>>>>>> in the example below for spark-csv). you will need to iknow
>>>>>>>>
>>>>>>>> --packages com.databricks:spark-xml_<scala.version>:<package
>>>>>>>> version>
>>>>>>>>
>>>>>>>> hth
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Jun 17, 2016 at 10:20 AM, VG <vl...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Apologies for that.
>>>>>>>>> I am trying to use spark-xml to load data of a xml file.
>>>>>>>>>
>>>>>>>>> here is the exception
>>>>>>>>>
>>>>>>>>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
>>>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException:
>>>>>>>>> Failed to find data source: org.apache.spark.xml. Please find packages at
>>>>>>>>> http://spark-packages.org
>>>>>>>>> at
>>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>>>>>>>>> at
>>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>>>>>>>>> at
>>>>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>>>>>>> at
>>>>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>>>>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>>> org.apache.spark.xml.DefaultSource
>>>>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>>>>>> at
>>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>>>>>>> at
>>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>>>>>>> at scala.util.Try$.apply(Try.scala:192)
>>>>>>>>> at
>>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>>>>>>>> at
>>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>>>>>>>> at scala.util.Try.orElse(Try.scala:84)
>>>>>>>>> at
>>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
>>>>>>>>> ... 4 more
>>>>>>>>>
>>>>>>>>> Code
>>>>>>>>>         SQLContext sqlContext = new SQLContext(sc);
>>>>>>>>>         DataFrame df = sqlContext.read()
>>>>>>>>>             .format("org.apache.spark.xml")
>>>>>>>>>             .option("rowTag", "row")
>>>>>>>>>             .load("A.xml");
>>>>>>>>>
>>>>>>>>> Any suggestions please ..
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni <
>>>>>>>>> mmistroni@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> too little info
>>>>>>>>>> it'll help if you can post the exception and show your sbt file
>>>>>>>>>> (if you are using sbt), and provide minimal details on what you are doing
>>>>>>>>>> kr
>>>>>>>>>>
>>>>>>>>>> On Fri, Jun 17, 2016 at 10:08 AM, VG <vl...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Failed to find data source: com.databricks.spark.xml
>>>>>>>>>>>
>>>>>>>>>>> Any suggestions to resolve this
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

Posted by VG <vl...@gmail.com>.
Hi Siva

This is what i have for jars. Did you manage to run with these or different
versions ?


<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.6.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>1.6.1</version>
</dependency>
<dependency>
<groupId>com.databricks</groupId>
<artifactId>spark-xml_2.10</artifactId>
<version>0.2.0</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.10.6</version>
</dependency>

Thanks
VG


On Fri, Jun 17, 2016 at 4:16 PM, Siva A <si...@gmail.com> wrote:

> Hi Marco,
>
> I did run in IDE(Intellij) as well. It works fine.
> VG, make sure the right jar is in classpath.
>
> --Siva
>
> On Fri, Jun 17, 2016 at 4:11 PM, Marco Mistroni <mm...@gmail.com>
> wrote:
>
>> and  your eclipse path is correct?
>> i suggest, as Siva did before, to build your jar and run it via
>> spark-submit  by specifying the --packages option
>> it's as simple as run this command
>>
>> spark-submit   --packages
>> com.databricks:spark-xml_<scalaversion>:<packageversion>   --class <Name of
>> your class containing main> <path to your jar>
>>
>> Indeed, if you have only these lines to run, why dont you try them in
>> spark-shell ?
>>
>> hth
>>
>> On Fri, Jun 17, 2016 at 11:32 AM, VG <vl...@gmail.com> wrote:
>>
>>> nopes. eclipse.
>>>
>>>
>>> On Fri, Jun 17, 2016 at 3:58 PM, Siva A <si...@gmail.com>
>>> wrote:
>>>
>>>> If you are running from IDE, Are you using Intellij?
>>>>
>>>> On Fri, Jun 17, 2016 at 3:20 PM, Siva A <si...@gmail.com>
>>>> wrote:
>>>>
>>>>> Can you try to package as a jar and run using spark-submit
>>>>>
>>>>> Siva
>>>>>
>>>>> On Fri, Jun 17, 2016 at 3:17 PM, VG <vl...@gmail.com> wrote:
>>>>>
>>>>>> I am trying to run from IDE and everything else is working fine.
>>>>>> I added spark-xml jar and now I ended up into this dependency
>>>>>>
>>>>>> 6/06/17 15:15:57 INFO BlockManagerMaster: Registered BlockManager
>>>>>> Exception in thread "main" *java.lang.NoClassDefFoundError:
>>>>>> scala/collection/GenTraversableOnce$class*
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.<init>(ddl.scala:150)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:154)
>>>>>> at
>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>>>> at
>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>>>> Caused by:* java.lang.ClassNotFoundException:
>>>>>> scala.collection.GenTraversableOnce$class*
>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>>> ... 5 more
>>>>>> 16/06/17 15:15:58 INFO SparkContext: Invoking stop() from shutdown
>>>>>> hook
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jun 17, 2016 at 2:59 PM, Marco Mistroni <mm...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> So you are using spark-submit  or spark-shell?
>>>>>>>
>>>>>>> you will need to launch either by passing --packages option (like in
>>>>>>> the example below for spark-csv). you will need to iknow
>>>>>>>
>>>>>>> --packages com.databricks:spark-xml_<scala.version>:<package version>
>>>>>>>
>>>>>>> hth
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jun 17, 2016 at 10:20 AM, VG <vl...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Apologies for that.
>>>>>>>> I am trying to use spark-xml to load data of a xml file.
>>>>>>>>
>>>>>>>> here is the exception
>>>>>>>>
>>>>>>>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
>>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException: Failed
>>>>>>>> to find data source: org.apache.spark.xml. Please find packages at
>>>>>>>> http://spark-packages.org
>>>>>>>> at
>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>>>>>>>> at
>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>>>>>>>> at
>>>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>>>>>> at
>>>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>>>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>> org.apache.spark.xml.DefaultSource
>>>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>>>>> at
>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>>>>>> at
>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>>>>>> at scala.util.Try$.apply(Try.scala:192)
>>>>>>>> at
>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>>>>>>> at
>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>>>>>>> at scala.util.Try.orElse(Try.scala:84)
>>>>>>>> at
>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
>>>>>>>> ... 4 more
>>>>>>>>
>>>>>>>> Code
>>>>>>>>         SQLContext sqlContext = new SQLContext(sc);
>>>>>>>>         DataFrame df = sqlContext.read()
>>>>>>>>             .format("org.apache.spark.xml")
>>>>>>>>             .option("rowTag", "row")
>>>>>>>>             .load("A.xml");
>>>>>>>>
>>>>>>>> Any suggestions please ..
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni <
>>>>>>>> mmistroni@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> too little info
>>>>>>>>> it'll help if you can post the exception and show your sbt file
>>>>>>>>> (if you are using sbt), and provide minimal details on what you are doing
>>>>>>>>> kr
>>>>>>>>>
>>>>>>>>> On Fri, Jun 17, 2016 at 10:08 AM, VG <vl...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Failed to find data source: com.databricks.spark.xml
>>>>>>>>>>
>>>>>>>>>> Any suggestions to resolve this
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

Posted by Siva A <si...@gmail.com>.
Hi Marco,

I did run in IDE(Intellij) as well. It works fine.
VG, make sure the right jar is in classpath.

--Siva

On Fri, Jun 17, 2016 at 4:11 PM, Marco Mistroni <mm...@gmail.com> wrote:

> and  your eclipse path is correct?
> i suggest, as Siva did before, to build your jar and run it via
> spark-submit  by specifying the --packages option
> it's as simple as run this command
>
> spark-submit   --packages
> com.databricks:spark-xml_<scalaversion>:<packageversion>   --class <Name of
> your class containing main> <path to your jar>
>
> Indeed, if you have only these lines to run, why dont you try them in
> spark-shell ?
>
> hth
>
> On Fri, Jun 17, 2016 at 11:32 AM, VG <vl...@gmail.com> wrote:
>
>> nopes. eclipse.
>>
>>
>> On Fri, Jun 17, 2016 at 3:58 PM, Siva A <si...@gmail.com> wrote:
>>
>>> If you are running from IDE, Are you using Intellij?
>>>
>>> On Fri, Jun 17, 2016 at 3:20 PM, Siva A <si...@gmail.com>
>>> wrote:
>>>
>>>> Can you try to package as a jar and run using spark-submit
>>>>
>>>> Siva
>>>>
>>>> On Fri, Jun 17, 2016 at 3:17 PM, VG <vl...@gmail.com> wrote:
>>>>
>>>>> I am trying to run from IDE and everything else is working fine.
>>>>> I added spark-xml jar and now I ended up into this dependency
>>>>>
>>>>> 6/06/17 15:15:57 INFO BlockManagerMaster: Registered BlockManager
>>>>> Exception in thread "main" *java.lang.NoClassDefFoundError:
>>>>> scala/collection/GenTraversableOnce$class*
>>>>> at
>>>>> org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.<init>(ddl.scala:150)
>>>>> at
>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:154)
>>>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>>> Caused by:* java.lang.ClassNotFoundException:
>>>>> scala.collection.GenTraversableOnce$class*
>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>> ... 5 more
>>>>> 16/06/17 15:15:58 INFO SparkContext: Invoking stop() from shutdown hook
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jun 17, 2016 at 2:59 PM, Marco Mistroni <mm...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> So you are using spark-submit  or spark-shell?
>>>>>>
>>>>>> you will need to launch either by passing --packages option (like in
>>>>>> the example below for spark-csv). you will need to iknow
>>>>>>
>>>>>> --packages com.databricks:spark-xml_<scala.version>:<package version>
>>>>>>
>>>>>> hth
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jun 17, 2016 at 10:20 AM, VG <vl...@gmail.com> wrote:
>>>>>>
>>>>>>> Apologies for that.
>>>>>>> I am trying to use spark-xml to load data of a xml file.
>>>>>>>
>>>>>>> here is the exception
>>>>>>>
>>>>>>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException: Failed
>>>>>>> to find data source: org.apache.spark.xml. Please find packages at
>>>>>>> http://spark-packages.org
>>>>>>> at
>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>>>>>>> at
>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>>>>>>> at
>>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>>>>> at
>>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>> org.apache.spark.xml.DefaultSource
>>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>>>> at
>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>>>>> at
>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>>>>> at scala.util.Try$.apply(Try.scala:192)
>>>>>>> at
>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>>>>>> at
>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>>>>>> at scala.util.Try.orElse(Try.scala:84)
>>>>>>> at
>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
>>>>>>> ... 4 more
>>>>>>>
>>>>>>> Code
>>>>>>>         SQLContext sqlContext = new SQLContext(sc);
>>>>>>>         DataFrame df = sqlContext.read()
>>>>>>>             .format("org.apache.spark.xml")
>>>>>>>             .option("rowTag", "row")
>>>>>>>             .load("A.xml");
>>>>>>>
>>>>>>> Any suggestions please ..
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni <mmistroni@gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> too little info
>>>>>>>> it'll help if you can post the exception and show your sbt file (if
>>>>>>>> you are using sbt), and provide minimal details on what you are doing
>>>>>>>> kr
>>>>>>>>
>>>>>>>> On Fri, Jun 17, 2016 at 10:08 AM, VG <vl...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Failed to find data source: com.databricks.spark.xml
>>>>>>>>>
>>>>>>>>> Any suggestions to resolve this
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

Posted by Marco Mistroni <mm...@gmail.com>.
and  your eclipse path is correct?
i suggest, as Siva did before, to build your jar and run it via
spark-submit  by specifying the --packages option
it's as simple as run this command

spark-submit   --packages
com.databricks:spark-xml_<scalaversion>:<packageversion>   --class <Name of
your class containing main> <path to your jar>

Indeed, if you have only these lines to run, why dont you try them in
spark-shell ?

hth

On Fri, Jun 17, 2016 at 11:32 AM, VG <vl...@gmail.com> wrote:

> nopes. eclipse.
>
>
> On Fri, Jun 17, 2016 at 3:58 PM, Siva A <si...@gmail.com> wrote:
>
>> If you are running from IDE, Are you using Intellij?
>>
>> On Fri, Jun 17, 2016 at 3:20 PM, Siva A <si...@gmail.com> wrote:
>>
>>> Can you try to package as a jar and run using spark-submit
>>>
>>> Siva
>>>
>>> On Fri, Jun 17, 2016 at 3:17 PM, VG <vl...@gmail.com> wrote:
>>>
>>>> I am trying to run from IDE and everything else is working fine.
>>>> I added spark-xml jar and now I ended up into this dependency
>>>>
>>>> 6/06/17 15:15:57 INFO BlockManagerMaster: Registered BlockManager
>>>> Exception in thread "main" *java.lang.NoClassDefFoundError:
>>>> scala/collection/GenTraversableOnce$class*
>>>> at
>>>> org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.<init>(ddl.scala:150)
>>>> at
>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:154)
>>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>> Caused by:* java.lang.ClassNotFoundException:
>>>> scala.collection.GenTraversableOnce$class*
>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>> ... 5 more
>>>> 16/06/17 15:15:58 INFO SparkContext: Invoking stop() from shutdown hook
>>>>
>>>>
>>>>
>>>> On Fri, Jun 17, 2016 at 2:59 PM, Marco Mistroni <mm...@gmail.com>
>>>> wrote:
>>>>
>>>>> So you are using spark-submit  or spark-shell?
>>>>>
>>>>> you will need to launch either by passing --packages option (like in
>>>>> the example below for spark-csv). you will need to iknow
>>>>>
>>>>> --packages com.databricks:spark-xml_<scala.version>:<package version>
>>>>>
>>>>> hth
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jun 17, 2016 at 10:20 AM, VG <vl...@gmail.com> wrote:
>>>>>
>>>>>> Apologies for that.
>>>>>> I am trying to use spark-xml to load data of a xml file.
>>>>>>
>>>>>> here is the exception
>>>>>>
>>>>>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
>>>>>> Exception in thread "main" java.lang.ClassNotFoundException: Failed
>>>>>> to find data source: org.apache.spark.xml. Please find packages at
>>>>>> http://spark-packages.org
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>>>>>> at
>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>>>> at
>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>> org.apache.spark.xml.DefaultSource
>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>>>> at scala.util.Try$.apply(Try.scala:192)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>>>>> at scala.util.Try.orElse(Try.scala:84)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
>>>>>> ... 4 more
>>>>>>
>>>>>> Code
>>>>>>         SQLContext sqlContext = new SQLContext(sc);
>>>>>>         DataFrame df = sqlContext.read()
>>>>>>             .format("org.apache.spark.xml")
>>>>>>             .option("rowTag", "row")
>>>>>>             .load("A.xml");
>>>>>>
>>>>>> Any suggestions please ..
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni <mm...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> too little info
>>>>>>> it'll help if you can post the exception and show your sbt file (if
>>>>>>> you are using sbt), and provide minimal details on what you are doing
>>>>>>> kr
>>>>>>>
>>>>>>> On Fri, Jun 17, 2016 at 10:08 AM, VG <vl...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Failed to find data source: com.databricks.spark.xml
>>>>>>>>
>>>>>>>> Any suggestions to resolve this
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

Posted by VG <vl...@gmail.com>.
nopes. eclipse.


On Fri, Jun 17, 2016 at 3:58 PM, Siva A <si...@gmail.com> wrote:

> If you are running from IDE, Are you using Intellij?
>
> On Fri, Jun 17, 2016 at 3:20 PM, Siva A <si...@gmail.com> wrote:
>
>> Can you try to package as a jar and run using spark-submit
>>
>> Siva
>>
>> On Fri, Jun 17, 2016 at 3:17 PM, VG <vl...@gmail.com> wrote:
>>
>>> I am trying to run from IDE and everything else is working fine.
>>> I added spark-xml jar and now I ended up into this dependency
>>>
>>> 6/06/17 15:15:57 INFO BlockManagerMaster: Registered BlockManager
>>> Exception in thread "main" *java.lang.NoClassDefFoundError:
>>> scala/collection/GenTraversableOnce$class*
>>> at
>>> org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.<init>(ddl.scala:150)
>>> at
>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:154)
>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>> Caused by:* java.lang.ClassNotFoundException:
>>> scala.collection.GenTraversableOnce$class*
>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>> ... 5 more
>>> 16/06/17 15:15:58 INFO SparkContext: Invoking stop() from shutdown hook
>>>
>>>
>>>
>>> On Fri, Jun 17, 2016 at 2:59 PM, Marco Mistroni <mm...@gmail.com>
>>> wrote:
>>>
>>>> So you are using spark-submit  or spark-shell?
>>>>
>>>> you will need to launch either by passing --packages option (like in
>>>> the example below for spark-csv). you will need to iknow
>>>>
>>>> --packages com.databricks:spark-xml_<scala.version>:<package version>
>>>>
>>>> hth
>>>>
>>>>
>>>>
>>>> On Fri, Jun 17, 2016 at 10:20 AM, VG <vl...@gmail.com> wrote:
>>>>
>>>>> Apologies for that.
>>>>> I am trying to use spark-xml to load data of a xml file.
>>>>>
>>>>> here is the exception
>>>>>
>>>>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
>>>>> Exception in thread "main" java.lang.ClassNotFoundException: Failed to
>>>>> find data source: org.apache.spark.xml. Please find packages at
>>>>> http://spark-packages.org
>>>>> at
>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>>>>> at
>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>>>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>> org.apache.spark.xml.DefaultSource
>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>> at
>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>>> at
>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>>> at scala.util.Try$.apply(Try.scala:192)
>>>>> at
>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>>>> at
>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>>>> at scala.util.Try.orElse(Try.scala:84)
>>>>> at
>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
>>>>> ... 4 more
>>>>>
>>>>> Code
>>>>>         SQLContext sqlContext = new SQLContext(sc);
>>>>>         DataFrame df = sqlContext.read()
>>>>>             .format("org.apache.spark.xml")
>>>>>             .option("rowTag", "row")
>>>>>             .load("A.xml");
>>>>>
>>>>> Any suggestions please ..
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni <mm...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> too little info
>>>>>> it'll help if you can post the exception and show your sbt file (if
>>>>>> you are using sbt), and provide minimal details on what you are doing
>>>>>> kr
>>>>>>
>>>>>> On Fri, Jun 17, 2016 at 10:08 AM, VG <vl...@gmail.com> wrote:
>>>>>>
>>>>>>> Failed to find data source: com.databricks.spark.xml
>>>>>>>
>>>>>>> Any suggestions to resolve this
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

Posted by Siva A <si...@gmail.com>.
If you are running from IDE, Are you using Intellij?

On Fri, Jun 17, 2016 at 3:20 PM, Siva A <si...@gmail.com> wrote:

> Can you try to package as a jar and run using spark-submit
>
> Siva
>
> On Fri, Jun 17, 2016 at 3:17 PM, VG <vl...@gmail.com> wrote:
>
>> I am trying to run from IDE and everything else is working fine.
>> I added spark-xml jar and now I ended up into this dependency
>>
>> 6/06/17 15:15:57 INFO BlockManagerMaster: Registered BlockManager
>> Exception in thread "main" *java.lang.NoClassDefFoundError:
>> scala/collection/GenTraversableOnce$class*
>> at
>> org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.<init>(ddl.scala:150)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:154)
>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>> Caused by:* java.lang.ClassNotFoundException:
>> scala.collection.GenTraversableOnce$class*
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> ... 5 more
>> 16/06/17 15:15:58 INFO SparkContext: Invoking stop() from shutdown hook
>>
>>
>>
>> On Fri, Jun 17, 2016 at 2:59 PM, Marco Mistroni <mm...@gmail.com>
>> wrote:
>>
>>> So you are using spark-submit  or spark-shell?
>>>
>>> you will need to launch either by passing --packages option (like in the
>>> example below for spark-csv). you will need to iknow
>>>
>>> --packages com.databricks:spark-xml_<scala.version>:<package version>
>>>
>>> hth
>>>
>>>
>>>
>>> On Fri, Jun 17, 2016 at 10:20 AM, VG <vl...@gmail.com> wrote:
>>>
>>>> Apologies for that.
>>>> I am trying to use spark-xml to load data of a xml file.
>>>>
>>>> here is the exception
>>>>
>>>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
>>>> Exception in thread "main" java.lang.ClassNotFoundException: Failed to
>>>> find data source: org.apache.spark.xml. Please find packages at
>>>> http://spark-packages.org
>>>> at
>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>>>> at
>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>> Caused by: java.lang.ClassNotFoundException:
>>>> org.apache.spark.xml.DefaultSource
>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>> at
>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>> at
>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>> at scala.util.Try$.apply(Try.scala:192)
>>>> at
>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>>> at
>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>>> at scala.util.Try.orElse(Try.scala:84)
>>>> at
>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
>>>> ... 4 more
>>>>
>>>> Code
>>>>         SQLContext sqlContext = new SQLContext(sc);
>>>>         DataFrame df = sqlContext.read()
>>>>             .format("org.apache.spark.xml")
>>>>             .option("rowTag", "row")
>>>>             .load("A.xml");
>>>>
>>>> Any suggestions please ..
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni <mm...@gmail.com>
>>>> wrote:
>>>>
>>>>> too little info
>>>>> it'll help if you can post the exception and show your sbt file (if
>>>>> you are using sbt), and provide minimal details on what you are doing
>>>>> kr
>>>>>
>>>>> On Fri, Jun 17, 2016 at 10:08 AM, VG <vl...@gmail.com> wrote:
>>>>>
>>>>>> Failed to find data source: com.databricks.spark.xml
>>>>>>
>>>>>> Any suggestions to resolve this
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

Posted by Siva A <si...@gmail.com>.
Can you try to package as a jar and run using spark-submit

Siva

On Fri, Jun 17, 2016 at 3:17 PM, VG <vl...@gmail.com> wrote:

> I am trying to run from IDE and everything else is working fine.
> I added spark-xml jar and now I ended up into this dependency
>
> 6/06/17 15:15:57 INFO BlockManagerMaster: Registered BlockManager
> Exception in thread "main" *java.lang.NoClassDefFoundError:
> scala/collection/GenTraversableOnce$class*
> at
> org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.<init>(ddl.scala:150)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:154)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
> Caused by:* java.lang.ClassNotFoundException:
> scala.collection.GenTraversableOnce$class*
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> ... 5 more
> 16/06/17 15:15:58 INFO SparkContext: Invoking stop() from shutdown hook
>
>
>
> On Fri, Jun 17, 2016 at 2:59 PM, Marco Mistroni <mm...@gmail.com>
> wrote:
>
>> So you are using spark-submit  or spark-shell?
>>
>> you will need to launch either by passing --packages option (like in the
>> example below for spark-csv). you will need to iknow
>>
>> --packages com.databricks:spark-xml_<scala.version>:<package version>
>>
>> hth
>>
>>
>>
>> On Fri, Jun 17, 2016 at 10:20 AM, VG <vl...@gmail.com> wrote:
>>
>>> Apologies for that.
>>> I am trying to use spark-xml to load data of a xml file.
>>>
>>> here is the exception
>>>
>>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
>>> Exception in thread "main" java.lang.ClassNotFoundException: Failed to
>>> find data source: org.apache.spark.xml. Please find packages at
>>> http://spark-packages.org
>>> at
>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>>> at
>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>> Caused by: java.lang.ClassNotFoundException:
>>> org.apache.spark.xml.DefaultSource
>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>> at
>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>> at
>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>> at scala.util.Try$.apply(Try.scala:192)
>>> at
>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>> at
>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>> at scala.util.Try.orElse(Try.scala:84)
>>> at
>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
>>> ... 4 more
>>>
>>> Code
>>>         SQLContext sqlContext = new SQLContext(sc);
>>>         DataFrame df = sqlContext.read()
>>>             .format("org.apache.spark.xml")
>>>             .option("rowTag", "row")
>>>             .load("A.xml");
>>>
>>> Any suggestions please ..
>>>
>>>
>>>
>>>
>>> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni <mm...@gmail.com>
>>> wrote:
>>>
>>>> too little info
>>>> it'll help if you can post the exception and show your sbt file (if you
>>>> are using sbt), and provide minimal details on what you are doing
>>>> kr
>>>>
>>>> On Fri, Jun 17, 2016 at 10:08 AM, VG <vl...@gmail.com> wrote:
>>>>
>>>>> Failed to find data source: com.databricks.spark.xml
>>>>>
>>>>> Any suggestions to resolve this
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

Posted by VG <vl...@gmail.com>.
I am trying to run from IDE and everything else is working fine.
I added spark-xml jar and now I ended up into this dependency

6/06/17 15:15:57 INFO BlockManagerMaster: Registered BlockManager
Exception in thread "main" *java.lang.NoClassDefFoundError:
scala/collection/GenTraversableOnce$class*
at
org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.<init>(ddl.scala:150)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:154)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
Caused by:* java.lang.ClassNotFoundException:
scala.collection.GenTraversableOnce$class*
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 5 more
16/06/17 15:15:58 INFO SparkContext: Invoking stop() from shutdown hook



On Fri, Jun 17, 2016 at 2:59 PM, Marco Mistroni <mm...@gmail.com> wrote:

> So you are using spark-submit  or spark-shell?
>
> you will need to launch either by passing --packages option (like in the
> example below for spark-csv). you will need to iknow
>
> --packages com.databricks:spark-xml_<scala.version>:<package version>
>
> hth
>
>
>
> On Fri, Jun 17, 2016 at 10:20 AM, VG <vl...@gmail.com> wrote:
>
>> Apologies for that.
>> I am trying to use spark-xml to load data of a xml file.
>>
>> here is the exception
>>
>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
>> Exception in thread "main" java.lang.ClassNotFoundException: Failed to
>> find data source: org.apache.spark.xml. Please find packages at
>> http://spark-packages.org
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>> Caused by: java.lang.ClassNotFoundException:
>> org.apache.spark.xml.DefaultSource
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>> at scala.util.Try$.apply(Try.scala:192)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>> at scala.util.Try.orElse(Try.scala:84)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
>> ... 4 more
>>
>> Code
>>         SQLContext sqlContext = new SQLContext(sc);
>>         DataFrame df = sqlContext.read()
>>             .format("org.apache.spark.xml")
>>             .option("rowTag", "row")
>>             .load("A.xml");
>>
>> Any suggestions please ..
>>
>>
>>
>>
>> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni <mm...@gmail.com>
>> wrote:
>>
>>> too little info
>>> it'll help if you can post the exception and show your sbt file (if you
>>> are using sbt), and provide minimal details on what you are doing
>>> kr
>>>
>>> On Fri, Jun 17, 2016 at 10:08 AM, VG <vl...@gmail.com> wrote:
>>>
>>>> Failed to find data source: com.databricks.spark.xml
>>>>
>>>> Any suggestions to resolve this
>>>>
>>>>
>>>>
>>>
>>
>

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

Posted by Marco Mistroni <mm...@gmail.com>.
So you are using spark-submit  or spark-shell?

you will need to launch either by passing --packages option (like in the
example below for spark-csv). you will need to iknow

--packages com.databricks:spark-xml_<scala.version>:<package version>

hth



On Fri, Jun 17, 2016 at 10:20 AM, VG <vl...@gmail.com> wrote:

> Apologies for that.
> I am trying to use spark-xml to load data of a xml file.
>
> here is the exception
>
> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
> Exception in thread "main" java.lang.ClassNotFoundException: Failed to
> find data source: org.apache.spark.xml. Please find packages at
> http://spark-packages.org
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.spark.xml.DefaultSource
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
> at scala.util.Try$.apply(Try.scala:192)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
> at scala.util.Try.orElse(Try.scala:84)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
> ... 4 more
>
> Code
>         SQLContext sqlContext = new SQLContext(sc);
>         DataFrame df = sqlContext.read()
>             .format("org.apache.spark.xml")
>             .option("rowTag", "row")
>             .load("A.xml");
>
> Any suggestions please ..
>
>
>
>
> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni <mm...@gmail.com>
> wrote:
>
>> too little info
>> it'll help if you can post the exception and show your sbt file (if you
>> are using sbt), and provide minimal details on what you are doing
>> kr
>>
>> On Fri, Jun 17, 2016 at 10:08 AM, VG <vl...@gmail.com> wrote:
>>
>>> Failed to find data source: com.databricks.spark.xml
>>>
>>> Any suggestions to resolve this
>>>
>>>
>>>
>>
>

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

Posted by Siva A <si...@gmail.com>.
If its not working,

Add the package list while executing spark-submit/spark-shell like below

$SPARK_HOME/bin/spark-shell --packages com.databricks:spark-xml_2.10:0.3.3

$SPARK_HOME/bin/spark-submit --packages com.databricks:spark-xml_2.10:0.3.3



On Fri, Jun 17, 2016 at 2:56 PM, Siva A <si...@gmail.com> wrote:

> Just try to use "xml" as format like below,
>
>         SQLContext sqlContext = new SQLContext(sc);
>         DataFrame df = sqlContext.read()
>             .format("xml")
>             .option("rowTag", "row")
>             .load("A.xml");
>
> FYR: https://github.com/databricks/spark-xml
>
> --Siva
>
> On Fri, Jun 17, 2016 at 2:50 PM, VG <vl...@gmail.com> wrote:
>
>> Apologies for that.
>> I am trying to use spark-xml to load data of a xml file.
>>
>> here is the exception
>>
>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
>> Exception in thread "main" java.lang.ClassNotFoundException: Failed to
>> find data source: org.apache.spark.xml. Please find packages at
>> http://spark-packages.org
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>> Caused by: java.lang.ClassNotFoundException:
>> org.apache.spark.xml.DefaultSource
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>> at scala.util.Try$.apply(Try.scala:192)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>> at scala.util.Try.orElse(Try.scala:84)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
>> ... 4 more
>>
>> Code
>>         SQLContext sqlContext = new SQLContext(sc);
>>         DataFrame df = sqlContext.read()
>>             .format("org.apache.spark.xml")
>>             .option("rowTag", "row")
>>             .load("A.xml");
>>
>> Any suggestions please ..
>>
>>
>>
>>
>> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni <mm...@gmail.com>
>> wrote:
>>
>>> too little info
>>> it'll help if you can post the exception and show your sbt file (if you
>>> are using sbt), and provide minimal details on what you are doing
>>> kr
>>>
>>> On Fri, Jun 17, 2016 at 10:08 AM, VG <vl...@gmail.com> wrote:
>>>
>>>> Failed to find data source: com.databricks.spark.xml
>>>>
>>>> Any suggestions to resolve this
>>>>
>>>>
>>>>
>>>
>>
>

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

Posted by VG <vl...@gmail.com>.
Hi Siva,

I still get a similar exception (See the highlighted section - It is
looking for DataSource)
16/06/17 15:11:37 INFO BlockManagerMaster: Registered BlockManager
Exception in thread "main" java.lang.ClassNotFoundException: Failed to find
data source: xml. Please find packages at http://spark-packages.org
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
*Caused by: java.lang.ClassNotFoundException: xml.DefaultSource*
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
at scala.util.Try$.apply(Try.scala:192)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
at scala.util.Try.orElse(Try.scala:84)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
... 4 more
16/06/17 15:11:38 INFO SparkContext: Invoking stop() from shutdown hook



On Fri, Jun 17, 2016 at 2:56 PM, Siva A <si...@gmail.com> wrote:

> Just try to use "xml" as format like below,
>
>         SQLContext sqlContext = new SQLContext(sc);
>         DataFrame df = sqlContext.read()
>             .format("xml")
>             .option("rowTag", "row")
>             .load("A.xml");
>
> FYR: https://github.com/databricks/spark-xml
>
> --Siva
>
> On Fri, Jun 17, 2016 at 2:50 PM, VG <vl...@gmail.com> wrote:
>
>> Apologies for that.
>> I am trying to use spark-xml to load data of a xml file.
>>
>> here is the exception
>>
>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
>> Exception in thread "main" java.lang.ClassNotFoundException: Failed to
>> find data source: org.apache.spark.xml. Please find packages at
>> http://spark-packages.org
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>> Caused by: java.lang.ClassNotFoundException:
>> org.apache.spark.xml.DefaultSource
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>> at scala.util.Try$.apply(Try.scala:192)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>> at scala.util.Try.orElse(Try.scala:84)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
>> ... 4 more
>>
>> Code
>>         SQLContext sqlContext = new SQLContext(sc);
>>         DataFrame df = sqlContext.read()
>>             .format("org.apache.spark.xml")
>>             .option("rowTag", "row")
>>             .load("A.xml");
>>
>> Any suggestions please ..
>>
>>
>>
>>
>> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni <mm...@gmail.com>
>> wrote:
>>
>>> too little info
>>> it'll help if you can post the exception and show your sbt file (if you
>>> are using sbt), and provide minimal details on what you are doing
>>> kr
>>>
>>> On Fri, Jun 17, 2016 at 10:08 AM, VG <vl...@gmail.com> wrote:
>>>
>>>> Failed to find data source: com.databricks.spark.xml
>>>>
>>>> Any suggestions to resolve this
>>>>
>>>>
>>>>
>>>
>>
>

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

Posted by Siva A <si...@gmail.com>.
Just try to use "xml" as format like below,

        SQLContext sqlContext = new SQLContext(sc);
        DataFrame df = sqlContext.read()
            .format("xml")
            .option("rowTag", "row")
            .load("A.xml");

FYR: https://github.com/databricks/spark-xml

--Siva

On Fri, Jun 17, 2016 at 2:50 PM, VG <vl...@gmail.com> wrote:

> Apologies for that.
> I am trying to use spark-xml to load data of a xml file.
>
> here is the exception
>
> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
> Exception in thread "main" java.lang.ClassNotFoundException: Failed to
> find data source: org.apache.spark.xml. Please find packages at
> http://spark-packages.org
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.spark.xml.DefaultSource
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
> at scala.util.Try$.apply(Try.scala:192)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
> at scala.util.Try.orElse(Try.scala:84)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
> ... 4 more
>
> Code
>         SQLContext sqlContext = new SQLContext(sc);
>         DataFrame df = sqlContext.read()
>             .format("org.apache.spark.xml")
>             .option("rowTag", "row")
>             .load("A.xml");
>
> Any suggestions please ..
>
>
>
>
> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni <mm...@gmail.com>
> wrote:
>
>> too little info
>> it'll help if you can post the exception and show your sbt file (if you
>> are using sbt), and provide minimal details on what you are doing
>> kr
>>
>> On Fri, Jun 17, 2016 at 10:08 AM, VG <vl...@gmail.com> wrote:
>>
>>> Failed to find data source: com.databricks.spark.xml
>>>
>>> Any suggestions to resolve this
>>>
>>>
>>>
>>
>

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

Posted by VG <vl...@gmail.com>.
Apologies for that.
I am trying to use spark-xml to load data of a xml file.

here is the exception

16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
Exception in thread "main" java.lang.ClassNotFoundException: Failed to find
data source: org.apache.spark.xml. Please find packages at
http://spark-packages.org
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
Caused by: java.lang.ClassNotFoundException:
org.apache.spark.xml.DefaultSource
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
at scala.util.Try$.apply(Try.scala:192)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
at scala.util.Try.orElse(Try.scala:84)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
... 4 more

Code
        SQLContext sqlContext = new SQLContext(sc);
        DataFrame df = sqlContext.read()
            .format("org.apache.spark.xml")
            .option("rowTag", "row")
            .load("A.xml");

Any suggestions please ..




On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni <mm...@gmail.com> wrote:

> too little info
> it'll help if you can post the exception and show your sbt file (if you
> are using sbt), and provide minimal details on what you are doing
> kr
>
> On Fri, Jun 17, 2016 at 10:08 AM, VG <vl...@gmail.com> wrote:
>
>> Failed to find data source: com.databricks.spark.xml
>>
>> Any suggestions to resolve this
>>
>>
>>
>

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

Posted by Marco Mistroni <mm...@gmail.com>.
too little info
it'll help if you can post the exception and show your sbt file (if you are
using sbt), and provide minimal details on what you are doing
kr

On Fri, Jun 17, 2016 at 10:08 AM, VG <vl...@gmail.com> wrote:

> Failed to find data source: com.databricks.spark.xml
>
> Any suggestions to resolve this
>
>
>