You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Aaron <aa...@gmail.com> on 2015/07/20 19:14:13 UTC
Spark 1.4.1,MySQL and DataFrameReader.read.jdbc fun
I have Spark 1.4.1, running on a YARN cluster. When I do a pyspark,
in yarn-client mode:
pyspark --jars ~/dev/spark/lib/mysql-connector-java-5.1.36-bin.jar
--driver-class-path
~/dev/spark/lib/mysql-connector-java-5.1.36-bin.jar
and then do the equivalent of..
tbl = sqlContext.read.jdbc("jdbc:mysql://....", "tableName",
properties={"user":"blah", "password":"pw"})
I get the "No Suitable Driver found" error when I attempted to do a
tbl.show() or maybe a tbl.describe(), etc. This even happens in the
spark-shell too.
Currently I do NOT use the SPARK_CLASSPATH (as I've seen that talked
about and knowing that it is deprecated). I also do NOT set the
spark.executor.extraClassPath property because I thought that was the
whole point of --jars option.
So, do I need to deploy the mysql connector to a known location on my
YARN node managers, and then reference that JAR location someplace?
If so, what cmd line options do I use, or properties do I set?
I thought the --jars cmd line option put the JARs into the class path
to be used, is this not the case?
Another question, why do I need --driver-class-path <location of mysql
jar>? If I don't use this cmd line option, I get an error just
attempting to do the sqlContext.read.jdbc() assignment..not trying to
perform an operation on the RDD.
Cheers,
Aaron
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org