You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Stephen Boesch <ja...@gmail.com> on 2014/12/30 01:55:36 UTC

Adding third party jars to classpath used by pyspark

What is the recommended way to do this?  We have some native database
client libraries for which we are adding pyspark bindings.

The pyspark invokes spark-submit.   Do we add our libraries to
the SPARK_SUBMIT_LIBRARY_PATH ?

This issue relates back to an error we have been seeing "Py4jError: Trying
to call a package" - the suspicion being that the third party libraries may
not be available on the jvm side.

Re: Adding third party jars to classpath used by pyspark

Posted by Davies Liu <da...@databricks.com>.
On Mon, Dec 29, 2014 at 7:39 PM, Jeremy Freeman
<fr...@gmail.com> wrote:
> Hi Stephen, it should be enough to include
>
>> --jars /path/to/file.jar
>
> in the command line call to either pyspark or spark-submit, as in
>
>> spark-submit --master local --jars /path/to/file.jar myfile.py

Unfortunately, you also need '--driver-class-path /path/to/file.jar'
to make it accessible in driver. (This may be fixed in 1.3).

> and you can check the bottom of the Web UI’s “Environment" tab to make sure the jar gets on your classpath. Let me know if you still see errors related to this.
>
> — Jeremy
>
> -------------------------
> jeremyfreeman.net
> @thefreemanlab
>
> On Dec 29, 2014, at 7:55 PM, Stephen Boesch <ja...@gmail.com> wrote:
>
>> What is the recommended way to do this?  We have some native database
>> client libraries for which we are adding pyspark bindings.
>>
>> The pyspark invokes spark-submit.   Do we add our libraries to
>> the SPARK_SUBMIT_LIBRARY_PATH ?
>>
>> This issue relates back to an error we have been seeing "Py4jError: Trying
>> to call a package" - the suspicion being that the third party libraries may
>> not be available on the jvm side.
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: Adding third party jars to classpath used by pyspark

Posted by Jeremy Freeman <fr...@gmail.com>.
Hi Stephen, it should be enough to include 

> --jars /path/to/file.jar

in the command line call to either pyspark or spark-submit, as in

> spark-submit --master local --jars /path/to/file.jar myfile.py

and you can check the bottom of the Web UI’s “Environment" tab to make sure the jar gets on your classpath. Let me know if you still see errors related to this.

— Jeremy

-------------------------
jeremyfreeman.net
@thefreemanlab

On Dec 29, 2014, at 7:55 PM, Stephen Boesch <ja...@gmail.com> wrote:

> What is the recommended way to do this?  We have some native database
> client libraries for which we are adding pyspark bindings.
> 
> The pyspark invokes spark-submit.   Do we add our libraries to
> the SPARK_SUBMIT_LIBRARY_PATH ?
> 
> This issue relates back to an error we have been seeing "Py4jError: Trying
> to call a package" - the suspicion being that the third party libraries may
> not be available on the jvm side.