You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bahir.apache.org by "Christian Kadner (JIRA)" <ji...@apache.org> on 2016/07/19 22:50:20 UTC
[jira] [Created] (BAHIR-35) Include Python code in the binary jars
for use with "--packages ..."
Christian Kadner created BAHIR-35:
-------------------------------------
Summary: Include Python code in the binary jars for use with "--packages ..."
Key: BAHIR-35
URL: https://issues.apache.org/jira/browse/BAHIR-35
Project: Bahir
Issue Type: Task
Components: Build
Affects Versions: 2.0.0
Reporter: Christian Kadner
Currently, to make use the PySpark code (i.e streaming-mqtt/python) a user will have to download the jar from Maven central or clone the code from GitHub and then have to find individual *.py files, create a zip and add that to the {{spark-submit}} command with the {{--py-files}} option, or, add them to the {{PYTHONPATH}} when running locally.
If we include the Python code in the binary build (to the jar that gets uploaded to Maven central), then users need not do any acrobatics besides using the {{--packages ...}} option.
An example where the Python code is part of the binary jar is the [GraphFrames|https://spark-packages.org/package/graphframes/graphframes] package.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)