You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Manohar Rao <ma...@gmail.com> on 2018/10/30 09:05:22 UTC

Java Spark to Python spark integration

I would like to know if its possible to invoke python spark code from java.

I have a java based framework where
a sparksession is created and a some dataframes are passed as argument to
an api .

Transformation.java
interface   Transformation
 {
     Dataset transform(Set<Dataset> inputDatasets , SparkSession spark);
 }

A user of this framework can them implement a transformation and the
framework can then use this custom transformation
along with rest of the standard transformations . This then integrates into
a larger data pipeline.

Question.
================
Some users would like to use python (pyspark ) code to write business logic.

Is there a possibility of passing this java Dataset ( or RDD) via the
framework
to python code and then retrieving the python RDD/dataset back as the
output to the java framework.

Any reference to some code snippets around this will be helpful  .

Thanks

Manohar