You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Manohar Rao <ma...@gmail.com> on 2018/10/30 09:05:22 UTC
Java Spark to Python spark integration
I would like to know if its possible to invoke python spark code from java.
I have a java based framework where
a sparksession is created and a some dataframes are passed as argument to
an api .
Transformation.java
interface Transformation
{
Dataset transform(Set<Dataset> inputDatasets , SparkSession spark);
}
A user of this framework can them implement a transformation and the
framework can then use this custom transformation
along with rest of the standard transformations . This then integrates into
a larger data pipeline.
Question.
================
Some users would like to use python (pyspark ) code to write business logic.
Is there a possibility of passing this java Dataset ( or RDD) via the
framework
to python code and then retrieving the python RDD/dataset back as the
output to the java framework.
Any reference to some code snippets around this will be helpful .
Thanks
Manohar