You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Chetan Khatri <ch...@gmail.com> on 2018/07/03 11:58:46 UTC

Run Python User Defined Functions / code in Spark with Scala Codebase

Hello Dear Spark User / Dev,

I would like to pass Python user defined function to Spark Job developed
using Scala and return value of that function would be returned to DF /
Dataset API.

Can someone please guide me, which would be best approach to do this.
Python function would be mostly transformation function. Also would like to
pass Java Function as a String to Spark / Scala job and it applies to RDD /
Data Frame and should return RDD / Data Frame.

Thank you.

Re: Run Python User Defined Functions / code in Spark with Scala Codebase

Posted by Gourav Sengupta <go...@gmail.com>.
Hi,

I am not very sure if SPARK data frames apply to your used case, if it does
please give a try by creating a UDF in Python and check whether you can
call it in Scala or not using select and expr.

Regards,
Gourav Sengupta

On Mon, Jul 16, 2018 at 5:32 AM, Chetan Khatri <ch...@gmail.com>
wrote:

> Hello Jayant,
>
> Thanks for great OSS Contribution :)
>
> On Thu, Jul 12, 2018 at 1:36 PM, Jayant Shekhar <ja...@gmail.com>
> wrote:
>
>> Hello Chetan,
>>
>> Sorry missed replying earlier. You can find some sample code here :
>>
>> http://sparkflows.readthedocs.io/en/latest/user-guide/python
>> /pipe-python.html
>>
>> We will continue adding more there.
>>
>> Feel free to ping me directly in case of questions.
>>
>> Thanks,
>> Jayant
>>
>>
>> On Mon, Jul 9, 2018 at 9:56 PM, Chetan Khatri <
>> chetan.opensource@gmail.com> wrote:
>>
>>> Hello Jayant,
>>>
>>> Thank you so much for suggestion. My view was to  use Python function as
>>> transformation which can take couple of column names and return object.
>>> which you explained. would that possible to point me to similiar codebase
>>> example.
>>>
>>> Thanks.
>>>
>>> On Fri, Jul 6, 2018 at 2:56 AM, Jayant Shekhar <ja...@gmail.com>
>>> wrote:
>>>
>>>> Hello Chetan,
>>>>
>>>> We have currently done it with .pipe(.py) as Prem suggested.
>>>>
>>>> That passes the RDD as CSV strings to the python script. The python
>>>> script can either process it line by line, create the result and return it
>>>> back. Or create things like Pandas Dataframe for processing and finally
>>>> write the results back.
>>>>
>>>> In the Spark/Scala/Java code, you get an RDD of string, which we
>>>> convert back to a Dataframe.
>>>>
>>>> Feel free to ping me directly in case of questions.
>>>>
>>>> Thanks,
>>>> Jayant
>>>>
>>>>
>>>> On Thu, Jul 5, 2018 at 3:39 AM, Chetan Khatri <
>>>> chetan.opensource@gmail.com> wrote:
>>>>
>>>>> Prem sure, Thanks for suggestion.
>>>>>
>>>>> On Wed, Jul 4, 2018 at 8:38 PM, Prem Sure <sp...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> try .pipe(.py) on RDD
>>>>>>
>>>>>> Thanks,
>>>>>> Prem
>>>>>>
>>>>>> On Wed, Jul 4, 2018 at 7:59 PM, Chetan Khatri <
>>>>>> chetan.opensource@gmail.com> wrote:
>>>>>>
>>>>>>> Can someone please suggest me , thanks
>>>>>>>
>>>>>>> On Tue 3 Jul, 2018, 5:28 PM Chetan Khatri, <
>>>>>>> chetan.opensource@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hello Dear Spark User / Dev,
>>>>>>>>
>>>>>>>> I would like to pass Python user defined function to Spark Job
>>>>>>>> developed using Scala and return value of that function would be returned
>>>>>>>> to DF / Dataset API.
>>>>>>>>
>>>>>>>> Can someone please guide me, which would be best approach to do
>>>>>>>> this. Python function would be mostly transformation function. Also would
>>>>>>>> like to pass Java Function as a String to Spark / Scala job and it applies
>>>>>>>> to RDD / Data Frame and should return RDD / Data Frame.
>>>>>>>>
>>>>>>>> Thank you.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Run Python User Defined Functions / code in Spark with Scala Codebase

Posted by Chetan Khatri <ch...@gmail.com>.
Hello Jayant,

Thanks for great OSS Contribution :)

On Thu, Jul 12, 2018 at 1:36 PM, Jayant Shekhar <ja...@gmail.com>
wrote:

> Hello Chetan,
>
> Sorry missed replying earlier. You can find some sample code here :
>
> http://sparkflows.readthedocs.io/en/latest/user-guide/
> python/pipe-python.html
>
> We will continue adding more there.
>
> Feel free to ping me directly in case of questions.
>
> Thanks,
> Jayant
>
>
> On Mon, Jul 9, 2018 at 9:56 PM, Chetan Khatri <chetan.opensource@gmail.com
> > wrote:
>
>> Hello Jayant,
>>
>> Thank you so much for suggestion. My view was to  use Python function as
>> transformation which can take couple of column names and return object.
>> which you explained. would that possible to point me to similiar codebase
>> example.
>>
>> Thanks.
>>
>> On Fri, Jul 6, 2018 at 2:56 AM, Jayant Shekhar <ja...@gmail.com>
>> wrote:
>>
>>> Hello Chetan,
>>>
>>> We have currently done it with .pipe(.py) as Prem suggested.
>>>
>>> That passes the RDD as CSV strings to the python script. The python
>>> script can either process it line by line, create the result and return it
>>> back. Or create things like Pandas Dataframe for processing and finally
>>> write the results back.
>>>
>>> In the Spark/Scala/Java code, you get an RDD of string, which we convert
>>> back to a Dataframe.
>>>
>>> Feel free to ping me directly in case of questions.
>>>
>>> Thanks,
>>> Jayant
>>>
>>>
>>> On Thu, Jul 5, 2018 at 3:39 AM, Chetan Khatri <
>>> chetan.opensource@gmail.com> wrote:
>>>
>>>> Prem sure, Thanks for suggestion.
>>>>
>>>> On Wed, Jul 4, 2018 at 8:38 PM, Prem Sure <sp...@gmail.com>
>>>> wrote:
>>>>
>>>>> try .pipe(.py) on RDD
>>>>>
>>>>> Thanks,
>>>>> Prem
>>>>>
>>>>> On Wed, Jul 4, 2018 at 7:59 PM, Chetan Khatri <
>>>>> chetan.opensource@gmail.com> wrote:
>>>>>
>>>>>> Can someone please suggest me , thanks
>>>>>>
>>>>>> On Tue 3 Jul, 2018, 5:28 PM Chetan Khatri, <
>>>>>> chetan.opensource@gmail.com> wrote:
>>>>>>
>>>>>>> Hello Dear Spark User / Dev,
>>>>>>>
>>>>>>> I would like to pass Python user defined function to Spark Job
>>>>>>> developed using Scala and return value of that function would be returned
>>>>>>> to DF / Dataset API.
>>>>>>>
>>>>>>> Can someone please guide me, which would be best approach to do
>>>>>>> this. Python function would be mostly transformation function. Also would
>>>>>>> like to pass Java Function as a String to Spark / Scala job and it applies
>>>>>>> to RDD / Data Frame and should return RDD / Data Frame.
>>>>>>>
>>>>>>> Thank you.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Run Python User Defined Functions / code in Spark with Scala Codebase

Posted by Chetan Khatri <ch...@gmail.com>.
Hello Jayant,

Thanks for great OSS Contribution :)

On Thu, Jul 12, 2018 at 1:36 PM, Jayant Shekhar <ja...@gmail.com>
wrote:

> Hello Chetan,
>
> Sorry missed replying earlier. You can find some sample code here :
>
> http://sparkflows.readthedocs.io/en/latest/user-guide/
> python/pipe-python.html
>
> We will continue adding more there.
>
> Feel free to ping me directly in case of questions.
>
> Thanks,
> Jayant
>
>
> On Mon, Jul 9, 2018 at 9:56 PM, Chetan Khatri <chetan.opensource@gmail.com
> > wrote:
>
>> Hello Jayant,
>>
>> Thank you so much for suggestion. My view was to  use Python function as
>> transformation which can take couple of column names and return object.
>> which you explained. would that possible to point me to similiar codebase
>> example.
>>
>> Thanks.
>>
>> On Fri, Jul 6, 2018 at 2:56 AM, Jayant Shekhar <ja...@gmail.com>
>> wrote:
>>
>>> Hello Chetan,
>>>
>>> We have currently done it with .pipe(.py) as Prem suggested.
>>>
>>> That passes the RDD as CSV strings to the python script. The python
>>> script can either process it line by line, create the result and return it
>>> back. Or create things like Pandas Dataframe for processing and finally
>>> write the results back.
>>>
>>> In the Spark/Scala/Java code, you get an RDD of string, which we convert
>>> back to a Dataframe.
>>>
>>> Feel free to ping me directly in case of questions.
>>>
>>> Thanks,
>>> Jayant
>>>
>>>
>>> On Thu, Jul 5, 2018 at 3:39 AM, Chetan Khatri <
>>> chetan.opensource@gmail.com> wrote:
>>>
>>>> Prem sure, Thanks for suggestion.
>>>>
>>>> On Wed, Jul 4, 2018 at 8:38 PM, Prem Sure <sp...@gmail.com>
>>>> wrote:
>>>>
>>>>> try .pipe(.py) on RDD
>>>>>
>>>>> Thanks,
>>>>> Prem
>>>>>
>>>>> On Wed, Jul 4, 2018 at 7:59 PM, Chetan Khatri <
>>>>> chetan.opensource@gmail.com> wrote:
>>>>>
>>>>>> Can someone please suggest me , thanks
>>>>>>
>>>>>> On Tue 3 Jul, 2018, 5:28 PM Chetan Khatri, <
>>>>>> chetan.opensource@gmail.com> wrote:
>>>>>>
>>>>>>> Hello Dear Spark User / Dev,
>>>>>>>
>>>>>>> I would like to pass Python user defined function to Spark Job
>>>>>>> developed using Scala and return value of that function would be returned
>>>>>>> to DF / Dataset API.
>>>>>>>
>>>>>>> Can someone please guide me, which would be best approach to do
>>>>>>> this. Python function would be mostly transformation function. Also would
>>>>>>> like to pass Java Function as a String to Spark / Scala job and it applies
>>>>>>> to RDD / Data Frame and should return RDD / Data Frame.
>>>>>>>
>>>>>>> Thank you.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Run Python User Defined Functions / code in Spark with Scala Codebase

Posted by Jayant Shekhar <ja...@gmail.com>.
Hello Chetan,

Sorry missed replying earlier. You can find some sample code here :

http://sparkflows.readthedocs.io/en/latest/user-guide/python/pipe-python.html

We will continue adding more there.

Feel free to ping me directly in case of questions.

Thanks,
Jayant


On Mon, Jul 9, 2018 at 9:56 PM, Chetan Khatri <ch...@gmail.com>
wrote:

> Hello Jayant,
>
> Thank you so much for suggestion. My view was to  use Python function as
> transformation which can take couple of column names and return object.
> which you explained. would that possible to point me to similiar codebase
> example.
>
> Thanks.
>
> On Fri, Jul 6, 2018 at 2:56 AM, Jayant Shekhar <ja...@gmail.com>
> wrote:
>
>> Hello Chetan,
>>
>> We have currently done it with .pipe(.py) as Prem suggested.
>>
>> That passes the RDD as CSV strings to the python script. The python
>> script can either process it line by line, create the result and return it
>> back. Or create things like Pandas Dataframe for processing and finally
>> write the results back.
>>
>> In the Spark/Scala/Java code, you get an RDD of string, which we convert
>> back to a Dataframe.
>>
>> Feel free to ping me directly in case of questions.
>>
>> Thanks,
>> Jayant
>>
>>
>> On Thu, Jul 5, 2018 at 3:39 AM, Chetan Khatri <
>> chetan.opensource@gmail.com> wrote:
>>
>>> Prem sure, Thanks for suggestion.
>>>
>>> On Wed, Jul 4, 2018 at 8:38 PM, Prem Sure <sp...@gmail.com>
>>> wrote:
>>>
>>>> try .pipe(.py) on RDD
>>>>
>>>> Thanks,
>>>> Prem
>>>>
>>>> On Wed, Jul 4, 2018 at 7:59 PM, Chetan Khatri <
>>>> chetan.opensource@gmail.com> wrote:
>>>>
>>>>> Can someone please suggest me , thanks
>>>>>
>>>>> On Tue 3 Jul, 2018, 5:28 PM Chetan Khatri, <
>>>>> chetan.opensource@gmail.com> wrote:
>>>>>
>>>>>> Hello Dear Spark User / Dev,
>>>>>>
>>>>>> I would like to pass Python user defined function to Spark Job
>>>>>> developed using Scala and return value of that function would be returned
>>>>>> to DF / Dataset API.
>>>>>>
>>>>>> Can someone please guide me, which would be best approach to do this.
>>>>>> Python function would be mostly transformation function. Also would like to
>>>>>> pass Java Function as a String to Spark / Scala job and it applies to RDD /
>>>>>> Data Frame and should return RDD / Data Frame.
>>>>>>
>>>>>> Thank you.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>
>>
>

Re: Run Python User Defined Functions / code in Spark with Scala Codebase

Posted by Chetan Khatri <ch...@gmail.com>.
Hello Jayant,

Thank you so much for suggestion. My view was to  use Python function as
transformation which can take couple of column names and return object.
which you explained. would that possible to point me to similiar codebase
example.

Thanks.

On Fri, Jul 6, 2018 at 2:56 AM, Jayant Shekhar <ja...@gmail.com>
wrote:

> Hello Chetan,
>
> We have currently done it with .pipe(.py) as Prem suggested.
>
> That passes the RDD as CSV strings to the python script. The python script
> can either process it line by line, create the result and return it back.
> Or create things like Pandas Dataframe for processing and finally write the
> results back.
>
> In the Spark/Scala/Java code, you get an RDD of string, which we convert
> back to a Dataframe.
>
> Feel free to ping me directly in case of questions.
>
> Thanks,
> Jayant
>
>
> On Thu, Jul 5, 2018 at 3:39 AM, Chetan Khatri <chetan.opensource@gmail.com
> > wrote:
>
>> Prem sure, Thanks for suggestion.
>>
>> On Wed, Jul 4, 2018 at 8:38 PM, Prem Sure <sp...@gmail.com> wrote:
>>
>>> try .pipe(.py) on RDD
>>>
>>> Thanks,
>>> Prem
>>>
>>> On Wed, Jul 4, 2018 at 7:59 PM, Chetan Khatri <
>>> chetan.opensource@gmail.com> wrote:
>>>
>>>> Can someone please suggest me , thanks
>>>>
>>>> On Tue 3 Jul, 2018, 5:28 PM Chetan Khatri, <ch...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hello Dear Spark User / Dev,
>>>>>
>>>>> I would like to pass Python user defined function to Spark Job
>>>>> developed using Scala and return value of that function would be returned
>>>>> to DF / Dataset API.
>>>>>
>>>>> Can someone please guide me, which would be best approach to do this.
>>>>> Python function would be mostly transformation function. Also would like to
>>>>> pass Java Function as a String to Spark / Scala job and it applies to RDD /
>>>>> Data Frame and should return RDD / Data Frame.
>>>>>
>>>>> Thank you.
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>>
>

Re: Run Python User Defined Functions / code in Spark with Scala Codebase

Posted by Chetan Khatri <ch...@gmail.com>.
Hello Jayant,

Thank you so much for suggestion. My view was to  use Python function as
transformation which can take couple of column names and return object.
which you explained. would that possible to point me to similiar codebase
example.

Thanks.

On Fri, Jul 6, 2018 at 2:56 AM, Jayant Shekhar <ja...@gmail.com>
wrote:

> Hello Chetan,
>
> We have currently done it with .pipe(.py) as Prem suggested.
>
> That passes the RDD as CSV strings to the python script. The python script
> can either process it line by line, create the result and return it back.
> Or create things like Pandas Dataframe for processing and finally write the
> results back.
>
> In the Spark/Scala/Java code, you get an RDD of string, which we convert
> back to a Dataframe.
>
> Feel free to ping me directly in case of questions.
>
> Thanks,
> Jayant
>
>
> On Thu, Jul 5, 2018 at 3:39 AM, Chetan Khatri <chetan.opensource@gmail.com
> > wrote:
>
>> Prem sure, Thanks for suggestion.
>>
>> On Wed, Jul 4, 2018 at 8:38 PM, Prem Sure <sp...@gmail.com> wrote:
>>
>>> try .pipe(.py) on RDD
>>>
>>> Thanks,
>>> Prem
>>>
>>> On Wed, Jul 4, 2018 at 7:59 PM, Chetan Khatri <
>>> chetan.opensource@gmail.com> wrote:
>>>
>>>> Can someone please suggest me , thanks
>>>>
>>>> On Tue 3 Jul, 2018, 5:28 PM Chetan Khatri, <ch...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hello Dear Spark User / Dev,
>>>>>
>>>>> I would like to pass Python user defined function to Spark Job
>>>>> developed using Scala and return value of that function would be returned
>>>>> to DF / Dataset API.
>>>>>
>>>>> Can someone please guide me, which would be best approach to do this.
>>>>> Python function would be mostly transformation function. Also would like to
>>>>> pass Java Function as a String to Spark / Scala job and it applies to RDD /
>>>>> Data Frame and should return RDD / Data Frame.
>>>>>
>>>>> Thank you.
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>>
>

Re: Run Python User Defined Functions / code in Spark with Scala Codebase

Posted by Jayant Shekhar <ja...@gmail.com>.
Hello Chetan,

We have currently done it with .pipe(.py) as Prem suggested.

That passes the RDD as CSV strings to the python script. The python script
can either process it line by line, create the result and return it back.
Or create things like Pandas Dataframe for processing and finally write the
results back.

In the Spark/Scala/Java code, you get an RDD of string, which we convert
back to a Dataframe.

Feel free to ping me directly in case of questions.

Thanks,
Jayant


On Thu, Jul 5, 2018 at 3:39 AM, Chetan Khatri <ch...@gmail.com>
wrote:

> Prem sure, Thanks for suggestion.
>
> On Wed, Jul 4, 2018 at 8:38 PM, Prem Sure <sp...@gmail.com> wrote:
>
>> try .pipe(.py) on RDD
>>
>> Thanks,
>> Prem
>>
>> On Wed, Jul 4, 2018 at 7:59 PM, Chetan Khatri <
>> chetan.opensource@gmail.com> wrote:
>>
>>> Can someone please suggest me , thanks
>>>
>>> On Tue 3 Jul, 2018, 5:28 PM Chetan Khatri, <ch...@gmail.com>
>>> wrote:
>>>
>>>> Hello Dear Spark User / Dev,
>>>>
>>>> I would like to pass Python user defined function to Spark Job
>>>> developed using Scala and return value of that function would be returned
>>>> to DF / Dataset API.
>>>>
>>>> Can someone please guide me, which would be best approach to do this.
>>>> Python function would be mostly transformation function. Also would like to
>>>> pass Java Function as a String to Spark / Scala job and it applies to RDD /
>>>> Data Frame and should return RDD / Data Frame.
>>>>
>>>> Thank you.
>>>>
>>>>
>>>>
>>>>
>>
>

Re: Run Python User Defined Functions / code in Spark with Scala Codebase

Posted by Chetan Khatri <ch...@gmail.com>.
Prem sure, Thanks for suggestion.

On Wed, Jul 4, 2018 at 8:38 PM, Prem Sure <sp...@gmail.com> wrote:

> try .pipe(.py) on RDD
>
> Thanks,
> Prem
>
> On Wed, Jul 4, 2018 at 7:59 PM, Chetan Khatri <chetan.opensource@gmail.com
> > wrote:
>
>> Can someone please suggest me , thanks
>>
>> On Tue 3 Jul, 2018, 5:28 PM Chetan Khatri, <ch...@gmail.com>
>> wrote:
>>
>>> Hello Dear Spark User / Dev,
>>>
>>> I would like to pass Python user defined function to Spark Job developed
>>> using Scala and return value of that function would be returned to DF /
>>> Dataset API.
>>>
>>> Can someone please guide me, which would be best approach to do this.
>>> Python function would be mostly transformation function. Also would like to
>>> pass Java Function as a String to Spark / Scala job and it applies to RDD /
>>> Data Frame and should return RDD / Data Frame.
>>>
>>> Thank you.
>>>
>>>
>>>
>>>
>

Re: Run Python User Defined Functions / code in Spark with Scala Codebase

Posted by Chetan Khatri <ch...@gmail.com>.
Prem sure, Thanks for suggestion.

On Wed, Jul 4, 2018 at 8:38 PM, Prem Sure <sp...@gmail.com> wrote:

> try .pipe(.py) on RDD
>
> Thanks,
> Prem
>
> On Wed, Jul 4, 2018 at 7:59 PM, Chetan Khatri <chetan.opensource@gmail.com
> > wrote:
>
>> Can someone please suggest me , thanks
>>
>> On Tue 3 Jul, 2018, 5:28 PM Chetan Khatri, <ch...@gmail.com>
>> wrote:
>>
>>> Hello Dear Spark User / Dev,
>>>
>>> I would like to pass Python user defined function to Spark Job developed
>>> using Scala and return value of that function would be returned to DF /
>>> Dataset API.
>>>
>>> Can someone please guide me, which would be best approach to do this.
>>> Python function would be mostly transformation function. Also would like to
>>> pass Java Function as a String to Spark / Scala job and it applies to RDD /
>>> Data Frame and should return RDD / Data Frame.
>>>
>>> Thank you.
>>>
>>>
>>>
>>>
>

Re: Run Python User Defined Functions / code in Spark with Scala Codebase

Posted by Prem Sure <sp...@gmail.com>.
try .pipe(.py) on RDD

Thanks,
Prem

On Wed, Jul 4, 2018 at 7:59 PM, Chetan Khatri <ch...@gmail.com>
wrote:

> Can someone please suggest me , thanks
>
> On Tue 3 Jul, 2018, 5:28 PM Chetan Khatri, <ch...@gmail.com>
> wrote:
>
>> Hello Dear Spark User / Dev,
>>
>> I would like to pass Python user defined function to Spark Job developed
>> using Scala and return value of that function would be returned to DF /
>> Dataset API.
>>
>> Can someone please guide me, which would be best approach to do this.
>> Python function would be mostly transformation function. Also would like to
>> pass Java Function as a String to Spark / Scala job and it applies to RDD /
>> Data Frame and should return RDD / Data Frame.
>>
>> Thank you.
>>
>>
>>
>>

Re: Run Python User Defined Functions / code in Spark with Scala Codebase

Posted by Chetan Khatri <ch...@gmail.com>.
Can someone please suggest me , thanks

On Tue 3 Jul, 2018, 5:28 PM Chetan Khatri, <ch...@gmail.com>
wrote:

> Hello Dear Spark User / Dev,
>
> I would like to pass Python user defined function to Spark Job developed
> using Scala and return value of that function would be returned to DF /
> Dataset API.
>
> Can someone please guide me, which would be best approach to do this.
> Python function would be mostly transformation function. Also would like to
> pass Java Function as a String to Spark / Scala job and it applies to RDD /
> Data Frame and should return RDD / Data Frame.
>
> Thank you.
>
>
>
>

Re: Run Python User Defined Functions / code in Spark with Scala Codebase

Posted by Chetan Khatri <ch...@gmail.com>.
Can someone please suggest me , thanks

On Tue 3 Jul, 2018, 5:28 PM Chetan Khatri, <ch...@gmail.com>
wrote:

> Hello Dear Spark User / Dev,
>
> I would like to pass Python user defined function to Spark Job developed
> using Scala and return value of that function would be returned to DF /
> Dataset API.
>
> Can someone please guide me, which would be best approach to do this.
> Python function would be mostly transformation function. Also would like to
> pass Java Function as a String to Spark / Scala job and it applies to RDD /
> Data Frame and should return RDD / Data Frame.
>
> Thank you.
>
>
>
>