You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Rishi Shah <ri...@gmail.com> on 2020/06/06 17:05:52 UTC

[pyspark 2.3+] Add scala library to pyspark app and use to derive columns

Hi All,

I have a use case where I need to utilize java/scala for regex mapping (as
lookbehinds are not well supported with python).. However our entire code
is python based so was wondering if there's a suggested way of creating a
scala/java lib and use that within pyspark..

I came across this,
https://diogoalexandrefranco.github.io/scala-code-in-pyspark/ - will try it
out but my colleague ran into some issues with serialization before while
trying to use java lib with pyspark.

Typical use case is to use library functions to derive columns.

Any input helps, appreciate it!

-- 
Regards,

Rishi Shah