You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Dhruve Ashar (JIRA)" <ji...@apache.org> on 2019/02/05 17:17:00 UTC

[jira] [Created] (SPARK-26827) Support importing python modules having shared objects(.so)

Dhruve Ashar created SPARK-26827:
------------------------------------

             Summary: Support importing python modules having shared objects(.so)
                 Key: SPARK-26827
                 URL: https://issues.apache.org/jira/browse/SPARK-26827
             Project: Spark
          Issue Type: New Feature
          Components: PySpark
    Affects Versions: 2.4.0, 2.3.2
            Reporter: Dhruve Ashar


If a user wants to import dynamic modules, specifically having .so files, this is currently disallowed by python from a zip file. ([https://docs.python.org/3/library/zipimport.html)] and currently spark doesn't support this either. 

Files which are passed using py-files options are placed on the PYTHONPATH, but are not extracted. While files which are passed as archives, are extracted but not placed on the PYTHONPATH. The dynamic modules can be loaded if they are extracted and added to the PYTHONPATH.

 

Has anyone encountered this issue before and what is the best way to go about it?

 

Some possible solutions:

1 - Get around this issue, by passing the archive with py-files and archives option, this extracts the archive as well as adds it to the path. Gotcha - both have to be named the same. I have tested this and it works, but its just a workaround.

2 - We add a new config like py-archives which takes all the files and extracts them and also adds them to the PYTHONPATH. Or just examine the contents of the zip file and if it has dynamic modules then do the same. I am happy to work on the fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org