You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "mathewjacob1002 (via GitHub)" <gi...@apache.org> on 2023/07/12 01:26:36 UTC

[GitHub] [spark] mathewjacob1002 commented on a diff in pull request #41946: [WIP] FunctionPickler Class

mathewjacob1002 commented on code in PR #41946:
URL: https://github.com/apache/spark/pull/41946#discussion_r1260452728


##########
python/pyspark/ml/util.py:
##########
@@ -760,3 +762,127 @@ def _get_active_session(is_remote: bool) -> SparkSession:
     if spark is None:
         raise RuntimeError("An active SparkSession is required for the distributor.")
     return spark
+
+
+class FunctionPickler:
+    """ 
+        This class provides a way to pickle a function and its arguments.
+        It also provides a way to create a pytorch script that can run a
+        function with arguments if they have pickled to a file.
+    """
+    @staticmethod
+    def pickle_func_and_get_path(train_fn: Callable, file_path: str, save_dir: str, *args, **kwargs) -> str:
+        """
+            Given a training function and args, this function will pickle them to a file. 
+
+            Parameters
+            ----------
+            train_fn: Callable
+                The picklable function that will be pickled to a file.
+
+            file_path: str
+                The path where to save the pickled function, args, and kwargs. If its the 
+                empty string, the function will decide on a random name.
+
+            save_dir: str
+                The directory in which to save the file with the pickled function and arguments.
+                Does nothing if the path is specified. If both file_path and save_dir are empty,
+                the function will write the file to the current working directory with a random 
+                name.

Review Comment:
   Unfortunately, because of the *args and the **kwargs parameters, default value wasn't working when I originally implemented it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org