You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "wesmcouch (via GitHub)" <gi...@apache.org> on 2024/03/08 15:55:15 UTC

[PR] Allow spark-class script to be ran in an environment where /dev/fd is unavailable (AWS Lambda) [spark]

wesmcouch opened a new pull request, #45441:
URL: https://github.com/apache/spark/pull/45441

   ### Why are the changes needed?
   * Running pyspark in an environment like AWS Lambda where /dev/fd is unavailable throws an error because /dev/fd does not exist
   ```
   /var/lang/lib/python3.11/site-packages/pyspark/bin/spark-class: line 93: /dev/fd/62: No such file or directory
   /var/lang/lib/python3.11/site-packages/pyspark/bin/spark-class: line 97: CMD: bad array subscript
   ```
   
   ### What changes were proposed in this pull request?
   Using a temporary file to run the command instead of process substitution allows the code to be ran in environments where /dev/fd does not exist.
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   * Tested in AWS Lambda Python 3.11 Container based image using pyspark
   * Tested in Mac Python 3.11 environment using pyspark
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Allow spark-class script to be ran in an environment where /dev/fd is unavailable (AWS Lambda) [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #45441:
URL: https://github.com/apache/spark/pull/45441#issuecomment-1987424381

   Mind filing a JIRA at https://issues.apache.org/jira/projects/SPARK? See also https://spark.apache.org/contributing.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] Allow spark-class script to be ran in an environment where /dev/fd is unavailable (AWS Lambda) [spark]

Posted by "gerashegalov (via GitHub)" <gi...@apache.org>.
gerashegalov commented on code in PR #45441:
URL: https://github.com/apache/spark/pull/45441#discussion_r1532660388


##########
bin/spark-class:
##########
@@ -77,6 +77,10 @@ set +o posix
 CMD=()
 DELIM=$'\n'
 CMD_START_FLAG="false"
+
+temp_file=$(mktemp)

Review Comment:
   Is it possible to avoid creating a temp file? Maybe wrapping the rest of the file after L84 into another function say run_command, so we have something like `build_command "$@" | run_command` ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org