You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/12/28 03:57:40 UTC
[PR] [SPARK-46530][PYTHON][SQL] Check Python executable when looking up available Data Sources [spark]
HyukjinKwon opened a new pull request, #44519:
URL: https://github.com/apache/spark/pull/44519
### What changes were proposed in this pull request?
This PR is a sort of followup of https://github.com/apache/spark/pull/44504 but addresses a separate issue. This PR proposes to check if Python executable exists when looking up available Python Data Sources.
### Why are the changes needed?
For some OSes such as Windows, or minimized Docker containers, there is no Python installed, and it will just fail even when users want to use Scala only. We should check the Python executable, and skip if that does not exist.
### Does this PR introduce _any_ user-facing change?
No because the main change has not been released out yet.
### How was this patch tested?
Manually tested.
### Was this patch authored or co-authored using generative AI tooling?
No.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] [SPARK-46530][PYTHON][SQL] Check Python executable when looking up available Data Sources [spark]
Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng closed pull request #44519: [SPARK-46530][PYTHON][SQL] Check Python executable when looking up available Data Sources
URL: https://github.com/apache/spark/pull/44519
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] [SPARK-46530][PYTHON][SQL] Check Python executable when looking up available Data Sources [spark]
Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on PR #44519:
URL: https://github.com/apache/spark/pull/44519#issuecomment-1872455892
late LGTM, thanks @HyukjinKwon for fixing this
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] [SPARK-46530][PYTHON][SQL] Check Python executable when looking up available Data Sources [spark]
Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #44519:
URL: https://github.com/apache/spark/pull/44519#issuecomment-1872405120
Thx thx
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] [SPARK-46530][PYTHON][SQL] Check Python executable when looking up available Data Sources [spark]
Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #44519:
URL: https://github.com/apache/spark/pull/44519#issuecomment-1871652982
cc @ueshin 🙏
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] [SPARK-46530][PYTHON][SQL] Check Python executable when looking up available Data Sources [spark]
Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on code in PR #44519:
URL: https://github.com/apache/spark/pull/44519#discussion_r1437995061
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceManager.scala:
##########
@@ -17,15 +17,19 @@
package org.apache.spark.sql.execution.datasources
+import java.io.File
import java.util.Locale
import java.util.concurrent.ConcurrentHashMap
+import java.util.regex.Pattern
import scala.jdk.CollectionConverters._
import org.apache.spark.api.python.PythonUtils
import org.apache.spark.internal.Logging
import org.apache.spark.sql.errors.QueryCompilationErrors
import org.apache.spark.sql.execution.python.UserDefinedPythonDataSource
+import org.apache.spark.util.Utils
+
Review Comment:
```suggestion
```
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceManager.scala:
##########
@@ -17,15 +17,19 @@
package org.apache.spark.sql.execution.datasources
+import java.io.File
import java.util.Locale
import java.util.concurrent.ConcurrentHashMap
+import java.util.regex.Pattern
import scala.jdk.CollectionConverters._
import org.apache.spark.api.python.PythonUtils
import org.apache.spark.internal.Logging
import org.apache.spark.sql.errors.QueryCompilationErrors
import org.apache.spark.sql.execution.python.UserDefinedPythonDataSource
+import org.apache.spark.util.Utils
+
Review Comment:
```suggestion
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] [SPARK-46530][PYTHON][SQL] Check Python executable when looking up available Data Sources [spark]
Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on PR #44519:
URL: https://github.com/apache/spark/pull/44519#issuecomment-1871693436
I have reversed https://github.com/apache/spark/pull/44504 (CommitID: `229a4eaf547e5c263c749bd53f7f9a89f4a9bea9`). Based on the current running results, the `Run Spark on Kubernetes Integration test` failure of GA is related to this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] [SPARK-46530][PYTHON][SQL] Check Python executable when looking up available Data Sources [spark]
Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #44519:
URL: https://github.com/apache/spark/pull/44519#issuecomment-1872401581
@LuciferYang @dongjoon-hyun @zhengruifeng if anyone is online can you merge this one please ? I won't be away from keyboard today.. and this technically fixes the build.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] [SPARK-46530][PYTHON][SQL] Check Python executable when looking up available Data Sources [spark]
Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #44519:
URL: https://github.com/apache/spark/pull/44519#issuecomment-1871737190
Thanks. Let me fix up here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
Re: [PR] [SPARK-46530][PYTHON][SQL] Check Python executable when looking up available Data Sources [spark]
Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng commented on PR #44519:
URL: https://github.com/apache/spark/pull/44519#issuecomment-1872405056
merged to master
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org