You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "HyukjinKwon (via GitHub)" <gi...@apache.org> on 2023/11/15 02:23:03 UTC

Re: [PR] [SPARK-45912][SQL] Enhancement of XSDToSchema API: Change to HDFS API for cloud storage accessibility [spark]

HyukjinKwon commented on code in PR #43789:
URL: https://github.com/apache/spark/pull/43789#discussion_r1393554560


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/xml/XSDToSchema.scala:
##########
@@ -35,34 +38,32 @@ import org.apache.spark.sql.types._
 object XSDToSchema {
 
   /**
-   * Reads a schema from an XSD file.
+   * Reads a schema from an XSD path.
    * Note that if the schema consists of one complex parent type which you want to use as
    * the row tag schema, then you will need to extract the schema of the single resulting
    * struct in the resulting StructType, and use its StructType as your schema.
    *
-   * @param xsdFile XSD file
+   * @param xsdPath XSD path
    * @return Spark-compatible schema
    */
-  def read(xsdFile: File): StructType = {
+  def read(xsdPath: Path): StructType = {
+    val in = try {
+      // Handle case where file exists as specified
+      val fs = xsdPath.getFileSystem(SparkHadoopUtil.get.conf)
+      fs.open(xsdPath)
+    } catch {
+      case _: Throwable =>
+        // Handle case where it was added with sc.addFile
+        val addFileUrl = SparkFiles.get(xsdPath.toString)

Review Comment:
   Hm, this isn't actually a URL. You should probably call `Utils.resolveURIs`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org