You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "shujingyang-db (via GitHub)" <gi...@apache.org> on 2024/03/15 20:55:46 UTC

[PR] [SPARK-47309][SQL] XML: Add schema inference tests for value tags [spark]

shujingyang-db opened a new pull request, #45538:
URL: https://github.com/apache/spark/pull/45538

   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
     8. If you want to add or modify an error type or message, please read the guideline first in
        'common/utils/src/main/resources/error/README.md'.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   Add schema inference tags for corrupt records, null values and value tags. For value tags, this PR adds the following tests:
   1. Conflict between primitive types conflict
   2. Root-level value tag
   3. empty value tag in some rows
   4. array of value tags:
      1) values split into multiple lines
      2) interspersed in nested structs: empty fields and optional fields in structs
      3) interspersed in arrays and value tags:  empty fields and optional fields in structs
      4) name conflict
      5) CDATA and comments
      6) no spaces / some spaces / whitespaces between valueTags and elements
   
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   This is a test-only change.
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   No
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   If benchmark tests were added, please run the benchmarks in GitHub Actions for the consistent environment, and the instructions could accord to: https://spark.apache.org/developer-tools.html#github-workflow-benchmarks.
   -->
   This is a test-only change.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   <!--
   If generative AI tooling has been used in the process of authoring this patch, please include the
   phrase: 'Generated-by: ' followed by the name of the tool and its version.
   If no, write 'No'.
   Please refer to the [ASF Generative Tooling Guidance](https://www.apache.org/legal/generative-tooling.html) for details.
   -->
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47309][SQL] XML: Add schema inference tests for value tags [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #45538:
URL: https://github.com/apache/spark/pull/45538#issuecomment-2008486564

   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47309][SQL] XML: Add schema inference tests for value tags [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon closed pull request #45538: [SPARK-47309][SQL] XML: Add schema inference tests for value tags
URL: https://github.com/apache/spark/pull/45538


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47309][SQL] XML: Add schema inference tests for value tags [spark]

Posted by "sandip-db (via GitHub)" <gi...@apache.org>.
sandip-db commented on code in PR #45538:
URL: https://github.com/apache/spark/pull/45538#discussion_r1529515496


##########
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/xml/XmlInferSchemaSuite.scala:
##########
@@ -293,4 +305,323 @@ class XmlInferSchemaSuite extends QueryTest with SharedSparkSession with TestXml
     assert(emptyDF.schema === expectedSchema)
   }
 
+  test("nulls in arrays") {
+    val expectedSchema = StructType(
+      StructField(
+        "field1",
+        ArrayType(
+          new StructType()
+            .add("array1", ArrayType(new StructType().add("array2", ArrayType(StringType))))
+        )
+      ) ::
+      StructField(
+        "field2",
+        ArrayType(
+          new StructType()
+            .add("array1", ArrayType(StructType(StructField("Test", LongType) :: Nil)))
+        )
+      ) :: Nil
+    )
+    val expectedAns = Seq(
+      Row(Seq(Row(Seq(Row(Seq("value1", "value2")), Row(null))), Row(null)), null),
+      Row(null, Seq(Row(null), Row(Seq(Row(1), Row(null))))),
+      Row(Seq(Row(null), Row(Seq(Row(null)))), Seq(Row(null)))
+    )
+    val xmlDF = readData(nullsInArrays)
+    assert(xmlDF.schema === expectedSchema)
+    checkAnswer(xmlDF, expectedAns)
+  }
+
+  test("corrupt records: fail fast mode") {
+    // fail fast mode is covered in the testcase: DSL test for failing fast in XmlSuite
+    val schemaOne = StructType(
+      StructField("a", StringType, true) ::
+      StructField("b", StringType, true) ::
+      StructField("c", StringType, true) :: Nil
+    )
+    // `DROPMALFORMED` mode should skip corrupt records
+    val xmlDFOne = readData(corruptRecords, Map("mode" -> "DROPMALFORMED"))
+    checkAnswer(
+      xmlDFOne,
+      Row("1", "2", null) ::
+      Row("str_a_4", "str_b_4", "str_c_4") :: Nil
+    )
+    assert(xmlDFOne.schema === schemaOne)
+  }
+
+  test("turn non-nullable schema into a nullable schema") {
+    // XML field is missing.
+    val missingFieldInput = """<ROW><c1>1</c1></ROW>"""
+    val missingFieldInputDS =
+      spark.createDataset(spark.sparkContext.parallelize(missingFieldInput :: Nil))(Encoders.STRING)
+    // XML filed is null.
+    val nullValueInput = """<ROW><c1>1</c1><c2/></ROW>"""
+    val nullValueInputDS =
+      spark.createDataset(spark.sparkContext.parallelize(nullValueInput :: Nil))(Encoders.STRING)
+
+    val schema = StructType(
+      Seq(
+        StructField("c1", IntegerType, nullable = false),
+        StructField("c2", IntegerType, nullable = false)
+      )
+    )
+    val expected = schema.asNullable
+
+    Seq(missingFieldInputDS, nullValueInputDS).foreach { xmlStringDS =>
+      Seq("DROPMALFORMED", "FAILFAST", "PERMISSIVE").foreach { mode =>
+        val df = spark.read
+          .option("mode", mode)
+          .option("rowTag", "ROW")
+          .schema(schema)
+          .xml(xmlStringDS)
+        assert(df.schema == expected)
+        checkAnswer(df, Row(1, null) :: Nil)
+      }
+      withSQLConf(SQLConf.LEGACY_RESPECT_NULLABILITY_IN_TEXT_DATASET_CONVERSION.key -> "true") {
+        checkAnswer(
+          spark.read
+            .schema(
+              StructType(
+                StructField("c1", LongType, nullable = false) ::
+                StructField("c2", LongType, nullable = false) :: Nil
+              )
+            )
+            .option("rowTag", "ROW")
+            .option("mode", "DROPMALFORMED")
+            .xml(xmlStringDS),
+          // It is for testing legacy configuration. This is technically a bug as
+          // `0` has to be `null` but the schema is non-nullable.
+          Row(1, 0)
+        )
+      }
+    }
+  }
+
+  test("XML with partitions") {
+    def makePartition(rdd: RDD[String], parent: File, partName: String, partValue: Any): File = {
+      val p = new File(parent, s"$partName=${partValue.toString}")
+      rdd.saveAsTextFile(p.getCanonicalPath)
+      p
+    }
+
+    withTempPath(root => {
+      withTempView("test_myxml_with_part") {
+        val d1 = new File(root, "d1=1")
+        // root/dt=1/col1=abc
+        makePartition(
+          sparkContext.parallelize(2 to 5).map(i => s"""<ROW><a>1</a><b>str$i</b></ROW>"""),
+          d1,
+          "col1",
+          "abc"
+        )
+
+        // root/dt=1/col1=abd

Review Comment:
   nit:
   ```suggestion
           // root/d1=1/col1=abd
   ```



##########
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/xml/XmlInferSchemaSuite.scala:
##########
@@ -293,4 +305,323 @@ class XmlInferSchemaSuite extends QueryTest with SharedSparkSession with TestXml
     assert(emptyDF.schema === expectedSchema)
   }
 
+  test("nulls in arrays") {
+    val expectedSchema = StructType(
+      StructField(
+        "field1",
+        ArrayType(
+          new StructType()
+            .add("array1", ArrayType(new StructType().add("array2", ArrayType(StringType))))
+        )
+      ) ::
+      StructField(
+        "field2",
+        ArrayType(
+          new StructType()
+            .add("array1", ArrayType(StructType(StructField("Test", LongType) :: Nil)))
+        )
+      ) :: Nil
+    )
+    val expectedAns = Seq(
+      Row(Seq(Row(Seq(Row(Seq("value1", "value2")), Row(null))), Row(null)), null),
+      Row(null, Seq(Row(null), Row(Seq(Row(1), Row(null))))),
+      Row(Seq(Row(null), Row(Seq(Row(null)))), Seq(Row(null)))
+    )
+    val xmlDF = readData(nullsInArrays)
+    assert(xmlDF.schema === expectedSchema)
+    checkAnswer(xmlDF, expectedAns)
+  }
+
+  test("corrupt records: fail fast mode") {
+    // fail fast mode is covered in the testcase: DSL test for failing fast in XmlSuite
+    val schemaOne = StructType(
+      StructField("a", StringType, true) ::
+      StructField("b", StringType, true) ::
+      StructField("c", StringType, true) :: Nil
+    )
+    // `DROPMALFORMED` mode should skip corrupt records
+    val xmlDFOne = readData(corruptRecords, Map("mode" -> "DROPMALFORMED"))
+    checkAnswer(
+      xmlDFOne,
+      Row("1", "2", null) ::
+      Row("str_a_4", "str_b_4", "str_c_4") :: Nil
+    )
+    assert(xmlDFOne.schema === schemaOne)
+  }
+
+  test("turn non-nullable schema into a nullable schema") {
+    // XML field is missing.
+    val missingFieldInput = """<ROW><c1>1</c1></ROW>"""
+    val missingFieldInputDS =
+      spark.createDataset(spark.sparkContext.parallelize(missingFieldInput :: Nil))(Encoders.STRING)
+    // XML filed is null.
+    val nullValueInput = """<ROW><c1>1</c1><c2/></ROW>"""
+    val nullValueInputDS =
+      spark.createDataset(spark.sparkContext.parallelize(nullValueInput :: Nil))(Encoders.STRING)
+
+    val schema = StructType(
+      Seq(
+        StructField("c1", IntegerType, nullable = false),
+        StructField("c2", IntegerType, nullable = false)
+      )
+    )
+    val expected = schema.asNullable
+
+    Seq(missingFieldInputDS, nullValueInputDS).foreach { xmlStringDS =>
+      Seq("DROPMALFORMED", "FAILFAST", "PERMISSIVE").foreach { mode =>
+        val df = spark.read
+          .option("mode", mode)
+          .option("rowTag", "ROW")
+          .schema(schema)
+          .xml(xmlStringDS)
+        assert(df.schema == expected)
+        checkAnswer(df, Row(1, null) :: Nil)
+      }
+      withSQLConf(SQLConf.LEGACY_RESPECT_NULLABILITY_IN_TEXT_DATASET_CONVERSION.key -> "true") {
+        checkAnswer(
+          spark.read
+            .schema(
+              StructType(
+                StructField("c1", LongType, nullable = false) ::
+                StructField("c2", LongType, nullable = false) :: Nil
+              )
+            )
+            .option("rowTag", "ROW")
+            .option("mode", "DROPMALFORMED")
+            .xml(xmlStringDS),
+          // It is for testing legacy configuration. This is technically a bug as
+          // `0` has to be `null` but the schema is non-nullable.
+          Row(1, 0)
+        )
+      }
+    }
+  }
+
+  test("XML with partitions") {
+    def makePartition(rdd: RDD[String], parent: File, partName: String, partValue: Any): File = {
+      val p = new File(parent, s"$partName=${partValue.toString}")
+      rdd.saveAsTextFile(p.getCanonicalPath)
+      p
+    }
+
+    withTempPath(root => {
+      withTempView("test_myxml_with_part") {
+        val d1 = new File(root, "d1=1")
+        // root/dt=1/col1=abc
+        makePartition(
+          sparkContext.parallelize(2 to 5).map(i => s"""<ROW><a>1</a><b>str$i</b></ROW>"""),
+          d1,
+          "col1",
+          "abc"
+        )
+
+        // root/dt=1/col1=abd
+        makePartition(
+          sparkContext.parallelize(6 to 10).map(i => s"""<ROW><a>1</a><c>str$i</c></ROW>"""),
+          d1,
+          "col1",
+          "abd"
+        )
+        val expectedSchema = new StructType()
+          .add("a", LongType)
+          .add("b", StringType)
+          .add("c", StringType)
+          .add("d1", IntegerType)
+          .add("col1", StringType)
+
+        val df = spark.read.option("rowTag", "ROW").xml(root.getAbsolutePath)
+        assert(df.schema === expectedSchema)
+        assert(df.where(col("d1") === 1).where(col("col1") === "abc").select("a").count() == 4)
+        assert(df.where(col("d1") === 1).where(col("col1") === "abd").select("a").count() == 5)
+        assert(df.where(col("d1") === 1).select("a").count() == 9)
+      }
+    })
+  }
+
+  test("value tag - type conflict and root level value tags") {
+    val xmlDF = readData(valueTagsTypeConflict, ignoreSurroundingSpacesOptions)
+    val expectedSchema = new StructType()
+      .add(valueTagName, ArrayType(StringType))
+      .add(
+        "a",
+        new StructType()
+          .add(valueTagName, LongType)
+          .add("b", new StructType().add(valueTagName, StringType).add("c", LongType))
+      )
+    assert(xmlDF.schema == expectedSchema)
+    val expectedAns = Seq(
+      Row(Seq("13.1", "string"), Row(11, Row("true", 1))),
+      Row(Seq("string", "true"), Row(21474836470L, Row("false", 2))),
+      Row(Seq("92233720368547758070"), Row(null, Row("12", 3)))
+    )
+    checkAnswer(xmlDF, expectedAns)
+  }
+
+  test("value tag - spaces and empty values") {
+    val expectedSchema = new StructType()
+      .add(valueTagName, ArrayType(StringType))
+      .add("a", new StructType().add(valueTagName, StringType).add("b", LongType))
+    // even though we don't ignore the surrounding spaces of characters,
+    // we won't put whitespaces as value tags :)
+    val xmlDFWSpaces =
+      readData(emptyValueTags, notIgnoreSurroundingSpacesOptions)
+    val xmlDFWOSpaces = readData(emptyValueTags, ignoreSurroundingSpacesOptions)
+    assert(xmlDFWSpaces.schema == expectedSchema)
+    assert(xmlDFWOSpaces.schema == expectedSchema)
+
+    val expectedAnsWSpaces = Seq(
+      Row(Seq("\n    str1\n    ", "str2\n"), Row(null, 1)),
+      Row(null, Row(" value", null)),
+      Row(null, Row(null, 3)),
+      Row(Seq("\n    str3\n"), Row(null, 4))
+    )
+    checkAnswer(xmlDFWSpaces, expectedAnsWSpaces)
+    val expectedAnsWOSpaces = Seq(
+      Row(Seq("str1", "str2"), Row(null, 1)),
+      Row(null, Row("value", null)),
+      Row(null, Row(null, 3)),
+      Row(Seq("str3"), Row(null, 4))
+    )
+    checkAnswer(xmlDFWOSpaces, expectedAnsWOSpaces)
+  }
+
+  test("value tags - multiple lines") {
+    val xmlDF = readData(multilineValueTags, ignoreSurroundingSpacesOptions)
+    val expectedSchema =
+      new StructType().add(valueTagName, ArrayType(StringType)).add("a", LongType)
+    val expectedAns = Seq(
+      Row(Seq("value1", "value2"), 1),
+      Row(Seq("value3\n    value4"), 1)
+    )
+    assert(xmlDF.schema == expectedSchema)
+    checkAnswer(xmlDF, expectedAns)
+  }
+
+  test("value tags - around structs") {
+    val xmlDF = readData(valueTagsAroundStructs)
+    val expectedSchema = new StructType()
+      .add(valueTagName, ArrayType(StringType))
+      .add(
+        "a",
+        new StructType()
+          .add(valueTagName, ArrayType(StringType))
+          .add("b", new StructType().add(valueTagName, LongType).add("c", LongType))
+      )
+
+    assert(xmlDF.schema == expectedSchema)
+    val expectedAns = Seq(
+      Row(
+        Seq("value1", "value5"),
+        Row(Seq("value2", "value4"), Row(3, 1))
+      ),
+      Row(
+        Seq("value6"),
+        Row(Seq("value4", "value5"), Row(null, null))
+      ),
+      Row(
+        Seq("value1", "value5"),
+        Row(Seq("value2", "value4"), Row(3, null))
+      ),
+      Row(
+        Seq("value1"),
+        Row(Seq("value2", "value4"), Row(3, null))
+      )
+    )
+    checkAnswer(xmlDF, expectedAns)
+  }
+
+  test("value tags - around arrays") {
+    val xmlDF = readData(valueTagsAroundArrays)
+    val expectedSchema = new StructType()
+      .add(valueTagName, ArrayType(StringType))
+      .add(
+        "array1",
+        ArrayType(
+          new StructType()
+            .add(valueTagName, ArrayType(StringType))
+            .add(
+              "array2",
+              ArrayType(new StructType()
+                // The value tag is not of long type due to:
+                // When determining
+                .add(valueTagName, ArrayType(StringType))

Review Comment:
   Update comment.. why it is not LongType?



##########
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/xml/XmlInferSchemaSuite.scala:
##########
@@ -293,4 +305,323 @@ class XmlInferSchemaSuite extends QueryTest with SharedSparkSession with TestXml
     assert(emptyDF.schema === expectedSchema)
   }
 
+  test("nulls in arrays") {
+    val expectedSchema = StructType(
+      StructField(
+        "field1",
+        ArrayType(
+          new StructType()
+            .add("array1", ArrayType(new StructType().add("array2", ArrayType(StringType))))
+        )
+      ) ::
+      StructField(
+        "field2",
+        ArrayType(
+          new StructType()
+            .add("array1", ArrayType(StructType(StructField("Test", LongType) :: Nil)))
+        )
+      ) :: Nil
+    )
+    val expectedAns = Seq(
+      Row(Seq(Row(Seq(Row(Seq("value1", "value2")), Row(null))), Row(null)), null),
+      Row(null, Seq(Row(null), Row(Seq(Row(1), Row(null))))),
+      Row(Seq(Row(null), Row(Seq(Row(null)))), Seq(Row(null)))
+    )
+    val xmlDF = readData(nullsInArrays)
+    assert(xmlDF.schema === expectedSchema)
+    checkAnswer(xmlDF, expectedAns)
+  }
+
+  test("corrupt records: fail fast mode") {
+    // fail fast mode is covered in the testcase: DSL test for failing fast in XmlSuite
+    val schemaOne = StructType(
+      StructField("a", StringType, true) ::
+      StructField("b", StringType, true) ::
+      StructField("c", StringType, true) :: Nil
+    )
+    // `DROPMALFORMED` mode should skip corrupt records
+    val xmlDFOne = readData(corruptRecords, Map("mode" -> "DROPMALFORMED"))
+    checkAnswer(
+      xmlDFOne,
+      Row("1", "2", null) ::
+      Row("str_a_4", "str_b_4", "str_c_4") :: Nil
+    )
+    assert(xmlDFOne.schema === schemaOne)
+  }
+
+  test("turn non-nullable schema into a nullable schema") {
+    // XML field is missing.
+    val missingFieldInput = """<ROW><c1>1</c1></ROW>"""
+    val missingFieldInputDS =
+      spark.createDataset(spark.sparkContext.parallelize(missingFieldInput :: Nil))(Encoders.STRING)
+    // XML filed is null.
+    val nullValueInput = """<ROW><c1>1</c1><c2/></ROW>"""
+    val nullValueInputDS =
+      spark.createDataset(spark.sparkContext.parallelize(nullValueInput :: Nil))(Encoders.STRING)
+
+    val schema = StructType(
+      Seq(
+        StructField("c1", IntegerType, nullable = false),
+        StructField("c2", IntegerType, nullable = false)
+      )
+    )
+    val expected = schema.asNullable
+
+    Seq(missingFieldInputDS, nullValueInputDS).foreach { xmlStringDS =>
+      Seq("DROPMALFORMED", "FAILFAST", "PERMISSIVE").foreach { mode =>
+        val df = spark.read
+          .option("mode", mode)
+          .option("rowTag", "ROW")
+          .schema(schema)
+          .xml(xmlStringDS)
+        assert(df.schema == expected)
+        checkAnswer(df, Row(1, null) :: Nil)
+      }
+      withSQLConf(SQLConf.LEGACY_RESPECT_NULLABILITY_IN_TEXT_DATASET_CONVERSION.key -> "true") {
+        checkAnswer(
+          spark.read
+            .schema(
+              StructType(
+                StructField("c1", LongType, nullable = false) ::
+                StructField("c2", LongType, nullable = false) :: Nil
+              )
+            )
+            .option("rowTag", "ROW")
+            .option("mode", "DROPMALFORMED")
+            .xml(xmlStringDS),
+          // It is for testing legacy configuration. This is technically a bug as
+          // `0` has to be `null` but the schema is non-nullable.
+          Row(1, 0)
+        )
+      }
+    }
+  }
+
+  test("XML with partitions") {
+    def makePartition(rdd: RDD[String], parent: File, partName: String, partValue: Any): File = {
+      val p = new File(parent, s"$partName=${partValue.toString}")
+      rdd.saveAsTextFile(p.getCanonicalPath)
+      p
+    }
+
+    withTempPath(root => {
+      withTempView("test_myxml_with_part") {
+        val d1 = new File(root, "d1=1")
+        // root/dt=1/col1=abc

Review Comment:
   nit:
   ```suggestion
           // root/d1=1/col1=abc
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47309][SQL] XML: Add schema inference tests for value tags [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.
HyukjinKwon commented on PR #45538:
URL: https://github.com/apache/spark/pull/45538#issuecomment-2008486755

   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org