You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "panbingkun (via GitHub)" <gi...@apache.org> on 2023/10/31 11:59:43 UTC

[PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

panbingkun opened a new pull request, #43603:
URL: https://github.com/apache/spark/pull/43603

   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
     8. If you want to add or modify an error type or message, please read the guideline first in
        'core/src/main/resources/error/README.md'.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   If benchmark tests were added, please run the benchmarks in GitHub Actions for the consistent environment, and the instructions could accord to: https://spark.apache.org/developer-tools.html#github-workflow-benchmarks.
   -->
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   <!--
   If generative AI tooling has been used in the process of authoring this patch, please include the
   phrase: 'Generated-by: ' followed by the name of the tool and its version.
   If no, write 'No'.
   Please refer to the [ASF Generative Tooling Guidance](https://www.apache.org/legal/generative-tooling.html) for details.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on code in PR #43603:
URL: https://github.com/apache/spark/pull/43603#discussion_r1400114822


##########
pom.xml:
##########
@@ -3000,6 +3000,7 @@
                 reduce the cost of migration in subsequent versions.
               -->
               <arg>-Wconf:cat=deprecation&amp;msg=it will become a keyword in Scala 3:e</arg>
+              <arg>-Wconf:cat=deprecation&amp;msg=Unicode escapes in triple quoted strings are deprecated:e</arg>

Review Comment:
   @panbingkun please revert the change of `SparkBuild.scala` and `pom.xml`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on code in PR #43603:
URL: https://github.com/apache/spark/pull/43603#discussion_r1378423078


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala:
##########
@@ -1677,25 +1677,24 @@ class PlanParserSuite extends AnalysisTest {
         ScriptInputOutputSchema(List.empty, List.empty, None, None,
           List.empty, List.empty, None, None, false)))
 
-    // verify with ROW FORMAT DELIMETED
-    assertEqual(
-      """
-        |SELECT TRANSFORM(a, b, c)
-        |  ROW FORMAT DELIMITED
-        |  FIELDS TERMINATED BY '\t'
-        |  COLLECTION ITEMS TERMINATED BY '\u0002'
-        |  MAP KEYS TERMINATED BY '\u0003'
-        |  LINES TERMINATED BY '\n'
-        |  NULL DEFINED AS 'null'
-        |  USING 'cat' AS (a, b, c)
-        |  ROW FORMAT DELIMITED
-        |  FIELDS TERMINATED BY '\t'
-        |  COLLECTION ITEMS TERMINATED BY '\u0004'
-        |  MAP KEYS TERMINATED BY '\u0005'
-        |  LINES TERMINATED BY '\n'
-        |  NULL DEFINED AS 'NULL'
-        |FROM testData
-      """.stripMargin,
+    // verify with ROW FORMAT DELIMITED
+    val sqlWithRowFormatDelimited =
+        "SELECT TRANSFORM(a, b, c)" + "\n" +
+        "  ROW FORMAT DELIMITED" + "\n" +
+      """  FIELDS TERMINATED BY '\t'""" + "\n" +

Review Comment:
   friendly ping @AngersZhuuuu , do you have any suggestions about this `deprecated usage` issue
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on PR #43603:
URL: https://github.com/apache/spark/pull/43603#issuecomment-1793996377

   <img width="409" alt="image" src="https://github.com/apache/spark/assets/15246973/209d7924-7867-4cf0-b3fc-8cf4f4019faf">
   <img width="420" alt="image" src="https://github.com/apache/spark/assets/15246973/24aef12e-dd13-414f-b324-77e0466b8fb9">
   <img width="442" alt="image" src="https://github.com/apache/spark/assets/15246973/fc804246-ef7c-419d-a9b7-400935b4f729">
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on code in PR #43603:
URL: https://github.com/apache/spark/pull/43603#discussion_r1378454613


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala:
##########
@@ -1677,25 +1677,24 @@ class PlanParserSuite extends AnalysisTest {
         ScriptInputOutputSchema(List.empty, List.empty, None, None,
           List.empty, List.empty, None, None, false)))
 
-    // verify with ROW FORMAT DELIMETED
-    assertEqual(
-      """
-        |SELECT TRANSFORM(a, b, c)
-        |  ROW FORMAT DELIMITED
-        |  FIELDS TERMINATED BY '\t'
-        |  COLLECTION ITEMS TERMINATED BY '\u0002'
-        |  MAP KEYS TERMINATED BY '\u0003'
-        |  LINES TERMINATED BY '\n'
-        |  NULL DEFINED AS 'null'
-        |  USING 'cat' AS (a, b, c)
-        |  ROW FORMAT DELIMITED
-        |  FIELDS TERMINATED BY '\t'
-        |  COLLECTION ITEMS TERMINATED BY '\u0004'
-        |  MAP KEYS TERMINATED BY '\u0005'
-        |  LINES TERMINATED BY '\n'
-        |  NULL DEFINED AS 'NULL'
-        |FROM testData
-      """.stripMargin,
+    // verify with ROW FORMAT DELIMITED
+    val sqlWithRowFormatDelimited =

Review Comment:
   <img width="652" alt="image" src="https://github.com/apache/spark/assets/15246973/bdedd8cc-d41a-46bf-ac83-38a2ba4a27d1">
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on PR #43603:
URL: https://github.com/apache/spark/pull/43603#issuecomment-1815724730

   I am more inclined to understand the original intent of the test. If the goal is just to test different separators, we can use the literals of visible separators to eliminate this compile warning. If the original intent was to test invisible separators, then I think we can simply suppress the warning in this test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on code in PR #43603:
URL: https://github.com/apache/spark/pull/43603#discussion_r1377813437


##########
project/SparkBuild.scala:
##########
@@ -269,7 +269,8 @@ object SparkBuild extends PomBuild {
         // SPARK-45627 `enum`, `export` and `given` will become keywords in Scala 3,
         // so they are prohibited from being used as variable names in Scala 2.13 to
         // reduce the cost of migration in subsequent versions.
-        "-Wconf:cat=deprecation&msg=it will become a keyword in Scala 3:e"
+        "-Wconf:cat=deprecation&msg=it will become a keyword in Scala 3:e",
+        "-Wconf:cat=deprecation&msg=Unicode escapes in triple quoted strings are deprecated:e"

Review Comment:
   this will be compile error in Scala 3?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on code in PR #43603:
URL: https://github.com/apache/spark/pull/43603#discussion_r1378349936


##########
project/SparkBuild.scala:
##########
@@ -269,7 +269,8 @@ object SparkBuild extends PomBuild {
         // SPARK-45627 `enum`, `export` and `given` will become keywords in Scala 3,
         // so they are prohibited from being used as variable names in Scala 2.13 to
         // reduce the cost of migration in subsequent versions.
-        "-Wconf:cat=deprecation&msg=it will become a keyword in Scala 3:e"
+        "-Wconf:cat=deprecation&msg=it will become a keyword in Scala 3:e",
+        "-Wconf:cat=deprecation&msg=Unicode escapes in triple quoted strings are deprecated:e"

Review Comment:
   this will nothing in Scala 3.3.1, neither a warning nor an error.
   - scala 3.3.1
      <img width="443" alt="image" src="https://github.com/apache/spark/assets/15246973/d843e23e-6e06-4121-9e91-19807bf7bc1f">
   
   - scala 2.13.8
      <img width="774" alt="image" src="https://github.com/apache/spark/assets/15246973/286fc68a-84c1-48cf-9716-7edafa2faa06">
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on code in PR #43603:
URL: https://github.com/apache/spark/pull/43603#discussion_r1398632990


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala:
##########
@@ -1678,7 +1680,8 @@ class PlanParserSuite extends AnalysisTest {
           List.empty, List.empty, None, None, false)))
 
     // verify with ROW FORMAT DELIMETED
-    assertEqual(
+    @nowarn("cat=deprecation")

Review Comment:
   fine-grained suppressing this case with `@nowarn` if fine to me



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on code in PR #43603:
URL: https://github.com/apache/spark/pull/43603#discussion_r1382737960


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala:
##########
@@ -1677,25 +1677,23 @@ class PlanParserSuite extends AnalysisTest {
         ScriptInputOutputSchema(List.empty, List.empty, None, None,
           List.empty, List.empty, None, None, false)))
 
-    // verify with ROW FORMAT DELIMETED
+    // verify with ROW FORMAT DELIMITED
     assertEqual(
-      """
-        |SELECT TRANSFORM(a, b, c)
-        |  ROW FORMAT DELIMITED
-        |  FIELDS TERMINATED BY '\t'
-        |  COLLECTION ITEMS TERMINATED BY '\u0002'
-        |  MAP KEYS TERMINATED BY '\u0003'
-        |  LINES TERMINATED BY '\n'
-        |  NULL DEFINED AS 'null'
-        |  USING 'cat' AS (a, b, c)
-        |  ROW FORMAT DELIMITED
-        |  FIELDS TERMINATED BY '\t'
-        |  COLLECTION ITEMS TERMINATED BY '\u0004'
-        |  MAP KEYS TERMINATED BY '\u0005'
-        |  LINES TERMINATED BY '\n'
-        |  NULL DEFINED AS 'NULL'
-        |FROM testData
-      """.stripMargin,
+      "SELECT TRANSFORM(a, b, c)\n" +

Review Comment:
   Although this writing may seem ugly, it seems better than the previous version, haha



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on code in PR #43603:
URL: https://github.com/apache/spark/pull/43603#discussion_r1377847272


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala:
##########
@@ -1678,23 +1678,27 @@ class PlanParserSuite extends AnalysisTest {
           List.empty, List.empty, None, None, false)))
 
     // verify with ROW FORMAT DELIMETED
+    val collectionItemsTerminated1 = '\u0002'

Review Comment:
   - '\u0002' corresponds to "STX" (Start of Text), a non-printing character used to mark the beginning of text.
   - '\u0003' corresponds to "ETX" (End of Text), a non-printing character used to mark the end of text.
   - '\u0004' corresponds to "EOT" (End of Transmission), a non-printing character used to mark the end of transmission.
   - '\u0005' corresponds to "ENQ" (Enquiry), a non-printing character used to inquire whether the receiving device is ready to receive data.
   
   I'm not sure if this cases really need to test using `non-printing characters` as delimiters, can we use printable characters for testing, so we can replace them with the actual corresponding characters. 
   
   If indeed we need to test using `non-printing characters` as delimiters, I think we can add a `@nowarn` annotation to suppress warnings at a fine granularity like:
   
   ```
       // verify with ROW FORMAT DELIMITED
       @nowarn("cat=deprecation")
       val sqlWithRowFormatDelimiters: String =
         """
           |SELECT TRANSFORM(a, b, c)
           |  ROW FORMAT DELIMITED
           |  FIELDS TERMINATED BY '\t'
           |  COLLECTION ITEMS TERMINATED BY '\u0002'
           |  MAP KEYS TERMINATED BY '\u0003'
           |  LINES TERMINATED BY '\n'
           |  NULL DEFINED AS 'null'
           |  USING 'cat' AS (a, b, c)
           |  ROW FORMAT DELIMITED
           |  FIELDS TERMINATED BY '\t'
           |  COLLECTION ITEMS TERMINATED BY '\u0004'
           |  MAP KEYS TERMINATED BY '\u0005'
           |  LINES TERMINATED BY '\n'
           |  NULL DEFINED AS 'NULL'
           |FROM testData
       """.stripMargin
   
       assertEqual(
         sqlWithRowFormatDelimiters,
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on PR #43603:
URL: https://github.com/apache/spark/pull/43603#issuecomment-1815962947

   > To clarify: I am more inclined to understand the original intent of the test. If the goal is just to test different separators, we can use the literals of visible separators to eliminate this compile warning. If the original intent was to test invisible separators, then I think we can simply suppress the warning in this test.
   
   Okay, let's `suppress the warning` with annotations first.
   I tested it locally and it was okay.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on PR #43603:
URL: https://github.com/apache/spark/pull/43603#issuecomment-1822059435

   Merged into master for Spark 4.0. Thanks @panbingkun 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on code in PR #43603:
URL: https://github.com/apache/spark/pull/43603#discussion_r1400127824


##########
pom.xml:
##########
@@ -3000,6 +3000,7 @@
                 reduce the cost of migration in subsequent versions.
               -->
               <arg>-Wconf:cat=deprecation&amp;msg=it will become a keyword in Scala 3:e</arg>
+              <arg>-Wconf:cat=deprecation&amp;msg=Unicode escapes in triple quoted strings are deprecated:e</arg>

Review Comment:
   Okay



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang closed pull request #43603: [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated`
URL: https://github.com/apache/spark/pull/43603


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on PR #43603:
URL: https://github.com/apache/spark/pull/43603#issuecomment-1788304312

   <img width="451" alt="image" src="https://github.com/apache/spark/assets/15246973/76a23c09-cd87-4eeb-bca4-ec01fda4000d">
   <img width="670" alt="image" src="https://github.com/apache/spark/assets/15246973/1ebfb343-a1ae-46bc-a690-b02c93612b34">
   <img width="389" alt="image" src="https://github.com/apache/spark/assets/15246973/4b0f4b60-7f35-4d8d-a670-f0e13eb6a39a">
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on code in PR #43603:
URL: https://github.com/apache/spark/pull/43603#discussion_r1377847272


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala:
##########
@@ -1678,23 +1678,27 @@ class PlanParserSuite extends AnalysisTest {
           List.empty, List.empty, None, None, false)))
 
     // verify with ROW FORMAT DELIMETED
+    val collectionItemsTerminated1 = '\u0002'

Review Comment:
   - '\u0002' corresponds to "STX" (Start of Text), a non-printing character used to mark the beginning of text.
   - '\u0003' corresponds to "ETX" (End of Text), a non-printing character used to mark the end of text.
   - '\u0004' corresponds to "EOT" (End of Transmission), a non-printing character used to mark the end of transmission.
   - '\u0005' corresponds to "ENQ" (Enquiry), a non-printing character used to inquire whether the receiving device is ready to receive data.
   
   I'm not sure if this cases really need to test using `non-printing characters` as delimiters, can we use printable characters for testing, so we can replace them with the actual corresponding characters. 
   
   If indeed we need to test using `non-printing characters` as delimiters, I think we can add a `@nowarn` annotation to suppress warnings at a fine granularity like:
   
   ```
       // verify with ROW FORMAT DELIMITED
       @nowarn("cat=deprecation")
       val sqlWithRowFormatDelimited: String =
         """
           |SELECT TRANSFORM(a, b, c)
           |  ROW FORMAT DELIMITED
           |  FIELDS TERMINATED BY '\t'
           |  COLLECTION ITEMS TERMINATED BY '\u0002'
           |  MAP KEYS TERMINATED BY '\u0003'
           |  LINES TERMINATED BY '\n'
           |  NULL DEFINED AS 'null'
           |  USING 'cat' AS (a, b, c)
           |  ROW FORMAT DELIMITED
           |  FIELDS TERMINATED BY '\t'
           |  COLLECTION ITEMS TERMINATED BY '\u0004'
           |  MAP KEYS TERMINATED BY '\u0005'
           |  LINES TERMINATED BY '\n'
           |  NULL DEFINED AS 'NULL'
           |FROM testData
       """.stripMargin
   
       assertEqual(
         sqlWithRowFormatDelimited,
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on code in PR #43603:
URL: https://github.com/apache/spark/pull/43603#discussion_r1377847272


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala:
##########
@@ -1678,23 +1678,27 @@ class PlanParserSuite extends AnalysisTest {
           List.empty, List.empty, None, None, false)))
 
     // verify with ROW FORMAT DELIMETED
+    val collectionItemsTerminated1 = '\u0002'

Review Comment:
   - '\u0002' corresponds to "STX" (Start of Text), a non-printing character used to mark the beginning of text.
   - '\u0003' corresponds to "ETX" (End of Text), a non-printing character used to mark the end of text.
   - '\u0004' corresponds to "EOT" (End of Transmission), a non-printing character used to mark the end of transmission.
   - '\u0005' corresponds to "ENQ" (Enquiry), a non-printing character used to inquire whether the receiving device is ready to receive data.
   
   I'm not sure if this cases really need to test using `non-printing characters` as delimiters, can we use printable characters for testing, so we can replace them with the actual corresponding characters. 
   
   If indeed we need to test using `non-printing characters` as delimiters, I think we can add a `@nowarn` annotation to suppress warnings at a fine granularity like:
   
   ```scala
       // verify with ROW FORMAT DELIMITED
       @nowarn("cat=deprecation")
       val sqlWithRowFormatDelimiters: String =
         """
           |SELECT TRANSFORM(a, b, c)
           |  ROW FORMAT DELIMITED
           |  FIELDS TERMINATED BY '\t'
           |  COLLECTION ITEMS TERMINATED BY '\u0002'
           |  MAP KEYS TERMINATED BY '\u0003'
           |  LINES TERMINATED BY '\n'
           |  NULL DEFINED AS 'null'
           |  USING 'cat' AS (a, b, c)
           |  ROW FORMAT DELIMITED
           |  FIELDS TERMINATED BY '\t'
           |  COLLECTION ITEMS TERMINATED BY '\u0004'
           |  MAP KEYS TERMINATED BY '\u0005'
           |  LINES TERMINATED BY '\n'
           |  NULL DEFINED AS 'NULL'
           |FROM testData
       """.stripMargin
   
       assertEqual(
         sqlWithRowFormatDelimiters,
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on code in PR #43603:
URL: https://github.com/apache/spark/pull/43603#discussion_r1378302898


##########
project/SparkBuild.scala:
##########
@@ -269,7 +269,8 @@ object SparkBuild extends PomBuild {
         // SPARK-45627 `enum`, `export` and `given` will become keywords in Scala 3,
         // so they are prohibited from being used as variable names in Scala 2.13 to
         // reduce the cost of migration in subsequent versions.
-        "-Wconf:cat=deprecation&msg=it will become a keyword in Scala 3:e"
+        "-Wconf:cat=deprecation&msg=it will become a keyword in Scala 3:e",
+        "-Wconf:cat=deprecation&msg=Unicode escapes in triple quoted strings are deprecated:e"

Review Comment:
   Only for test in GA.
   I want it to run through GA first and see if there are any other similar issues besides the above modifications.
   I will delete it later.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.
LuciferYang commented on code in PR #43603:
URL: https://github.com/apache/spark/pull/43603#discussion_r1378421435


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala:
##########
@@ -1677,25 +1677,24 @@ class PlanParserSuite extends AnalysisTest {
         ScriptInputOutputSchema(List.empty, List.empty, None, None,
           List.empty, List.empty, None, None, false)))
 
-    // verify with ROW FORMAT DELIMETED
-    assertEqual(
-      """
-        |SELECT TRANSFORM(a, b, c)
-        |  ROW FORMAT DELIMITED
-        |  FIELDS TERMINATED BY '\t'
-        |  COLLECTION ITEMS TERMINATED BY '\u0002'
-        |  MAP KEYS TERMINATED BY '\u0003'
-        |  LINES TERMINATED BY '\n'
-        |  NULL DEFINED AS 'null'
-        |  USING 'cat' AS (a, b, c)
-        |  ROW FORMAT DELIMITED
-        |  FIELDS TERMINATED BY '\t'
-        |  COLLECTION ITEMS TERMINATED BY '\u0004'
-        |  MAP KEYS TERMINATED BY '\u0005'
-        |  LINES TERMINATED BY '\n'
-        |  NULL DEFINED AS 'NULL'
-        |FROM testData
-      """.stripMargin,
+    // verify with ROW FORMAT DELIMITED
+    val sqlWithRowFormatDelimited =

Review Comment:
   looks a bit ugly ...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on PR #43603:
URL: https://github.com/apache/spark/pull/43603#issuecomment-1815468435

   > May I ask the status of this PR, @panbingkun ?
   @dongjoon-hyun 
   My modification principle is to `eliminate the warning` and `maintain the original string semantics`, but it is indeed `ugly`;while @LuciferYang 's suggestion is to `eliminate the warning`, but bypass it with `a special annotation.`
   Which `modification method` do you think is more appropriate? We would also like to hear your `suggestions`, many thanks!
   (PS: I actually think both are acceptable, but a judgment and decision are needed. 😄)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45697][BUILD] Fix `Unicode escapes in triple quoted strings are deprecated` [spark]

Posted by "panbingkun (via GitHub)" <gi...@apache.org>.
panbingkun commented on code in PR #43603:
URL: https://github.com/apache/spark/pull/43603#discussion_r1400127824


##########
pom.xml:
##########
@@ -3000,6 +3000,7 @@
                 reduce the cost of migration in subsequent versions.
               -->
               <arg>-Wconf:cat=deprecation&amp;msg=it will become a keyword in Scala 3:e</arg>
+              <arg>-Wconf:cat=deprecation&amp;msg=Unicode escapes in triple quoted strings are deprecated:e</arg>

Review Comment:
   Okay, done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org