You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "MaxGekk (via GitHub)" <gi...@apache.org> on 2023/09/04 12:16:04 UTC

[GitHub] [spark] MaxGekk opened a new pull request, #42801: [WIP][SQL][DOCS] Describe the binary and datetime formats of `to_char`/`to_varchar`

MaxGekk opened a new pull request, #42801:
URL: https://github.com/apache/spark/pull/42801

   ### What changes were proposed in this pull request?
   In the PR, I propose to document the recent changes related to the `format` of the `to_char`/`to_varchar` functions:
   1. binary formats added by https://github.com/apache/spark/pull/42632
   2. datetime formats introduced by https://github.com/apache/spark/pull/42534
   
   ### Why are the changes needed?
   To inform users about recent changes.
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   ### How was this patch tested?
   By CI.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #42801: [SPARK-45070][SQL][DOCS] Describe the binary and datetime formats of `to_char`/`to_varchar`

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #42801:
URL: https://github.com/apache/spark/pull/42801#discussion_r1315041850


##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala:
##########
@@ -4311,11 +4320,20 @@ object functions {
    *   prints '+' for positive values but 'MI' prints a space.</li> <li>'PR': Only allowed at the
    *   end of the format string; specifies that the result string will be wrapped by angle
    *   brackets if the input value is negative.</li> </ul>
+   *   If `e` is a datetime, `format` shall be a valid datetime pattern, see
+   *   <a href="https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html">Datetime Patterns</a>.
+   *   If `e` is a binary, it is converted to a string in one of the formats:
+   *   <ul>
+   *     <li>'base64': a base 64 string.</li>
+   *     <li>'hex': a string in the hexadecimal format.</li>
+   *     <li>'utf-8': the input binary is decoded to UTF-8 string.</li>
+   *   </ul>
    *
    * @group string_funcs
    * @since 3.5.0
    */
   def to_varchar(e: Column, format: Column): Column = Column.fn("to_varchar", e, format)
+  // scalastyle:on line.size.limit

Review Comment:
   shall we put it before the `def to_varchar`? It's better to limit the code block that skips style check



##########
sql/core/src/main/scala/org/apache/spark/sql/functions.scala:
##########
@@ -4446,10 +4454,18 @@ object functions {
    *   'PR': Only allowed at the end of the format string; specifies that the result string will be
    *     wrapped by angle brackets if the input value is negative.
    *
+   *  If `e` is a datetime, `format` shall be a valid datetime pattern, see
+   *  <a href="https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html">Datetime Patterns</a>.
+   *  If `e` is a binary, it is converted to a string in one of the formats:
+   *     'base64': a base 64 string.
+   *     'hex': a string in the hexadecimal format.
+   *     'utf-8': the input binary is decoded to UTF-8 string.
+   *
    * @group string_funcs
    * @since 3.5.0
    */
   def to_varchar(e: Column, format: Column): Column = call_function("to_varchar", e, format)
+  // scalastyle:on line.size.limit

Review Comment:
   ditto



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #42801: [SPARK-45070][SQL][DOCS] Describe the binary and datetime formats of `to_char`/`to_varchar`

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #42801:
URL: https://github.com/apache/spark/pull/42801#discussion_r1315213044


##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala:
##########
@@ -4284,12 +4285,22 @@ object functions {
    *   prints '+' for positive values but 'MI' prints a space.</li> <li>'PR': Only allowed at the
    *   end of the format string; specifies that the result string will be wrapped by angle
    *   brackets if the input value is negative.</li> </ul>
+   *   If `e` is a datetime, `format` shall be a valid datetime pattern, see
+   *   <a href="https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html">Datetime Patterns</a>.
+   *   If `e` is a binary, it is converted to a string in one of the formats:
+   *   <ul>
+   *     <li>'base64': a base 64 string.</li>
+   *     <li>'hex': a string in the hexadecimal format.</li>
+   *     <li>'utf-8': the input binary is decoded to UTF-8 string.</li>
+   *   </ul>

Review Comment:
   To @MaxGekk , it seems our Linter CI doesn't allow this kind of formatting in the comment.
   
   - https://github.com/MaxGekk/spark/actions/runs/6076537960/job/16484751098
   
   ```
   The scalafmt check failed on connector/connect at following occurrences:
   
   Requires formatting: functions.scala
   
   Before submitting your change, please make sure to format your code using the following command:
   ./build/mvn -Pscala-2.12 scalafmt:format -Dscalafmt.skip=false -Dscalafmt.validateOnly=false -Dscalafmt.changedOnly=false -pl connector/connect/common -pl connector/connect/server -pl connector/connect/client/jvm
   Error: Process completed with exit code 1.
   ```
   
   The recommended style is something like the following. Please fix the style by run the above command line.
   ```
   -   *   brackets if the input value is negative.</li> </ul>
   -   *   If `e` is a datetime, `format` shall be a valid datetime pattern, see
   -   *   <a href="https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html">Datetime Patterns</a>.
   -   *   If `e` is a binary, it is converted to a string in one of the formats:
   -   *   <ul>
   -   *     <li>'base64': a base 64 string.</li>
   -   *     <li>'hex': a string in the hexadecimal format.</li>
   -   *     <li>'utf-8': the input binary is decoded to UTF-8 string.</li>
   -   *   </ul>
   +   *   brackets if the input value is negative.</li> </ul> If `e` is a datetime, `format` shall be
   +   *   a valid datetime pattern, see <a
   +   *   href="https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html">Datetime
   +   *   Patterns</a>. If `e` is a binary, it is converted to a string in one of the formats: <ul>
   +   *   <li>'base64': a base 64 string.</li> <li>'hex': a string in the hexadecimal format.</li>
   +   *   <li>'utf-8': the input binary is decoded to UTF-8 string.</li> </ul>
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #42801: [SPARK-45070][SQL][DOCS] Describe the binary and datetime formats of `to_char`/`to_varchar`

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk commented on PR #42801:
URL: https://github.com/apache/spark/pull/42801#issuecomment-1707683221

   Merging to master. Thank you, @dongjoon-hyun and @cloud-fan for review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #42801: [SPARK-45070][SQL][DOCS] Describe the binary and datetime formats of `to_char`/`to_varchar`

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #42801:
URL: https://github.com/apache/spark/pull/42801#discussion_r1315213044


##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala:
##########
@@ -4284,12 +4285,22 @@ object functions {
    *   prints '+' for positive values but 'MI' prints a space.</li> <li>'PR': Only allowed at the
    *   end of the format string; specifies that the result string will be wrapped by angle
    *   brackets if the input value is negative.</li> </ul>
+   *   If `e` is a datetime, `format` shall be a valid datetime pattern, see
+   *   <a href="https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html">Datetime Patterns</a>.
+   *   If `e` is a binary, it is converted to a string in one of the formats:
+   *   <ul>
+   *     <li>'base64': a base 64 string.</li>
+   *     <li>'hex': a string in the hexadecimal format.</li>
+   *     <li>'utf-8': the input binary is decoded to UTF-8 string.</li>
+   *   </ul>

Review Comment:
   To @MaxGekk , it seems our Linter CI doesn't allow this kind of human-friendly formatting in the comment.
   
   - https://github.com/MaxGekk/spark/actions/runs/6076537960/job/16484751098
   
   ```
   The scalafmt check failed on connector/connect at following occurrences:
   
   Requires formatting: functions.scala
   
   Before submitting your change, please make sure to format your code using the following command:
   ./build/mvn -Pscala-2.12 scalafmt:format -Dscalafmt.skip=false -Dscalafmt.validateOnly=false -Dscalafmt.changedOnly=false -pl connector/connect/common -pl connector/connect/server -pl connector/connect/client/jvm
   Error: Process completed with exit code 1.
   ```
   
   The recommended style is something like the following. Please fix the style by run the above command line.
   ```
   -   *   brackets if the input value is negative.</li> </ul>
   -   *   If `e` is a datetime, `format` shall be a valid datetime pattern, see
   -   *   <a href="https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html">Datetime Patterns</a>.
   -   *   If `e` is a binary, it is converted to a string in one of the formats:
   -   *   <ul>
   -   *     <li>'base64': a base 64 string.</li>
   -   *     <li>'hex': a string in the hexadecimal format.</li>
   -   *     <li>'utf-8': the input binary is decoded to UTF-8 string.</li>
   -   *   </ul>
   +   *   brackets if the input value is negative.</li> </ul> If `e` is a datetime, `format` shall be
   +   *   a valid datetime pattern, see <a
   +   *   href="https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html">Datetime
   +   *   Patterns</a>. If `e` is a binary, it is converted to a string in one of the formats: <ul>
   +   *   <li>'base64': a base 64 string.</li> <li>'hex': a string in the hexadecimal format.</li>
   +   *   <li>'utf-8': the input binary is decoded to UTF-8 string.</li> </ul>
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk closed pull request #42801: [SPARK-45070][SQL][DOCS] Describe the binary and datetime formats of `to_char`/`to_varchar`

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk closed pull request #42801: [SPARK-45070][SQL][DOCS] Describe the binary and datetime formats of `to_char`/`to_varchar`
URL: https://github.com/apache/spark/pull/42801


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on a diff in pull request #42801: [SPARK-45070][SQL][DOCS] Describe the binary and datetime formats of `to_char`/`to_varchar`

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk commented on code in PR #42801:
URL: https://github.com/apache/spark/pull/42801#discussion_r1315744531


##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala:
##########
@@ -4284,12 +4285,22 @@ object functions {
    *   prints '+' for positive values but 'MI' prints a space.</li> <li>'PR': Only allowed at the
    *   end of the format string; specifies that the result string will be wrapped by angle
    *   brackets if the input value is negative.</li> </ul>
+   *   If `e` is a datetime, `format` shall be a valid datetime pattern, see
+   *   <a href="https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html">Datetime Patterns</a>.
+   *   If `e` is a binary, it is converted to a string in one of the formats:
+   *   <ul>
+   *     <li>'base64': a base 64 string.</li>
+   *     <li>'hex': a string in the hexadecimal format.</li>
+   *     <li>'utf-8': the input binary is decoded to UTF-8 string.</li>
+   *   </ul>

Review Comment:
   Thank you, @dongjoon-hyun 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org