You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/12/06 08:35:46 UTC

[GitHub] [spark] yaooqinn opened a new pull request #26776: [SPARK-30147][SQL] Trim the string when cast string type to booleans

yaooqinn opened a new pull request #26776: [SPARK-30147][SQL] Trim the string when cast string type to booleans
URL: https://github.com/apache/spark/pull/26776
 
 
   ### What changes were proposed in this pull request?
   
   Now, we trim the string when casting string value to those `canCast` types values, e.g. int, double, decimal, interval, date, timestamps, except for boolean. 
   This behavior makes type cast and coercion inconsistency in Spark.
   Not fitting ANSI SQL standard either.
   ```
   If TD is boolean, then
   Case:
   a) If SD is character string, then SV is replaced by
       TRIM ( BOTH ' ' FROM VE )
       Case:
       i) If the rules for literal in Subclause 5.3, “literal”, can be applied to SV to determine a valid
   value of the data type TD, then let TV be that value.
      ii) Otherwise, an exception condition is raised: data exception — invalid character value for cast.
   b) If SD is boolean, then TV is SV
   ```
   In this pull request, we trim all the whitespaces from both ends of the string before converting it to a bool value. This behavior is as same as others, but a bit different from sql standard, which trim only spaces. 
   
   ### Why are the changes needed?
   
   Type cast/coercion consistency
   
   
   ### Does this PR introduce any user-facing change?
   <!--
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If no, write 'No'.
   -->
   
   yes, string with whitespaces in both ends will be trimmed before converted to booleans.
   
   e.g. `select cast('\t true' as boolean)` results `true` now, before this pr it's `null`
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   -->
   
   add unit tests

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org