You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/07/06 10:03:51 UTC

[GitHub] [spark] GuoPhilipse opened a new pull request #29009: [SPARK-32193][SQL]update migrate guide docs on regexp function

GuoPhilipse opened a new pull request #29009:
URL: https://github.com/apache/spark/pull/29009


   ### What changes were proposed in this pull request?
   update docs guide on function regexp, it can help people migrate sqls
   
   ### Why are the changes needed?
   Hive support regexp function, Spark sql use `rlike` instead of `regexp` , we can update the docs to make it known to more users.
   
   
   ### Does this PR introduce _any_ user-facing change?
   no
   
   
   ### How was this patch tested?
   no tests


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] GuoPhilipse commented on pull request #29009: [SPARK-32193][SQL] Update migrate guide docs on regexp function

Posted by GitBox <gi...@apache.org>.
GuoPhilipse commented on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-654167274


   > Thanks @maropu. @GuoPhilipse, does MySQL also supports `REGEXP` as a function?
   > 
   > From reading the doc in Hive at https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF, it's same as Rlike which Spark supports. Maybe we should alias it.
   > 
   > FWIW, there are many unsupported expressions explicitly:
   > https://github.com/apache/spark/blob/5d870ef0bc70527fd1bc99a4ad17e4941c923351/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala#L189-L192
   
   yes, i posted the REGEXP  usage  in mysql last comment.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29009: [SPARK-32193][SQL]update migrate guide docs on regexp function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-654141473


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #29009: [SPARK-32193][DOCS] Update regexp usage in docs

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-655861241


   ok to test


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29009: [SPARK-32193][DOCS] Update regexp usage in docs

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-655866686


   **[Test build #125427 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125427/testReport)** for PR 29009 at commit [`06cfd5d`](https://github.com/apache/spark/commit/06cfd5dc7e14a4f096002029817852ed1bc5ba41).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] GuoPhilipse edited a comment on pull request #29009: [SPARK-32193][SQL] Update migrate guide docs on regexp function

Posted by GitBox <gi...@apache.org>.
GuoPhilipse edited a comment on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-654163788


   > I think you don't need to file jira for this kind of minor doc fixes. Btw, any other systems supporting `REGEXP` for regular expressions other than Hive? I think we might be able to assign an alias name to it if it is a general name.
   
   So far as i know ,mysql and hive support this key word .But i think a alias name seems a good idea.
   mysql examples:
   `SELECT 'abc' REGEXP '([a-z]+)';`
   Ressult:
   `1`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #29009: [SPARK-32193][SQL] Update migrate guide docs on regexp function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-654168528


   No, @GuoPhilipse, I meant if it's supported in `SELECT REGEXP('abc', '([a-z]+)');` way.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on pull request #29009: [SPARK-32193][SQL][DOCS] Update regexp usage in SQL docs

Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-655948027


   Thanks! Merged to master/3.0.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29009: [SPARK-32193][DOCS] Update regexp usage in docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-655863495






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29009: [SPARK-32193][SQL]update migrate guide docs on regexp function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-654140900


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29009: [SPARK-32193][SQL][DOCS] Update regexp usage in SQL docs

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29009:
URL: https://github.com/apache/spark/pull/29009#discussion_r452128712



##########
File path: docs/sql-ref-syntax-qry-select-like.md
##########
@@ -26,7 +26,7 @@ A LIKE predicate is used to search for a specific pattern.
 ### Syntax
 
 ```sql
-[ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | RLIKE regex_pattern }
+[ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | RLIKE regex_pattern | REGEXP regex_pattern}

Review comment:
       Thanks! That's very helpful.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] GuoPhilipse removed a comment on pull request #29009: [SPARK-32193][SQL] Update migrate guide docs on regexp function

Posted by GitBox <gi...@apache.org>.
GuoPhilipse removed a comment on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-654168933


   > No, @GuoPhilipse, I meant if it's supported in `SELECT REGEXP('abc', '([a-z]+)');` way.
   
   No,it's a different way.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] GuoPhilipse commented on a change in pull request #29009: [SPARK-32193][SQL] Add alias for function rlike

Posted by GitBox <gi...@apache.org>.
GuoPhilipse commented on a change in pull request #29009:
URL: https://github.com/apache/spark/pull/29009#discussion_r451924250



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
##########
@@ -359,6 +359,7 @@ object FunctionRegistry {
     expression[StringReplace]("replace"),
     expression[Overlay]("overlay"),
     expression[RLike]("rlike"),
+    expression[RLike]("regexp", true),

Review comment:
       yes, let me will remove the function usage 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #29009: [SPARK-32193][SQL] Update migrate guide docs on regexp function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #29009:
URL: https://github.com/apache/spark/pull/29009#discussion_r450147882



##########
File path: docs/sql-migration-guide.md
##########
@@ -970,3 +970,4 @@ Below are the scenarios in which Hive and Spark generate different results:
 * `ACOS(n)` If n < -1 or n > 1, Hive returns null, Spark SQL returns NaN.
 * `ASIN(n)` If n < -1 or n > 1, Hive returns null, Spark SQL returns NaN.
 * `CAST(n AS TIMESTAMP)` If n is integral numbers, Hive treats n as milliseconds, Spark SQL treats n as seconds.
+* `REGEXP(str, patten)` Hive support this function, Spark SQL use RLIKE instead.

Review comment:
       Also, technically this isn's in "the scenarios in which Hive and Spark generate different results".




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29009: [SPARK-32193][DOCS] Update regexp usage in docs

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-655863107


   **[Test build #125427 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125427/testReport)** for PR 29009 at commit [`06cfd5d`](https://github.com/apache/spark/commit/06cfd5dc7e14a4f096002029817852ed1bc5ba41).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] GuoPhilipse commented on a change in pull request #29009: [SPARK-32193][DOCS] Update regexp usage in docs

Posted by GitBox <gi...@apache.org>.
GuoPhilipse commented on a change in pull request #29009:
URL: https://github.com/apache/spark/pull/29009#discussion_r451931566



##########
File path: docs/sql-ref-syntax-qry-select-like.md
##########
@@ -26,7 +26,7 @@ A LIKE predicate is used to search for a specific pattern.
 ### Syntax
 
 ```sql
-[ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | RLIKE regex_pattern }
+[ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | RLIKE regex_pattern | REGEXP regex_pattern}

Review comment:
       have updated and added examples for this usage.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] GuoPhilipse commented on pull request #29009: [SPARK-32193][SQL] Update migrate guide docs on regexp function

Posted by GitBox <gi...@apache.org>.
GuoPhilipse commented on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-654168933


   > No, @GuoPhilipse, I meant if it's supported in `SELECT REGEXP('abc', '([a-z]+)');` way.
   
   No,it's a different way.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] GuoPhilipse edited a comment on pull request #29009: [SPARK-32193][SQL] Update migrate guide docs on regexp function

Posted by GitBox <gi...@apache.org>.
GuoPhilipse edited a comment on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-654214678






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29009: [SPARK-32193][SQL] Add alias for function rlike

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29009:
URL: https://github.com/apache/spark/pull/29009#discussion_r451905333



##########
File path: docs/sql-ref-syntax-qry-select-like.md
##########
@@ -26,7 +26,7 @@ A LIKE predicate is used to search for a specific pattern.
 ### Syntax
 
 ```sql
-[ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | RLIKE regex_pattern }
+[ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | RLIKE regex_pattern | REGEXP regex_pattern}

Review comment:
       `[ RLIKE | REGEXP ] regex_pattern` cc: @huaxingao 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] GuoPhilipse commented on a change in pull request #29009: [SPARK-32193][SQL][DOCS] Update regexp usage in SQL docs

Posted by GitBox <gi...@apache.org>.
GuoPhilipse commented on a change in pull request #29009:
URL: https://github.com/apache/spark/pull/29009#discussion_r452096859



##########
File path: docs/sql-ref-syntax-qry-select-like.md
##########
@@ -26,7 +26,7 @@ A LIKE predicate is used to search for a specific pattern.
 ### Syntax
 
 ```sql
-[ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | RLIKE regex_pattern }
+[ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | RLIKE regex_pattern | REGEXP regex_pattern}

Review comment:
       I am glad to take this over,will work on it later :)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29009: [SPARK-32193][DOCS] Update regexp usage in docs

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-655863107


   **[Test build #125427 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125427/testReport)** for PR 29009 at commit [`06cfd5d`](https://github.com/apache/spark/commit/06cfd5dc7e14a4f096002029817852ed1bc5ba41).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29009: [SPARK-32193][DOCS] Update regexp usage in docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-655861402






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29009: [SPARK-32193][DOCS] Update regexp usage in docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-655863495






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on pull request #29009: [SPARK-32193][SQL]update migrate guide docs on regexp function

Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-654153428


   I think you don't need to file jira for this kind of minor doc fixes. Btw, any other systems supporting `REGEXP` for regular expressions other than Hive? I think we might be able to assign an alias name to it if it is a general name.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu closed pull request #29009: [SPARK-32193][SQL][DOCS] Update regexp usage in SQL docs

Posted by GitBox <gi...@apache.org>.
maropu closed pull request #29009:
URL: https://github.com/apache/spark/pull/29009


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on pull request #29009: [SPARK-32193][SQL]update migrate guide docs on regexp function

Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-654153898


   cc: @HyukjinKwon 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29009: [SPARK-32193][DOCS] Update regexp usage in docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-655866811






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29009: [SPARK-32193][SQL]update migrate guide docs on regexp function

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29009:
URL: https://github.com/apache/spark/pull/29009#discussion_r450133421



##########
File path: docs/sql-migration-guide.md
##########
@@ -970,3 +970,4 @@ Below are the scenarios in which Hive and Spark generate different results:
 * `ACOS(n)` If n < -1 or n > 1, Hive returns null, Spark SQL returns NaN.
 * `ASIN(n)` If n < -1 or n > 1, Hive returns null, Spark SQL returns NaN.
 * `CAST(n AS TIMESTAMP)` If n is integral numbers, Hive treats n as milliseconds, Spark SQL treats n as seconds.
+* `REGEXP(str, patten)` Hive support this function, Spark SQL use RLIKE instead.

Review comment:
       nit: `support` -> `supports` and `use` -> `uses`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29009: [SPARK-32193][SQL]update migrate guide docs on regexp function

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-654140900


   Can one of the admins verify this patch?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] GuoPhilipse commented on pull request #29009: [SPARK-32193][SQL] Update migrate guide docs on regexp function

Posted by GitBox <gi...@apache.org>.
GuoPhilipse commented on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-654163788


   > I think you don't need to file jira for this kind of minor doc fixes. Btw, any other systems supporting `REGEXP` for regular expressions other than Hive? I think we might be able to assign an alias name to it if it is a general name.
   
   So far as i know ,mysql and hive support this key word .But i think a alias name seems a good idea.
   examples:
   `SELECT 'abc' REGEXP '([a-z]+)';`
   Ressult:
   `1`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29009: [SPARK-32193][SQL] Add alias for function rlike

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29009:
URL: https://github.com/apache/spark/pull/29009#discussion_r451905685



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
##########
@@ -359,6 +359,7 @@ object FunctionRegistry {
     expression[StringReplace]("replace"),
     expression[Overlay]("overlay"),
     expression[RLike]("rlike"),
+    expression[RLike]("regexp", true),

Review comment:
       Need this? It seems hive and mysql doesn't support this though.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #29009: [SPARK-32193][SQL] Update migrate guide docs on regexp function

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-654166095


   Thanks @maropu. @GuoPhilipse, does MySQL also supports `REGEXP` as a function?
   
   From reading the doc in Hive at https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF, it's same as Rlike which Spark supports. Maybe we should alias it.
   
   FWIW, there are many unsupported expressions explicitly:
   https://github.com/apache/spark/blob/5d870ef0bc70527fd1bc99a4ad17e4941c923351/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala#L189-L192
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29009: [SPARK-32193][SQL][DOCS] Update regexp usage in SQL docs

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29009:
URL: https://github.com/apache/spark/pull/29009#discussion_r452015664



##########
File path: docs/sql-ref-syntax-qry-select-like.md
##########
@@ -26,7 +26,7 @@ A LIKE predicate is used to search for a specific pattern.
 ### Syntax
 
 ```sql
-[ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | RLIKE regex_pattern }
+[ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | RLIKE regex_pattern | REGEXP regex_pattern}

Review comment:
       Btw, this reminded me that there are still missing keywords in the SQL docs: https://issues.apache.org/jira/browse/SPARK-31753
   Its very helpful if you take this over.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] GuoPhilipse commented on pull request #29009: [SPARK-32193][SQL] Update migrate guide docs on regexp function

Posted by GitBox <gi...@apache.org>.
GuoPhilipse commented on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-654214678


   I tried in hive 1.1, and it works. seems hive is incompatible for old version
   `hive>  select RLIKE('abc','([a-z]+)');`
   `true`
   
   `hive> select REGEXP('abc','([a-z]+)');`
   `true`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on pull request #29009: [SPARK-32193][SQL] Update migrate guide docs on regexp function

Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-654207971


   I get a bit confiused and Hive really supports REGEX as a function?
   ```
   // REGEXP case in hive (3.1.1)
   hive> SELECT 'abc' REGEXP '([a-z]+)';
   OK
   true
   
   hive> select REGEXP('abc','([a-z]+)');
   NoViableAltException(251@[])
   	at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause(HiveParser_SelectClauseParser.java:962)
       ...
   FAILED: ParseException line 1:7 cannot recognize input near 'REGEXP' '(' ''abc'' in select clause
   
   // RLIKE in hive
   hive> SELECT 'abc' RLIKE '([a-z]+)';
   OK
   true
   
   hive> select RLIKE('abc','([a-z]+)');
   NoViableAltException(265@[])
   	at org.apache.hadoop.hive.ql.parse.HiveParser_SelectClauseParser.selectClause(HiveParser_SelectClauseParser.java:962)
   	...
   FAILED: ParseException line 1:7 cannot recognize input near 'RLIKE' '(' ''abc'' in select clause
   
   // REGEXP in mysql
   mysql> SELECT 'abc' REGEXP '([a-z]+)';
   +-------------------------+
   | 'abc' REGEXP '([a-z]+)' |
   +-------------------------+
   |                       1 |
   +-------------------------+
   
   mysql> select REGEXP('abc','([a-z]+)');
   ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'REGEXP('abc','([a-z]+)')' at line 1
   
   // RLIKE in mysql
   mysql> SELECT 'abc' RLIKE '([a-z]+)';
   +------------------------+
   | 'abc' RLIKE '([a-z]+)' |
   +------------------------+
   |                      1 |
   +------------------------+
   
   mysql> select RLIKE('abc','([a-z]+)');
   ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'RLIKE('abc','([a-z]+)')' at line 1
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29009: [SPARK-32193][SQL] Add alias for function rlike

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29009:
URL: https://github.com/apache/spark/pull/29009#discussion_r451906022



##########
File path: docs/sql-ref-syntax-qry-select-like.md
##########
@@ -26,7 +26,7 @@ A LIKE predicate is used to search for a specific pattern.
 ### Syntax
 
 ```sql
-[ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | RLIKE regex_pattern }
+[ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | RLIKE regex_pattern | REGEXP regex_pattern}

Review comment:
       I think its okay for this PR to just update this doc.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29009: [SPARK-32193][DOCS] Update regexp usage in docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-654141473






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on pull request #29009: [SPARK-32193][SQL] Update migrate guide docs on regexp function

Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-654218179


   Yea, I checked the doc @HyukjinKwon put above, and it seems the current hive only supports `REGEXP` only in a SQL syntax.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29009: [SPARK-32193][SQL] Add alias for function rlike

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29009:
URL: https://github.com/apache/spark/pull/29009#discussion_r451905426



##########
File path: docs/sql-migration-guide.md
##########
@@ -30,6 +30,8 @@ license: |
   
   - In Spark 3.1, `from_unixtime`, `unix_timestamp`,`to_unix_timestamp`, `to_timestamp` and `to_date` will fail if the specified datetime pattern is invalid. In Spark 3.0 or earlier, they result `NULL`.
 
+  - In Spark 3.1, we can use regexp function, it is the alias of rlike funtion, which functions the same with rike funtion.

Review comment:
       We don't need this.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29009: [SPARK-32193][DOCS] Update regexp usage in docs

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29009:
URL: https://github.com/apache/spark/pull/29009#issuecomment-655866811






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org