You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by sunjincheng121 <gi...@git.apache.org> on 2017/06/15 05:30:31 UTC

[GitHub] flink pull request #4127: [FLINK-6892][table]Add L/RPAD supported in SQL

GitHub user sunjincheng121 opened a pull request:

    https://github.com/apache/flink/pull/4127

    [FLINK-6892][table]Add L/RPAD supported in SQL

    In this PR. have Add L/RPAD supported in SQL,For Example:
    ```
    LPAD('hi',4,'??') -> '??hi'
    LPAD('hi',1,'??') -> 'h'
    RPAD('hi',4,'') -> 'hi'
    RPAD('hi',1,'??') -> 'h'
    ```
    - [x] General
      - The pull request references the related JIRA issue ("[FLINK-XXX] Jira title text")
      - The pull request addresses only one issue
      - Each commit in the PR has a meaningful commit message (including the JIRA id)
    
    - [ ] Documentation
      - Documentation has been added for new functionality
      - Old documentation affected by the pull request has been updated
      - JavaDoc for public methods has been added
    
    - [x] Tests & Build
      - Functionality added by the pull request is covered by tests
      - `mvn clean verify` has been executed successfully locally or a Travis build has passed


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sunjincheng121/flink FLINK-6892-PR

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/4127.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4127
    
----
commit ea7006b6ab5cb6ead8d3f2a2b4d5fc2686454a9a
Author: sunjincheng121 <su...@gmail.com>
Date:   2017-06-15T05:23:32Z

    [FLINK-6892][table]Add L/RPAD supported in SQL

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4127: [FLINK-6892][table]Add L/RPAD supported in SQL

Posted by sunjincheng121 <gi...@git.apache.org>.
Github user sunjincheng121 commented on the issue:

    https://github.com/apache/flink/pull/4127
  
    I had rebase the code and update the PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4127: [FLINK-6892][table]Add L/RPAD supported in SQL

Posted by twalthr <gi...@git.apache.org>.
Github user twalthr commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4127#discussion_r160628359
  
    --- Diff: docs/dev/table/sql.md ---
    @@ -1744,6 +1743,29 @@ CONCAT_WS(separator, string1, string2,...)
           </td>
           <td>
             <p>Returns the string that results from concatenating the arguments using a separator. The separator is added between the strings to be concatenated. Returns NULL If the separator is NULL. CONCAT_WS() does not skip empty strings. However, it does skip any NULL argument. E.g. <code>CONCAT_WS("~", "AA", "BB", "", "CC")</code> returns <code>AA~BB~~CC</code></p>
    +  </td>
    +    </tr>
    +
    +        <tr>
    +      <td>
    +        {% highlight text %}
    +LPAD(text string, len integer, pad string)
    +{% endhighlight %}
    +      </td>
    +      <td>
    +        <p>Returns the string text, left-padded with the string pad to a length of len characters. If text is longer than len, the return value is shortened to len characters.</p>
    --- End diff --
    
    Can you add two examples where the length is greater/smaller that the string.


---

[GitHub] flink pull request #4127: [FLINK-6892][table]Add L/RPAD supported in SQL

Posted by twalthr <gi...@git.apache.org>.
Github user twalthr commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4127#discussion_r127995237
  
    --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/functions/ScalarFunctions.scala ---
    @@ -82,4 +82,102 @@ object ScalarFunctions {
         }
         sb.toString
       }
    +
    +  /**
    +    * Returns the string str, left-padded with the string pad to a length of len characters.
    +    * If str is longer than len, the return value is shortened to len characters.
    +    */
    +  def lpad(base: String, len: Integer, pad: String): String = {
    +    if (base == null || len == null || pad == null) {
    +      return null
    +    }
    +    var data = "".getBytes
    +    if (data.length < len) {
    +      data = new Array[Byte](len)
    --- End diff --
    
    Why do we work on bytes here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4127: [FLINK-6892][table]Add L/RPAD supported in SQL

Posted by sunjincheng121 <gi...@git.apache.org>.
Github user sunjincheng121 commented on the issue:

    https://github.com/apache/flink/pull/4127
  
     I'll Rebase code and add doc after FLINK-6960 & FLINK-6925.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4127: [FLINK-6892][table]Add L/RPAD supported in SQL

Posted by sunjincheng121 <gi...@git.apache.org>.
Github user sunjincheng121 commented on the issue:

    https://github.com/apache/flink/pull/4127
  
    @twalthr Thanks for the review! I have update the PR. I will be very grateful if you can review again.
    
    Thanks, Jincheng


---

[GitHub] flink pull request #4127: [FLINK-6892][table]Add L/RPAD supported in SQL

Posted by twalthr <gi...@git.apache.org>.
Github user twalthr commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4127#discussion_r127995434
  
    --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/functions/ScalarFunctions.scala ---
    @@ -82,4 +82,102 @@ object ScalarFunctions {
         }
         sb.toString
       }
    +
    +  /**
    +    * Returns the string str, left-padded with the string pad to a length of len characters.
    +    * If str is longer than len, the return value is shortened to len characters.
    +    */
    +  def lpad(base: String, len: Integer, pad: String): String = {
    +    if (base == null || len == null || pad == null) {
    +      return null
    +    }
    +    var data = "".getBytes
    +    if (data.length < len) {
    +      data = new Array[Byte](len)
    +    }
    +    val baseBytes = base.getBytes
    +    val padBytes = pad.getBytes
    +
    +    // The length of the padding needed
    +    val pos = Math.max(len - base.length, 0)
    +
    +    // Copy the padding
    +    var i = 0
    +
    +    while (i < pos) {
    +      {
    --- End diff --
    
    Is this code auto-generated? It looks weird.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #4127: [FLINK-6892][table]Add L/RPAD supported in SQL

Posted by twalthr <gi...@git.apache.org>.
Github user twalthr commented on the issue:

    https://github.com/apache/flink/pull/4127
  
    Thank you @sunjincheng121. I will expose this to the Table API as well and merge this.


---

[GitHub] flink pull request #4127: [FLINK-6892][table]Add L/RPAD supported in SQL

Posted by twalthr <gi...@git.apache.org>.
Github user twalthr commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4127#discussion_r160630263
  
    --- Diff: flink-libraries/flink-table/src/test/scala/org/apache/flink/table/expressions/validation/ScalarFunctionsValidationTest.scala ---
    @@ -78,10 +78,21 @@ class ScalarFunctionsValidationTest extends ScalarTypesTestBase {
       }
     
       @Test(expected = classOf[ValidationException])
    -  def testTimestampAddWithWrongQuantity(): Unit ={
    +  def testTimestampAddWithWrongQuantity(): Unit = {
         testSqlApi("TIMESTAMPADD(YEAR, 1.0, timestamp '2016-02-24 12:42:25')", "2016-06-16")
       }
     
    +  @Test(expected = classOf[ValidationException])
    +  def testLpadWithNull(): Unit = {
    +    // Must fail. Parameter of base string must not be null.
    +    testSqlApi("LPAD(null,1,'??')", "")
    --- End diff --
    
    I think we don't need to fail here. Returning null should be the SQL standard here. 


---

[GitHub] flink pull request #4127: [FLINK-6892][table]Add L/RPAD supported in SQL

Posted by twalthr <gi...@git.apache.org>.
Github user twalthr commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4127#discussion_r160629826
  
    --- Diff: flink-libraries/flink-table/src/test/scala/org/apache/flink/table/expressions/ScalarFunctionsTest.scala ---
    @@ -352,6 +352,24 @@ class ScalarFunctionsTest extends ScalarTypesTestBase {
           "Flink~~~~xx")
       }
     
    +  @Test
    +  def testLpad(): Unit = {
    +    testSqlApi("LPAD('hi',4,'??')", "??hi")
    --- End diff --
    
    Add a test with unicode characters.


---

[GitHub] flink pull request #4127: [FLINK-6892][table]Add L/RPAD supported in SQL

Posted by twalthr <gi...@git.apache.org>.
Github user twalthr commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4127#discussion_r160629246
  
    --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/functions/ScalarFunctions.scala ---
    @@ -104,9 +103,84 @@ object ScalarFunctions {
         }
         if (base <= 1.0) {
           throw new IllegalArgumentException(s"base of 'log(base, x)' must be > 1, but base = $base")
    -    }
    -    else {
    +    } else {
           Math.log(x) / Math.log(base)
         }
       }
    +
    +  /**
    +    * Returns the string str, left-padded with the string pad to a length of len characters.
    +    * If str is longer than len, the return value is shortened to len characters.
    +    */
    +  def lpad(base: String, len: Integer, pad: String): String = {
    +    if (base == null || len == null || pad == null) {
    --- End diff --
    
    `addSqlFunctionMethod` already checks for null in the arguments so we can use primitive integer here and remove this checks.


---

[GitHub] flink pull request #4127: [FLINK-6892][table]Add L/RPAD supported in SQL

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/4127


---

[GitHub] flink issue #4127: [FLINK-6892][table]Add L/RPAD supported in SQL

Posted by sunjincheng121 <gi...@git.apache.org>.
Github user sunjincheng121 commented on the issue:

    https://github.com/apache/flink/pull/4127
  
    Hi, @twalthr Thanks for your reviewing. 
    For performance reasons I have not using `String.format()`. I have a simple test using `String.format()` and PR approach. run `100000` times. The result as follows:
    `str.concat(String.format("%" + (length - str.length()) + "s", "").replace(" ", car));` : 1345ms
    `The PR lpad` : 145ms. So I think we should not using `String.format()`. Please tell me if there are any mistake.
    
    Thanks, Jincheng



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4127: [FLINK-6892][table]Add L/RPAD supported in SQL

Posted by twalthr <gi...@git.apache.org>.
Github user twalthr commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4127#discussion_r127995503
  
    --- Diff: flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/functions/ScalarFunctions.scala ---
    @@ -82,4 +82,102 @@ object ScalarFunctions {
         }
         sb.toString
       }
    +
    +  /**
    +    * Returns the string str, left-padded with the string pad to a length of len characters.
    +    * If str is longer than len, the return value is shortened to len characters.
    +    */
    +  def lpad(base: String, len: Integer, pad: String): String = {
    +    if (base == null || len == null || pad == null) {
    +      return null
    +    }
    +    var data = "".getBytes
    +    if (data.length < len) {
    +      data = new Array[Byte](len)
    +    }
    +    val baseBytes = base.getBytes
    +    val padBytes = pad.getBytes
    +
    +    // The length of the padding needed
    +    val pos = Math.max(len - base.length, 0)
    +
    +    // Copy the padding
    +    var i = 0
    +
    +    while (i < pos) {
    +      {
    +        var j = 0
    +        while (j < pad.length && j < pos - i) {
    +          {
    +            data(i + j) = padBytes(j)
    +          }
    +          {
    +            j += 1;
    +            j - 1
    --- End diff --
    
    Isn't this expression useless?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #4127: [FLINK-6892][table]Add L/RPAD supported in SQL

Posted by twalthr <gi...@git.apache.org>.
Github user twalthr commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4127#discussion_r160629614
  
    --- Diff: flink-libraries/flink-table/src/test/scala/org/apache/flink/table/expressions/ScalarFunctionsTest.scala ---
    @@ -352,6 +352,24 @@ class ScalarFunctionsTest extends ScalarTypesTestBase {
           "Flink~~~~xx")
       }
     
    +  @Test
    +  def testLpad(): Unit = {
    +    testSqlApi("LPAD('hi',4,'??')", "??hi")
    +    testSqlApi("LPAD('hi',1,'??')", "h")
    +    testSqlApi("LPAD('',1,'??')", "?")
    +    testSqlApi("LPAD('',30,'??')", "??????????????????????????????")
    +    testSqlApi("LPAD('111',-2,'??')", "")
    --- End diff --
    
    Is `testSqlApi("LPAD('111',-2,'??')", "")` standard?
    I just tested this with MySQL and it returns NULL


---