You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by hyunsik <gi...@git.apache.org> on 2014/10/05 02:11:33 UTC

[GitHub] tajo pull request: TAJO-1092: Improve the function system to allow...

GitHub user hyunsik opened a pull request:

    https://github.com/apache/tajo/pull/178

    TAJO-1092: Improve the function system to allow other function implementation types.

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/hyunsik/tajo TAJO-1092

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tajo/pull/178.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #178
    
----
commit 149b92afe0d5a2903ddd55b4e788482e3344477b
Author: Hyunsik Choi <hy...@apache.org>
Date:   2014-10-04T22:23:10Z

    TAJO-1092: Improve the function system to allow other function implementation types.

commit 6df2f9f4d7c38f751b761f9dbdd224cc3c89b2ad
Author: Hyunsik Choi <hy...@apache.org>
Date:   2014-10-04T22:23:37Z

    Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/tajo into TAJO-1092

commit f6fe193152dc8a121def75b925bdde0e102a2cac
Author: Hyunsik Choi <hy...@apache.org>
Date:   2014-10-05T00:09:49Z

    Refactored code generation for function binding.
     * Moved some methods to TajoGeneratorAdaptor.
     * Refactored method signatures of isMatched and isCompatible.
     * Fixed unit test bugs of TestEvalCodeGenerator.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1092: Improve the function system to allow...

Posted by hyunsik <gi...@git.apache.org>.
Github user hyunsik commented on the pull request:

    https://github.com/apache/tajo/pull/178#issuecomment-58768370
  
    rebased.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1092: Improve the function system to allow...

Posted by hyunsik <gi...@git.apache.org>.
Github user hyunsik commented on the pull request:

    https://github.com/apache/tajo/pull/178#issuecomment-59063492
  
    The function support will be added in my next patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1092: Improve the function system to allow...

Posted by jinossy <gi...@git.apache.org>.
Github user jinossy commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/178#discussion_r18773411
  
    --- Diff: tajo-core/src/main/java/org/apache/tajo/engine/function/math/MathFunctions.java ---
    @@ -0,0 +1,46 @@
    +/***
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.tajo.engine.function.math;
    +
    +import org.apache.tajo.function.FunctionCollection;
    +import org.apache.tajo.function.ScalarFunction;
    +
    +import static org.apache.tajo.common.TajoDataTypes.Type.FLOAT8;
    +
    +@FunctionCollection
    +public class MathFunctions {
    +
    +  @ScalarFunction(name = "pi", returnType = FLOAT8)
    +  public static double pi() {
    +    return Math.PI;
    +  }
    +
    +  @ScalarFunction(name = "pow", returnType = FLOAT8, paramTypes = {FLOAT8, FLOAT8})
    +  public static Double pow(Double x, Double y) {
    +    if (x == null || y == null) {
    +      return null;
    +    }
    +    return Math.pow(x, y);
    +  }
    +
    +//  @ScalarFunction(name = "pow", returnType = FLOAT8, paramTypes = {FLOAT8, FLOAT8})
    --- End diff --
    
     Please remove the commented out line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1092: Improve the function system to allow...

Posted by jinossy <gi...@git.apache.org>.
Github user jinossy commented on the pull request:

    https://github.com/apache/tajo/pull/178#issuecomment-58915658
  
    Looks great to me!
    In my opinion, you should check following example:  
    public static bool myfunc(String x, int y); //nullable + primitive


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1092: Improve the function system to allow...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/tajo/pull/178


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1092: Improve the function system to allow...

Posted by hyunsik <gi...@git.apache.org>.
Github user hyunsik commented on the pull request:

    https://github.com/apache/tajo/pull/178#issuecomment-57923825
  
    After this patch is committed, I'll add a documentation about how making Tajo user-defined functions using the proposed design.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1092: Improve the function system to allow...

Posted by hyunsik <gi...@git.apache.org>.
Github user hyunsik commented on the pull request:

    https://github.com/apache/tajo/pull/178#issuecomment-57923814
  
    An example of static method in Java class is https://github.com/apache/tajo/pull/178/files#diff-f6daa76b2459470a9f3412131c0f726bR34.
    
    I designed the function annotation system to point Function Collection, which is a class including multiple static functions. For user-defined functions and built-in functions, just add function as the example. It is very easy and it enables Tajo to reuse existing functions.
    
    Besides, as you can see, SQL is based on three-valued logic (http://en.wikipedia.org/wiki/Three-valued_logic). So, each value can be nullable. Despite of boolean type, one boolean type value can be three values: TRUE, FALSE, and UNKNOWN (NULL in SQL). In the current function system, each function must deal with NULL value explicitly. Most of functions usually return NULL if at least of one parameter is NULL. ```Substr``` function is an example (https://github.com/apache/tajo/blob/master/tajo-core/src/main/java/org/apache/tajo/engine/function/string/Substr.java#L63). It gives users burden, and it is easy for users to forget NULL handling when users implement user-defined functions.
    
    In order to mitigate such a problem and to make function invocation more efficiently, I designed new function binder and new function definition approach to keep hints how a function handles NULL value.
    
    The hints are described in function parameters in a function definition. You can specify the hints by using java primitive type or class primitive type as each parameter according to null handling way. 
    
    For example:
    
    This ```pow``` function does not allow NULL values as input parameter. In this case, if at least one parameter is null, this function binder will automatically return NULL value without invoking this function. So, this function itself does not need to handle NULL value explicitly.
    ````
    @ScalarFunction(name = "pow", returnType = FLOAT8, paramTypes = {FLOAT8, FLOAT8})
       public static double pow(double x, double y) {
         return Math.pow(x, y);
    }
    ```
    
    The following function definition allow NULL value as both input parameters. In this case, this function must handle NULL value explicitly.
    ```
    @ScalarFunction(name = "pow", returnType = FLOAT8, paramTypes = {FLOAT8, FLOAT8})
    public static Double pow(Double x, Double y) {
      if (x == null || y == null) {
        return null;
      }
      return Math.pow(x, y);
    }
    ```
    
    In addition, the function binder allows a mixed use of primitive types and class primitive types. When mixed definition is used, the function binder only allow class primitive types to handle NULL values explicitly.
    
    Finally, the function binder is generated on the fly by java byte code generation technique, and it does not have any overheads even though the logic is very complex. Also, I'm expecting that this idea will remove significantly the overhead of Datum uses in the existing function system.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1092: Improve the function system to allow...

Posted by jinossy <gi...@git.apache.org>.
Github user jinossy commented on the pull request:

    https://github.com/apache/tajo/pull/178#issuecomment-59147926
  
    This patch provide backward compatibility, so there is no issue.
    Here is my +1.  Thank you for your contribution!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1092: Improve the function system to allow...

Posted by hyunsik <gi...@git.apache.org>.
Github user hyunsik commented on the pull request:

    https://github.com/apache/tajo/pull/178#issuecomment-59236781
  
    Thank you for your review. I'll commit it shortly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---