You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by eminency <gi...@git.apache.org> on 2015/11/27 07:49:15 UTC

[GitHub] tajo pull request: TAJO-1997: Registering UDF, it needs to check d...

GitHub user eminency opened a pull request:

    https://github.com/apache/tajo/pull/883

    TAJO-1997: Registering UDF, it needs to check duplication

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/eminency/tajo ambiguous_excpetion

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tajo/pull/883.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #883
    
----
commit f2e2ff1b0399aded42275b7c02c7c46368820c38
Author: Jongyoung Park <em...@gmail.com>
Date:   2015-11-16T03:28:52Z

    AmbiguousFunctionException is thrown when duplicated function is found

commit 92b0a0f81b773aa220f0b9425aa5ddc6f1a4f5d0
Author: Jongyoung Park <em...@gmail.com>
Date:   2015-11-20T02:32:35Z

    AmbiguousException

commit 2826324f4053afc15c306c3fb3959c434fcb667f
Author: Jongyoung Park <em...@gmail.com>
Date:   2015-11-27T06:45:21Z

    fix bug

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1997: Registering UDF, it needs to check d...

Posted by eminency <gi...@git.apache.org>.
Github user eminency commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/883#discussion_r47589038
  
    --- Diff: tajo-core/src/main/java/org/apache/tajo/engine/function/FunctionLoader.java ---
    @@ -299,4 +284,38 @@ private static StaticMethodInvocationDesc extractStaticMethodInvocation(Method m
     
         return sqlFuncs;
       }
    +
    +  public static Collection<FunctionDesc> loadFunctions(TajoConf conf) throws IOException, AmbiguousFunctionException {
    +    List<FunctionDesc> functionList = new ArrayList<>(loadBuiltinFunctions().values());
    +    List<FunctionDesc> udfs = loadUserDefinedFunctions(conf);
    +
    +    return mergeFunctionLists(functionList, udfs);
    +  }
    +
    +  @SafeVarargs
    +  static Collection<FunctionDesc> mergeFunctionLists(List<FunctionDesc> ... functionLists)
    +      throws AmbiguousFunctionException {
    +
    +    Map<Integer, FunctionDesc> funcMap = new HashMap<>();
    +    List<FunctionDesc> baseFuncList = functionLists[0];
    +
    +    // Build a map with a first list
    +    for (FunctionDesc desc: baseFuncList) {
    +      funcMap.put(desc.hashCodeWithoutType(), desc);
    +    }
    +
    +    // Check duplicates for other function lists(should be UDFs practically)
    +    for (int i=1; i<functionLists.length; i++) {
    +      for (FunctionDesc desc: functionLists[i]) {
    +        if (funcMap.containsKey(desc.hashCodeWithoutType())) {
    --- End diff --
    
    Good suggestion! I fix it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1997: Registering UDF, it needs to check d...

Posted by eminency <gi...@git.apache.org>.
Github user eminency commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/883#discussion_r47590084
  
    --- Diff: tajo-core/src/main/java/org/apache/tajo/engine/function/FunctionLoader.java ---
    @@ -299,4 +284,38 @@ private static StaticMethodInvocationDesc extractStaticMethodInvocation(Method m
     
         return sqlFuncs;
       }
    +
    +  public static Collection<FunctionDesc> loadFunctions(TajoConf conf) throws IOException, AmbiguousFunctionException {
    +    List<FunctionDesc> functionList = new ArrayList<>(loadBuiltinFunctions().values());
    +    List<FunctionDesc> udfs = loadUserDefinedFunctions(conf);
    +
    +    return mergeFunctionLists(functionList, udfs);
    +  }
    +
    +  @SafeVarargs
    +  static Collection<FunctionDesc> mergeFunctionLists(List<FunctionDesc> ... functionLists)
    +      throws AmbiguousFunctionException {
    +
    +    Map<Integer, FunctionDesc> funcMap = new HashMap<>();
    +    List<FunctionDesc> baseFuncList = functionLists[0];
    +
    +    // Build a map with a first list
    +    for (FunctionDesc desc: baseFuncList) {
    +      funcMap.put(desc.hashCodeWithoutType(), desc);
    +    }
    +
    +    // Check duplicates for other function lists(should be UDFs practically)
    +    for (int i=1; i<functionLists.length; i++) {
    --- End diff --
    
    I considered about that.
    But, there are two reasons why I didn't decide to do it.
    
    First is built-in functions exist statically, that is, they are not changed frequently. So I thought it could be useless burden to check each startup (number of built-in functions are more than 200, so number of checking will be more than 20K times).
    
    Secondly, the check routine is not considering function type as you already know. But there are already duplicate functions in built-in functions except function type. For example, there is sum() with or without 'distinct' feature.
    Since the check logic should be different, it should be done separately before current code part. 
    Thus, I thought the task was not the part of this issue if it should be done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1997: Registering UDF, it needs to check d...

Posted by jihoonson <gi...@git.apache.org>.
Github user jihoonson commented on the pull request:

    https://github.com/apache/tajo/pull/883#issuecomment-166259834
  
    +1. The latest patch looks good to me.
    I'll commit shortly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1997: Registering UDF, it needs to check d...

Posted by jihoonson <gi...@git.apache.org>.
Github user jihoonson commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/883#discussion_r47456682
  
    --- Diff: tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/FunctionDesc.java ---
    @@ -167,6 +171,13 @@ public boolean equals(Object obj) {
         }
         return false;
       }
    +
    +  public boolean equalsSignature(Object obj) {
    --- End diff --
    
    Maybe this method is used to check the real equality between functions. If so, please fix the below code. In the below, function signature is compared with function desc.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1997: Registering UDF, it needs to check d...

Posted by eminency <gi...@git.apache.org>.
Github user eminency commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/883#discussion_r47589327
  
    --- Diff: tajo-core/src/main/java/org/apache/tajo/engine/function/FunctionLoader.java ---
    @@ -299,4 +284,38 @@ private static StaticMethodInvocationDesc extractStaticMethodInvocation(Method m
     
         return sqlFuncs;
       }
    +
    +  public static Collection<FunctionDesc> loadFunctions(TajoConf conf) throws IOException, AmbiguousFunctionException {
    +    List<FunctionDesc> functionList = new ArrayList<>(loadBuiltinFunctions().values());
    +    List<FunctionDesc> udfs = loadUserDefinedFunctions(conf);
    +
    +    return mergeFunctionLists(functionList, udfs);
    +  }
    +
    +  @SafeVarargs
    +  static Collection<FunctionDesc> mergeFunctionLists(List<FunctionDesc> ... functionLists)
    +      throws AmbiguousFunctionException {
    +
    +    Map<Integer, FunctionDesc> funcMap = new HashMap<>();
    +    List<FunctionDesc> baseFuncList = functionLists[0];
    +
    +    // Build a map with a first list
    +    for (FunctionDesc desc: baseFuncList) {
    +      funcMap.put(desc.hashCodeWithoutType(), desc);
    +    }
    +
    +    // Check duplicates for other function lists(should be UDFs practically)
    +    for (int i=1; i<functionLists.length; i++) {
    +      for (FunctionDesc desc: functionLists[i]) {
    +        if (funcMap.containsKey(desc.hashCodeWithoutType())) {
    +          throw new AmbiguousFunctionException(String.format("UDF %s", desc.toString()));
    --- End diff --
    
    It is based on the error message template that has existed.
    
    https://github.com/apache/tajo/blob/ef94bb38d225deaff2d1eeb3d916fbe765412c3a/tajo-common/src/main/java/org/apache/tajo/exception/ErrorMessages.java#L87
    
    So to fix as you advised, some other positions need to be fixed.
    
    If you need, I think you'd better create another issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1997: Registering UDF, it needs to check d...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/tajo/pull/883


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1997: Registering UDF, it needs to check d...

Posted by eminency <gi...@git.apache.org>.
Github user eminency commented on the pull request:

    https://github.com/apache/tajo/pull/883#issuecomment-166204080
  
    Hi, @jihoonson .
    I applied what you suggested and rebased.
    Could you verify them?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1997: Registering UDF, it needs to check d...

Posted by jihoonson <gi...@git.apache.org>.
Github user jihoonson commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/883#discussion_r47456624
  
    --- Diff: tajo-core/src/main/java/org/apache/tajo/engine/function/FunctionLoader.java ---
    @@ -299,4 +284,38 @@ private static StaticMethodInvocationDesc extractStaticMethodInvocation(Method m
     
         return sqlFuncs;
       }
    +
    +  public static Collection<FunctionDesc> loadFunctions(TajoConf conf) throws IOException, AmbiguousFunctionException {
    +    List<FunctionDesc> functionList = new ArrayList<>(loadBuiltinFunctions().values());
    +    List<FunctionDesc> udfs = loadUserDefinedFunctions(conf);
    +
    +    return mergeFunctionLists(functionList, udfs);
    +  }
    +
    +  @SafeVarargs
    +  static Collection<FunctionDesc> mergeFunctionLists(List<FunctionDesc> ... functionLists)
    +      throws AmbiguousFunctionException {
    +
    +    Map<Integer, FunctionDesc> funcMap = new HashMap<>();
    +    List<FunctionDesc> baseFuncList = functionLists[0];
    +
    +    // Build a map with a first list
    +    for (FunctionDesc desc: baseFuncList) {
    +      funcMap.put(desc.hashCodeWithoutType(), desc);
    +    }
    +
    +    // Check duplicates for other function lists(should be UDFs practically)
    +    for (int i=1; i<functionLists.length; i++) {
    --- End diff --
    
    You seem to assume that there is no conflict between built-in functions. But, to check our mistake, how about checking built-in functions too?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1997: Registering UDF, it needs to check d...

Posted by jihoonson <gi...@git.apache.org>.
Github user jihoonson commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/883#discussion_r47456204
  
    --- Diff: tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/FunctionDesc.java ---
    @@ -167,6 +171,13 @@ public boolean equals(Object obj) {
         }
         return false;
       }
    +
    +  public boolean equalsSignature(Object obj) {
    --- End diff --
    
    Please remove the unused method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1997: Registering UDF, it needs to check d...

Posted by jihoonson <gi...@git.apache.org>.
Github user jihoonson commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/883#discussion_r47882803
  
    --- Diff: tajo-core/src/main/java/org/apache/tajo/engine/function/FunctionLoader.java ---
    @@ -299,4 +284,38 @@ private static StaticMethodInvocationDesc extractStaticMethodInvocation(Method m
     
         return sqlFuncs;
       }
    +
    +  public static Collection<FunctionDesc> loadFunctions(TajoConf conf) throws IOException, AmbiguousFunctionException {
    +    List<FunctionDesc> functionList = new ArrayList<>(loadBuiltinFunctions().values());
    +    List<FunctionDesc> udfs = loadUserDefinedFunctions(conf);
    +
    +    return mergeFunctionLists(functionList, udfs);
    +  }
    +
    +  @SafeVarargs
    +  static Collection<FunctionDesc> mergeFunctionLists(List<FunctionDesc> ... functionLists)
    +      throws AmbiguousFunctionException {
    +
    +    Map<Integer, FunctionDesc> funcMap = new HashMap<>();
    +    List<FunctionDesc> baseFuncList = functionLists[0];
    +
    +    // Build a map with a first list
    +    for (FunctionDesc desc: baseFuncList) {
    +      funcMap.put(desc.hashCodeWithoutType(), desc);
    +    }
    +
    +    // Check duplicates for other function lists(should be UDFs practically)
    +    for (int i=1; i<functionLists.length; i++) {
    --- End diff --
    
    I understand. Would you please leave some comments about this discussion?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1997: Registering UDF, it needs to check d...

Posted by eminency <gi...@git.apache.org>.
Github user eminency commented on the pull request:

    https://github.com/apache/tajo/pull/883#issuecomment-164664621
  
    @jihoonson 
    Thanks for the review.
    I applied what you advised and left some answers.
    Please check them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1997: Registering UDF, it needs to check d...

Posted by jihoonson <gi...@git.apache.org>.
Github user jihoonson commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/883#discussion_r47456442
  
    --- Diff: tajo-core/src/main/java/org/apache/tajo/engine/function/FunctionLoader.java ---
    @@ -299,4 +284,38 @@ private static StaticMethodInvocationDesc extractStaticMethodInvocation(Method m
     
         return sqlFuncs;
       }
    +
    +  public static Collection<FunctionDesc> loadFunctions(TajoConf conf) throws IOException, AmbiguousFunctionException {
    +    List<FunctionDesc> functionList = new ArrayList<>(loadBuiltinFunctions().values());
    +    List<FunctionDesc> udfs = loadUserDefinedFunctions(conf);
    +
    +    return mergeFunctionLists(functionList, udfs);
    +  }
    +
    +  @SafeVarargs
    +  static Collection<FunctionDesc> mergeFunctionLists(List<FunctionDesc> ... functionLists)
    +      throws AmbiguousFunctionException {
    +
    +    Map<Integer, FunctionDesc> funcMap = new HashMap<>();
    +    List<FunctionDesc> baseFuncList = functionLists[0];
    +
    +    // Build a map with a first list
    +    for (FunctionDesc desc: baseFuncList) {
    +      funcMap.put(desc.hashCodeWithoutType(), desc);
    +    }
    +
    +    // Check duplicates for other function lists(should be UDFs practically)
    +    for (int i=1; i<functionLists.length; i++) {
    +      for (FunctionDesc desc: functionLists[i]) {
    +        if (funcMap.containsKey(desc.hashCodeWithoutType())) {
    +          throw new AmbiguousFunctionException(String.format("UDF %s", desc.toString()));
    --- End diff --
    
    It would be better to print which function is ambiguous with found one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1997: Registering UDF, it needs to check d...

Posted by jihoonson <gi...@git.apache.org>.
Github user jihoonson commented on the pull request:

    https://github.com/apache/tajo/pull/883#issuecomment-165389548
  
    Thanks for update. I left a trivial comment.
    The latest patch looks good to me. Would you rebase it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1997: Registering UDF, it needs to check d...

Posted by eminency <gi...@git.apache.org>.
Github user eminency commented on the pull request:

    https://github.com/apache/tajo/pull/883#issuecomment-160060793
  
    Ready for review


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1997: Registering UDF, it needs to check d...

Posted by jihoonson <gi...@git.apache.org>.
Github user jihoonson commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/883#discussion_r47456570
  
    --- Diff: tajo-core/src/main/java/org/apache/tajo/engine/function/FunctionLoader.java ---
    @@ -299,4 +284,38 @@ private static StaticMethodInvocationDesc extractStaticMethodInvocation(Method m
     
         return sqlFuncs;
       }
    +
    +  public static Collection<FunctionDesc> loadFunctions(TajoConf conf) throws IOException, AmbiguousFunctionException {
    +    List<FunctionDesc> functionList = new ArrayList<>(loadBuiltinFunctions().values());
    +    List<FunctionDesc> udfs = loadUserDefinedFunctions(conf);
    +
    +    return mergeFunctionLists(functionList, udfs);
    +  }
    +
    +  @SafeVarargs
    +  static Collection<FunctionDesc> mergeFunctionLists(List<FunctionDesc> ... functionLists)
    +      throws AmbiguousFunctionException {
    +
    +    Map<Integer, FunctionDesc> funcMap = new HashMap<>();
    +    List<FunctionDesc> baseFuncList = functionLists[0];
    +
    +    // Build a map with a first list
    +    for (FunctionDesc desc: baseFuncList) {
    +      funcMap.put(desc.hashCodeWithoutType(), desc);
    +    }
    +
    +    // Check duplicates for other function lists(should be UDFs practically)
    +    for (int i=1; i<functionLists.length; i++) {
    +      for (FunctionDesc desc: functionLists[i]) {
    +        if (funcMap.containsKey(desc.hashCodeWithoutType())) {
    --- End diff --
    
    As you know, the equality check with hash code involves a possibility of false positive. I know it is very rare, but we need to check the real equality for valid operation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-1997: Registering UDF, it needs to check d...

Posted by eminency <gi...@git.apache.org>.
Github user eminency commented on a diff in the pull request:

    https://github.com/apache/tajo/pull/883#discussion_r47589003
  
    --- Diff: tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/FunctionDesc.java ---
    @@ -167,6 +171,13 @@ public boolean equals(Object obj) {
         }
         return false;
       }
    +
    +  public boolean equalsSignature(Object obj) {
    --- End diff --
    
    It's used now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---