You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by paul-rogers <gi...@git.apache.org> on 2017/03/10 19:55:39 UTC

[GitHub] drill pull request #777: DRILL-5330: NPE in FunctionImplementationRegistry

GitHub user paul-rogers opened a pull request:

    https://github.com/apache/drill/pull/777

    DRILL-5330: NPE in FunctionImplementationRegistry

    Fixes:
    
    * DRILL-5330: NPE in
    FunctionImplementationRegistry.functionReplacement()
    * DRILL-5331:
    NPE in FunctionImplementationRegistry.findDrillFunction() if dynamic
    UDFs disabled
    
    For DRILL-5331, we leverage an existing session option to determine if
    DUDFs are enabled. If not, we skip the DUDF registry check.
    
    For DRILL-5330, we use an existing option validator rather than
    accessing the raw option directly.
    
    Then, both options cached on setup rather than repeatedly resolved in
    each function lookup.
    
    Also includes a bit of code cleanup in the class in question.
    
    The result is that the code now works when used in a sub-operator unit
    test.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/paul-rogers/drill DRILL-5330

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/777.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #777
    
----
commit 8d2d9093dd26582fa7b13fe6fb57428f9a90d170
Author: Paul Rogers <pr...@maprtech.com>
Date:   2017-03-10T19:55:13Z

    DRILL-5330: NPE in FunctionImplementationRegistry
    
    Fixes:
    
    * DRILL-5330: NPE in
    FunctionImplementationRegistry.functionReplacement()
    * DRILL-5331:
    NPE in FunctionImplementationRegistry.findDrillFunction() if dynamic
    UDFs disabled
    
    For DRILL-5331, we leverage an existing session option to determine if
    DUDFs are enabled. If not, we skip the DUDF registry check.
    
    For DRILL-5330, we use an existing option validator rather than
    accessing the raw option directly.
    
    Then, both options cached on setup rather than repeatedly resolved in
    each function lookup.
    
    Also includes a bit of code cleanup in the class in question.
    
    The result is that the code now works when used in a sub-operator unit
    test.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #777: DRILL-5330: NPE in FunctionImplementationRegistry

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/777#discussion_r105539404
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java ---
    @@ -160,7 +168,7 @@ public DrillFuncHolder findDrillFunction(FunctionResolver functionResolver, Func
         FunctionResolver exactResolver = FunctionResolverFactory.getExactResolver(functionCall);
         DrillFuncHolder holder = exactResolver.getBestMatch(functions, functionCall);
     
    -    if (holder == null) {
    +    if (holder == null && useDynamicUdfs) {
    --- End diff --
    
    Taking a step back, here is what we're trying to accomplish. When running unit tests, the DUDF mechanism is not available -- or needed. So, we want to disable the code path that uses DUDFs.
    
    The original version of this code causes an NPE when calling the code inside the if. Now, I don't fully understand what how all this works. So, if the non-DUDF check is to early, where should I move the check so we do the necessary local lookups but bypass the DUDF code?
    
    I agree that having three distinct options to disable DUDFs is excessive. (This new boot option and two system options.) As Padma explained, the two system options disable adding DUDFs but do not disable DUDF lookup. The stated reason is that the user may have DUDFs on the system, so disabling the DUDF function can't result in those functions becoming unavailable. And, because of the race conditions we've discussed, that lazy init check still has to be done, even with DUDFs are off. So, we need yet another option, only for testing, that "really" turns DUDFs off.
    
    Now, this is a murky state of affairs, so it would be better to have a single option, maybe with three values: OFF, READ_ONLY and ON to handle the various cases.
    
    The boot option is also an optimization: it says that the option can be set only at boot time, so there is no need to do an option lookup on every function resolution.
    
    Finally, it turns out that, in the constructor, the option manager is not yet initialized and so can't be accessed. As a result, we can't cache the value of a system option from the constructor.
    
    All in all, I'm open to any revision of this change which simply disables DUDFs during unit testing (except, of course, when we are testing DUDFs...)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #777: DRILL-5330: NPE in FunctionImplementationRegistry

Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on a diff in the pull request:

    https://github.com/apache/drill/pull/777#discussion_r105534413
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestSimpleFunctions.java ---
    @@ -133,7 +134,7 @@ public void resolveHash(DrillConfig config, LogicalExpression arg, TypeProtos.Ma
                                         FunctionImplementationRegistry registry) throws JClassAlreadyExistsException, IOException {
         final List<LogicalExpression> args = new ArrayList<>();
         args.add(arg);
    -    final String[] registeredNames = { "hash" };
    +//    final String[] registeredNames = { "hash" };
    --- End diff --
    
    Why we need to comment this? If it's not need, can we just remove it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #777: DRILL-5330: NPE in FunctionImplementationRegistry

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/drill/pull/777


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #777: DRILL-5330: NPE in FunctionImplementationRegistry

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/777#discussion_r105539420
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestSimpleFunctions.java ---
    @@ -62,10 +62,11 @@
     
     public class TestSimpleFunctions extends ExecTest {
       //private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(TestSimpleFunctions.class);
    -  private final DrillConfig c = DrillConfig.create();
    +//  private final DrillConfig c = DrillConfig.create();
    --- End diff --
    
    I can remove it. I initially commented it just to see if things would still work.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #777: DRILL-5330: NPE in FunctionImplementationRegistry

Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on a diff in the pull request:

    https://github.com/apache/drill/pull/777#discussion_r105534412
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestSimpleFunctions.java ---
    @@ -62,10 +62,11 @@
     
     public class TestSimpleFunctions extends ExecTest {
       //private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(TestSimpleFunctions.class);
    -  private final DrillConfig c = DrillConfig.create();
    +//  private final DrillConfig c = DrillConfig.create();
    --- End diff --
    
    Why we need to comment this? If it's not need, can we just remove it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #777: DRILL-5330: NPE in FunctionImplementationRegistry

Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on a diff in the pull request:

    https://github.com/apache/drill/pull/777#discussion_r105534393
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java ---
    @@ -160,7 +168,7 @@ public DrillFuncHolder findDrillFunction(FunctionResolver functionResolver, Func
         FunctionResolver exactResolver = FunctionResolverFactory.getExactResolver(functionCall);
         DrillFuncHolder holder = exactResolver.getBestMatch(functions, functionCall);
     
    -    if (holder == null) {
    +    if (holder == null && useDynamicUdfs) {
    --- End diff --
    
    1. When dynamic udfs are enabled, we first try to find matching function using exact function resolver `exactResolver.getBestMatch` and if match if not found check if remote and local function registries are in sync and then try to find matching function using provided function resolver `functionResolver.getBestMatch`. 
    So when dynamic udfs are disabled, we should not try to find matching function  using exact function resolver, we need to use provided one and skip function registry sync check,
    
    2. Regarding new bootstap option, since we already have system / session option `ExecConstants.USE_DYNAMIC_UDFS`, I suggest we use it during this check, first we won't have two similar options, second if we re-write method as suggested above (point 1), users who disabled dynamic udfs using system / session option will have performance benefit since we won't double-check function existence. Probably we should have done this when `ExecConstants.USE_DYNAMIC_UDFS` was introduced in first place.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #777: DRILL-5330: NPE in FunctionImplementationRegistry

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/777#discussion_r105792412
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java ---
    @@ -160,7 +168,7 @@ public DrillFuncHolder findDrillFunction(FunctionResolver functionResolver, Func
         FunctionResolver exactResolver = FunctionResolverFactory.getExactResolver(functionCall);
         DrillFuncHolder holder = exactResolver.getBestMatch(functions, functionCall);
     
    -    if (holder == null) {
    +    if (holder == null && useDynamicUdfs) {
    --- End diff --
    
    Ah, now I see what's happening (I hope...) I pushed another commit that makes the suggested changes. 
    
    I wonder, do we have any unit tests for the ambiguous-function case? The unit tests passed with both the original and this new version, so I wonder if we have a hole in our test coverage?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #777: DRILL-5330: NPE in FunctionImplementationRegistry

Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on a diff in the pull request:

    https://github.com/apache/drill/pull/777#discussion_r105628751
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java ---
    @@ -160,7 +168,7 @@ public DrillFuncHolder findDrillFunction(FunctionResolver functionResolver, Func
         FunctionResolver exactResolver = FunctionResolverFactory.getExactResolver(functionCall);
         DrillFuncHolder holder = exactResolver.getBestMatch(functions, functionCall);
     
    -    if (holder == null) {
    +    if (holder == null && useDynamicUdfs) {
    --- End diff --
    
    1. Since you have mentioned I remembered one more issue with FunctionImplementationRegistry, it can access only system options, so using `ExecConstants.USE_DYNAMIC_UDFS` won't work properly since it can be set at session level as well. I guess using bootsrap option you introduced is OK for now. Regarding your suggestion to have single option OFF, READ_ONLY and ON to handle the various cases (I love this idea!), we can try to implement this the scope of MVCC (I'll add this point to the document).
    
    2. Even with boostrap option we need to update `findDrillFunction` to use provided function resolver when dynamic udfs are turned off (more details in my first comment). For example, `findDrillFunction` should can be re-written the following way (please optimize if needed):
    ```java
    public DrillFuncHolder findDrillFunction(FunctionResolver functionResolver, FunctionCall functionCall) {
        AtomicLong version = new AtomicLong();
        String newFunctionName = functionReplacement(functionCall);
        List<DrillFuncHolder> functions = localFunctionRegistry.getMethods(newFunctionName, version);
        if (!useDynamicUdfs) {
           return functionResolver.getBestMatch(functions, functionCall);
        }
        FunctionResolver exactResolver = FunctionResolverFactory.getExactResolver(functionCall);
        DrillFuncHolder holder = exactResolver.getBestMatch(functions, functionCall);
    
        if (holder == null) {
          syncWithRemoteRegistry(version.get());
          List<DrillFuncHolder> updatedFunctions = localFunctionRegistry.getMethods(newFunctionName, version);
          holder = functionResolver.getBestMatch(updatedFunctions, functionCall);
        }
    
        return holder;
      }
    ```
    3. Also changes should be done in `findExactMatchingDrillFunction` method to take into account boostrap option as well. For example (please optimize if needed):
    ```java
      public DrillFuncHolder findExactMatchingDrillFunction(String name, List<MajorType> argTypes, MajorType returnType) {
        if (useDynamicUdfs) {
           return findExactMatchingDrillFunction(name, argTypes, returnType, true);
        }
        return findExactMatchingDrillFunction(name, argTypes, returnType, false);
      }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #777: DRILL-5330: NPE in FunctionImplementationRegistry

Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on the issue:

    https://github.com/apache/drill/pull/777
  
    +1, looks good. Please squash the commits and put read-to-commit label in Jira when done. Thank you!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #777: DRILL-5330: NPE in FunctionImplementationRegistry

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/777#discussion_r105540304
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestSimpleFunctions.java ---
    @@ -133,7 +134,7 @@ public void resolveHash(DrillConfig config, LogicalExpression arg, TypeProtos.Ma
                                         FunctionImplementationRegistry registry) throws JClassAlreadyExistsException, IOException {
         final List<LogicalExpression> args = new ArrayList<>();
         args.add(arg);
    -    final String[] registeredNames = { "hash" };
    +//    final String[] registeredNames = { "hash" };
    --- End diff --
    
    Same reason. Removed this and the previous commented-out code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---