You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by paul-rogers <gi...@git.apache.org> on 2017/10/06 22:09:18 UTC

[GitHub] drill pull request #978: DRILL-5842: Refactor and simplify the fragment, ope...

GitHub user paul-rogers opened a pull request:

    https://github.com/apache/drill/pull/978

    DRILL-5842: Refactor and simplify the fragment, operator contexts for testing

    Drill's execution engine has a "fragment context" that provides state for a fragment as a whole, and an "operator context" which provides state for a single operator. Historically, these have both been concrete classes that make generous references to the Drillbit context, and hence need a full Drill server in order to operate.
    
    Drill has historically made extensive use of system-level testing: build the entire server and fire queries at it to test each component. Over time, we are augmenting that approach with unit tests: the ability to test each operator (or parts of an operator) in isolation.
    
    Since each operator requires access to both the operator and fragment context, the fact that the contexts depend on the overall server creates a large barrier to unit testing. An earlier checkin started down the path of defining the contexts as interfaces that can have different run-time and test-time implementations to enable testing.
    
    One solution has been to use JMockit or Mockito to mock up the operator and fragment contexts. This works, but is fragile. (Indeed, those tests failed spectacularly during the work for this PR until the mocks were updated with the new and changed methods.)
    
    This PR refactors those interfaces: simplifying the operator context and introducing an interface for the fragment context. New code will use these new interfaces, while older code continues to use the concrete implementations. Over time, as operators are enhanced, they can be modified to allow unit-level testing.
    
    ### Context Revisions
    
    The following changes were made to contexts:
    
    * Rename `AbstractOperatorExecContext` to `BaseOperatorContext`.
    * Introduce `BaseFragmentContext` as a common base class for test and production fragment contexts.
    * Introduce `FragmentContextInterface` (sorry for the name) as the new abstraction to be (eventually) used by all operators so that they can use both test and production versions.
    * Move methods from the concrete implementations to the base classes where they do not depend on Drillbit or network services.
    * Retire earlier versions: `OperExecContext` and `OperExecContextImpl`.
    * Add more services to `OperatorContext`, the interface for the operator context.
    
    A result of these changes is that `OperatorContext` is now a one-stop-shop for services needed by operators. It now provides:
    
    * Access to the fragment context: `getFragmentContext()`
    * Access to the physical operator definition (AKA “physical operator”): `getOperatorDefn()`.
    
    Because of these changes, we need no longer pass around the triple of fragment context, operator definition and operator context — something needed by the “managed” sort and upcoming PRs.
    
    The above caused changes to consumers of the above classes. The funky `FragmentContextInterface` name was chosen to minimize such changes: code that uses the concrete `FragmentContext` did not change. (Renaming `FragmentContext` to `FragmentContextImpl` would have caused hundreds of changes…)
    
    ### Operator Fixture
    
    The `OperatorFixture` class provides test-time implementations of the fragment and operator contexts. These classes were modified to extend the new base classes described above, and to include the new method on the context interfaces. This is not a final design, but it is a step toward the final design.
    
    ### Mocks
    
    Added selected new methods to the JMockit mocks set up in `PhysicalOpUnitTestBase`. This is the crux of the problem that this change works to solve: mocks are not very stable and are very hard to debug. Having a test-time implementation is a better long-term solution. However, we are not yet in a a position to replace the mocks with the test-time alternatives; doing so will cause a cascade of other changes that would be too big for this one PR.
    
    ### Other Clean-up
    
    Continued to shift code to use the write-only operator stats interface to allow easier operator context mapping. Changes here focused on the Drill file system to allow mocking up tests with scan batch.
    
    Various minor code cleanups are also included.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/paul-rogers/drill DRILL-5842

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/978.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #978
    
----
commit 5a6d51a2027bdf4af3d771e884a4defe60023a86
Author: Paul Rogers <pr...@maprtech.com>
Date:   2017-10-05T05:43:44Z

    DRILL-5842: Refactor fragment, operator contexts

commit 9fe0314b1c8925ef847d0b4c618b1556b59b1f00
Author: Paul Rogers <pr...@maprtech.com>
Date:   2017-10-05T05:44:30Z

    DRILL-5842: Secondary changes

commit 03d650f8729828d1d60e58a1dc50091edbe9aba3
Author: Paul Rogers <pr...@maprtech.com>
Date:   2017-10-06T06:24:56Z

    Fixes for tests which mock contexts

----


---

[GitHub] drill issue #978: DRILL-5842: Refactor and simplify the fragment, operator c...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on the issue:

    https://github.com/apache/drill/pull/978
  
    I understand the motivation of this PR is to refactor the various of "context" used in drill, so that we could get rid of JMock in test. I'm a bit confused after reading the statement that "we are not yet in a a position to replace the mocks with the test-time alternatives". I thought that's the purpose of this PR, yet it does not demonstrate the benefit, after going through big chunk of code change ?
    



---

[GitHub] drill issue #978: DRILL-5842: Refactor and simplify the fragment, operator c...

Posted by priteshm <gi...@git.apache.org>.
Github user priteshm commented on the issue:

    https://github.com/apache/drill/pull/978
  
    @sohami  can you please review this?


---

[GitHub] drill pull request #978: DRILL-5842: Refactor and simplify the fragment, ope...

Posted by jinfengni <gi...@git.apache.org>.
Github user jinfengni commented on a diff in the pull request:

    https://github.com/apache/drill/pull/978#discussion_r147546965
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/unit/MiniPlanUnitTestBase.java ---
    @@ -360,7 +366,7 @@ public T columnsToRead(String ... columnsToRead) {
        */
       public class JsonScanBuilder extends ScanPopBuider<JsonScanBuilder> {
         List<String> jsonBatches = null;
    -    List<String> inputPaths = Collections.EMPTY_LIST;
    +    List<String> inputPaths = Collections.emptyList();
    --- End diff --
    
    why replace a field with a function call, which eventually return that static field? 


---

[GitHub] drill pull request #978: DRILL-5842: Refactor and simplify the fragment, ope...

Posted by sohami <gi...@git.apache.org>.
Github user sohami commented on a diff in the pull request:

    https://github.com/apache/drill/pull/978#discussion_r148146597
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorContextImpl.java ---
    @@ -17,33 +17,23 @@
      */
     package org.apache.drill.exec.ops;
     
    -import java.io.IOException;
     import java.security.PrivilegedExceptionAction;
     import java.util.concurrent.Callable;
    -import java.util.concurrent.ExecutorService;
     
    -import org.apache.drill.common.exceptions.DrillRuntimeException;
     import org.apache.drill.exec.exception.OutOfMemoryException;
     import org.apache.drill.exec.physical.base.PhysicalOperator;
    -import org.apache.drill.exec.store.dfs.DrillFileSystem;
     import org.apache.drill.exec.work.WorkManager;
    -import org.apache.hadoop.conf.Configuration;
     import org.apache.hadoop.security.UserGroupInformation;
     
    -import com.google.common.base.Preconditions;
     import com.google.common.util.concurrent.ListenableFuture;
     import com.google.common.util.concurrent.ListeningExecutorService;
     import com.google.common.util.concurrent.MoreExecutors;
     
    -class OperatorContextImpl extends AbstractOperatorExecContext implements OperatorContext, AutoCloseable {
    +class OperatorContextImpl extends BaseOperatorContext implements OperatorContext, AutoCloseable {
       static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(OperatorContextImpl.class);
    --- End diff --
    
    Why we need to have `OperatorContextImpl implements OperatorContext` when `BaseOperatorContex`t already implements that interface ?


---

[GitHub] drill pull request #978: DRILL-5842: Refactor and simplify the fragment, ope...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/drill/pull/978


---

[GitHub] drill pull request #978: DRILL-5842: Refactor and simplify the fragment, ope...

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/978#discussion_r147842797
  
    --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/physical/unit/MiniPlanUnitTestBase.java ---
    @@ -360,7 +366,7 @@ public T columnsToRead(String ... columnsToRead) {
        */
       public class JsonScanBuilder extends ScanPopBuider<JsonScanBuilder> {
         List<String> jsonBatches = null;
    -    List<String> inputPaths = Collections.EMPTY_LIST;
    +    List<String> inputPaths = Collections.emptyList();
    --- End diff --
    
    This is a subtle point. Using the constant creates the expression:
    
    ```
    public static final List EMPTY_LIST = new EmptyList<>(); // Definition
    List<String> inputPaths = EMPTY_LIST; // Original code
    ```
    
    The above is not type-friendly: we are setting a typed list (`inputPaths`) to an untyped constant (`EMPTY_LIST`).
    
    The revised code uses Java's parameterized methods to work around the type ambiguity:
    
    ```
    public static final <T> List<T> emptyList() ... // Definition
    List<String> inputPaths = Collections.emptyList(); // Type-safe assignment
    ```
    
    Functionally, the two expressions are identical. But, the original was type-unsafe and generated a compiler warning. The new one is type-safe and resolves the warning.


---

[GitHub] drill issue #978: DRILL-5842: Refactor and simplify the fragment, operator c...

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on the issue:

    https://github.com/apache/drill/pull/978
  
    @sohami, pushed a commit that addresses your PR comments.
    
    Regarding the "one stop shop" comment. This PR removes the `OperExecContextImpl` class. That class was an earlier attempt to combine the three items into one. Additional experience showed that the existing operator context could do that task instead.
    
    This PR did not change existing operators to avoid passing around multiple items. Instead, it allows new code (for the batch size limitation project) to pass just the operator context, and use that to obtain the other two items. The external sort had to be changed because it used the old `OperExecContextImpl` for that purpose, and so the code had to be updated when `OperExecContextImpl` was removed.


---

[GitHub] drill issue #978: DRILL-5842: Refactor and simplify the fragment, operator c...

Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on the issue:

    https://github.com/apache/drill/pull/978
  
    +1, LGTM.


---

[GitHub] drill issue #978: DRILL-5842: Refactor and simplify the fragment, operator c...

Posted by sohami <gi...@git.apache.org>.
Github user sohami commented on the issue:

    https://github.com/apache/drill/pull/978
  
    Looking into the PR and based on description it says below:
    
    _A result of these changes is that OperatorContext is now a one-stop-shop for services needed by operators. It now provides:_
    
    - Access to the fragment context: getFragmentContext()
    - Access to the physical operator definition (AKA “physical operator”): getOperatorDefn().
    
    _Because of these changes, we need no longer pass around the triple of fragment context, operator definition and operator context — something needed by the “managed” sort and upcoming PRs._
    
    Based on my understanding it looks like existing interface and implementation `OperExecContextImpl` do provide access to `FragmentContext` and `PopConfig` (operator Defn). The instance of this context is created inside `ExternalSortBatch` which has access to FragmentContext and PopConfig already. Also `SortImpl` only takes in `OperExecContext`. I am little confused how the change avoided the need to pass around the triplet ?


---

[GitHub] drill pull request #978: DRILL-5842: Refactor and simplify the fragment, ope...

Posted by sohami <gi...@git.apache.org>.
Github user sohami commented on a diff in the pull request:

    https://github.com/apache/drill/pull/978#discussion_r148095543
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/ops/BaseOperatorContext.java ---
    @@ -0,0 +1,201 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.ops;
    +
    +import java.io.IOException;
    +import java.util.concurrent.ExecutorService;
    +
    +import org.apache.drill.common.exceptions.UserException;
    +import org.apache.drill.exec.memory.BufferAllocator;
    +import org.apache.drill.exec.physical.base.PhysicalOperator;
    +import org.apache.drill.exec.store.dfs.DrillFileSystem;
    +import org.apache.drill.exec.testing.ControlsInjector;
    +import org.apache.drill.exec.testing.ExecutionControls;
    +import org.apache.hadoop.conf.Configuration;
    +
    +import com.google.common.base.Preconditions;
    +
    +import io.netty.buffer.DrillBuf;
    +
    +/**
    + * Implementation of {@link OperatorExecContext} that provides services
    + * needed by most run-time operators. Excludes services that need the
    + * entire Drillbit. This class provides services common to the test-time
    + * version of the operator context and the full production-time context
    + * that includes network services.
    + */
    +
    +public abstract class BaseOperatorContext implements OperatorContext {
    +  static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(BaseOperatorContext.class);
    +
    +  protected final FragmentContextInterface context;
    +  protected final BufferAllocator allocator;
    +  protected final PhysicalOperator popConfig;
    +  protected final BufferManager manager;
    +  protected OperatorStatReceiver statsWriter;
    +  private DrillFileSystem fs;
    +  private ControlsInjector injector;
    +
    +  public BaseOperatorContext(FragmentContextInterface context, BufferAllocator allocator,
    +               PhysicalOperator popConfig,
    +               OperatorStatReceiver stats) {
    +    this.context = context;
    +    this.allocator = allocator;
    +    this.popConfig = popConfig;
    +    this.manager = new BufferManagerImpl(allocator);
    +    statsWriter = stats;
    +  }
    +
    +  @Override
    +  public FragmentContextInterface getFragmentContext() {
    +    return context;
    +  }
    +
    +  @SuppressWarnings("unchecked")
    +  @Override
    +  public <T extends PhysicalOperator> T getOperatorDefn() {
    +    return (T) popConfig;
    +  }
    +
    +  public String getName() {
    +    return popConfig.getClass().getName();
    +  }
    +
    +  @Override
    +  public DrillBuf replace(DrillBuf old, int newSize) {
    +    return manager.replace(old, newSize);
    +  }
    +
    +  @Override
    +  public DrillBuf getManagedBuffer() {
    +    return manager.getManagedBuffer();
    +  }
    +
    +  @Override
    +  public DrillBuf getManagedBuffer(int size) {
    +    return manager.getManagedBuffer(size);
    +  }
    +
    +  @Override
    +  public ExecutionControls getExecutionControls() {
    +    return context.getExecutionControls();
    +  }
    +
    +  @Override
    +  public BufferAllocator getAllocator() {
    +    if (allocator == null) {
    +      throw new UnsupportedOperationException("Operator context does not have an allocator");
    +    }
    +    return allocator;
    +  }
    +
    +  // Allow an operator to use the thread pool
    +  @Override
    +  public ExecutorService getExecutor() {
    +    return context.getDrillbitContext().getExecutor();
    +  }
    +
    +  @Override
    +  public ExecutorService getScanExecutor() {
    +    return context.getDrillbitContext().getScanExecutor();
    +  }
    +
    +  @Override
    +  public ExecutorService getScanDecodeExecutor() {
    +    return context.getDrillbitContext().getScanDecodeExecutor();
    +  }
    +
    +  @Override
    +  public void setInjector(ControlsInjector injector) {
    +    this.injector = injector;
    +  }
    +
    +  @Override
    +  public ControlsInjector getInjector() {
    +    return injector;
    +  }
    +
    +  @Override
    +  public void injectUnchecked(String desc) {
    +    ExecutionControls executionControls = context.getExecutionControls();
    +    if (injector != null  &&  executionControls != null) {
    +      injector.injectUnchecked(executionControls, desc);
    +    }
    +  }
    +
    +  @Override
    +  public <T extends Throwable> void injectChecked(String desc, Class<T> exceptionClass)
    +      throws T {
    +    ExecutionControls executionControls = context.getExecutionControls();
    +    if (injector != null  &&  executionControls != null) {
    +      injector.injectChecked(executionControls, desc, exceptionClass);
    +    }
    +  }
    +
    +  @Override
    +  public void close() {
    +    RuntimeException ex = null;
    +    try {
    +      manager.close();
    +    } catch (RuntimeException e) {
    +      ex = e;
    +    }
    +    try {
    +      if (allocator != null) {
    +        allocator.close();
    +      }
    +    } catch (RuntimeException e) {
    +      ex = ex == null ? e : ex;
    +    }
    +    try {
    +      if (fs != null) {
    +          fs.close();
    +          fs = null;
    +      }
    +    } catch (IOException e) {
    +      throw UserException.resourceError(e)
    +        .addContext("Failed to close the Drill file system for " + getName())
    --- End diff --
    
    how about adding context for the `exception ex` as well if it's not null


---

[GitHub] drill issue #978: DRILL-5842: Refactor and simplify the fragment, operator c...

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on the issue:

    https://github.com/apache/drill/pull/978
  
    @jinfengni, sorry my description was a bit ambiguous. This PR has two goals:
    
    1. Clean up the contexts based on what has been learned recently.
    2. Evolve the contexts to allow "sub-operator" unit testing without mocks.
    
    Drill has many tests that use mocks and these are unaffected by the change. The PR simply allows new tests to do sub-operator testing without mocks, when convenient. 


---

[GitHub] drill pull request #978: DRILL-5842: Refactor and simplify the fragment, ope...

Posted by sohami <gi...@git.apache.org>.
Github user sohami commented on a diff in the pull request:

    https://github.com/apache/drill/pull/978#discussion_r148153093
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/ops/OperatorContextImpl.java ---
    @@ -98,37 +70,25 @@ public boolean isClosed() {
       @Override
       public void close() {
         if (closed) {
    -      logger.debug("Attempted to close Operator context for {}, but context is already closed", popConfig != null ? popConfig.getClass().getName() : null);
    +      logger.debug("Attempted to close Operator context for {}, but context is already closed", popConfig != null ? getName() : null);
           return;
         }
    -    logger.debug("Closing context for {}", popConfig != null ? popConfig.getClass().getName() : null);
    +    logger.debug("Closing context for {}", popConfig != null ? getName() : null);
     
         closed = true;
    --- End diff --
    
    `closed = true` should happen after the call to `super.close()` since base class close can throw exception.


---

[GitHub] drill pull request #978: DRILL-5842: Refactor and simplify the fragment, ope...

Posted by sohami <gi...@git.apache.org>.
Github user sohami commented on a diff in the pull request:

    https://github.com/apache/drill/pull/978#discussion_r148150382
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/ops/BaseOperatorContext.java ---
    @@ -0,0 +1,201 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.ops;
    +
    +import java.io.IOException;
    +import java.util.concurrent.ExecutorService;
    +
    +import org.apache.drill.common.exceptions.UserException;
    +import org.apache.drill.exec.memory.BufferAllocator;
    +import org.apache.drill.exec.physical.base.PhysicalOperator;
    +import org.apache.drill.exec.store.dfs.DrillFileSystem;
    +import org.apache.drill.exec.testing.ControlsInjector;
    +import org.apache.drill.exec.testing.ExecutionControls;
    +import org.apache.hadoop.conf.Configuration;
    +
    +import com.google.common.base.Preconditions;
    +
    +import io.netty.buffer.DrillBuf;
    +
    +/**
    + * Implementation of {@link OperatorExecContext} that provides services
    --- End diff --
    
    `OperatorContext` instead of _OperatorExecContext_


---

[GitHub] drill issue #978: DRILL-5842: Refactor and simplify the fragment, operator c...

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on the issue:

    https://github.com/apache/drill/pull/978
  
    Rebased and squashed commits.


---

[GitHub] drill issue #978: DRILL-5842: Refactor and simplify the fragment, operator c...

Posted by sohami <gi...@git.apache.org>.
Github user sohami commented on the issue:

    https://github.com/apache/drill/pull/978
  
    +1 LGTM. 
    Thanks for clarification.


---