You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by vdiravka <gi...@git.apache.org> on 2017/10/04 16:17:54 UTC

[GitHub] drill pull request #973: DRILL-5775: Select * query on a maprdb binary table...

GitHub user vdiravka opened a pull request:

    https://github.com/apache/drill/pull/973

     DRILL-5775: Select * query on a maprdb binary table fails

    - Back out HBase changes introduced in DRILL-5546;
    - The common HBase/MapR-DB_binary verifyColumns() method;
    - MapRDBBinaryTable is introduced for the purpose of expanding the wildcard on the planning stage;
    - AbstractHBaseDrillTable class for MapRDBBinaryTable and DrillHBaseTable.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vdiravka/drill DRILL-5775

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/973.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #973
    
----
commit 2c38109607eec4b99f29dc9b2863156907b17500
Author: Paul Rogers <pr...@maprtech.com>
Date:   2017-09-28T16:49:38Z

    Back out HBase changes

commit 632f403d9f0711c4f4405349c7f3700af148a30b
Author: Vitalii Diravka <vi...@gmail.com>
Date:   2017-10-02T14:14:54Z

    DRILL-5775: Select * query on a maprdb binary table fails
    
    - The common HBase/MapR-DB_binary verifyColumns() method;
    - MapRDBBinaryTable is introduced for the purpose of expanding the wildcard on the planning stage;
    - AbstractHBaseDrillTable class for MapRDBBinaryTable and DrillHBaseTable.

----


---

[GitHub] drill pull request #973: DRILL-5775: Select * query on a maprdb binary table...

Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on a diff in the pull request:

    https://github.com/apache/drill/pull/973#discussion_r143770360
  
    --- Diff: contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/binary/MapRDBBinaryTable.java ---
    @@ -0,0 +1,48 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.store.mapr.db.binary;
    +
    +import java.io.IOException;
    +
    +import org.apache.drill.common.exceptions.UserException;
    +import org.apache.drill.exec.store.dfs.FileSystemPlugin;
    +import org.apache.drill.exec.store.dfs.FormatSelection;
    +import org.apache.drill.exec.store.hbase.AbstractHBaseDrillTable;
    +import org.apache.drill.exec.store.mapr.TableFormatPlugin;
    +import org.apache.drill.exec.store.mapr.db.MapRDBFormatPlugin;
    +import org.apache.hadoop.hbase.TableName;
    +import org.apache.hadoop.hbase.client.Admin;
    +
    +public class MapRDBBinaryTable extends AbstractHBaseDrillTable {
    +  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(MapRDBBinaryTable.class);
    +
    +  public MapRDBBinaryTable(String storageEngineName, FileSystemPlugin storagePlugin, TableFormatPlugin formatPlugin,
    +      FormatSelection selection) {
    +    super(storageEngineName, storagePlugin, selection);
    +    String tableName = selection.getSelection().getFiles().get(0);
    --- End diff --
    
    1. What if we factor out `tableName` logic into MapRDBFormatPlugin class into `getTableName` method (don't forget to add `@JsonIgnore`) since it's common here and in `MapRDBFormatPlugin` class.
    
    2. Maybe we can factor out common logic for getting `tableDesc ` as well and place it, for example, into `AbstractHBaseDrillTable` class in `protected void setTableDesc(Connection connection, String tableName)`?


---

[GitHub] drill pull request #973: DRILL-5775: Select * query on a maprdb binary table...

Posted by vdiravka <gi...@git.apache.org>.
Github user vdiravka commented on a diff in the pull request:

    https://github.com/apache/drill/pull/973#discussion_r143980497
  
    --- Diff: contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/binary/MapRDBBinaryTable.java ---
    @@ -0,0 +1,48 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + * http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.store.mapr.db.binary;
    +
    +import java.io.IOException;
    +
    +import org.apache.drill.common.exceptions.UserException;
    +import org.apache.drill.exec.store.dfs.FileSystemPlugin;
    +import org.apache.drill.exec.store.dfs.FormatSelection;
    +import org.apache.drill.exec.store.hbase.AbstractHBaseDrillTable;
    +import org.apache.drill.exec.store.mapr.TableFormatPlugin;
    +import org.apache.drill.exec.store.mapr.db.MapRDBFormatPlugin;
    +import org.apache.hadoop.hbase.TableName;
    +import org.apache.hadoop.hbase.client.Admin;
    +
    +public class MapRDBBinaryTable extends AbstractHBaseDrillTable {
    +  private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(MapRDBBinaryTable.class);
    +
    +  public MapRDBBinaryTable(String storageEngineName, FileSystemPlugin storagePlugin, TableFormatPlugin formatPlugin,
    +      FormatSelection selection) {
    +    super(storageEngineName, storagePlugin, selection);
    +    String tableName = selection.getSelection().getFiles().get(0);
    --- End diff --
    
    Very good suggestions! Done. Thanks.


---

[GitHub] drill pull request #973: DRILL-5775: Select * query on a maprdb binary table...

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/973#discussion_r144124764
  
    --- Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseUtils.java ---
    @@ -139,4 +142,26 @@ public static Filter orFilterAtIndex(Filter currentFilter, int index, Filter new
         return Bytes.compareTo(left, right) < 0 ? left : right;
       }
     
    +
    +  /**
    +   * Verify the presence of a column family in the schema path of the hbase table or whether the schema path is
    +   * the row key column.
    +   *
    +   * @param columns List of the selected columns
    +   * @param hTableDesc HTableDescriptor of HBase/MapR-DB_binary table (consists the details about that table)
    +   * @throws DrillRuntimeException if column family does not exist, or is not row_key column.
    +   */
    +  public static void verifyColumns(List<SchemaPath> columns, HTableDescriptor hTableDesc) {
    --- End diff --
    
    As it turns out, this PR is introducing the very ambiguity that DRILL-5830 tried to remove from DRILL-5546. That is, we now have to separate ways to expand the wildcard: this way and the project push down way. We need this commit, so we'll accept this for now. But, moving forward, we should clear up this ambiguity.


---

[GitHub] drill issue #973: DRILL-5775: Select * query on a maprdb binary table fails

Posted by vdiravka <gi...@git.apache.org>.
Github user vdiravka commented on the issue:

    https://github.com/apache/drill/pull/973
  
    @paul-rogers That changes connected to MapR-DB JSON tables, but this PR connected to MapR-DB binary tables. And looks like the functionality of these PRs are not crossed.
    
    The branch is rebased into last master  version.


---

[GitHub] drill issue #973: DRILL-5775: Select * query on a maprdb binary table fails

Posted by vdiravka <gi...@git.apache.org>.
Github user vdiravka commented on the issue:

    https://github.com/apache/drill/pull/973
  
    @jinfengni Could you please review this PR?


---

[GitHub] drill issue #973: DRILL-5775: Select * query on a maprdb binary table fails

Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on the issue:

    https://github.com/apache/drill/pull/973
  
    +1, LGTM.


---

[GitHub] drill pull request #973: DRILL-5775: Select * query on a maprdb binary table...

Posted by vdiravka <gi...@git.apache.org>.
Github user vdiravka commented on a diff in the pull request:

    https://github.com/apache/drill/pull/973#discussion_r143971466
  
    --- Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseGroupScan.java ---
    @@ -143,9 +143,9 @@ private HBaseGroupScan(HBaseGroupScan that) {
     
       @Override
       public GroupScan clone(List<SchemaPath> columns) {
    +    HBaseUtils.verifyColumns(columns, hTableDesc);
    --- End diff --
    
    Agree. I've added the same to `BinaryTableGroupScan`.


---

[GitHub] drill pull request #973: DRILL-5775: Select * query on a maprdb binary table...

Posted by vdiravka <gi...@git.apache.org>.
Github user vdiravka commented on a diff in the pull request:

    https://github.com/apache/drill/pull/973#discussion_r144158971
  
    --- Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseUtils.java ---
    @@ -139,4 +142,26 @@ public static Filter orFilterAtIndex(Filter currentFilter, int index, Filter new
         return Bytes.compareTo(left, right) < 0 ? left : right;
       }
     
    +
    +  /**
    +   * Verify the presence of a column family in the schema path of the hbase table or whether the schema path is
    +   * the row key column.
    +   *
    +   * @param columns List of the selected columns
    +   * @param hTableDesc HTableDescriptor of HBase/MapR-DB_binary table (consists the details about that table)
    +   * @throws DrillRuntimeException if column family does not exist, or is not row_key column.
    +   */
    +  public static void verifyColumns(List<SchemaPath> columns, HTableDescriptor hTableDesc) {
    --- End diff --
    
    DRILL-5546 introduced expanding wildcard in the `verifyColumns()` method with renaming to `verifyColumnsAndConvertStar()`. But this PR keeps the original `verifyColumns()` method and the same in DRILL-5830 (the star isn't expanded here). Just the method have started to be common to HBase and MapR-DB. 
    And MapR-DB plugin for now leverages the same way of project push down as HBase due to the `MapRDBBinaryTable` introducing.
    Or did I miss something?


---

[GitHub] drill issue #973: DRILL-5775: Select * query on a maprdb binary table fails

Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on the issue:

    https://github.com/apache/drill/pull/973
  
    Does this PR duplicates some work in #972? Can we review and commit that one first before we redo the work here?


---

[GitHub] drill issue #973: DRILL-5775: Select * query on a maprdb binary table fails

Posted by vdiravka <gi...@git.apache.org>.
Github user vdiravka commented on the issue:

    https://github.com/apache/drill/pull/973
  
    Once this PR will have +1 the [PR-941](https://github.com/apache/drill/pull/941) should be closed or vice versa.



---

[GitHub] drill pull request #973: DRILL-5775: Select * query on a maprdb binary table...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/drill/pull/973


---

[GitHub] drill pull request #973: DRILL-5775: Select * query on a maprdb binary table...

Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on a diff in the pull request:

    https://github.com/apache/drill/pull/973#discussion_r143753438
  
    --- Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseGroupScan.java ---
    @@ -143,9 +143,9 @@ private HBaseGroupScan(HBaseGroupScan that) {
     
       @Override
       public GroupScan clone(List<SchemaPath> columns) {
    +    HBaseUtils.verifyColumns(columns, hTableDesc);
    --- End diff --
    
    We should replace null columns with all_columns before verification or we will get exception. You have changed initial order of statements.


---

[GitHub] drill issue #973: DRILL-5775: Select * query on a maprdb binary table fails

Posted by vdiravka <gi...@git.apache.org>.
Github user vdiravka commented on the issue:

    https://github.com/apache/drill/pull/973
  
    The branch is rebased onto the master version and commits are squashed into one (except one Paul's commit from [PR-968](https://github.com/apache/drill/pull/968).


---