You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by vdiravka <gi...@git.apache.org> on 2017/10/04 16:17:54 UTC
[GitHub] drill pull request #973: DRILL-5775: Select * query on a maprdb binary table...
GitHub user vdiravka opened a pull request:
https://github.com/apache/drill/pull/973
DRILL-5775: Select * query on a maprdb binary table fails
- Back out HBase changes introduced in DRILL-5546;
- The common HBase/MapR-DB_binary verifyColumns() method;
- MapRDBBinaryTable is introduced for the purpose of expanding the wildcard on the planning stage;
- AbstractHBaseDrillTable class for MapRDBBinaryTable and DrillHBaseTable.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/vdiravka/drill DRILL-5775
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/drill/pull/973.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #973
----
commit 2c38109607eec4b99f29dc9b2863156907b17500
Author: Paul Rogers <pr...@maprtech.com>
Date: 2017-09-28T16:49:38Z
Back out HBase changes
commit 632f403d9f0711c4f4405349c7f3700af148a30b
Author: Vitalii Diravka <vi...@gmail.com>
Date: 2017-10-02T14:14:54Z
DRILL-5775: Select * query on a maprdb binary table fails
- The common HBase/MapR-DB_binary verifyColumns() method;
- MapRDBBinaryTable is introduced for the purpose of expanding the wildcard on the planning stage;
- AbstractHBaseDrillTable class for MapRDBBinaryTable and DrillHBaseTable.
----
---
[GitHub] drill pull request #973: DRILL-5775: Select * query on a maprdb binary table...
Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on a diff in the pull request:
https://github.com/apache/drill/pull/973#discussion_r143770360
--- Diff: contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/binary/MapRDBBinaryTable.java ---
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.mapr.db.binary;
+
+import java.io.IOException;
+
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.store.dfs.FileSystemPlugin;
+import org.apache.drill.exec.store.dfs.FormatSelection;
+import org.apache.drill.exec.store.hbase.AbstractHBaseDrillTable;
+import org.apache.drill.exec.store.mapr.TableFormatPlugin;
+import org.apache.drill.exec.store.mapr.db.MapRDBFormatPlugin;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.Admin;
+
+public class MapRDBBinaryTable extends AbstractHBaseDrillTable {
+ private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(MapRDBBinaryTable.class);
+
+ public MapRDBBinaryTable(String storageEngineName, FileSystemPlugin storagePlugin, TableFormatPlugin formatPlugin,
+ FormatSelection selection) {
+ super(storageEngineName, storagePlugin, selection);
+ String tableName = selection.getSelection().getFiles().get(0);
--- End diff --
1. What if we factor out `tableName` logic into MapRDBFormatPlugin class into `getTableName` method (don't forget to add `@JsonIgnore`) since it's common here and in `MapRDBFormatPlugin` class.
2. Maybe we can factor out common logic for getting `tableDesc ` as well and place it, for example, into `AbstractHBaseDrillTable` class in `protected void setTableDesc(Connection connection, String tableName)`?
---
[GitHub] drill pull request #973: DRILL-5775: Select * query on a maprdb binary table...
Posted by vdiravka <gi...@git.apache.org>.
Github user vdiravka commented on a diff in the pull request:
https://github.com/apache/drill/pull/973#discussion_r143980497
--- Diff: contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/binary/MapRDBBinaryTable.java ---
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.mapr.db.binary;
+
+import java.io.IOException;
+
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.store.dfs.FileSystemPlugin;
+import org.apache.drill.exec.store.dfs.FormatSelection;
+import org.apache.drill.exec.store.hbase.AbstractHBaseDrillTable;
+import org.apache.drill.exec.store.mapr.TableFormatPlugin;
+import org.apache.drill.exec.store.mapr.db.MapRDBFormatPlugin;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.client.Admin;
+
+public class MapRDBBinaryTable extends AbstractHBaseDrillTable {
+ private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(MapRDBBinaryTable.class);
+
+ public MapRDBBinaryTable(String storageEngineName, FileSystemPlugin storagePlugin, TableFormatPlugin formatPlugin,
+ FormatSelection selection) {
+ super(storageEngineName, storagePlugin, selection);
+ String tableName = selection.getSelection().getFiles().get(0);
--- End diff --
Very good suggestions! Done. Thanks.
---
[GitHub] drill pull request #973: DRILL-5775: Select * query on a maprdb binary table...
Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on a diff in the pull request:
https://github.com/apache/drill/pull/973#discussion_r144124764
--- Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseUtils.java ---
@@ -139,4 +142,26 @@ public static Filter orFilterAtIndex(Filter currentFilter, int index, Filter new
return Bytes.compareTo(left, right) < 0 ? left : right;
}
+
+ /**
+ * Verify the presence of a column family in the schema path of the hbase table or whether the schema path is
+ * the row key column.
+ *
+ * @param columns List of the selected columns
+ * @param hTableDesc HTableDescriptor of HBase/MapR-DB_binary table (consists the details about that table)
+ * @throws DrillRuntimeException if column family does not exist, or is not row_key column.
+ */
+ public static void verifyColumns(List<SchemaPath> columns, HTableDescriptor hTableDesc) {
--- End diff --
As it turns out, this PR is introducing the very ambiguity that DRILL-5830 tried to remove from DRILL-5546. That is, we now have to separate ways to expand the wildcard: this way and the project push down way. We need this commit, so we'll accept this for now. But, moving forward, we should clear up this ambiguity.
---
[GitHub] drill issue #973: DRILL-5775: Select * query on a maprdb binary table fails
Posted by vdiravka <gi...@git.apache.org>.
Github user vdiravka commented on the issue:
https://github.com/apache/drill/pull/973
@paul-rogers That changes connected to MapR-DB JSON tables, but this PR connected to MapR-DB binary tables. And looks like the functionality of these PRs are not crossed.
The branch is rebased into last master version.
---
[GitHub] drill issue #973: DRILL-5775: Select * query on a maprdb binary table fails
Posted by vdiravka <gi...@git.apache.org>.
Github user vdiravka commented on the issue:
https://github.com/apache/drill/pull/973
@jinfengni Could you please review this PR?
---
[GitHub] drill issue #973: DRILL-5775: Select * query on a maprdb binary table fails
Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on the issue:
https://github.com/apache/drill/pull/973
+1, LGTM.
---
[GitHub] drill pull request #973: DRILL-5775: Select * query on a maprdb binary table...
Posted by vdiravka <gi...@git.apache.org>.
Github user vdiravka commented on a diff in the pull request:
https://github.com/apache/drill/pull/973#discussion_r143971466
--- Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseGroupScan.java ---
@@ -143,9 +143,9 @@ private HBaseGroupScan(HBaseGroupScan that) {
@Override
public GroupScan clone(List<SchemaPath> columns) {
+ HBaseUtils.verifyColumns(columns, hTableDesc);
--- End diff --
Agree. I've added the same to `BinaryTableGroupScan`.
---
[GitHub] drill pull request #973: DRILL-5775: Select * query on a maprdb binary table...
Posted by vdiravka <gi...@git.apache.org>.
Github user vdiravka commented on a diff in the pull request:
https://github.com/apache/drill/pull/973#discussion_r144158971
--- Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseUtils.java ---
@@ -139,4 +142,26 @@ public static Filter orFilterAtIndex(Filter currentFilter, int index, Filter new
return Bytes.compareTo(left, right) < 0 ? left : right;
}
+
+ /**
+ * Verify the presence of a column family in the schema path of the hbase table or whether the schema path is
+ * the row key column.
+ *
+ * @param columns List of the selected columns
+ * @param hTableDesc HTableDescriptor of HBase/MapR-DB_binary table (consists the details about that table)
+ * @throws DrillRuntimeException if column family does not exist, or is not row_key column.
+ */
+ public static void verifyColumns(List<SchemaPath> columns, HTableDescriptor hTableDesc) {
--- End diff --
DRILL-5546 introduced expanding wildcard in the `verifyColumns()` method with renaming to `verifyColumnsAndConvertStar()`. But this PR keeps the original `verifyColumns()` method and the same in DRILL-5830 (the star isn't expanded here). Just the method have started to be common to HBase and MapR-DB.
And MapR-DB plugin for now leverages the same way of project push down as HBase due to the `MapRDBBinaryTable` introducing.
Or did I miss something?
---
[GitHub] drill issue #973: DRILL-5775: Select * query on a maprdb binary table fails
Posted by paul-rogers <gi...@git.apache.org>.
Github user paul-rogers commented on the issue:
https://github.com/apache/drill/pull/973
Does this PR duplicates some work in #972? Can we review and commit that one first before we redo the work here?
---
[GitHub] drill issue #973: DRILL-5775: Select * query on a maprdb binary table fails
Posted by vdiravka <gi...@git.apache.org>.
Github user vdiravka commented on the issue:
https://github.com/apache/drill/pull/973
Once this PR will have +1 the [PR-941](https://github.com/apache/drill/pull/941) should be closed or vice versa.
---
[GitHub] drill pull request #973: DRILL-5775: Select * query on a maprdb binary table...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/drill/pull/973
---
[GitHub] drill pull request #973: DRILL-5775: Select * query on a maprdb binary table...
Posted by arina-ielchiieva <gi...@git.apache.org>.
Github user arina-ielchiieva commented on a diff in the pull request:
https://github.com/apache/drill/pull/973#discussion_r143753438
--- Diff: contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseGroupScan.java ---
@@ -143,9 +143,9 @@ private HBaseGroupScan(HBaseGroupScan that) {
@Override
public GroupScan clone(List<SchemaPath> columns) {
+ HBaseUtils.verifyColumns(columns, hTableDesc);
--- End diff --
We should replace null columns with all_columns before verification or we will get exception. You have changed initial order of statements.
---
[GitHub] drill issue #973: DRILL-5775: Select * query on a maprdb binary table fails
Posted by vdiravka <gi...@git.apache.org>.
Github user vdiravka commented on the issue:
https://github.com/apache/drill/pull/973
The branch is rebased onto the master version and commits are squashed into one (except one Paul's commit from [PR-968](https://github.com/apache/drill/pull/968).
---