You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@drill.apache.org by sachouche <gi...@git.apache.org> on 2018/03/15 22:32:50 UTC

[GitHub] drill pull request #1170: DRILL-6223: Fixed several Drillbit failures due to...

GitHub user sachouche opened a pull request:

    https://github.com/apache/drill/pull/1170

    DRILL-6223: Fixed several Drillbit failures due to schema changes

    Fixed several Issues due to Schema changes:
    1) Changes in complex data types
    Drill Query Failing when selecting all columns from a Complex Nested Data File (Parquet) Set). There are differences in Schema among the files:
    
    The Parquet files exhibit differences both at the first level and within nested data types
    A select * will not cause an exception but using a limit clause will
    Note also this issue seems to happen only when multiple Drillbit minor fragments are involved (concurrency higher than one)
    
    2) Dangling columns (both simple and complex)
    This situation can be easily reproduced for:
    - Select STAR queries which involve input data with different schemas
    - LIMIT or / and PROJECT operators are used
    - The data will be read from more than one minor fragment
    - This is because individual readers have logic to handle such use-cases but not downstream operators
    - So is reader-1 sends one batch with F1, F2, and F3
    - The reader-2 sends batch F2, F3
    - Then the LIMIT and PROJECT operator will fail to cleanup the dangling column F1 which will cause failures when downstream operators copy logic attempts copy the stale column F1
    - This pull request adds logic to detect and eliminate dangling columns   

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sachouche/drill DRILL-6223

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/1170.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1170
    
----
commit d986b6c7588c107bb7e49d2fc8eb3f25a60e1214
Author: Salim Achouche <sa...@...>
Date:   2018-02-21T02:17:14Z

    DRILL-6223: Fixed several Drillbit failures due to schema changes

----


---

[GitHub] drill issue #1170: DRILL-6223: Fixed several Drillbit failures due to schema...

Posted by sachouche <gi...@git.apache.org>.

Github user sachouche commented on the issue:

    https://github.com/apache/drill/pull/1170
  
    @parthchandra and @paul-rogers, I have added a comment within the Jira [DRILL-6223](https://issues.apache.org/jira/browse/DRILL-6223); please let me know what you think.
    
    Thanks!


---

[GitHub] drill issue #1170: DRILL-6223: Fixed several Drillbit failures due to schema...

Posted by sachouche <gi...@git.apache.org>.

Github user sachouche commented on the issue:

    https://github.com/apache/drill/pull/1170
  
    @amansinha100 can you please review this pull request?
    
    Thanks!


---

[GitHub] drill issue #1170: DRILL-6223: Fixed several Drillbit failures due to schema...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on the issue:

    https://github.com/apache/drill/pull/1170
  
    The waters here run deep. Please see a detailed comment in [DRILL-6223](https://issues.apache.org/jira/browse/DRILL-6223).


---

[GitHub] drill pull request #1170: DRILL-6223: Fixed several Drillbit failures due to...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1170#discussion_r178225725
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/VectorContainer.java ---
    @@ -136,14 +138,28 @@ public void transferOut(VectorContainer containerOut) {
       public <T extends ValueVector> T addOrGet(final MaterializedField field, final SchemaChangeCallBack callBack) {
         final TypedFieldId id = getValueVectorId(SchemaPath.getSimplePath(field.getName()));
         final ValueVector vector;
    -    final Class<?> clazz = TypeHelper.getValueVectorClass(field.getType().getMinorType(), field.getType().getMode());
    +
         if (id != null) {
    -      vector = getValueAccessorById(id.getFieldIds()).getValueVector();
    +      vector                = getValueAccessorById(id.getFieldIds()).getValueVector();
    +      final Class<?> clazz  = TypeHelper.getValueVectorClass(field.getType().getMinorType(), field.getType().getMode());
    +
    +      // Check whether incoming field and the current one are compatible; if not then replace previous one with the new one
           if (id.getFieldIds().length == 1 && clazz != null && !clazz.isAssignableFrom(vector.getClass())) {
             final ValueVector newVector = TypeHelper.getNewVector(field, this.getAllocator(), callBack);
             replace(vector, newVector);
             return (T) newVector;
           }
    +
    +      // At this point, we know incoming and current fields are compatible. Maps can have children,
    +      // we need to ensure they have the same structure.
    +      if (MinorType.MAP.equals(field.getType().getMinorType())
    --- End diff --
    
    This is tricky and probably broken elsewhere. If the incoming type is a union or a list, then it can contain a nested map.


---

[GitHub] drill issue #1170: DRILL-6223: Fixed several Drillbit failures due to schema...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/1170

Sorry to say, I still disagree with this statement: "This pull request adds logic to detect and eliminate dangling columns".

There was a prior discussion that `SELECT *` means "return all columns", not "return only those columns which happen to be in common." We discussed how removing columns midway through a DAG can produce inconsistent results.

But, let's take this particular case: the Project operator.

What should happen (to be consistent with other parts of Drill), is that the operator correctly fills in values for the "dangling" F3 so the the output is (F1, F2, F3).

Note that this becomes VERY ambiguous. Suppose Projection sees the following from Files A(F1, F2, F3) and B(F1, F2)

* Batch A.1 (with F1, F2, F3)
* Batch B.1 (with F1, F2)

Clearly, the project can remember that F3 was previously seen and fill in the missing column. (This is exactly what the new projection logic in the new scan framework does, by the way.) This works, however, only if F3 is nullable. If not... what (non-null) value can we fill in for F3?

Had we know that F3 would turn up dangling, we could have converted F3 in the first batch to become nullable, but Drill can't predict the future.

Let's consider the proposal: we drop dangling columns. But, since the dangling column (F3) appeared in the first batch, we didn't know it is dangling. Only when we see the second batch (B.1) do we realize that F3 was dangling and we should have removed it. Again, this algorithm requires effective time travel.

Now, suppose that the order is reversed:

* Batch B.1 (with F1, F2)
* Batch A.1 (with F1, F2, F3)

Here, we can identify F3 as dangling and could remove it, so the proposal is sound.

On the other hand, the "fill in F3" trick does not work here because Project sends B.1 downstream. Later, it notices that A.1 adds a column. Project can't somehow retroactively add the missing column; all it can do is trigger a schema change downstream. Again, Drill can't predict the future to know that it has to fill in F3 in the first B.1 batch.

We've not yet discussed the case in which F2, which exists in both files, has distinct types (INT in A, say and VARCHAR in B). The dangling column trick won't work. The same logic as above applies to the type mismatch.

Perhaps we use either the "remove" or "fill in" depending on whether the column appears in the first batch. So, for the following:

* Batch A.1 (with F1, F2, F3)
* Batch B.1 (with F1, F2)

The result would be (F1, F2, F3)

But if the input was:

* Batch B.1 (with F1, F2)
* Batch A.1 (with F1, F2, F3)

The result wold be (F1, F2)

Since the user has no control over the order that files are read, the result would be random: half the time the user gets one schema, the other half the other. It is unlikely that the user will perceive that as a feature.

The general conclusion is that there is no way that Project can "smooth" the schema in the general case: it would have to predict the future to do so.

Now, let's think about other operators, Sort, say. Suppose we do `SELECT * FROM foo ORDER BY x`. In this case, there is no project. The Sort operator will see batches with differing schemas, but must sort/merge them together. The schemas must match. The Sort tries to do this by using the union type (actually works, there is a unit test for it somewhere) if columns have conflicting types. I suspect it does not work if a column is missing from some batches. (Would need to test.)

And, of course, the situation is worse if the dangling column is the sort key!

Overall, while it is very appealing to think that dangling columns are a "bug" for which there is a fix, the reality is not so simple. This is an inherent ambiguity in the Drill model for which there is no fix that both works and is consistent with SQL semantics.

What would work? A schema! Suppose we are told that the schema is (F1: INT, F2: VARCHAR, F3: nullable DOUBLE). Now, when Project (or even the scanner) notices that F3 is missing, it knows to add in the required column of the correct type and, voila! no schema change.

Suppose that B defines F2 as INT. We know we want a VARCHAR, and so can do an implicit conversion. Again, voila! no schema change.

In summary, I completely agree that the scenario described is a problem. But, I don't believe that removing columns is the fix; instead the only valid fix is to allow the user to provide a schema.

---

[GitHub] drill pull request #1170: DRILL-6223: Fixed several Drillbit failures due to...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1170#discussion_r178222960
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/limit/LimitRecordBatch.java ---
    @@ -60,13 +60,7 @@ public LimitRecordBatch(Limit popConfig, FragmentContext context, RecordBatch in
       protected boolean setupNewSchema() throws SchemaChangeException {
         container.zeroVectors();
         transfers.clear();
    -
    -
    -    for(final VectorWrapper<?> v : incoming) {
    -      final TransferPair pair = v.getValueVector().makeTransferPair(
    -          container.addOrGet(v.getField(), callBack));
    -      transfers.add(pair);
    -    }
    +    container.onSchemaChange(incoming, callBack, transfers);
    --- End diff --
    
    `onSchemaChange()` may perhaps be the wrong name. It is why this functionality is called in this case. But, the actual functionality is closer to `setupTransfers()` (assuming the removed code was simply moved into the container class...)


---

[GitHub] drill issue #1170: DRILL-6223: Fixed several Drillbit failures due to schema...

Posted by sachouche <gi...@git.apache.org>.

Github user sachouche commented on the issue:

    https://github.com/apache/drill/pull/1170
  
    @parthchandra can you also please review this PR? 
    Thanks!



---

[GitHub] drill issue #1170: DRILL-6223: Fixed several Drillbit failures due to schema...

Posted by parthchandra <gi...@git.apache.org>.

Github user parthchandra commented on the issue:

    https://github.com/apache/drill/pull/1170
  
    I added a comment in the JIRA - [DRILL-6223](https://issues.apache.org/jira/browse/DRILL-6223?focusedCommentId=16402223&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16402223)



---

[GitHub] drill issue #1170: DRILL-6223: Fixed several Drillbit failures due to schema...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on the issue:

    https://github.com/apache/drill/pull/1170
  
    Over the last year, we've tended to favor including unit tests with each PR. There don't seem to be any with this one, yet we are proposing to make a fairly complex change. Perhaps tests can be added.
    
    Further, by having good tests, we don't have to debate how Drill will handle the scenarios discussed in an earlier comment: we just code 'em up and try 'em out, letting Drill speak for itself. We can then decide whether or not we like the results, rather than discussing hypotheticals.


---

[GitHub] drill issue #1170: DRILL-6223: Fixed several Drillbit failures due to schema...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on the issue:

    https://github.com/apache/drill/pull/1170
  
    BTW: thanks for tackling such a difficult, core issue in Drill. Drill claims to be a) schema free and b) SQL compliant. SQL is based on operations over relations with a fixed number of columns of fixed types. Reconciling these two ideas is very difficult. Even the original Drill developers, who built a huge amount of code very quickly, and who had intimate knowledge of the Drill internals, even they did not find a good solution which is why the problem is still open.
    
    There are two obvious approaches: 1) redefine SQL to operate over lists of maps (with arbitrary name/value pairs that differ across rows), or 2) define translation rules from schema-free input into the schema-full relations that SQL requires.
    
    This PR attempts to go down the first route: redefine SQL. To be successful, we'd want to rely on research papers, if any, that show how to reformulate relational theory on top of lists of maps rather than on relations and domains.
    
    The other approach is to define conversion rules: something much more on the order of a straight-forward implementation project. Can the user provide conversion rules (in the form of a schema) when the conversion is ambiguous? Would users rather encounter schema change exceptions or provide the conversion rules? These are interesting open questions.


---

[GitHub] drill pull request #1170: DRILL-6223: Fixed several Drillbit failures due to...

Posted by paul-rogers <gi...@git.apache.org>.

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1170#discussion_r178225930
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/VectorContainer.java ---
    @@ -136,14 +138,28 @@ public void transferOut(VectorContainer containerOut) {
       public <T extends ValueVector> T addOrGet(final MaterializedField field, final SchemaChangeCallBack callBack) {
         final TypedFieldId id = getValueVectorId(SchemaPath.getSimplePath(field.getName()));
         final ValueVector vector;
    -    final Class<?> clazz = TypeHelper.getValueVectorClass(field.getType().getMinorType(), field.getType().getMode());
    +
         if (id != null) {
    -      vector = getValueAccessorById(id.getFieldIds()).getValueVector();
    +      vector                = getValueAccessorById(id.getFieldIds()).getValueVector();
    +      final Class<?> clazz  = TypeHelper.getValueVectorClass(field.getType().getMinorType(), field.getType().getMode());
    +
    +      // Check whether incoming field and the current one are compatible; if not then replace previous one with the new one
           if (id.getFieldIds().length == 1 && clazz != null && !clazz.isAssignableFrom(vector.getClass())) {
             final ValueVector newVector = TypeHelper.getNewVector(field, this.getAllocator(), callBack);
             replace(vector, newVector);
             return (T) newVector;
           }
    +
    +      // At this point, we know incoming and current fields are compatible. Maps can have children,
    +      // we need to ensure they have the same structure.
    +      if (MinorType.MAP.equals(field.getType().getMinorType())
    +       && vector != null
    +       && !SchemaUtil.isSameSchemaIncludingOrder(vector.getField().getChildren(), field.getChildren())) {
    +
    +        final ValueVector newVector = TypeHelper.getNewVector(field, this.getAllocator(), callBack);
    +        replace(vector, newVector);
    +        return (T) newVector;
    --- End diff --
    
    We have two vectors, both maps. We found out their schemas differ. We are throwing  away the old map, replacing it with a new one with no members. Is that what we want to do? Or, do we want to recursively merge the maps? Is this done elsewhere? If so, how do we remember to do that merge?


---