You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Paul Rogers (Jira)" <ji...@apache.org> on 2019/10/30 02:56:00 UTC

[jira] [Created] (DRILL-7428) Drill incorrectly allows a repeated map field to be projected to top level

Paul Rogers created DRILL-7428:
----------------------------------

             Summary: Drill incorrectly allows a repeated map field to be projected to top level
                 Key: DRILL-7428
                 URL: https://issues.apache.org/jira/browse/DRILL-7428
             Project: Apache Drill
          Issue Type: Bug
            Reporter: Paul Rogers


Consider the following query from the [Mongo DB tests|https://github.com/apache/drill/blob/master/contrib/storage-mongo/src/test/java/org/apache/drill/exec/store/mongo/MongoTestConstants.java#L80]:

{noformat}
select t.name as name, t.topping.type as type 
  from mongo.%s.`%s` t where t.sales >= 150
{noformat}


The query is used in [{{TestMongoQueries.testUnShardedDBInShardedClusterWithProjectionAndFilter()}}|https://github.com/apache/drill/blob/master/contrib/storage-mongo/src/test/java/org/apache/drill/exec/store/mongo/TestMongoQueries.java#L89].
 
Here it turns out that {{topping}} is a repeated map. The query is projecting the members of that map to the top level. The query has five rows, but 24 values in the repeated map. The Project operator allows the projection, resulting in an output batch in which most vectors have 5 values, but the {{topping}} column, now at the top level and no longer in the map, has 24 values.

As a result, the first five values, formerly associated with the first record, are now associated with the first five top-level records, while the values formerly associated with records 1-4 are lost.

Thus, this is a data corruption bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)