You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Padma Penumarthy (JIRA)" <ji...@apache.org> on 2017/03/29 04:32:41 UTC

[jira] [Commented] (DRILL-5395) Query on MapR-DB table fails with NPE due to an issue with assignment logic

    [ https://issues.apache.org/jira/browse/DRILL-5395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15946511#comment-15946511 ] 

Padma Penumarthy commented on DRILL-5395:
-----------------------------------------

Based on number of fragments (decided by parallelization) and HBase regions to assign to these fragments, we calculate minimum and maximum regions to assign per fragment. 
Each region is assigned to a fragment on the node it is located on, fragments being picked in a round robin fashion. At the end, we make adjustments so each fragment is assigned at least the minimum and not more than maximum regions per fragment.
We can  have a case where we assigned up to minimum per fragment and still have regions to assign. In that case, minHeap will be null. We try to assign the unassigned regions to fragments with less than minimum by accessing minHeap, which is null and thus causing NPE.

In this case, 
we have 5 regions with ~300K rows. We apply selectivity of 0.5 and determine the cost as ~150,000. With slice target of 100K, we  create 2 fragments. On a 3 node cluster, we have 2 fragments assigned to 2 nodes and none assigned to 3rd node. With 5 regions to assign, minimum regions to assign per fragment is calculated as 5/2 i.e. 2 and maximum regions to assign per fragment is 3.  In the first round,  each region is allocated to a fragment on the node on which it is located. 4 regions got assigned to 2 fragments i.e. both the fragments have 2 each. One of the regions is located on the node which does not have any fragments and it does not get assigned to a fragment in the first round. Once initial assignment based on region location is done, we try to adjust the assignments so that each fragment gets assigned at least the minimum and at most the maximum. For this adjustment phase, we build a minHeap which is supposed to contain fragments which have less than minimum assigned. In this case, since we assigned 2 regions each to 2 fragments, minHeap is null. For assigning the 5th region, we try to get fragment which has least number of regions assigned by accessing minHeap, which is NULL.  This causes Null Pointer Exception.

Fix is to include the fragments which have minimum assigned in minHeap (not just less than minimum).




> Query on MapR-DB table fails with NPE due to an issue with assignment logic
> ---------------------------------------------------------------------------
>
>                 Key: DRILL-5395
>                 URL: https://issues.apache.org/jira/browse/DRILL-5395
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization, Storage - MapRDB
>    Affects Versions: 1.9.0, 1.10.0
>            Reporter: Abhishek Girish
>            Assignee: Padma Penumarthy
>              Labels: MapR-DB-Binary
>             Fix For: 1.11.0
>
>
> We uncovered this issue when working on DRILL-5394. 
> The MapR-DB table in question had 5 tablets with skewed data distribution (~6 million rows). A partial WIP fix for DRILL-5394 caused the number of rows to be reported incorrectly (~300,000). 2 minor fragments were created (due to filter selectivity) for scanning the 5 tablets. And this resulted in an NPE, possibly related to an issue with assignment logic, that was now exposed. 
> Representative query:
> {code}
> SELECT Convert_from(avail.customer, 'UTF8') AS ABC, 
>        Convert_from(prop.customer, 'UTF8')  AS PQR 
> FROM   (SELECT Convert_from(a.row_key, 'UTF8') 
>                AS customer, 
>                Cast(Convert_from(a.data .` l_discount ` , 'double_be') AS FLOAT) 
>                AS availability 
>         FROM   db.tpch_maprdb.lineitem_1 a 
>         WHERE  Convert_from(a.row_key, 'UTF8') = '%004%') AS avail 
>        join 
>               (SELECT Convert_from(b.row_key, 'UTF8') 
>                       AS customer, 
>                Cast( 
>        Convert_from(b.data .` l_discount ` , 'double_be') AS FLOAT) AS 
>                availability 
>         FROM   db.tpch_maprdb.lineitem_1 b 
>         WHERE  Convert_from(b.row_key, 'UTF8') LIKE '%003%') AS prop 
>          ON avail.customer = prop.customer; 
> {code}
> Error:
> {code}
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: NullPointerException
> {code}
> Log attached. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)