You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Nitin Pawar (JIRA)" <ji...@apache.org> on 2018/01/24 18:04:00 UTC

[jira] [Created] (DRILL-6105) SYSTEM ERROR: NullPointerException

Nitin Pawar created DRILL-6105:
----------------------------------

             Summary: SYSTEM ERROR: NullPointerException
                 Key: DRILL-6105
                 URL: https://issues.apache.org/jira/browse/DRILL-6105
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Flow
    Affects Versions: 1.12.0
            Reporter: Nitin Pawar


We just upgraded drill from 1.8.0 to 1.12.0

One of the issue which left us behind was whenever we upgraded Drill we were hit will null pointer exception in scenario which we could not spend time debugging. Then we rolled back to 1.8.0

This time we have spent some time and understood the problem

 

data set: 3 million records

5 columns, 2 of which are date columns (Let us call these date columns as A and B)

In column B majority of the values are nulls (roughly 99.5%)

our query is select * from table where A = B

above query fails with null pointer exception where the number of underlying parquet files are more than 1. If I merge the underlying parquet files into a single large file then we do not see this issue. 

We also were able to remove the error if we changed the where clause with coalesce(A,null) = coalesce(B,null)

This somehow points us that the error is coming while reading the parquet files with the given data types.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)