You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Volodymyr Vysotskyi (JIRA)" <ji...@apache.org> on 2017/07/10 14:14:00 UTC

[jira] [Comment Edited] (DRILL-4139) Fix parquet partition pruning for BIT, INTERVAL and DECIMAL types

    [ https://issues.apache.org/jira/browse/DRILL-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080025#comment-16080025 ] 

Volodymyr Vysotskyi edited comment on DRILL-4139 at 7/10/17 2:13 PM:
---------------------------------------------------------------------

Drill serializes values of binary fields to parquet metadata cache file using the code {{new String(((Binary) bytes).getBytes())}}
but when bytes has encoding that differs from default, for example it has little-endian byte order, then {{new String(((Binary) bytes).getBytes()).getBytes()}}
would return byte array that differs from the {{bytes}}. 
According to [Parquet Logical Type Definitions|https://github.com/Parquet/parquet-format/blob/master/LogicalTypes.md], big-endian byte order should be used to store DECIMAL values in fixed_len_byte_array or binary field. INTERVAL type uses little-endian byte order to store its value in fixed_len_byte_array field.
Drill stores correctly only values of binary fields in parquet metadata cache file, but values of fixed_len_byte_array fields are storing as Binary objects:
{noformat}
      {
        "name" : [ "col_intrvl_yr" ],
        "minValue" : {
          "bytesUnsafe" : "sQAAAAAAAAAAAAAA",
          "bytes" : "sQAAAAAAAAAAAAAA",
          "backingBytesReused" : true
        },
        "maxValue" : {
          "bytesUnsafe" : "OgEAAAAAAAAAAAAA",
          "bytes" : "OgEAAAAAAAAAAAAA",
          "backingBytesReused" : true
        },
        "nulls" : 0
      }
{noformat}
Since Drill may store some types in binary and fixed_len_byte_array fields, it is required to serialize / deserialize both these types by the same way. For example according to [Parquet Logical Type Definitions|https://github.com/Parquet/parquet-format/blob/master/LogicalTypes.md], DECIMAL field may be stored as binary or fixed_len_byte_array field.

Proposal is to serialize byte arrays directly by calling {{((Binary) value.minValue).getBytes()}} and deserialize by calling {{Base64.decodeBase64(((String) source).getBytes())}}.
So there will be no dependence on the byte order.

Another problem is backward compatibility. When metadata file, that created by the version of Drill with these changes will be read from older Drill version, it may lead to errors or wrong results. Updating the metadata version does not help, since old Drill versions just throws an exception when is trying to read new metadata cache files:
{noformat}
Error: SYSTEM ERROR: JsonMappingException: Could not resolve type id 'v4' into a subtype of [simple type, class org.apache.drill.exec.store.parquet.Metadata$ParquetTableMetadataBase]: known type ids = [Metadata$ParquetTableMetadataBase, v1, v2, v3]
 at [Source: org.apache.hadoop.fs.ChecksumFileSystem$FSDataBoundedInputStream@7b609ce0; line: 2, column: 24]
{noformat}

Metadata files without and with changes for this Jira:
{noformat}
+-------------------------------------------------------+-------------------------------------------------------+
|	{						|	{						|
|	"metadata_version" : "v3",			|	"metadata_version" : "v4",			|
|	"columnTypeInfo" : {				|	"columnTypeInfo" : {				|
|	"col_intrvl_yr" : {				|	"col_intrvl_yr" : {				|
|	"name" : [ "col_intrvl_yr" ],			|	"name" : [ "col_intrvl_yr" ],			|
|	"primitiveType" : "FIXED_LEN_BYTE_ARRAY",	|	"primitiveType" : "FIXED_LEN_BYTE_ARRAY",	|
|	"originalType" : "INTERVAL",			|	"originalType" : "INTERVAL",			|
|	"precision" : 0,				|	"precision" : 0,				|
|	"scale" : 0,					|	"scale" : 0,					|
|	"repetitionLevel" : 0,				|	"repetitionLevel" : 0,				|
|	"definitionLevel" : 0				|	"definitionLevel" : 0				|
|	},						|	},						|
|	"col_int" : {					|	"col_int" : {					|
|	"name" : [ "col_int" ],				|	"name" : [ "col_int" ],				|
|	"primitiveType" : "INT32",			|	"primitiveType" : "INT32",			|
|	"originalType" : null,				|	"originalType" : null,				|
|	"precision" : 0,				|	"precision" : 0,				|
|	"scale" : 0,					|	"scale" : 0,					|
|	"repetitionLevel" : 0,				|	"repetitionLevel" : 0,				|
|	"definitionLevel" : 0				|	"definitionLevel" : 0				|
|	},						|	},						|
|	"col_vrchr" : {					|	"col_vrchr" : {					|
|	"name" : [ "col_vrchr" ],			|	"name" : [ "col_vrchr" ],			|
|	"primitiveType" : "BINARY",			|	"primitiveType" : "BINARY",			|
|	"originalType" : "UTF8",			|	"originalType" : "UTF8",			|
|	"precision" : 0,				|	"precision" : 0,				|
|	"scale" : 0,					|	"scale" : 0,					|
|	"repetitionLevel" : 0,				|	"repetitionLevel" : 0,				|
|	"definitionLevel" : 0				|	"definitionLevel" : 0				|
|	},						|	},						|
|	"col_tmstmp" : {				|	"col_tmstmp" : {				|
|	"name" : [ "col_tmstmp" ],			|	"name" : [ "col_tmstmp" ],			|
|	"primitiveType" : "INT64",			|	"primitiveType" : "INT64",			|
|	"originalType" : "TIMESTAMP_MILLIS",		|	"originalType" : "TIMESTAMP_MILLIS",		|
|	"precision" : 0,				|	"precision" : 0,				|
|	"scale" : 0,					|	"scale" : 0,					|
|	"repetitionLevel" : 0,				|	"repetitionLevel" : 0,				|
|	"definitionLevel" : 0				|	"definitionLevel" : 0				|
|	},						|	},						|
|	"col_dt" : {					|	"col_dt" : {					|
|	"name" : [ "col_dt" ],				|	"name" : [ "col_dt" ],				|
|	"primitiveType" : "INT32",			|	"primitiveType" : "INT32",			|
|	"originalType" : "DATE",			|	"originalType" : "DATE",			|
|	"precision" : 0,				|	"precision" : 0,				|
|	"scale" : 0,					|	"scale" : 0,					|
|	"repetitionLevel" : 0,				|	"repetitionLevel" : 0,				|
|	"definitionLevel" : 0				|	"definitionLevel" : 0				|
|	},						|	},						|
|	"col_intrvl_day" : {				|	"col_intrvl_day" : {				|
|	"name" : [ "col_intrvl_day" ],			|	"name" : [ "col_intrvl_day" ],			|
|	"primitiveType" : "FIXED_LEN_BYTE_ARRAY",	|	"primitiveType" : "FIXED_LEN_BYTE_ARRAY",	|
|	"originalType" : "INTERVAL",			|	"originalType" : "INTERVAL",			|
|	"precision" : 0,				|	"precision" : 0,				|
|	"scale" : 0,					|	"scale" : 0,					|
|	"repetitionLevel" : 0,				|	"repetitionLevel" : 0,				|
|	"definitionLevel" : 0				|	"definitionLevel" : 0				|
|	},						|	},						|
|	"col_flt" : {					|	"col_flt" : {					|
|	"name" : [ "col_flt" ],				|	"name" : [ "col_flt" ],				|
|	"primitiveType" : "FLOAT",			|	"primitiveType" : "FLOAT",			|
|	"originalType" : null,				|	"originalType" : null,				|
|	"precision" : 0,				|	"precision" : 0,				|
|	"scale" : 0,					|	"scale" : 0,					|
|	"repetitionLevel" : 0,				|	"repetitionLevel" : 0,				|
|	"definitionLevel" : 0				|	"definitionLevel" : 0				|
|	},						|	},						|
|	"col_tim" : {					|	"col_tim" : {					|
|	"name" : [ "col_tim" ],				|	"name" : [ "col_tim" ],				|
|	"primitiveType" : "INT32",			|	"primitiveType" : "INT32",			|
|	"originalType" : "TIME_MILLIS",			|	"originalType" : "TIME_MILLIS",			|
|	"precision" : 0,				|	"precision" : 0,				|
|	"scale" : 0,					|	"scale" : 0,					|
|	"repetitionLevel" : 0,				|	"repetitionLevel" : 0,				|
|	"definitionLevel" : 0				|	"definitionLevel" : 0				|
|	},						|	},						|
|	"col_bln" : {					|	"col_bln" : {					|
|	"name" : [ "col_bln" ],				|	"name" : [ "col_bln" ],				|
|	"primitiveType" : "BOOLEAN",			|	"primitiveType" : "BOOLEAN",			|
|	"originalType" : null,				|	"originalType" : null,				|
|	"precision" : 0,				|	"precision" : 0,				|
|	"scale" : 0,					|	"scale" : 0,					|
|	"repetitionLevel" : 0,				|	"repetitionLevel" : 0,				|
|	"definitionLevel" : 0				|	"definitionLevel" : 0				|
|	},						|	},						|
|	"col_chr" : {					|	"col_chr" : {					|
|	"name" : [ "col_chr" ],				|	"name" : [ "col_chr" ],				|
|	"primitiveType" : "BINARY",			|	"primitiveType" : "BINARY",			|
|	"originalType" : "UTF8",			|	"originalType" : "UTF8",			|
|	"precision" : 0,				|	"precision" : 0,				|
|	"scale" : 0,					|	"scale" : 0,					|
|	"repetitionLevel" : 0,				|	"repetitionLevel" : 0,				|
|	"definitionLevel" : 0				|	"definitionLevel" : 0				|
|	}						|	}						|
|	},						|	},						|
|	"files" : [ {					|	"files" : [ {					|
|	"path" : "0_0_1.parquet",			|	"path" : "0_0_1.parquet",			|
|	"length" : 1456,				|	"length" : 1456,				|
|	"rowGroups" : [ {				|	"rowGroups" : [ {				|
|	"start" : 4,					|	"start" : 4,					|
|	"length" : 539,					|	"length" : 539,					|
|	"rowCount" : 2,					|	"rowCount" : 2,					|
|	"hostAffinity" : {				|	"hostAffinity" : {				|
|	"localhost" : 1.0				|	"localhost" : 1.0				|
|	},						|	},						|
|	"columns" : [ {					|	"columns" : [ {					|
|	"name" : [ "col_int" ],				|	"name" : [ "col_int" ],				|
|	"minValue" : 13075,				|	"minValue" : 13075,				|
|	"maxValue" : 45436,				|	"maxValue" : 45436,				|
|	"nulls" : 0					|	"nulls" : 0					|
|	}, {						|	}, {						|
|	"name" : [ "col_chr" ],				|	"name" : [ "col_chr" ],				|
|	"minValue" : "UT",				|	"minValue" : "VVQ=",				|
|	"maxValue" : "WV",				|	"maxValue" : "V1Y=",				|
|	"nulls" : 0					|	"nulls" : 0					|
|	}, {						|	}, {						|
|	"name" : [ "col_vrchr" ],			|	"name" : [ "col_vrchr" ],			|
|	"minValue" : "John Mcginity",			|	"minValue" : "Sm9obiBNY2dpbml0eQ==",		|
|	"maxValue" : "Timothy Griffin",			|	"maxValue" : "VGltb3RoeSBHcmlmZmlu",		|
|	"nulls" : 0					|	"nulls" : 0					|
|	}, {						|	}, {						|
|	"name" : [ "col_dt" ],				|	"name" : [ "col_dt" ],				|
|	"minValue" : 6138,				|	"minValue" : 6138,				|
|	"maxValue" : 15282,				|	"maxValue" : 15282,				|
|	"nulls" : 0					|	"nulls" : 0					|
|	}, {						|	}, {						|
|	"name" : [ "col_tim" ],				|	"name" : [ "col_tim" ],				|
|	"minValue" : 64946000,				|	"minValue" : 64946000,				|
|	"maxValue" : 76337000,				|	"maxValue" : 76337000,				|
|	"nulls" : 0					|	"nulls" : 0					|
|	}, {						|	}, {						|
|	"name" : [ "col_tmstmp" ],			|	"name" : [ "col_tmstmp" ],			|
|	"minValue" : 591037122000,			|	"minValue" : 591037122000,			|
|	"maxValue" : 1259272872000,			|	"maxValue" : 1259272872000,			|
|	"nulls" : 0					|	"nulls" : 0					|
|	}, {						|	}, {						|
|	"name" : [ "col_flt" ],				|	"name" : [ "col_flt" ],				|
|	"minValue" : 10.193293,				|	"minValue" : 10.193293,				|
|	"maxValue" : 51.523853,				|	"maxValue" : 51.523853,				|
|	"nulls" : 0					|	"nulls" : 0					|
|	}, {						|	}, {						|
|	"name" : [ "col_intrvl_yr" ],			|	"name" : [ "col_intrvl_yr" ],			|
|	"minValue" : {					|	"minValue" : "sQAAAAAAAAAAAAAA",		|
|	"bytesUnsafe" : "sQAAAAAAAAAAAAAA",		|	"maxValue" : "OgEAAAAAAAAAAAAA",		|
|	"bytes" : "sQAAAAAAAAAAAAAA",			|	"nulls" : 0					|
|	"backingBytesReused" : true			|	}, {						|
|	},						|	"name" : [ "col_intrvl_day" ],			|
|	"maxValue" : {					|	"minValue" : "AAAAAAQAAADYx0EA",		|
|	"bytesUnsafe" : "OgEAAAAAAAAAAAAA",		|	"maxValue" : "AAAAABoAAACQ4KEB",		|
|	"bytes" : "OgEAAAAAAAAAAAAA",			|	"nulls" : 0					|
|	"backingBytesReused" : true			|	}, {						|
|	},						|	"name" : [ "col_bln" ],				|
|	"nulls" : 0					|	"minValue" : false,				|
|	}, {						|	"maxValue" : false,				|
|	"name" : [ "col_intrvl_day" ],			|	"nulls" : 0					|
|	"minValue" : {					|	} ]						|
|	"bytesUnsafe" : "AAAAAAQAAADYx0EA",		|	} ]						|
|	"bytes" : "AAAAAAQAAADYx0EA",			|	}, {						|
|	"backingBytesReused" : true			|	"path" : "0_0_2.parquet",			|
|	},						|	"length" : 1458,				|
|	"maxValue" : {					|	"rowGroups" : [ {				|
|	"bytesUnsafe" : "AAAAABoAAACQ4KEB",		|	"start" : 4,					|
|	"bytes" : "AAAAABoAAACQ4KEB",			|	"length" : 540,					|
|	"backingBytesReused" : true			|	"rowCount" : 2,					|
|	},						|	"hostAffinity" : {				|
|	"nulls" : 0					|	"localhost" : 1.0				|
|	}, {						|	},						|
|	"name" : [ "col_bln" ],				|	"columns" : [ {					|
|	"minValue" : false,				|	"name" : [ "col_int" ],				|
|	"maxValue" : false,				|	"minValue" : 7272,				|
|	"nulls" : 0					|	"maxValue" : 63069,				|
|	} ]						|	"nulls" : 0					|
|	} ]						|	}, {						|
|	}, {						|	"name" : [ "col_chr" ],				|
|	"path" : "0_0_2.parquet",			|	"minValue" : "TVQ=",				|
|	"length" : 1458,				|	"maxValue" : "T1I=",				|
|	"rowGroups" : [ {				|	"nulls" : 0					|
|	"start" : 4,					|	}, {						|
|	"length" : 540,					|	"name" : [ "col_vrchr" ],			|
|	"rowCount" : 2,					|	"minValue" : "RGF2aWQgQmFybmVz",		|
|	"hostAffinity" : {				|	"maxValue" : "SmVmZmVyeSBSb2JlcnRzb24=",	|
|	"localhost" : 1.0				|	"nulls" : 0					|
|	},						|	}, {						|
|	"columns" : [ {					|	"name" : [ "col_dt" ],				|
|	"name" : [ "col_int" ],				|	"minValue" : 8288,				|
|	"minValue" : 7272,				|	"maxValue" : 17020,				|
|	"maxValue" : 63069,				|	"nulls" : 0					|
|	"nulls" : 0					|	}, {						|
|	}, {						|	"name" : [ "col_tim" ],				|
|	"name" : [ "col_chr" ],				|	"minValue" : 37000000,				|
|	"minValue" : "MT",				|	"maxValue" : 73718000,				|
|	"maxValue" : "OR",				|	"nulls" : 0					|
|	"nulls" : 0					|	}, {						|
|	}, {						|	"name" : [ "col_tmstmp" ],			|
|	"name" : [ "col_vrchr" ],			|	"minValue" : 36743174000,			|
|	"minValue" : "David Barnes",			|	"maxValue" : 241908803000,			|
|	"maxValue" : "Jeffery Robertson",		|	"nulls" : 0					|
|	"nulls" : 0					|	}, {						|
|	}, {						|	"name" : [ "col_flt" ],				|
|	"name" : [ "col_dt" ],				|	"minValue" : 4.737761,				|
|	"minValue" : 8288,				|	"maxValue" : 20.417383,				|
|	"maxValue" : 17020,				|	"nulls" : 0					|
|	"nulls" : 0					|	}, {						|
|	}, {						|	"name" : [ "col_intrvl_yr" ],			|
|	"name" : [ "col_tim" ],				|	"minValue" : "nwAAAAAAAAAAAAAA",		|
|	"minValue" : 37000000,				|	"maxValue" : "XAAAAAAAAAAAAAAA",		|
|	"maxValue" : 73718000,				|	"nulls" : 0					|
|	"nulls" : 0					|	}, {						|
|	}, {						|	"name" : [ "col_intrvl_day" ],			|
|	"name" : [ "col_tmstmp" ],			|	"minValue" : "AAAAABIAAABY2f4B",		|
|	"minValue" : 36743174000,			|	"maxValue" : "AAAAABcAAACAr3kB",		|
|	"maxValue" : 241908803000,			|	"nulls" : 0					|
|	"nulls" : 0					|	}, {						|
|	}, {						|	"name" : [ "col_bln" ],				|
|	"name" : [ "col_flt" ],				|	"minValue" : true,				|
|	"minValue" : 4.737761,				|	"maxValue" : true,				|
|	"maxValue" : 20.417383,				|	"nulls" : 0					|
|	"nulls" : 0					|	} ]						|
|	}, {						|	} ]						|
|	"name" : [ "col_intrvl_yr" ],			|	} ],						|
|	"minValue" : {					|	"directories" : [ ],				|
|	"bytesUnsafe" : "nwAAAAAAAAAAAAAA",		|	"drillVersion" : "1.11.0-SNAPSHOT"		|
|	"bytes" : "nwAAAAAAAAAAAAAA",			|	}						|
|	"backingBytesReused" : true			|							|
|	},						|							|
|	"maxValue" : {					|							|
|	"bytesUnsafe" : "XAAAAAAAAAAAAAAA",		|							|
|	"bytes" : "XAAAAAAAAAAAAAAA",			|							|
|	"backingBytesReused" : true			|							|
|	},						|							|
|	"nulls" : 0					|							|
|	}, {						|							|
|	"name" : [ "col_intrvl_day" ],			|							|
|	"minValue" : {					|							|
|	"bytesUnsafe" : "AAAAABIAAABY2f4B",		|							|
|	"bytes" : "AAAAABIAAABY2f4B",			|							|
|	"backingBytesReused" : true			|							|
|	},						|							|
|	"maxValue" : {					|							|
|	"bytesUnsafe" : "AAAAABcAAACAr3kB",		|							|
|	"bytes" : "AAAAABcAAACAr3kB",			|							|
|	"backingBytesReused" : true			|							|
|	},						|							|
|	"nulls" : 0					|							|
|	}, {						|							|
|	"name" : [ "col_bln" ],				|							|
|	"minValue" : true,				|							|
|	"maxValue" : true,				|							|
|	"nulls" : 0					|							|
|	} ]						|							|
|	} ]						|							|
|	} ],						|							|
|	"directories" : [ ],				|							|
|	"drillVersion" : "1.11.0-SNAPSHOT"		|							|
|	}						|							|
+-------------------------------------------------------+-------------------------------------------------------+
{noformat}
Drill version with changes for this Jira allows to read parquet table metadata cache with version v3 and older. 
Drill 1.10.0 will throw an exception when it will try to read parquet table metadata cache with version v4 and greater.

There are no failures in functional tests connected with these changes.


was (Author: vvysotskyi):
Drill serializes values of binary fields to parquet metadata cache file using the code {{new String(((Binary) bytes).getBytes())}}
but when bytes has encoding that differs from default, for example it has little-endian byte order, then {{new String(((Binary) bytes).getBytes()).getBytes()}}
would return byte array that differs from the {{bytes}}. 
According to [Parquet Logical Type Definitions|https://github.com/Parquet/parquet-format/blob/master/LogicalTypes.md], big-endian byte order should be used to store DECIMAL values in fixed_len_byte_array or binary field. INTERVAL type uses little-endian byte order to store its value in fixed_len_byte_array field.
Drill stores correctly only values of binary fields in parquet metadata cache file, but values of fixed_len_byte_array fields are storing as Binary objects:
{noformat}
      {
        "name" : [ "col_intrvl_yr" ],
        "minValue" : {
          "bytesUnsafe" : "sQAAAAAAAAAAAAAA",
          "bytes" : "sQAAAAAAAAAAAAAA",
          "backingBytesReused" : true
        },
        "maxValue" : {
          "bytesUnsafe" : "OgEAAAAAAAAAAAAA",
          "bytes" : "OgEAAAAAAAAAAAAA",
          "backingBytesReused" : true
        },
        "nulls" : 0
      }
{noformat}
Since Drill may store some types in binary and fixed_len_byte_array fields, it is required to serialize / deserialize both these types by the same way. For example according to [Parquet Logical Type Definitions|https://github.com/Parquet/parquet-format/blob/master/LogicalTypes.md], DECIMAL field may be stored as binary or fixed_len_byte_array field.

Proposal is to serialize byte arrays directly by calling {{((Binary) value.minValue).getBytes()}} and deserialize by calling {{Base64.decodeBase64(((String) source).getBytes())}}.
So there will be no dependence on the byte order.

Another problem is backward compatibility. When metadata file, that created by the version of Drill with these changes will be read from older Drill version, it may lead to errors or wrong results. Updating the metadata version does not help, since old Drill versions just throws an exception when is trying to read new metadata cache files:
{noformat}
Error: SYSTEM ERROR: JsonMappingException: Could not resolve type id 'v4' into a subtype of [simple type, class org.apache.drill.exec.store.parquet.Metadata$ParquetTableMetadataBase]: known type ids = [Metadata$ParquetTableMetadataBase, v1, v2, v3]
 at [Source: org.apache.hadoop.fs.ChecksumFileSystem$FSDataBoundedInputStream@7b609ce0; line: 2, column: 24]
{noformat}

Metadata cache files without and with changes for DRILL-4139 attached to the Jira.

Drill version with changes for this Jira allows to read parquet table metadata cache with version v3 and older. 
Drill 1.10.0 will throw an exception when it will try to read parquet table metadata cache with version v4 and greater.


> Fix parquet partition pruning for BIT, INTERVAL and DECIMAL types
> -----------------------------------------------------------------
>
>                 Key: DRILL-4139
>                 URL: https://issues.apache.org/jira/browse/DRILL-4139
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Parquet
>    Affects Versions: 1.3.0
>         Environment: 4 node cluster on CentOS
>            Reporter: Khurram Faraaz
>            Assignee: Volodymyr Vysotskyi
>         Attachments: metadata file v3, metadata file with changes
>
>
> Exception while trying to prune partition.
> java.lang.UnsupportedOperationException: Unsupported type: BIT
> is seen in drillbit.log after Functional run on 4 node cluster.
> Drill 1.3.0 sys.version => d61bb83a8
> {code}
> 2015-11-27 03:12:19,809 [29a835ec-3c02-0fb6-d3c1-bae276ef7385:foreman] INFO  o.a.d.e.p.l.partition.PruneScanRule - Beginning partition pruning, pruning class: org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2
> 2015-11-27 03:12:19,809 [29a835ec-3c02-0fb6-d3c1-bae276ef7385:foreman] INFO  o.a.d.e.p.l.partition.PruneScanRule - Total elapsed time to build and analyze filter tree: 0 ms
> 2015-11-27 03:12:19,810 [29a835ec-3c02-0fb6-d3c1-bae276ef7385:foreman] WARN  o.a.d.e.p.l.partition.PruneScanRule - Exception while trying to prune partition.
> java.lang.UnsupportedOperationException: Unsupported type: BIT
>         at org.apache.drill.exec.store.parquet.ParquetGroupScan.populatePruningVector(ParquetGroupScan.java:479) ~[drill-java-exec-1.3.0.jar:1.3.0]
>         at org.apache.drill.exec.planner.ParquetPartitionDescriptor.populatePartitionVectors(ParquetPartitionDescriptor.java:96) ~[drill-java-exec-1.3.0.jar:1.3.0]
>         at org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:235) ~[drill-java-exec-1.3.0.jar:1.3.0]
>         at org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87) [drill-java-exec-1.3.0.jar:1.3.0]
>         at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) [calcite-core-1.4.0-drill-r8.jar:1.4.0-drill-r8]
>         at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808) [calcite-core-1.4.0-drill-r8.jar:1.4.0-drill-r8]
>         at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) [calcite-core-1.4.0-drill-r8.jar:1.4.0-drill-r8]
>         at org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:303) [calcite-core-1.4.0-drill-r8.jar:1.4.0-drill-r8]
>         at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.logicalPlanningVolcanoAndLopt(DefaultSqlHandler.java:545) [drill-java-exec-1.3.0.jar:1.3.0]
>         at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:213) [drill-java-exec-1.3.0.jar:1.3.0]
>         at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:248) [drill-java-exec-1.3.0.jar:1.3.0]
>         at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:164) [drill-java-exec-1.3.0.jar:1.3.0]
>         at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:184) [drill-java-exec-1.3.0.jar:1.3.0]
>         at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:905) [drill-java-exec-1.3.0.jar:1.3.0]
>         at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:244) [drill-java-exec-1.3.0.jar:1.3.0]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_45]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_45]
>         at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)