You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Pritesh Maker (JIRA)" <ji...@apache.org> on 2017/10/13 19:33:02 UTC
[jira] [Assigned] (DRILL-3118) Improve error messaging if table has
name that conflicts with partition label
[ https://issues.apache.org/jira/browse/DRILL-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pritesh Maker reassigned DRILL-3118:
------------------------------------
Assignee: Padma Penumarthy (was: Pritesh Maker)
> Improve error messaging if table has name that conflicts with partition label
> -----------------------------------------------------------------------------
>
> Key: DRILL-3118
> URL: https://issues.apache.org/jira/browse/DRILL-3118
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Flow
> Affects Versions: 1.0.0
> Reporter: Hao Zhu
> Assignee: Padma Penumarthy
> Fix For: Future
>
>
> Tested on 1.0 with commit id:
> {code}
> select commit_id from sys.version;
> +-------------------------------------------+
> | commit_id |
> +-------------------------------------------+
> | d8b19759657698581cc0d01d7038797952888123 |
> +-------------------------------------------+
> 1 row selected (0.097 seconds)
> {code}
> When source data has column name like "dir0","dir1"...., the query may fail with "java.lang.IndexOutOfBoundsException".
> For example:
> {code}
> > select `dir999` from dfs.root.`user/hive/warehouse/testdir999/3d49fc1fd0bc7e81-e6c5bb9affac8684_358897896_data.parquet`;
> Error: SYSTEM ERROR: java.lang.IndexOutOfBoundsException: index: 0, length: 4 (expected: range(0, 0))
> Fragment 0:0
> [Error Id: d289b3d7-1172-4ed7-b679-7af80d9aca7c on h1.poc.com:31010]
> (org.apache.drill.common.exceptions.DrillRuntimeException) Error in parquet record reader.
> Message:
> Hadoop path: /user/hive/warehouse/testdir999/3d49fc1fd0bc7e81-e6c5bb9affac8684_358897896_data.parquet
> Total records read: 0
> Mock records read: 0
> Records to read: 32768
> Row group index: 0
> Records in row group: 1
> Parquet Metadata: ParquetMetaData{FileMetaData{schema: message schema {
> optional int32 id;
> optional binary dir999;
> }
> , metadata: {}}, blocks: [BlockMetaData{1, 98 [ColumnMetaData{SNAPPY [id] INT32 [PLAIN, RLE, PLAIN_DICTIONARY], 23}, ColumnMetaData{SNAPPY [dir999] BINARY [PLAIN, RLE, PLAIN_DICTIONARY], 103}]}]}
> org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.handleAndRaise():339
> org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.next():441
> org.apache.drill.exec.physical.impl.ScanBatch.next():175
> org.apache.drill.exec.physical.impl.BaseRootExec.next():83
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():80
> org.apache.drill.exec.physical.impl.BaseRootExec.next():73
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():259
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():253
> java.security.AccessController.doPrivileged():-2
> optional int32 id;
> optional binary dir999;
> }
> , metadata: {}}, blocks: [BlockMetaData{1, 98 [ColumnMetaData{SNAPPY [id] INT32 [PLAIN, RLE, PLAIN_DICTIONARY], 23}, ColumnMetaData{SNAPPY [dir999] BINARY [PLAIN, RLE, PLAIN_DICTIONARY], 103}]}]}
> org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.handleAndRaise():339
> org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.next():441
> org.apache.drill.exec.physical.impl.ScanBatch.next():175
> org.apache.drill.exec.physical.impl.BaseRootExec.next():83
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():80
> org.apache.drill.exec.physical.impl.BaseRootExec.next():73
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():259
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():253
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1469
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():253
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> java.lang.Thread.run():745
> Caused By (java.lang.IndexOutOfBoundsException) index: 0, length: 4 (expected: range(0, 0))
> io.netty.buffer.DrillBuf.checkIndexD():189
> io.netty.buffer.DrillBuf.chk():211
> io.netty.buffer.DrillBuf.getInt():491
> org.apache.drill.exec.vector.UInt4Vector$Accessor.get():321
> org.apache.drill.exec.vector.VarBinaryVector$Mutator.setSafe():481
> org.apache.drill.exec.vector.NullableVarBinaryVector$Mutator.fillEmpties():408
> org.apache.drill.exec.vector.NullableVarBinaryVector$Mutator.setValueCount():513
> org.apache.drill.exec.store.parquet.columnreaders.VarLenBinaryReader.readFields():78
> org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.next():425
> org.apache.drill.exec.physical.impl.ScanBatch.next():175
> org.apache.drill.exec.physical.impl.BaseRootExec.next():83
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():80
> org.apache.drill.exec.physical.impl.BaseRootExec.next():73
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():259
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():253
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1469
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():253
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> java.lang.Thread.run():745 (state=,code=0)
> {code}
> My thought:
> We need to fix this by
> 1. Either prompting a readable message saying "dirN" is a reserved column names, please change drill.exec.storage.file.partition.column.label to something else;
> 2. Or/And if source data has dirN columns, it should override our reserved "dirN".
> 3. We need to document "drill.exec.storage.file.partition.column.label" in http://drill.apache.org/docs/querying-directories/
> 4. drill.exec.storage.file.partition.column.label is a system level configuration, if we use it as a workaround, it will impact the whole system. Can we make it a session level?
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)