You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Sumit Khanna <su...@askme.in> on 2016/08/06 05:03:18 UTC

parquet decoding exceptions - hue sample data view works fine though

Hey,

I am having a parquet dir and a table mounted on it. the table is showing
sample view , via hue fine but a simple query like select * from tablename
gives this error :


   - Bad status for request TFetchResultsReq(fetchType=0,
   operationHandle=TOperationHandle(hasResultSet=True, modifiedRowCount=None,
   operationType=0,
   operationId=THandleIdentifier(secret='\xf7\xe7\x90\x0e\x85\x91E{\x99\xd1\xdf>v\xf7\x8c`',
   guid='\xcc\xd6$^\xac{M\xaf\x9c{\xc2\xcf\xf3\xc6\xe7/')), orientation=4,
   maxRows=100): TFetchResultsResp(status=TStatus(errorCode=0,
   errorMessage='java.io.IOException: parquet.io.ParquetDecodingException: Can
   not read value at 0 in block -1 in file
   hdfs://askmehadoop/parquet1_mpdm_mpdm_store/partitioned_on_seller_mailer_flag=1/part-r-00000-a77c308f-c088-4f41-ab07-0c8e0557dbe1.gz.parquet',
   sqlState=None,
   infoMessages=['*org.apache.hive.service.cli.HiveSQLException:java.io.IOException:
   parquet.io.ParquetDecodingException: Can not read value at 0 in block -1 in
   file
   hdfs://askmehadoop/parquet1_mpdm_mpdm_store/partitioned_on_seller_mailer_flag=1/part-r-00000-a77c308f-c088-4f41-ab07-0c8e0557dbe1.gz.parquet:25:24',
   'org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:352',
   'org.apache.hive.service.cli.operation.OperationManager:getOperationNextRowSet:OperationManager.java:220',
   'org.apache.hive.service.cli.session.HiveSessionImpl:fetchResults:HiveSessionImpl.java:685',
   'sun.reflect.GeneratedMethodAccessor63:invoke::-1',
   'sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43',
   'java.lang.reflect.Method:invoke:Method.java:498',
   'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78',
   'org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36',
   'org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63',
   'java.security.AccessController:doPrivileged:AccessController.java:-2',
   'javax.security.auth.Subject:doAs:Subject.java:422',
   'org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1657',
   'org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59',
   'com.sun.proxy.$Proxy22:fetchResults::-1',
   'org.apache.hive.service.cli.CLIService:fetchResults:CLIService.java:454',
   'org.apache.hive.service.cli.thrift.ThriftCLIService:FetchResults:ThriftCLIService.java:672',
   'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1553',
   'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1538',
   'org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39',
   'org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39',
   'org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56',
   'org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:285',
   'java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1142',
   'java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:617',
   'java.lang.Thread:run:Thread.java:745',
   '*java.io.IOException:parquet.io.ParquetDecodingException: Can not read
   value at 0 in block -1 in file
   hdfs://askmehadoop/parquet1_mpdm_mpdm_store/partitioned_on_seller_mailer_flag=1/part-r-00000-a77c308f-c088-4f41-ab07-0c8e0557dbe1.gz.parquet:29:4',
   'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextRow:FetchOperator.java:507',
   'org.apache.hadoop.hive.ql.exec.FetchOperator:pushRow:FetchOperator.java:414',
   'org.apache.hadoop.hive.ql.exec.FetchTask:fetch:FetchTask.java:140',
   'org.apache.hadoop.hive.ql.Driver:getResults:Driver.java:1670',
   'org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:347',
   '*parquet.io.ParquetDecodingException:Can not read value at 0 in block -1
   in file
   hdfs://askmehadoop/parquet1_mpdm_mpdm_store/partitioned_on_seller_mailer_flag=1/part-r-00000-a77c308f-c088-4f41-ab07-0c8e0557dbe1.gz.parquet:36:7',
   'parquet.hadoop.InternalParquetRecordReader:nextKeyValue:InternalParquetRecordReader.java:228',
   'parquet.hadoop.ParquetRecordReader:nextKeyValue:ParquetRecordReader.java:201',
   'org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper:<init>:ParquetRecordReaderWrapper.java:122',
   'org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper:<init>:ParquetRecordReaderWrapper.java:85',
   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat:getRecordReader:MapredParquetInputFormat.java:72',
   'org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit:getRecordReader:FetchOperator.java:673',
   'org.apache.hadoop.hive.ql.exec.FetchOperator:getRecordReader:FetchOperator.java:323',
   'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextRow:FetchOperator.java:445',
   '*java.lang.UnsupportedOperationException:parquet.column.values.dictionary.PlainValuesDictionary$PlainLongDictionary:47:11',
   'parquet.column.Dictionary:decodeToBinary:Dictionary.java:44',
   'org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter$BinaryConverter:setDictionary:ETypeConverter.java:227',
   'parquet.column.impl.ColumnReaderImpl:<init>:ColumnReaderImpl.java:339',
   'parquet.column.impl.ColumnReadStoreImpl:newMemColumnReader:ColumnReadStoreImpl.java:66',
   'parquet.column.impl.ColumnReadStoreImpl:getColumnReader:ColumnReadStoreImpl.java:61',
   'parquet.io.RecordReaderImplementation:<init>:RecordReaderImplementation.java:270',
   'parquet.io.MessageColumnIO$1:visit:MessageColumnIO.java:134',
   'parquet.io.MessageColumnIO$1:visit:MessageColumnIO.java:99',
   'parquet.filter2.compat.FilterCompat$NoOpFilter:accept:FilterCompat.java:154',
   'parquet.io.MessageColumnIO:getRecordReader:MessageColumnIO.java:99',
   'parquet.hadoop.InternalParquetRecordReader:checkRead:InternalParquetRecordReader.java:137',
   'parquet.hadoop.InternalParquetRecordReader:nextKeyValue:InternalParquetRecordReader.java:208'],
   statusCode=3), results=None, hasMoreRows=None)

ls that something to do with hive? or a parquet error as such ?  I have
posted it in both these groups, but I am afraid it mustnt be parquet
because data is showing fine (viewed it in hue).

Has anyone experienced similar errors before? Kindly let me know..

Awaiting Your Reply,

Thanks
Sumit

Re: parquet decoding exceptions - hue sample data view works fine though

Posted by Sumit Khanna <su...@askme.in>.
Well anyways, even from Hue if I try loading partition wise data, it throws
the same error. Am really really perplexed to what this bug really is..
Although, if I try viewing the data in general, it displays me column names
/ values  / analysis etc but not partitionwise.

Thanks,
Sumit

On Sat, Aug 6, 2016 at 10:33 AM, Sumit Khanna <su...@askme.in> wrote:

> Hey,
>
> I am having a parquet dir and a table mounted on it. the table is showing
> sample view , via hue fine but a simple query like select * from tablename
> gives this error :
>
>
>    - Bad status for request TFetchResultsReq(fetchType=0, operationHandle=
>    TOperationHandle(hasResultSet=True, modifiedRowCount=None,
>    operationType=0, operationId=THandleIdentifier(
>    secret='\xf7\xe7\x90\x0e\x85\x91E{\x99\xd1\xdf>v\xf7\x8c`',
>    guid='\xcc\xd6$^\xac{M\xaf\x9c{\xc2\xcf\xf3\xc6\xe7/')),
>    orientation=4, maxRows=100): TFetchResultsResp(status=TStatus(errorCode=0,
>    errorMessage='java.io.IOException: parquet.io.ParquetDecodingException:
>    Can not read value at 0 in block -1 in file hdfs://askmehadoop/parquet1_
>    mpdm_mpdm_store/partitioned_on_seller_mailer_flag=1/part-
>    r-00000-a77c308f-c088-4f41-ab07-0c8e0557dbe1.gz.parquet',
>    sqlState=None, infoMessages=['*org.apache.hive.service.cli.
>    HiveSQLException:java.io.IOException: parquet.io.ParquetDecodingException:
>    Can not read value at 0 in block -1 in file hdfs://askmehadoop/parquet1_
>    mpdm_mpdm_store/partitioned_on_seller_mailer_flag=1/part-
>    r-00000-a77c308f-c088-4f41-ab07-0c8e0557dbe1.gz.parquet:25:24',
>    'org.apache.hive.service.cli.operation.SQLOperation:
>    getNextRowSet:SQLOperation.java:352', 'org.apache.hive.service.cli.
>    operation.OperationManager:getOperationNextRowSet:OperationManager.java:220',
>    'org.apache.hive.service.cli.session.HiveSessionImpl:
>    fetchResults:HiveSessionImpl.java:685', 'sun.reflect.
>    GeneratedMethodAccessor63:invoke::-1', 'sun.reflect.
>    DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43',
>    'java.lang.reflect.Method:invoke:Method.java:498',
>    'org.apache.hive.service.cli.session.HiveSessionProxy:
>    invoke:HiveSessionProxy.java:78', 'org.apache.hive.service.cli.
>    session.HiveSessionProxy:access$000:HiveSessionProxy.java:36',
>    'org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63',
>    'java.security.AccessController:doPrivileged:AccessController.java:-2',
>    'javax.security.auth.Subject:doAs:Subject.java:422',
>    'org.apache.hadoop.security.UserGroupInformation:doAs:
>    UserGroupInformation.java:1657', 'org.apache.hive.service.cli.
>    session.HiveSessionProxy:invoke:HiveSessionProxy.java:59',
>    'com.sun.proxy.$Proxy22:fetchResults::-1',
>    'org.apache.hive.service.cli.CLIService:fetchResults:CLIService.java:454',
>    'org.apache.hive.service.cli.thrift.ThriftCLIService:
>    FetchResults:ThriftCLIService.java:672', 'org.apache.hive.service.cli.
>    thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1553',
>    'org.apache.hive.service.cli.thrift.TCLIService$Processor$
>    FetchResults:getResult:TCLIService.java:1538', 'org.apache.thrift.
>    ProcessFunction:process:ProcessFunction.java:39', 'org.apache.thrift.
>    TBaseProcessor:process:TBaseProcessor.java:39',
>    'org.apache.hive.service.auth.TSetIpAddressProcessor:process:
>    TSetIpAddressProcessor.java:56', 'org.apache.thrift.server.
>    TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:285',
>    'java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1142',
>    'java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:617',
>    'java.lang.Thread:run:Thread.java:745', '*java.io.IOException:parquet.io.ParquetDecodingException:
>    Can not read value at 0 in block -1 in file hdfs://askmehadoop/parquet1_
>    mpdm_mpdm_store/partitioned_on_seller_mailer_flag=1/part-
>    r-00000-a77c308f-c088-4f41-ab07-0c8e0557dbe1.gz.parquet:29:4',
>    'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextRow:FetchOperator.java:507',
>    'org.apache.hadoop.hive.ql.exec.FetchOperator:pushRow:FetchOperator.java:414',
>    'org.apache.hadoop.hive.ql.exec.FetchTask:fetch:FetchTask.java:140',
>    'org.apache.hadoop.hive.ql.Driver:getResults:Driver.java:1670',
>    'org.apache.hive.service.cli.operation.SQLOperation:
>    getNextRowSet:SQLOperation.java:347', '*parquet.io.ParquetDecodingException:Can
>    not read value at 0 in block -1 in file hdfs://askmehadoop/parquet1_
>    mpdm_mpdm_store/partitioned_on_seller_mailer_flag=1/part-
>    r-00000-a77c308f-c088-4f41-ab07-0c8e0557dbe1.gz.parquet:36:7',
>    'parquet.hadoop.InternalParquetRecordReader:nextKeyValue:
>    InternalParquetRecordReader.java:228', 'parquet.hadoop.
>    ParquetRecordReader:nextKeyValue:ParquetRecordReader.java:201', '
>    org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper:<
>    init>:ParquetRecordReaderWrapper.java:122', '
>    org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper:<
>    init>:ParquetRecordReaderWrapper.java:85', '
>    org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat:
>    getRecordReader:MapredParquetInputFormat.java:72',
>    'org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit:
>    getRecordReader:FetchOperator.java:673', 'org.apache.hadoop.hive.ql.
>    exec.FetchOperator:getRecordReader:FetchOperator.java:323',
>    'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextRow:FetchOperator.java:445',
>    '*java.lang.UnsupportedOperationException:parquet.column.values.
>    dictionary.PlainValuesDictionary$PlainLongDictionary:47:11',
>    'parquet.column.Dictionary:decodeToBinary:Dictionary.java:44', '
>    org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter$
>    BinaryConverter:setDictionary:ETypeConverter.java:227',
>    'parquet.column.impl.ColumnReaderImpl:<init>:ColumnReaderImpl.java:339',
>    'parquet.column.impl.ColumnReadStoreImpl:newMemColumnReader:ColumnReadStoreImpl.java:66',
>    'parquet.column.impl.ColumnReadStoreImpl:getColumnReader:ColumnReadStoreImpl.java:61',
>    'parquet.io.RecordReaderImplementation:<init>:
>    RecordReaderImplementation.java:270', 'parquet.io.MessageColumnIO$1:
>    visit:MessageColumnIO.java:134', 'parquet.io.MessageColumnIO$1:
>    visit:MessageColumnIO.java:99', 'parquet.filter2.compat.
>    FilterCompat$NoOpFilter:accept:FilterCompat.java:154',
>    'parquet.io.MessageColumnIO:getRecordReader:MessageColumnIO.java:99',
>    'parquet.hadoop.InternalParquetRecordReader:checkRead:
>    InternalParquetRecordReader.java:137', 'parquet.hadoop.
>    InternalParquetRecordReader:nextKeyValue:InternalParquetRecordReader.java:208'],
>    statusCode=3), results=None, hasMoreRows=None)
>
> ls that something to do with hive? or a parquet error as such ?  I have
> posted it in both these groups, but I am afraid it mustnt be parquet
> because data is showing fine (viewed it in hue).
>
> Has anyone experienced similar errors before? Kindly let me know..
>
> Awaiting Your Reply,
>
> Thanks
> Sumit
>