You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Reuben Kuhnert (JIRA)" <ji...@apache.org> on 2015/12/14 22:07:46 UTC
[jira] [Created] (PARQUET-406) Counter Initialization causes NPE
Reuben Kuhnert created PARQUET-406:
--------------------------------------
Summary: Counter Initialization causes NPE
Key: PARQUET-406
URL: https://issues.apache.org/jira/browse/PARQUET-406
Project: Parquet
Issue Type: Bug
Reporter: Reuben Kuhnert
{code}
CREATE EXTERNAL TABLE api_hit_parquet_test ROW FORMAT SERDE 'com.foursquare.hadoop.hive.serde.RecordV2SerDe' WITH SERDEPROPERTIES ('serialization.class' = 'com.foursquare.logs.gen.ApiHit') STORED AS INPUTFORMAT 'com.foursquare.hadoop.hive.io.HiveThriftParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION '/user/bly/api_hit_parquet' TBLPROPERTIES ('thrift.parquetfile.input.format.thrift.class' = 'com.foursquare.logs.gen.ApiHit’)
{code}
The table is successfully created, and I can verify the schema is correct by running DESCRIBE FORMATTED on it. However, when I try to do a simple SELECT * on the table, I get the following stack trace:
{code}
java.io.IOException: java.lang.RuntimeException: Could not read first record (and it was not an EOF)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507)
at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:138)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1657)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:227)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:159)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:756)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:615)
Caused by: java.lang.RuntimeException: Could not read first record (and it was not an EOF)
at com.twitter.elephantbird.mapred.input.DeprecatedInputFormatWrapper$RecordReaderWrapper.initKeyValueObjects(DeprecatedInputFormatWrapper.java:280)
at com.twitter.elephantbird.mapred.input.DeprecatedInputFormatWrapper$RecordReaderWrapper.createValue(DeprecatedInputFormatWrapper.java:297)
at com.foursquare.hadoop.hive.io.HiveThriftParquetInputFormat$$anon$1.<init>(HiveThriftParquetInputFormat.scala:47)
at com.foursquare.hadoop.hive.io.HiveThriftParquetInputFormat.getRecordReader(HiveThriftParquetInputFormat.scala:46)
at org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:667)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:323)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:445)
... 9 more
Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read value at 0 in block -1 in file hdfs://hadoop-alidoro-nn-vip/user/bly/api_hit_parquet/part-m-00000.parquet
at org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:243)
at org.apache.parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:227)
at com.twitter.elephantbird.mapred.input.DeprecatedInputFormatWrapper$RecordReaderWrapper.initKeyValueObjects(DeprecatedInputFormatWrapper.java:271)
... 15 more
Caused by: java.lang.NullPointerException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.parquet.hadoop.util.ContextUtil.invoke(ContextUtil.java:264)
at org.apache.parquet.hadoop.util.ContextUtil.incrementCounter(ContextUtil.java:273)
at org.apache.parquet.hadoop.util.counters.mapreduce.MapReduceCounterAdapter.increment(MapReduceCounterAdapter.java:38)
at org.apache.parquet.hadoop.util.counters.BenchmarkCounter.incrementTotalBytes(BenchmarkCounter.java:78)
at org.apache.parquet.hadoop.ParquetFileReader.readNextRowGroup(ParquetFileReader.java:497)
at org.apache.parquet.hadoop.InternalParquetRecordReader.checkRead(InternalParquetRecordReader.java:130)
at org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:214)
... 17 more
{code}
I have spent some time following this stack trace, and it appears that the error lies in the Counter code, which is odd because I don’t do anything with that. Is there some way I need to initialize counters?
To be specific, I have found that MapReduceCounterAdapter is being created with a null parameter. Here is the constructor:
{code}
public MapReduceCounterAdapter(Counter adaptee) {
this.adaptee = adaptee;
}
{code}
So adaptee is being passed as null, and then getting called later on, causing my NullPointerException.
The adaptee parameter is created by this method:
{code}
public static Counter getCounter(TaskInputOutputContext context,
String groupName, String counterName) {
return (Counter) invoke(GET_COUNTER_METHOD, context, groupName, counterName);
}
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)