You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Bharat Viswanadham (Jira)" <ji...@apache.org> on 2020/02/10 23:45:00 UTC

[jira] [Resolved] (HDDS-2936) Hive queries fail at readFully

     [ https://issues.apache.org/jira/browse/HDDS-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bharat Viswanadham resolved HDDS-2936.
--------------------------------------
    Fix Version/s: 0.5.0
       Resolution: Fixed

> Hive queries fail at readFully
> ------------------------------
>
>                 Key: HDDS-2936
>                 URL: https://issues.apache.org/jira/browse/HDDS-2936
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>    Affects Versions: 0.5.0
>            Reporter: Istvan Fajth
>            Assignee: Shashikant Banerjee
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 0.5.0
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> When running Hive queries on a 1TB dataset for TPC-DS tests, we started to see an exception coming out from FSInputStream.readFully.
> This does not happen with a smaller 100GB dataset, so possibly multi block long files are the cause of the trouble, and the issue was not seen with a build from early december, so we most likely to blame a recent change since then. The build I am running with is from the hash 929f2f85d0379aab5aabeded8a4d3a5056777706 of master branch but with HDDS-2188 reverted from the code.
> The exception I see:
> {code}
> Error while running task ( failure ) : attempt_1579615091731_0060_9_05_000029_3:java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: java.io.EOFException: End of file reached before reading fully.
>         at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
>         at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>         at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>         at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>         at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
>         at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>         at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>         at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>         at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
>         at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
>         at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.RuntimeException: java.io.IOException: java.io.EOFException: End of file reached before reading fully.
>         at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:206)
>         at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.<init>(TezGroupedSplitsInputFormat.java:145)
>         at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:111)
>         at org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:157)
>         at org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:83)
>         at org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:703)
>         at org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:662)
>         at org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:150)
>         at org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:114)
>         at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:532)
>         at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:178)
>         at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266)
>         ... 16 more
> Caused by: java.io.IOException: java.io.EOFException: End of file reached before reading fully.
>         at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
>         at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
>         at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:422)
>         at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203)
>         ... 27 more
> Caused by: java.io.EOFException: End of file reached before reading fully.
>         at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:126)
>         at org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
>         at org.apache.orc.impl.RecordReaderUtils$DefaultDataReader.readStripeFooter(RecordReaderUtils.java:269)
>         at org.apache.orc.impl.RecordReaderImpl.readStripeFooter(RecordReaderImpl.java:308)
>         at org.apache.orc.impl.RecordReaderImpl.beginReadStripe(RecordReaderImpl.java:1089)
>         at org.apache.orc.impl.RecordReaderImpl.readStripe(RecordReaderImpl.java:1051)
>         at org.apache.orc.impl.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:1219)
>         at org.apache.orc.impl.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1254)
>         at org.apache.orc.impl.RecordReaderImpl.<init>(RecordReaderImpl.java:284)
>         at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.<init>(RecordReaderImpl.java:67)
>         at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:83)
>         at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader.<init>(VectorizedOrcAcidRowBatchReader.java:145)
>         at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader.<init>(VectorizedOrcAcidRowBatchReader.java:135)
>         at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:2046)
>         at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:419)
>         ... 28 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org