You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/07/15 00:33:00 UTC

[jira] [Work logged] (HIVE-23477) LLAP : mmap allocation interruptions fails to notify other threads

     [ https://issues.apache.org/jira/browse/HIVE-23477?focusedWorklogId=459035&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-459035 ]

ASF GitHub Bot logged work on HIVE-23477:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 15/Jul/20 00:32
            Start Date: 15/Jul/20 00:32
    Worklog Time Spent: 10m 
      Work Description: github-actions[bot] commented on pull request #1020:
URL: https://github.com/apache/hive/pull/1020#issuecomment-658479205


   This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the dev@hive.apache.org list if the patch is in need of reviews.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 459035)
    Time Spent: 20m  (was: 10m)

> LLAP : mmap allocation interruptions fails to notify other threads
> ------------------------------------------------------------------
>
>                 Key: HIVE-23477
>                 URL: https://issues.apache.org/jira/browse/HIVE-23477
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-23477.1.patch, HIVE-23477.2.patch, HIVE-23477.3.patch
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> BuddyAllocator always uses lazy allocation if mmap is enabled. If query fragment is interrupted at the time of arena allocation, ClosedByInterruptionException is thrown. This exception artificially triggers allocator OutOfMemoryError and fails to notify other threads waiting to allocate arenas. 
> {code:java}
> 2020-05-15 00:03:23.254  WARN [TezTR-128417_1_3_1_1_0] LlapIoImpl: Failed trying to allocate memory mapped arena
> java.nio.channels.ClosedByInterruptException
>         at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>         at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:970)
>         at org.apache.hadoop.hive.llap.cache.BuddyAllocator.preallocateArenaBuffer(BuddyAllocator.java:867)
>         at org.apache.hadoop.hive.llap.cache.BuddyAllocator.access$1100(BuddyAllocator.java:69)
>         at org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.init(BuddyAllocator.java:900)
>         at org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.allocateWithExpand(BuddyAllocator.java:1458)
>         at org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.access$800(BuddyAllocator.java:884)
>         at org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateWithExpand(BuddyAllocator.java:740)
>         at org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateMultiple(BuddyAllocator.java:330)
>         at org.apache.hadoop.hive.llap.io.metadata.MetadataCache.wrapBbForFile(MetadataCache.java:257)
>         at org.apache.hadoop.hive.llap.io.metadata.MetadataCache.putFileMetadata(MetadataCache.java:216)
>         at org.apache.hadoop.hive.llap.io.metadata.MetadataCache.putFileMetadata(MetadataCache.java:49)
>         at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.readSplitFooter(VectorizedParquetRecordReader.java:343)
>         at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.initialize(VectorizedParquetRecordReader.java:238)
>         at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.<init>(VectorizedParquetRecordReader.java:160)
>         at org.apache.hadoop.hive.ql.io.parquet.VectorizedParquetInputFormat.getRecordReader(VectorizedParquetInputFormat.java:50)
>         at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:87)
>         at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:427)
>         at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203)
>         at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.<init>(TezGroupedSplitsInputFormat.java:145)
>         at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:111)
>         at org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:156)
>         at org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:82)
>         at org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:703)
>         at org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:662)
>         at org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:150)
>         at org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:114)
>         at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:532)
>         at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:178)
>         at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266)
>         at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>         at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>         at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
>         at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
>         at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62)
>         at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38)
>         at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>         at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> 2020-05-15 00:03:23.254 ERROR [TezTR-128417_1_3_1_1_0] vector.VectorizedParquetRecordReader: Failed to create the vectorized reader due to exception java.lang.OutOfMemoryError: Cannot allocate 1073741824 bytes: Failed trying to allocate memory mapped arena: null; make sure your xmx and process size are set correctly. {code}
>  
> {code:java}
> "TezTR-128417_1_3_1_18_0" #319 daemon prio=5 os_prio=0 tid=0x00007f5880004000 nid=0x3c8 runnable [0x00007f57a1846000]
>    java.lang.Thread.State: TIMED_WAITING (on object monitor)
>         at java.lang.Throwable.fillInStackTrace(Native Method)
>         at java.lang.Throwable.fillInStackTrace(Throwable.java:784)
>         - locked <0x00007f5c93915e98> (a java.lang.InterruptedException)
>         at java.lang.Throwable.<init>(Throwable.java:251)
>         at java.lang.Exception.<init>(Exception.java:54)
>         at java.lang.InterruptedException.<init>(InterruptedException.java:57)
>         at java.lang.Object.wait(Native Method)
>         at org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.allocateWithExpand(BuddyAllocator.java:1443)
>         - locked <0x00007f598859f188> (a org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena)
>         at org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.access$800(BuddyAllocator.java:884)
>         at org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateWithExpand(BuddyAllocator.java:740)
>         at org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateMultiple(BuddyAllocator.java:330)
>         at org.apache.hadoop.hive.llap.io.metadata.MetadataCache.wrapBbForFile(MetadataCache.java:257)
>         at org.apache.hadoop.hive.llap.io.metadata.MetadataCache.putFileMetadata(MetadataCache.java:216)
>         at org.apache.hadoop.hive.llap.io.metadata.MetadataCache.putFileMetadata(MetadataCache.java:49)
>         at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.readSplitFooter(VectorizedParquetRecordReader.java:343) {code}
>  
> {code:java}
> "TezTR-128417_1_4_1_18_0" #588 daemon prio=5 os_prio=0 tid=0x00007f57d0004000 nid=0x43a3 in Object.wait() [0x00007f56f8681000]
>    java.lang.Thread.State: TIMED_WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         at org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.allocateWithExpand(BuddyAllocator.java:1443)
>         - locked <0x00007f598859f188> (a org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena) {code}
>  
> TezTR-128417_1_3_1_18_0 got interrupted, it threw OOM but failed to notify other threads. TezTR-128417_1_4_1_18_0 thread is stuck forever waiting to allocate.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)