You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Khurram Faraaz (JIRA)" <ji...@apache.org> on 2015/05/28 23:03:20 UTC

[jira] [Resolved] (DRILL-3207) Memory leak in window functions

     [ https://issues.apache.org/jira/browse/DRILL-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Khurram Faraaz resolved DRILL-3207.
-----------------------------------
    Resolution: Duplicate

duplicate of 3206

> Memory leak in window functions
> -------------------------------
>
>                 Key: DRILL-3207
>                 URL: https://issues.apache.org/jira/browse/DRILL-3207
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow
>    Affects Versions: 1.0.0
>         Environment: 21cc578b6b8c8f3ca1ebffd3dbb92e35d68bc726
>            Reporter: Khurram Faraaz
>            Assignee: Chris Westin
>
> Test was run on 4 node cluster on CentOS.
> Size in bytes of JSON data file.
> {code}
> [root@centos-01 ~]# hadoop fs -ls /tmp/twoKeyJsn.json
> -rwxr-xr-x   3 root root  888409136 2015-04-20 18:32 /tmp/twoKeyJsn.json
> {code}
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(key1) over(partition by key2 order by key1) from `twoKeyJsn.json`;
> java.lang.RuntimeException: java.sql.SQLException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query.
> Fragment 1:7
> [Error Id: 8ffc94b9-1318-4841-9247-259155e97202 on centos-02.qa.lab:31010]
> 	at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
> 	at sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:85)
> 	at sqlline.TableOutputFormat.print(TableOutputFormat.java:116)
> 	at sqlline.SqlLine.print(SqlLine.java:1583)
> 	at sqlline.Commands.execute(Commands.java:852)
> 	at sqlline.Commands.sql(Commands.java:751)
> 	at sqlline.SqlLine.dispatch(SqlLine.java:738)
> 	at sqlline.SqlLine.begin(SqlLine.java:612)
> 	at sqlline.SqlLine.start(SqlLine.java:366)
> 	at sqlline.SqlLine.main(SqlLine.java:259)
> {code}
> Memory usage after above query was executed
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select * from sys.memory;
> +-------------------+------------+---------------+-------------+-----------------+---------------------+-------------+
> |     hostname      | user_port  | heap_current  |  heap_max   | direct_current  | jvm_direct_current  | direct_max  |
> +-------------------+------------+---------------+-------------+-----------------+---------------------+-------------+
> | centos-01.qa.lab  | 31010      | 1304067160    | 4294967296  | 110019091       | 520095827           | 8589934592  |
> | centos-03.qa.lab  | 31010      | 2020130800    | 4294967296  | 301360965       | 738199649           | 8589934592  |
> | centos-02.qa.lab  | 31010      | 1253034864    | 4294967296  | 156397935       | 553649232           | 8589934592  |
> | centos-04.qa.lab  | 31010      | 385872528     | 4294967296  | 203721765       | 553649246           | 8589934592  |
> +-------------------+------------+---------------+-------------+-----------------+---------------------+-------------+
> 4 rows selected (0.134 seconds)
> {code}
> Memory details after rerunning the query, we are leaking memory.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(key1) over(partition by key2 order by key1) from `twoKeyJsn.json`;
> java.lang.RuntimeException: java.sql.SQLException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query.
> Fragment 1:7
> [Error Id: fe56b1ff-02b6-4ded-a317-d753ab211f5b on centos-03.qa.lab:31010]
> 	at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
> 	at sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:85)
> 	at sqlline.TableOutputFormat.print(TableOutputFormat.java:116)
> 	at sqlline.SqlLine.print(SqlLine.java:1583)
> 	at sqlline.Commands.execute(Commands.java:852)
> 	at sqlline.Commands.sql(Commands.java:751)
> 	at sqlline.SqlLine.dispatch(SqlLine.java:738)
> 	at sqlline.SqlLine.begin(SqlLine.java:612)
> 	at sqlline.SqlLine.start(SqlLine.java:366)
> 	at sqlline.SqlLine.main(SqlLine.java:259)
> 0: jdbc:drill:schema=dfs.tmp> select * from sys.memory;
> +-------------------+------------+---------------+-------------+-----------------+---------------------+-------------+
> |     hostname      | user_port  | heap_current  |  heap_max   | direct_current  | jvm_direct_current  | direct_max  |
> +-------------------+------------+---------------+-------------+-----------------+---------------------+-------------+
> | centos-01.qa.lab  | 31010      | 2414546008    | 4294967296  | 438149911       | 905971795           | 8589934592  |
> | centos-02.qa.lab  | 31010      | 1953483632    | 4294967296  | 901110416       | 1442841680          | 8589934592  |
> | centos-03.qa.lab  | 31010      | 297329544     | 4294967296  | 560852951       | 1308624993          | 8589934592  |
> | centos-04.qa.lab  | 31010      | 458157528     | 4294967296  | 740156752       | 1207960670          | 8589934592  |
> +-------------------+------------+---------------+-------------+-----------------+---------------------+-------------+
> 4 rows selected (0.118 seconds)
> {code}
> there are 16 distinct partitions (PARTITION BY key2)
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select distinct key2 from `twoKeyJsn.json`;
> +-------+
> | key2  |
> +-------+
> | d     |
> | c     |
> | b     |
> | 1     |
> | a     |
> | 0     |
> | k     |
> | m     |
> | j     |
> | h     |
> | e     |
> | n     |
> | g     |
> | f     |
> | l     |
> | i     |
> +-------+
> 16 rows selected (28.967 seconds)
> {code}
> Details from drillbit.log
> {code}
> error_type: SYSTEM
>     message: "SYSTEM ERROR: java.lang.IllegalStateException: Failure while closing accountor.  Expected private and shared pools to be set to initial values.  However, one or more were not.  Stats are\n\tzone\tinit\tallocated\tdelta \n\tprivate\t1000000\t0\t1000000 \n\tshared\t9999000000\t9928320966\t70679034.\n\nFragment 1:8\n\n[Error Id: b7b41c03-1122-4fa4-b441-9aa10544a91e on centos-02.qa.lab:31010]"
>     exception {
>       exception_class: "java.lang.IllegalStateException"
>       message: "Failure while closing accountor.  Expected private and shared pools to be set to initial values.  However, one or more were not.  Stats are\n\tzone\tinit\tallocated\tdelta \n\tprivate\t1000000\t0\t1000000 \n\tshared\t9999000000\t9928320966\t70679034."
>       stack_trace {
>         class_name: "org.apache.drill.exec.memory.AtomicRemainder"
>         file_name: "AtomicRemainder.java"
>         line_number: 200
>         method_name: "close"
>         is_native_method: false
>       }
> {code}
> Stack trace
> {code}
> org.apache.drill.common.exceptions.UserRemoteException: RESOURCE ERROR: One or more nodes ran out of memory while executing the query.
> Fragment 1:7
> [Error Id: fe56b1ff-02b6-4ded-a317-d753ab211f5b on centos-03.qa.lab:31010]
>         at org.apache.drill.exec.work.foreman.QueryManager$1.statusUpdate(QueryManager.java:458) [drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
>         at org.apache.drill.exec.rpc.control.WorkEventBus.statusUpdate(WorkEventBus.java:71) [drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
>         at org.apache.drill.exec.work.batch.ControlMessageHandler.handle(ControlMessageHandler.java:79) [drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
>         at org.apache.drill.exec.rpc.control.ControlServer.handle(ControlServer.java:61) [drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
>         at org.apache.drill.exec.rpc.control.ControlServer.handle(ControlServer.java:38) [drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
>         at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:61) [drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
>         at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:233) [drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
>         at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:205) [drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:1.0.0-mapr-r1]
>         at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89) [netty-codec-4.0.27.Final.jar:4.0.27.Final]
>         at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at io.netty.handler.timeout.ReadTimeoutHandler.channelRead(ReadTimeoutHandler.java:150) [netty-handler-4.0.27.Final.jar:4.0.27.Final]
>         at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) [netty-codec-4.0.27.Final.jar:4.0.27.Final]
>         at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242) [netty-codec-4.0.27.Final.jar:4.0.27.Final]
>         at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847) [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:618) [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
>         at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:329) [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
>         at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:250) [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
>         at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) [netty-common-4.0.27.Final.jar:4.0.27.Final]
>         at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)