You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Abhishek Girish (JIRA)" <ji...@apache.org> on 2015/04/25 00:44:38 UTC

[jira] [Updated] (DRILL-1793) Drill statements fail with CuratorConnectionLossException / FragmentSetupException

     [ https://issues.apache.org/jira/browse/DRILL-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Abhishek Girish updated DRILL-1793:
-----------------------------------
    Labels: hard_to_verify  (was: )

> Drill statements fail with CuratorConnectionLossException / FragmentSetupException
> ----------------------------------------------------------------------------------
>
>                 Key: DRILL-1793
>                 URL: https://issues.apache.org/jira/browse/DRILL-1793
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Writer
>    Affects Versions: 0.7.0
>            Reporter: Abhishek Girish
>            Assignee: Steven Phillips
>              Labels: hard_to_verify
>             Fix For: 0.8.0
>
>         Attachments: drillbit.log, drillbit.out.log
>
>
> CTAS statements on about 10TB of data (TPCH) fail towards the end after majority of data conversion is done. 
> The following error is seen in Sqlline / Drillbit.out
> 10:32:07.286 [CuratorFramework-0] ERROR org.apache.curator.ConnectionState - Connection timed out for connection string (10.10.104.101:5181,10.10.104.102:5181,10.10.104.103:5181) and timeout (5000) / elapsed (5001)
> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = ConnectionLoss
> 	at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:198) [curator-client-2.5.0.jar:na]
> 	at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:88) [curator-client-2.5.0.jar:na]
> 	at org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115) [curator-client-2.5.0.jar:na]
> 	at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:807) [curator-framework-2.5.0.jar:na]
> 	at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:793) [curator-framework-2.5.0.jar:na]
> 	at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$400(CuratorFrameworkImpl.java:57) [curator-framework-2.5.0.jar:na]
> 	at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:275) [curator-framework-2.5.0.jar:na]
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:262) [na:1.7.0_71]
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_71]
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_71]
> 	at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> The following error was seen in Drillbit.log couple of minutes before this error was seen on sqlline/drillbit.out:
> 2014-12-01 13:34:06,447 [BitServer-8] ERROR o.a.drill.exec.rpc.data.DataServer - Failure while getting fragment manager. 4c80fb01-8d5e-4a6d-8f0f-5943d6ceb74f:7:564
> org.apache.drill.exec.exception.FragmentSetupException: Failed to receive plan fragment that was required for id: 4c80fb01-8d5e-4a6d-8f0f-5943d6ceb74f:7:564
> 	at org.apache.drill.exec.rpc.control.WorkEventBus.getFragmentManager(WorkEventBus.java:107) ~[drill-java-exec-0.6.0.r2-incubating-SNAPSHOT-rebuffed.jar:0.6.0.r2-incubating-SNAPSHOT]
> 	at org.apache.drill.exec.rpc.data.DataServer.handle(DataServer.java:111) [drill-java-exec-0.6.0.r2-incubating-SNAPSHOT-rebuffed.jar:0.6.0.r2-incubating-SNAPSHOT]
> 	at org.apache.drill.exec.rpc.data.DataServer.handle(DataServer.java:48) [drill-java-exec-0.6.0.r2-incubating-SNAPSHOT-rebuffed.jar:0.6.0.r2-incubating-SNAPSHOT]
> 	at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:194) [drill-java-exec-0.6.0.r2-incubating-SNAPSHOT-rebuffed.jar:0.6.0.r2-incubating-SNAPSHOT]
> 	at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:173) [drill-java-exec-0.6.0.r2-incubating-SNAPSHOT-rebuffed.jar:0.6.0.r2-incubating-SNAPSHOT]
> 	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89) [netty-codec-4.0.24.Final.jar:4.0.24.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) [netty-transport-4.0.24.Final.jar:4.0.24.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) [netty-transport-4.0.24.Final.jar:4.0.24.Final]
> 	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) [netty-codec-4.0.24.Final.jar:4.0.24.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) [netty-transport-4.0.24.Final.jar:4.0.24.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) [netty-transport-4.0.24.Final.jar:4.0.24.Final]
> 	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:161) [netty-codec-4.0.24.Final.jar:4.0.24.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) [netty-transport-4.0.24.Final.jar:4.0.24.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) [netty-transport-4.0.24.Final.jar:4.0.24.Final]
> 	at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) [netty-transport-4.0.24.Final.jar:4.0.24.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) [netty-transport-4.0.24.Final.jar:4.0.24.Final]
> 	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) [netty-transport-4.0.24.Final.jar:4.0.24.Final]
> 	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787) [netty-transport-4.0.24.Final.jar:4.0.24.Final]
> 	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130) [netty-transport-4.0.24.Final.jar:4.0.24.Final]
> 	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) [netty-transport-4.0.24.Final.jar:4.0.24.Final]
> 	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) [netty-transport-4.0.24.Final.jar:4.0.24.Final]
> 	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) [netty-transport-4.0.24.Final.jar:4.0.24.Final]
> 	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) [netty-transport-4.0.24.Final.jar:4.0.24.Final]
> 	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) [netty-common-4.0.24.Final.jar:4.0.24.Final]
> 	at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> 2014-12-01 13:34:06,519 [BitServer-6] ERROR o.a.d.exec.rpc.control.WorkEventBus - A fragment message arrived but there was no registered listener for that message for handle query_id {
>   part1: 5512681927986793069
>   part2: -8138187853733644465
> }
> major_fragment_id: 2
> minor_fragment_id: 424
> .
> Log files have been attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)