You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by Kurt Young <yk...@gmail.com> on 2019/05/28 06:10:07 UTC
Re: Blink在Hive表没有统计信息的情况下如何优化
没有统计信息确实很难生成比较靠谱的执行计划,这也是之前很多DBA的工作 ;-)
你可以试试以下以下操作:
1. 如果是join顺序不合理,可以手动调整sql中的join顺序,并且关闭join reorder
2.
看看fail的具体原因,如果是个别比较激进的算子表现不好,比如HashAggregate、HashJoin,你可以手动禁止掉这些算子,选择性能稍差但可能执行起来更稳健的算子,比如SortMergeJoin
这是我拍脑袋想的,具体的建议你先分析一下SQL为什么会fail,然后贴出具体的问题来。
另外,我们正在开发SQL hint功能,可以有效缓解类似问题。
Best,
Kurt
On Tue, May 28, 2019 at 12:32 PM bigdatayunzhongyan <
bigdatayunzhongyan@aliyun.com> wrote:
> @Kurt Young @Jark Wu @Bowen Li
> 先描述下我这边的情况,一些简单的SQL无论是否有统计的信息的情况下都能执行成功,
> 但是一些复杂的SQL在没有统计信息的情况下一直执行失败,尝试开启各种参数,都失败了,很痛苦,
> 不过在设置统计信息及开启相关参数后很轻松就能执行成功(点赞)。
> 我们线上的情况是很多表信息都没有统计信息,请问有哪些优化的办法。
>
> 启动命令:./bin/yarn-session.sh -n 50 -s 20 -jm 3072 -tm 6144 -d
> 配置见附件
> 谢谢!
>
>
Re: Blink在Hive表没有统计信息的情况下如何优化
Posted by Kurt Young <yk...@gmail.com>.
你先试试把HashJoin这个算子禁用看看,TableConfig里添加这个配置
sql.exec.disabled-operators: HashJoin
Best,
Kurt
On Tue, May 28, 2019 at 3:23 PM bigdatayunzhongyan <
bigdatayunzhongyan@aliyun.com> wrote:
> 感谢 @Kurt Young 大神的回复,报错信息在附件。谢谢!
>
>
> 在2019年05月28日 14:10,Kurt Young<yk...@gmail.com> <yk...@gmail.com> 写道:
>
> 没有统计信息确实很难生成比较靠谱的执行计划,这也是之前很多DBA的工作 ;-)
>
> 你可以试试以下以下操作:
> 1. 如果是join顺序不合理,可以手动调整sql中的join顺序,并且关闭join reorder
> 2.
> 看看fail的具体原因,如果是个别比较激进的算子表现不好,比如HashAggregate、HashJoin,你可以手动禁止掉这些算子,选择性能稍差但可能执行起来更稳健的算子,比如SortMergeJoin
>
> 这是我拍脑袋想的,具体的建议你先分析一下SQL为什么会fail,然后贴出具体的问题来。
>
> 另外,我们正在开发SQL hint功能,可以有效缓解类似问题。
>
> Best,
> Kurt
>
>
> On Tue, May 28, 2019 at 12:32 PM bigdatayunzhongyan <
> bigdatayunzhongyan@aliyun.com> wrote:
>
>> @Kurt Young @Jark Wu @Bowen Li
>> 先描述下我这边的情况,一些简单的SQL无论是否有统计的信息的情况下都能执行成功,
>> 但是一些复杂的SQL在没有统计信息的情况下一直执行失败,尝试开启各种参数,都失败了,很痛苦,
>> 不过在设置统计信息及开启相关参数后很轻松就能执行成功(点赞)。
>> 我们线上的情况是很多表信息都没有统计信息,请问有哪些优化的办法。
>>
>> 启动命令:./bin/yarn-session.sh -n 50 -s 20 -jm 3072 -tm 6144 -d
>> 配置见附件
>> 谢谢!
>>
>>
回复: Blink在Hive表没有统计信息的情况下如何优化
Posted by bigdatayunzhongyan <bi...@aliyun.com.INVALID>.
org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: Fatal error at remote task manager '/xxxxxx:14941'.
at org.apache.flink.runtime.io.network.netty.CreditBasedPartitionRequestClientHandler.decodeMsg(CreditBasedPartitionRequestClientHandler.java:276)
at org.apache.flink.runtime.io.network.netty.CreditBasedPartitionRequestClientHandler.channelRead(CreditBasedPartitionRequestClientHandler.java:182)
at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at org.apache.flink.runtime.io.network.netty.ZeroCopyNettyMessageDecoder.channelRead(ZeroCopyNettyMessageDecoder.java:128)
at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
at org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: java.nio.channels.ClosedChannelException
at org.apache.flink.runtime.io.network.netty.PartitionRequestQueue.writeAndFlushNextMessageIfPossible(PartitionRequestQueue.java:297)
at org.apache.flink.runtime.io.network.netty.PartitionRequestQueue.enqueueAvailableReader(PartitionRequestQueue.java:119)
at org.apache.flink.runtime.io.network.netty.PartitionRequestQueue.addCredit(PartitionRequestQueue.java:181)
at org.apache.flink.runtime.io.network.netty.PartitionRequestServerHandler.channelRead0(PartitionRequestServerHandler.java:140)
at org.apache.flink.runtime.io.network.netty.PartitionRequestServerHandler.channelRead0(PartitionRequestServerHandler.java:43)
at org.apache.flink.shaded.netty4.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
... 13 more
Caused by: org.apache.flink.runtime.io.network.partition.DataConsumptionException: java.nio.channels.ClosedChannelException
at org.apache.flink.runtime.io.network.partition.SpilledSubpartitionView.getNextBuffer(SpilledSubpartitionView.java:150)
at org.apache.flink.runtime.io.network.netty.CreditBasedSequenceNumberingViewReader.getNextBuffer(CreditBasedSequenceNumberingViewReader.java:162)
ERROR org.apache.flink.runtime.io.network.netty.PartitionRequestQueue - Encountered error while consuming partitions
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:192)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at org.apache.flink.shaded.netty4.io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:311)
at org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)
at org.apache.flink.shaded.netty4.io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:241)
at org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)
at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
.......
org.apache.flink.streaming.runtime.tasks.ExceptionInChainedOperatorException: Could not forward element to next operator
at org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingWithOneInputStreamOperatorOutput.pushToOperator(OperatorChain.java:638)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingWithOneInputStreamOperatorOutput.collect(OperatorChain.java:612)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingWithOneInputStreamOperatorOutput.collect(OperatorChain.java:575)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:763)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:741)
at org.apache.flink.table.runtime.util.StreamRecordCollector.collect(StreamRecordCollector.java:44)
at org.apache.flink.table.runtime.join.batch.HashJoinOperator.collect(HashJoinOperator.java:202)
at org.apache.flink.table.runtime.join.batch.HashJoinOperator.innerJoin(HashJoinOperator.java:187)
at org.apache.flink.table.runtime.join.batch.HashJoinOperator$InnerHashJoinOperator.join(HashJoinOperator.java:310)
at org.apache.flink.table.runtime.join.batch.HashJoinOperator.joinWithNextKey(HashJoinOperator.java:181)
at org.apache.flink.table.runtime.join.batch.HashJoinOperator.processElement2(HashJoinOperator.java:159)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$TwoInputStreamOperatorProxy.processElement2(OperatorChain.java:1389)
at org.apache.flink.streaming.runtime.io.SecondOfTwoInputProcessor.processRecord(SecondOfTwoInputProcessor.java:91)
at org.apache.flink.streaming.runtime.io.InputGateFetcher.fetchAndProcess(InputGateFetcher.java:159)
at org.apache.flink.streaming.runtime.io.StreamArbitraryInputProcessor.process(StreamArbitraryInputProcessor.java:134)
at org.apache.flink.streaming.runtime.tasks.ArbitraryInputStreamTask.run(ArbitraryInputStreamTask.java:183)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:324)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:727)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.flink.streaming.runtime.tasks.ExceptionInChainedOperatorException: Could not forward element to next operator
at org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingWithSecondInputOfTwoInputStreamOperatorOutput.pushToOperator(OperatorChain.java:850)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingWithSecondInputOfTwoInputStreamOperatorOutput.collect(OperatorChain.java:824)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingWithSecondInputOfTwoInputStreamOperatorOutput.collect(OperatorChain.java:787)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:763)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:741)
at BatchExecCalcRule$64.processElement(Unknown Source)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingWithOneInputStreamOperatorOutput.pushToOperator(OperatorChain.java:635)
... 18 more
Caused by: org.apache.flink.streaming.runtime.tasks.ExceptionInChainedOperatorException: Could not forward element to next operator
at org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingWithOneInputStreamOperatorOutput.pushToOperator(OperatorChain.java:638)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingWithOneInputStreamOperatorOutput.collect(OperatorChain.java:612)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingWithOneInputStreamOperatorOutput.collect(OperatorChain.java:575)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:763)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:741)
at org.apache.flink.table.runtime.util.StreamRecordCollector.collect(StreamRecordCollector.java:44)
at org.apache.flink.table.runtime.join.batch.HashJoinOperator.collect(HashJoinOperator.java:202)
at org.apache.flink.table.runtime.join.batch.HashJoinOperator.innerJoin(HashJoinOperator.java:187)
at org.apache.flink.table.runtime.join.batch.HashJoinOperator$InnerHashJoinOperator.join(HashJoinOperator.java:310)
at org.apache.flink.table.runtime.join.batch.HashJoinOperator.joinWithNextKey(HashJoinOperator.java:181)
at org.apache.flink.table.runtime.join.batch.HashJoinOperator.processElement2(HashJoinOperator.java:159)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$TwoInputStreamOperatorProxy.processElement2(OperatorChain.java:1389)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingWithSecondInputOfTwoInputStreamOperatorOutput.pushToOperator(OperatorChain.java:847)
... 24 more
Caused by: java.lang.RuntimeException
at org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:96)
at org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:76)
at org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:42)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:763)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:741)
at BatchExecCalcRule$82.processElement(Unknown Source)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$ChainingWithOneInputStreamOperatorOutput.pushToOperator(OperatorChain.java:635)
... 36 more
Caused by: java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestMemorySegment(LocalBufferPool.java:250)
at org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBufferBuilderBlocking(LocalBufferPool.java:207)
at org.apache.flink.runtime.io.network.partition.InternalResultPartition.requestNewBufferBuilder(InternalResultPartition.java:448)
at org.apache.flink.runtime.io.network.partition.InternalResultPartition.copyFromSerializerToTargetChannel(InternalResultPartition.java:526)
at org.apache.flink.runtime.io.network.partition.InternalResultPartition.emitRecord(InternalResultPartition.java:231)
at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:89)
at org.apache.flink.streaming.runtime.io.StreamRecordWriter.emit(StreamRecordWriter.java:81)
at org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:93)
... 42 more