You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by Jark Wu <im...@gmail.com> on 2020/11/24 16:21:43 UTC

Re: Flink CDC 遇到关于不发生packet造成的卡顿问题该如何解决

See the docs:
https://github.com/ververica/flink-cdc-connectors/wiki/MySQL-CDC-Connector#setting-up-mysql-session-timeouts

On Tue, 24 Nov 2020 at 23:54, yujianbo <15...@163.com> wrote:

> 一、环境:
> 1、版本:1.11.2
> 2、flink CDC 用Stream  API 从mysql  同步到kudu
>
> 二、遇到的问题现象:
> 1、目前线上已经同步了几张mysql表到kudu了,mysql的量级都在3千万左右。
>  但是有一张mysql表同步了几次都遇到一个问题:大概能判断在全量阶段,还没到增量阶段。
>
>
> 错误日志在下面。目前想采取“autoReconnect=true”看看来避免,到是不应该加在哪个地方,看日志感觉加了这个参数也是治标不治本,重点是为啥不发送packet,造成了卡顿?
>
>  下面是具体报错:
> ======================================================
> 2020-11-24 20:00:37,547 *ERROR io.debezium.connector.mysql.SnapshotReader
> *
> [] - Failed due to error: Aborting snapshot due to error when last running
> 'SELECT * FROM `uchome`.`forums_post_12`': *The last packet successfully
> received from the server was 39 milliseconds ago.  The last packet sent
> successfully to the server was 6,772,615 milliseconds ago. is longer than
> the server configured value of 'wait_timeout'. You should consider either
> expiring and/or testing connection validity before use in your application,
> increasing the server configured values for client timeouts, or using the
> Connector/J connection property 'autoReconnect=true' to avoid this
> problem.*
> org.apache.kafka.connect.errors.ConnectException: The last packet
> successfully received from the server was 39 milliseconds ago.  The last
> packet sent successfully to the server was 6,772,615 milliseconds ago. is
> longer than the server configured value of 'wait_timeout'. You should
> consider either expiring and/or testing connection validity before use in
> your application, increasing the server configured values for client
> timeouts, or using the Connector/J connection property 'autoReconnect=true'
> to avoid this problem. Error code: 0; SQLSTATE: 08S01.
>         at
> io.debezium.connector.mysql.AbstractReader.wrap(AbstractReader.java:230)
>
> ~[blob_p-b339a2f89b058d1dab7e01f8c235b6bcc0c26d10-90c2b905e5c1a69c13cf6a9259bd7be8:?]
>         at
> io.debezium.connector.mysql.AbstractReader.failed(AbstractReader.java:207)
>
> ~[blob_p-b339a2f89b058d1dab7e01f8c235b6bcc0c26d10-90c2b905e5c1a69c13cf6a9259bd7be8:?]
>         at
> io.debezium.connector.mysql.SnapshotReader.execute(SnapshotReader.java:831)
>
> ~[blob_p-b339a2f89b058d1dab7e01f8c235b6bcc0c26d10-90c2b905e5c1a69c13cf6a9259bd7be8:?]
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [?:1.8.0_231]
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [?:1.8.0_231]
>         at java.lang.Thread.run(Thread.java:748) [?:1.8.0_231]
> *Caused by: com.mysql.cj.jdbc.exceptions.CommunicationsException: The last
> packet successfully received from the server was 39 milliseconds ago.  The
> last packet sent successfully to the server was 6,772,615 milliseconds ago.
> is longer than the server configured value of 'wait_timeout'. *You should
> consider either expiring and/or testing connection validity before use in
> your application, increasing the server configured values for client
> timeouts, or using the Connector/J connection property 'autoReconnect=true'
> to avoid this problem.
>         at
>
> com.mysql.cj.jdbc.exceptions.SQLError.createCommunicationsException(SQLError.java:174)
>
> ~[blob_p-b339a2f89b058d1dab7e01f8c235b6bcc0c26d10-90c2b905e5c1a69c13cf6a9259bd7be8:?]
>         at
>
> com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:64)
>
> ~[blob_p-b339a2f89b058d1dab7e01f8c235b6bcc0c26d10-90c2b905e5c1a69c13cf6a9259bd7be8:?]
> ===============================================
>
>
>
> --
> Sent from: http://apache-flink.147419.n8.nabble.com/
>

Re: Flink CDC 遇到关于不发生packet造成的卡顿问题该如何解决

Posted by yujianbo <15...@163.com>.
感谢Jark!
上次调整了mysql连接参数解决了超时问题,但是目前还是同步这张表的时候,就是在Snapshot快结束阶段卡主,报连接异常,请问这个地方应该排查哪个地方?

一、环境:
1、版本:1.11.2
2、flink CDC 用Stream  API 从mysql  同步到kudu
3、*这张表有3400万数据,老是在3340左右就卡住,目前已经把sink 到kudu直接改成 print输出还是会出现一模一样的报错。*


日志如下:
==================================
2020-11-26 14:00:15,293 ERROR *io.debezium.connector.mysql.SnapshotReader*                  
[] - Failed due to error: Aborting snapshot due to error when last running
'SELECT * FROM `uchome`.`forums_post_12`': *Communications link failure*

The last packet successfully received from the server was 16 milliseconds
ago.  The last packet sent successfully to the server was 335,794
milliseconds ago.
*org.apache.kafka.connect.errors.ConnectException: Communications link
failure*

The last packet successfully received from the server was 16 milliseconds
ago.  The last packet sent successfully to the server was 335,794
milliseconds ago. Error code: 0; SQLSTATE: 08S01.
	at io.debezium.connector.mysql.AbstractReader.wrap(AbstractReader.java:230)
~[flinkcdc4mysql2kudu.jar:?]
	at
io.debezium.connector.mysql.AbstractReader.failed(AbstractReader.java:207)
~[flinkcdc4mysql2kudu.jar:?]
	at
io.debezium.connector.mysql.SnapshotReader.execute(SnapshotReader.java:831)
~[flinkcdc4mysql2kudu.jar:?]
	at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_231]
	at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_231]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_231]
*Caused by: com.mysql.cj.jdbc.exceptions.CommunicationsException:
Communications link failure*

The last packet successfully received from the server was 16 milliseconds
ago.  The last packet sent successfully to the server was 335,794
milliseconds ago.
	at
com.mysql.cj.jdbc.exceptions.SQLError.createCommunicationsException(SQLError.java:174)
~[flinkcdc4mysql2kudu.jar:?]
	at
com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:64)
~[flinkcdc4mysql2kudu.jar:?]
	at com.mysql.cj.jdbc.ConnectionImpl.commit(ConnectionImpl.java:813)
~[flinkcdc4mysql2kudu.jar:?]
	at
io.debezium.connector.mysql.SnapshotReader.execute(SnapshotReader.java:747)
~[flinkcdc4mysql2kudu.jar:?]
	... 3 more
Caused by: com.mysql.cj.exceptions.CJCommunicationsException: Communications
link failure

The last packet successfully received from the server was 16 milliseconds
ago.  The last packet sent successfully to the server was 335,794
milliseconds ago.
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
~[?:1.8.0_231]
	at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
~[?:1.8.0_231]
	at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
~[?:1.8.0_231]
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
~[?:1.8.0_231]
	at
com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:61)
~[flinkcdc4mysql2kudu.jar:?]
	at
com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:105)
~[flinkcdc4mysql2kudu.jar:?]
	at
com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:151)
~[flinkcdc4mysql2kudu.jar:?]
	at
com.mysql.cj.exceptions.ExceptionFactory.createCommunicationsException(ExceptionFactory.java:167)
~[flinkcdc4mysql2kudu.jar:?]
	at
com.mysql.cj.protocol.a.NativeProtocol.clearInputStream(NativeProtocol.java:837)
~[flinkcdc4mysql2kudu.jar:?]
	at
com.mysql.cj.protocol.a.NativeProtocol.sendCommand(NativeProtocol.java:652)
~[flinkcdc4mysql2kudu.jar:?]
	at
com.mysql.cj.protocol.a.NativeProtocol.sendQueryPacket(NativeProtocol.java:986)
~[flinkcdc4mysql2kudu.jar:?]
	at
com.mysql.cj.protocol.a.NativeProtocol.sendQueryString(NativeProtocol.java:921)
~[flinkcdc4mysql2kudu.jar:?]
	at com.mysql.cj.NativeSession.execSQL(NativeSession.java:1165)
~[flinkcdc4mysql2kudu.jar:?]
	at com.mysql.cj.jdbc.ConnectionImpl.commit(ConnectionImpl.java:801)
~[flinkcdc4mysql2kudu.jar:?]
	at
io.debezium.connector.mysql.SnapshotReader.execute(SnapshotReader.java:747)
~[flinkcdc4mysql2kudu.jar:?]
	... 3 more
Caused by: java.io.IOException: Socket is closed
	at
com.mysql.cj.protocol.AbstractSocketConnection.getMysqlInput(AbstractSocketConnection.java:72)
~[flinkcdc4mysql2kudu.jar:?]
	at
com.mysql.cj.protocol.a.NativeProtocol.clearInputStream(NativeProtocol.java:833)
~[flinkcdc4mysql2kudu.jar:?]
	at
com.mysql.cj.protocol.a.NativeProtocol.sendCommand(NativeProtocol.java:652)
~[flinkcdc4mysql2kudu.jar:?]
	at
com.mysql.cj.protocol.a.NativeProtocol.sendQueryPacket(NativeProtocol.java:986)
~[flinkcdc4mysql2kudu.jar:?]
	at
com.mysql.cj.protocol.a.NativeProtocol.sendQueryString(NativeProtocol.java:921)
~[flinkcdc4mysql2kudu.jar:?]
	at com.mysql.cj.NativeSession.execSQL(NativeSession.java:1165)
~[flinkcdc4mysql2kudu.jar:?]
	at com.mysql.cj.jdbc.ConnectionImpl.commit(ConnectionImpl.java:801)
~[flinkcdc4mysql2kudu.jar:?]
	at
io.debezium.connector.mysql.SnapshotReader.execute(SnapshotReader.java:747)
~[flinkcdc4mysql2kudu.jar:?]
	... 3 more
2020-11-26 14:00:15,699 INFO  io.debezium.connector.common.BaseSourceTask                 
[] - Stopping down connector
2020-11-26 14:00:15,700 INFO  io.debezium.connector.mysql.MySqlConnectorTask              
[] - Stopping MySQL connector task
2020-11-26 14:00:15,700 INFO  io.debezium.connector.mysql.ChainedReader                   
[] - ChainedReader: Stopping the snapshot reader
2020-11-26 14:00:15,700 INFO  io.debezium.connector.mysql.SnapshotReader                  
[] - Discarding 1411 unsent record(s) due to the connector shutting down
2020-11-26 14:00:15,700 INFO  io.debezium.connector.mysql.SnapshotReader                  
[] - Discarding 0 unsent record(s) due to the connector shutting down
2020-11-26 14:00:15,701 INFO  io.debezium.connector.mysql.MySqlConnectorTask              
[] - Connector task finished all work and is now shutdown

==================================



--
Sent from: http://apache-flink.147419.n8.nabble.com/

Re: Flink CDC 遇到关于不发生packet造成的卡顿问题该如何解决

Posted by Jark Wu <im...@gmail.com>.
这个倒还好,毕竟任务不会非常多,这点压力MySQL还是能抗住的。

文档中的描述不太准确,不配 server id ,不会对 MySQL 链接造成冲击的。 我更新了下文档。
https://github.com/ververica/flink-cdc-connectors/wiki/MySQL-CDC-Connector#set-a-differnet-server-id-for-each-job



On Thu, 26 Nov 2020 at 09:53, yujianbo <15...@163.com> wrote:

> 感谢Jark的回答,还想请问大佬,想问社区的mysql cdc 的wiki上说具有许多的不同的 server
> id去连接mysql服务器,会造成mysql
> cpu和连接高峰。那想问我们cdc采用sql指定不同的 serverid 去拉不同的表, 是不是这样的cdc 任务也不要太多?
>
>
>
> --
> Sent from: http://apache-flink.147419.n8.nabble.com/

Re: Flink CDC 遇到关于不发生packet造成的卡顿问题该如何解决

Posted by yujianbo <15...@163.com>.
感谢Jark的回答,还想请问大佬,想问社区的mysql cdc 的wiki上说具有许多的不同的 server id去连接mysql服务器,会造成mysql
cpu和连接高峰。那想问我们cdc采用sql指定不同的 serverid 去拉不同的表, 是不是这样的cdc 任务也不要太多?



--
Sent from: http://apache-flink.147419.n8.nabble.com/

Re: Flink CDC 遇到关于不发生packet造成的卡顿问题该如何解决

Posted by Jark Wu <im...@gmail.com>.
1. 默认是随机的。所以可能会重复。
2,3.  有问题。server id 是库级别的。

On Wed, 25 Nov 2020 at 17:41, yujianbo <15...@163.com> wrote:

> 主要是为了实现解析自定义的schema,sink端好输出到下游。
> 想请教一个问题:
>
> https://github.com/ververica/flink-cdc-connectors/wiki/MySQL-CDC-Connector#set-a-differnet-server-id-for-each-job
> 看了上面这个链接关于为每个作业设置一个differnet server id的问题。我看sql可以指定不同的server id,所以有下面这三个疑惑:
> 1、 如果是不同的stream 任务 的它的server id是不是同一个?
> 2、不同的stream 任务 同步同一个数据库的不同表是不是没有问题
> 3、不同的stream 任务 同步同一个数据库的同一张表是不是有问题
>
>
>
> --
> Sent from: http://apache-flink.147419.n8.nabble.com/

Re: Flink CDC 遇到关于不发生packet造成的卡顿问题该如何解决

Posted by yujianbo <15...@163.com>.
主要是为了实现解析自定义的schema,sink端好输出到下游。
想请教一个问题:
https://github.com/ververica/flink-cdc-connectors/wiki/MySQL-CDC-Connector#set-a-differnet-server-id-for-each-job
看了上面这个链接关于为每个作业设置一个differnet server id的问题。我看sql可以指定不同的server id,所以有下面这三个疑惑:
1、 如果是不同的stream 任务 的它的server id是不是同一个?
2、不同的stream 任务 同步同一个数据库的不同表是不是没有问题
3、不同的stream 任务 同步同一个数据库的同一张表是不是有问题



--
Sent from: http://apache-flink.147419.n8.nabble.com/

Re: Flink CDC 遇到关于不发生packet造成的卡顿问题该如何解决

Posted by Jark Wu <im...@gmail.com>.
Btw,能问一下为什么用 Stream API 而不是直接用 Flink SQL 么?

On Wed, 25 Nov 2020 at 00:21, Jark Wu <im...@gmail.com> wrote:

> See the docs:
> https://github.com/ververica/flink-cdc-connectors/wiki/MySQL-CDC-Connector#setting-up-mysql-session-timeouts
>
> On Tue, 24 Nov 2020 at 23:54, yujianbo <15...@163.com> wrote:
>
>> 一、环境:
>> 1、版本:1.11.2
>> 2、flink CDC 用Stream  API 从mysql  同步到kudu
>>
>> 二、遇到的问题现象:
>> 1、目前线上已经同步了几张mysql表到kudu了,mysql的量级都在3千万左右。
>>  但是有一张mysql表同步了几次都遇到一个问题:大概能判断在全量阶段,还没到增量阶段。
>>
>>
>> 错误日志在下面。目前想采取“autoReconnect=true”看看来避免,到是不应该加在哪个地方,看日志感觉加了这个参数也是治标不治本,重点是为啥不发送packet,造成了卡顿?
>>
>>  下面是具体报错:
>> ======================================================
>> 2020-11-24 20:00:37,547 *ERROR io.debezium.connector.mysql.SnapshotReader
>> *
>> [] - Failed due to error: Aborting snapshot due to error when last running
>> 'SELECT * FROM `uchome`.`forums_post_12`': *The last packet successfully
>> received from the server was 39 milliseconds ago.  The last packet sent
>> successfully to the server was 6,772,615 milliseconds ago. is longer than
>> the server configured value of 'wait_timeout'. You should consider either
>> expiring and/or testing connection validity before use in your
>> application,
>> increasing the server configured values for client timeouts, or using the
>> Connector/J connection property 'autoReconnect=true' to avoid this
>> problem.*
>> org.apache.kafka.connect.errors.ConnectException: The last packet
>> successfully received from the server was 39 milliseconds ago.  The last
>> packet sent successfully to the server was 6,772,615 milliseconds ago. is
>> longer than the server configured value of 'wait_timeout'. You should
>> consider either expiring and/or testing connection validity before use in
>> your application, increasing the server configured values for client
>> timeouts, or using the Connector/J connection property
>> 'autoReconnect=true'
>> to avoid this problem. Error code: 0; SQLSTATE: 08S01.
>>         at
>> io.debezium.connector.mysql.AbstractReader.wrap(AbstractReader.java:230)
>>
>> ~[blob_p-b339a2f89b058d1dab7e01f8c235b6bcc0c26d10-90c2b905e5c1a69c13cf6a9259bd7be8:?]
>>         at
>> io.debezium.connector.mysql.AbstractReader.failed(AbstractReader.java:207)
>>
>> ~[blob_p-b339a2f89b058d1dab7e01f8c235b6bcc0c26d10-90c2b905e5c1a69c13cf6a9259bd7be8:?]
>>         at
>>
>> io.debezium.connector.mysql.SnapshotReader.execute(SnapshotReader.java:831)
>>
>> ~[blob_p-b339a2f89b058d1dab7e01f8c235b6bcc0c26d10-90c2b905e5c1a69c13cf6a9259bd7be8:?]
>>         at
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>> [?:1.8.0_231]
>>         at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>> [?:1.8.0_231]
>>         at java.lang.Thread.run(Thread.java:748) [?:1.8.0_231]
>> *Caused by: com.mysql.cj.jdbc.exceptions.CommunicationsException: The last
>> packet successfully received from the server was 39 milliseconds ago.  The
>> last packet sent successfully to the server was 6,772,615 milliseconds
>> ago.
>> is longer than the server configured value of 'wait_timeout'. *You should
>> consider either expiring and/or testing connection validity before use in
>> your application, increasing the server configured values for client
>> timeouts, or using the Connector/J connection property
>> 'autoReconnect=true'
>> to avoid this problem.
>>         at
>>
>> com.mysql.cj.jdbc.exceptions.SQLError.createCommunicationsException(SQLError.java:174)
>>
>> ~[blob_p-b339a2f89b058d1dab7e01f8c235b6bcc0c26d10-90c2b905e5c1a69c13cf6a9259bd7be8:?]
>>         at
>>
>> com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:64)
>>
>> ~[blob_p-b339a2f89b058d1dab7e01f8c235b6bcc0c26d10-90c2b905e5c1a69c13cf6a9259bd7be8:?]
>> ===============================================
>>
>>
>>
>> --
>> Sent from: http://apache-flink.147419.n8.nabble.com/
>>
>