You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Alexandr Dmitrov <ca...@gmail.com> on 2023/03/03 12:11:52 UTC
debezium/flink-oracle-cdc-connector large memory consumption for transactions
Hello! Maybe you could help with debezium/vervica connectors in optimizing
memory consumption for large changes in a single transaction using vervica
flink-oracle-cdc-connector.
Environment:
flink - 1.15.1
oracle-cdc - 2.3
oracle - 19.3
When updating large number of rows in a single transaction I get the
exception:
"ERROR io.debezium.connector.oracle.logminer.LogMinerHelper [] -
Mining session stopped due to the java.lang.OutOfMemoryError: Java heap
space".
For the table with 20 fields of types (int, float, timestamp, date, string)*4
I could get the results listed in table:
For TaskManager heap space=540Mi, I could get 250k rows through, but failed on
300k.
For TaskManager heap space=960Mi, I could get 350k rows through, but failed on
400k.
Using SourceFunction created from OracleSource.builder(), startup-mode set to
latest-offset,
and self-defined deserializer, where I added [log.info](http://log.info/) to
see if the problem starts after debezium/cdc-connector did it work, but Java
Heap Space error occurs before the flow gets to the deserialization.
I've tried to provide the next debezium properties, but got no luck:
dbzProps.setProperty("log.mining.batch.size.max", "10000");
dbzProps.setProperty("log.mining.batch.size.default", "2000");
dbzProps.setProperty("log.mining.batch.size.min", "100");
dbzProps.setProperty("log.mining.view.fetch.size", "1000");
dbzProps.setProperty("max.batch.size", "64");
dbzProps.setProperty("max.queue.size", "256");
After some digging, I could find places in code where debezium consume changes
from oracle:
Loop to fetch records:
[https://github.com/debezium/debezium/blob/1.6/debezium-connector-
oracle/src/main/java/io/debezium/connector/oracle/logminer/LogMinerStreamingChangeEventSource.java#L154](https://github.com/debezium/debezium/blob/1.6/debezium-
connector-
oracle/src/main/java/io/debezium/connector/oracle/logminer/LogMinerStreamingChangeEventSource.java#L154)
Inserts/Updates/Deletes records are registered in transaction buffer:
[https://github.com/debezium/debezium/blob/1.6/debezium-connector-
oracle/src/main/java/io/debezium/connector/oracle/logminer/LogMinerQueryResultProcessor.java#L267](https://github.com/debezium/debezium/blob/1.6/debezium-
connector-
oracle/src/main/java/io/debezium/connector/oracle/logminer/LogMinerQueryResultProcessor.java#L267)
Commit records cause all events in transaction buffer to be commited - sent
forward to dispatcher, ending in batch handler:
[https://github.com/debezium/debezium/blob/1.6/debezium-connector-
oracle/src/main/java/io/debezium/connector/oracle/logminer/LogMinerQueryResultProcessor.java#L144](https://github.com/debezium/debezium/blob/1.6/debezium-
connector-
oracle/src/main/java/io/debezium/connector/oracle/logminer/LogMinerQueryResultProcessor.java#L144)
As far as I see, the cause of the problem is that debezium stores all changes
locally before the commit of the transaction occurs.
Is there a way to make the connector split a large amount of changes
processing for a single transaction? Or maybe any other way to get rid of the
large memory consumption problem?
Re: debezium/flink-oracle-cdc-connector large memory consumption for transactions
Posted by Антон <an...@yandex.ru>.
Hello,
Try Infinispan -
https://debezium.io/documentation/reference/stable/connectors/oracle.html#oracle-
event-buffering
15:12, 3 марта 2023 г., Alexandr Dmitrov <ca...@gmail.com>:
> Hello! Maybe you could help with debezium/vervica connectors in optimizing
> memory consumption for large changes in a single transaction using vervica
> flink-oracle-cdc-connector.
>
> Environment:
> flink - 1.15.1
> oracle-cdc - 2.3
> oracle - 19.3
>
>
> When updating large number of rows in a single transaction I get the
> exception:
> "ERROR io.debezium.connector.oracle.logminer.LogMinerHelper [] - Mining
> session stopped due to the java.lang.OutOfMemoryError: Java heap space".
>
> For the table with 20 fields of types (int, float, timestamp, date,
> string)*4 I could get the results listed in table:
>
> For TaskManager heap space=540Mi, I could get 250k rows through, but failed
> on 300k.
> For TaskManager heap space=960Mi, I could get 350k rows through, but failed
> on 400k.
>
> Using SourceFunction created from OracleSource.builder(), startup-mode set
> to latest-offset,
> and self-defined deserializer, where I added [log.info](http://log.info/) to
> see if the problem starts after debezium/cdc-connector did it work, but Java
> Heap Space error occurs before the flow gets to the deserialization.
>
> I've tried to provide the next debezium properties, but got no luck:
>
> dbzProps.setProperty("log.mining.batch.size.max", "10000");
> dbzProps.setProperty("log.mining.batch.size.default", "2000");
> dbzProps.setProperty("log.mining.batch.size.min", "100");
> dbzProps.setProperty("log.mining.view.fetch.size", "1000");
> dbzProps.setProperty("max.batch.size", "64");
> dbzProps.setProperty("max.queue.size", "256");
>
>
> After some digging, I could find places in code where debezium consume
> changes from oracle:
> Loop to fetch records:
> [https://github.com/debezium/debezium/blob/1.6/debezium-connector-
> oracle/src/main/java/io/debezium/connector/oracle/logminer/LogMinerStreamingChangeEventSource.java#L154](https://github.com/debezium/debezium/blob/1.6/debezium-
> connector-
> oracle/src/main/java/io/debezium/connector/oracle/logminer/LogMinerStreamingChangeEventSource.java#L154)
>
> Inserts/Updates/Deletes records are registered in transaction buffer:
> [https://github.com/debezium/debezium/blob/1.6/debezium-connector-
> oracle/src/main/java/io/debezium/connector/oracle/logminer/LogMinerQueryResultProcessor.java#L267](https://github.com/debezium/debezium/blob/1.6/debezium-
> connector-
> oracle/src/main/java/io/debezium/connector/oracle/logminer/LogMinerQueryResultProcessor.java#L267)
>
> Commit records cause all events in transaction buffer to be commited - sent
> forward to dispatcher, ending in batch handler:
> [https://github.com/debezium/debezium/blob/1.6/debezium-connector-
> oracle/src/main/java/io/debezium/connector/oracle/logminer/LogMinerQueryResultProcessor.java#L144](https://github.com/debezium/debezium/blob/1.6/debezium-
> connector-
> oracle/src/main/java/io/debezium/connector/oracle/logminer/LogMinerQueryResultProcessor.java#L144)
>
> As far as I see, the cause of the problem is that debezium stores all
> changes locally before the commit of the transaction occurs.
>
> Is there a way to make the connector split a large amount of changes
> processing for a single transaction? Or maybe any other way to get rid of
> the large memory consumption problem?
Sent from Yandex.Mail for mobile: http://m.ya.ru/ymail