You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Matt Burgess (Jira)" <ji...@apache.org> on 2020/04/03 15:06:00 UTC

[jira] [Commented] (NIFI-7240) wrong table mapping when using the nifi CaptureChangeMySQL modules combinded with mysql triggers:

    [ https://issues.apache.org/jira/browse/NIFI-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074640#comment-17074640 ] 

Matt Burgess commented on NIFI-7240:
------------------------------------

I believe the issue here is that CaptureChangeMySQL is counting on a Table Map event followed by any events that occur on the table (such as Write Rows). To accomplish this, it saves off the table information (database, schema, table name, id, etc.) as this information is not available in the Write Rows event (except for table ID).  For triggers however, we get two Table Map events in a row, followed by two Write Rows events. In this case we end up replacing the first table's information with the second table's information and use that for both Write Rows events.

If table IDs are unique (which I believe they are), then we should be able to cache them in a way that they can be retrieved no matter how many Table Map events have preceded a Write Rows event. The trouble is that the current cache key is a fully-qualified table object including all the table information mentioned above. That means that during Write Rows processing, we can't get the table entry from the table ID alone. Changing the cache key would break existing functionality. We could implement a "fallback" behavior where we start using new cache keys but fall back to using the old cache keys if the entry can't be found, but that adds extra processing that most of the time is unnecessary. We could add another set of entries to the cache, but that could double the amount of cache space used.

Presuming that this issue happens within a single transaction and there won't be too many triggers, the processor could keep a local transient cache of table IDs to cache keys. Then during Write Rows we could fetch the key from the local cache by table ID, then fetch the table information from the dstributed cache using the retrieved key. This would fix the issue of needing a Table Map to be followed by its operations without another Table Map event interfering.


> wrong table mapping when using the nifi CaptureChangeMySQL modules combinded with mysql triggers:
> -------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-7240
>                 URL: https://issues.apache.org/jira/browse/NIFI-7240
>             Project: Apache NiFi
>          Issue Type: Bug
>            Reporter: Michel Elias
>            Priority: Major
>
> wrong table mapping when using the nifi CaptureChangeMySQL modules combinded with mysql triggers:
> when using the nifi CaptureChangeMySQL module (https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-cdc/nifi-cdc-mysql-bundle/nifi-cdc-mysql-processors/src/main/java/org/apache/nifi/cdc/mysql/processors/CaptureChangeMySQL.java) combinded with mysql triggers the table mapping does not work correctly.
> when using triggers the order of the events is slightly different than when using regular inserts.
> ======================================================================================================================================
> regular insert:
> BEGIN;
> PREPARE stmt1 FROM 'insert into changes set Header_RefID=?';
> PREPARE stmt2 FROM 'insert into tab_tracedata set ZE140_RT_Aufnahme=?';
> SET @a = 3;
> SET @b = 4;
> EXECUTE stmt1 USING @a;
> EXECUTE stmt2 USING @b;
> COMMIT;
> events in binlog:
> | mysql-bin.000029 | 349004 | Gtid | 2 | 349042 | BEGIN GTID 0-2-6431 |
> | mysql-bin.000029 | 349042 | Table_map | 2 | 349091 | table_id: 19 (asic.changes) |
> | mysql-bin.000029 | 349091 | Write_rows_v1 | 2 | 349127 | table_id: 19 flags: STMT_END_F |
> | mysql-bin.000029 | 349127 | Table_map | 2 | 349911 | table_id: 24 (asic.tab_tracedata) |
> | mysql-bin.000029 | 349911 | Write_rows_v1 | 2 | 350041 | table_id: 24 flags: STMT_END_F |
> | mysql-bin.000029 | 350041 | Xid | 2 | 350068 | COMMIT /* xid=366 */ |
> ======================================================================================================================================
> triggered insert:
> CREATE DEFINER=`root`@`%` TRIGGER `insertChanges` AFTER INSERT ON `tab_tracedata` FOR EACH ROW BEGIN
> INSERT INTO changes (changes.Header_RefID) SELECT NEW.Header_RefID;
> END
> events in eventlog:
> | mysql-bin.000029 | 343 | Gtid | 1 | 381 | BEGIN GTID 0-1-6289 |
> | mysql-bin.000029 | 381 | Table_map | 1 | 1165 | table_id: 21 (asic.tab_tracedata) |
> | mysql-bin.000029 | 1165 | Table_map | 1 | 1214 | table_id: 19 (asic.changes) |
> | mysql-bin.000029 | 1214 | Write_rows_v1 | 1 | 3020 | table_id: 21 |
> | mysql-bin.000029 | 3020 | Write_rows_v1 | 1 | 3071 | table_id: 19 flags: STMT_END_F |
> | mysql-bin.000029 | 3071 | Xid | 1 | 3098 | COMMIT /* xid=18 */ |
>  
> showcase:
> CREATE TABLE test (id int auto_increment, data varchar(255),PRIMARY KEY (id));
> CREATE TABLE triggered (id int);
> create trigger testtrigger AFTER insert ON test FOR EACH ROW INSERT INTO triggered (triggered.id) SELECT NEW.id;
> insert into test set data="lala";
> subscribe to eventlog via CaptureChangeMySQL module. There are 2 events generated. Both inserts are mapped to the "triggered" table. thats wrong. one should be mapped to the "test" table - the other one to the "triggered" table.
> {"type":"insert","timestamp":1580393721000,"binlog_filename":"mysql-bin.000029","binlog_position":5035809,"database":"nifi_test_case","table_name":"triggered","table_id":31,"columns":[\{"id":1,"name":"id","column_type":4,"value":1},\{"id":2,"value":null}]}
> {"type":"insert","timestamp":1580393721000,"binlog_filename":"mysql-bin.000029","binlog_position":5035849,"database":"nifi_test_case","table_name":"triggered","table_id":31,"columns":[\{"id":1,"name":"id","column_type":4,"value":1}]}--=ixed 004F59E8C12584FF_=



--
This message was sent by Atlassian Jira
(v8.3.4#803005)