You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Vedran Ljubovic (Jira)" <ji...@apache.org> on 2023/05/11 13:06:00 UTC

[jira] [Updated] (IGNITE-19458) Kafka Connect IgniteSinkConnector multiple topics & caches do not work

     [ https://issues.apache.org/jira/browse/IGNITE-19458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vedran Ljubovic updated IGNITE-19458:
-------------------------------------
    Description: 
We are using Kafka Connect to stream 4 topics into 4 separate caches. We are presently forced to use 4 separate instances of Kafka Connect because when 4 connectors are passed to one instance all messages will be streamed into the same (first) Ignite cache. Multiple Connect instances work, but in the future we plan to add more topics/caches and this will affect memory usage as well as our ability to distribute workload in a cluster.

I think that the reason for this is in IgniteSinkTask.java where IgniteDataStreamer is declared as private static final attribute of inner class, which means that it will be initialized with the first passed cache and cannot be changed later.

I'm not sure which is the best approach performance wise, but possible solutions would be:
 * It should be non-static attribute of IgniteSinkTask (different streamer for each task)
 * It should be created in IgniteSinkConnector and passed to IgniteSinkTask (one streamer per connector)
 * Have a Map of cacheNames to Streamers which ensures that only as many Streamers will be created as there are different caches used.

Also I notice that cacheName, SingleTupleExtractor, stopped boolean, are all declared as static which cause various problems when using multiple connectors (messages being processed with wrong SimpleTupleExtractor etc.)

  was:
We are using Kafka Connect to stream 4 topics into 4 separate caches. We are presently forced to use 4 separate instances of Kafka Connect because when 4 connectors are passed to one instance all messages will be streamed into the same (first) Ignite cache. This works, but in the future we plan to add more topics/caches and this will affect memory usage as well as our ability to distribute workload in a cluster.

I think that the reason for this is in IgniteSinkTask.java where IgniteDataStreamer is declared as private static final attribute of inner class, which means that it will be initialized with the first passed cache and cannot be changed later.

I'm not sure which is the best approach performance wise, but possible solutions would be:
 * It should be non-static attribute of IgniteSinkTask (different streamer for each task)
 * It should be created in IgniteSinkConnector and passed to IgniteSinkTask (one streamer per connector)
 * Have a Map of cacheNames to Streamers which ensures that only as many Streamers will be created as there are different caches used.

Also I notice that cacheName, SingleTupleExtractor, stopped boolean, are all declared as static which cause various problems when using multiple connectors (messages being processed with wrong SimpleTupleExtractor etc.)


> Kafka Connect IgniteSinkConnector multiple topics & caches do not work
> ----------------------------------------------------------------------
>
>                 Key: IGNITE-19458
>                 URL: https://issues.apache.org/jira/browse/IGNITE-19458
>             Project: Ignite
>          Issue Type: Bug
>          Components: extensions
>    Affects Versions: 2.15
>            Reporter: Vedran Ljubovic
>            Priority: Major
>
> We are using Kafka Connect to stream 4 topics into 4 separate caches. We are presently forced to use 4 separate instances of Kafka Connect because when 4 connectors are passed to one instance all messages will be streamed into the same (first) Ignite cache. Multiple Connect instances work, but in the future we plan to add more topics/caches and this will affect memory usage as well as our ability to distribute workload in a cluster.
> I think that the reason for this is in IgniteSinkTask.java where IgniteDataStreamer is declared as private static final attribute of inner class, which means that it will be initialized with the first passed cache and cannot be changed later.
> I'm not sure which is the best approach performance wise, but possible solutions would be:
>  * It should be non-static attribute of IgniteSinkTask (different streamer for each task)
>  * It should be created in IgniteSinkConnector and passed to IgniteSinkTask (one streamer per connector)
>  * Have a Map of cacheNames to Streamers which ensures that only as many Streamers will be created as there are different caches used.
> Also I notice that cacheName, SingleTupleExtractor, stopped boolean, are all declared as static which cause various problems when using multiple connectors (messages being processed with wrong SimpleTupleExtractor etc.)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)