You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "kaikai.hou (Jira)" <ji...@apache.org> on 2020/01/08 10:43:00 UTC
[jira] [Created] (KAFKA-9385) Connect cluster: connector task repeat like a splitbrain cluster problem

kaikai.hou created KAFKA-9385:
---------------------------------

             Summary: Connect cluster: connector task repeat like a splitbrain cluster problem 
                 Key: KAFKA-9385
                 URL: https://issues.apache.org/jira/browse/KAFKA-9385
             Project: Kafka
          Issue Type: Bug
          Components: KafkaConnect
            Reporter: kaikai.hou
         Attachments: 12_31_d8c7j_1.jpg

I am using Debezium. And find a task repeat problem.[Jump|[https://issues.redhat.com/browse/DBZ-1573?jql=key%20in%20watchedIssues()]]

 

1. I push the Debezium image to our private image repository.

2. Deploy the connect cluster with the following *Deployment Config*：
{code:java}
//代码占位符
apiVersion: apps.openshift.io/v1
kind: DeploymentConfig
metadata:
  annotations:
    openshift.io/generated-by: OpenShiftWebConsole
  creationTimestamp: '2019-10-14T07:45:41Z'
  generation: 29
  labels:
    app: debezium-test-cloud
  name: debezium-test-cloud
  namespace: test
  resourceVersion: '168496156'
  selfLink: >-
    /apis/apps.openshift.io/v1/namespaces/test/deploymentconfigs/debezium-test-cloud
  uid: 9f4f8f4d-ee56-11e9-a5a1-00163e0e008f
spec:
  replicas: 2
  selector:
    app: debezium-test-cloud
    deploymentconfig: debezium-test-cloud
  strategy:
    activeDeadlineSeconds: 21600
    resources: {}
    rollingParams:
      intervalSeconds: 1
      maxSurge: 25%
      maxUnavailable: 25%
      timeoutSeconds: 600
      updatePeriodSeconds: 1
    type: Rolling
  template:
    metadata:
      annotations:
        openshift.io/generated-by: OpenShiftWebConsole
      creationTimestamp: null
      labels:
        app: debezium-test-cloud
        deploymentconfig: debezium-test-cloud
    spec:
      containers:
        - env:
            - name: BOOTSTRAP_SERVERS
              value: '192.168.100.228:9092'
            - name: GROUP_ID
              value: test-cloud
            - name: CONFIG_STORAGE_TOPIC
              value: base.test-cloud.config
            - name: OFFSET_STORAGE_TOPIC
              value: base.test-cloud.offset
            - name: STATUS_STORAGE_TOPIC
              value: base.test-cloud.status
            - name: CONNECT_KEY_CONVERTER_SCHEMAS_ENABLE
              value: 'true'
            - name: CONNECT_VALUE_CONVERTER_SCHEMAS_ENABLE
              value: 'true'
            - name: CONNECT_PRODUCER_MAX_REQUEST_SIZE
              value: '20971520'
            - name: CONNECT_DATABASE_HISTORY_KAFKA_RECOVERY_POLL_INTERVAL_MS
              value: '1000'
            - name: HEAP_OPTS
              value: '-XX:+UseContainerSupport -XX:MaxRAMPercentage=75.0'
          image: 'registry.cn-hangzhou.aliyuncs.com/eshine/debeziumconnect:1.0.0.Beta2'
          imagePullPolicy: IfNotPresent
          name: debezium-test-cloud
          ports:
            - containerPort: 8083
              protocol: TCP
            - containerPort: 8778
              protocol: TCP
            - containerPort: 9092
              protocol: TCP
            - containerPort: 9779
              protocol: TCP
          resources:
            limits:
              cpu: 400m
              memory: 1Gi
            requests:
              cpu: 200m
              memory: 1Gi
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
            - mountPath: /kafka/config
              name: debezium-test-cloud-1
            - mountPath: /kafka/data
              name: debezium-test-cloud-2
            - mountPath: /kafka/logs
              name: debezium-test-cloud-3
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      volumes:
        - emptyDir: {}
          name: debezium-test-cloud-1
        - emptyDir: {}
          name: debezium-test-cloud-2
        - emptyDir: {}
          name: debezium-test-cloud-3
  test: false
  triggers:
    - type: ConfigChange
status:
  availableReplicas: 2
  conditions:
    - lastTransitionTime: '2019-11-25T06:44:30Z'
      lastUpdateTime: '2019-11-25T06:44:44Z'
      message: replication controller "debezium-test-cloud-15" successfully rolled out
      reason: NewReplicationControllerAvailable
      status: 'True'
      type: Progressing
    - lastTransitionTime: '2019-12-31T10:06:23Z'
      lastUpdateTime: '2019-12-31T10:06:23Z'
      message: Deployment config has minimum availability.
      status: 'True'
      type: Available
  details:
    causes:
      - type: Manual
    message: manual change
  latestVersion: 15
  observedGeneration: 29
  readyReplicas: 2
  replicas: 2
  unavailableReplicas: 0
  updatedReplicas: 2
{code}
3. Connect cluster in openshift: one service with two pods

4.  

     a). task_connector_1_0 and task_connector_3_0 were running in podA; task_connector_2_0 was running in PodB

     b) Then, PodA console follows error log:  In attachment "12_31_d8c7j_1.jpg" 

     c) Then, Rebalance started;

     d) However, In PodB, all task (task_connector_1_0, task_connector_2_0, task_connector_3_0) are running.  In PodA, still task_connector_1_0 and task_connector_3_0.

     e) So the repeat task appeared.

 

    



--
This message was sent by Atlassian Jira
(v8.3.4#803005)