You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "kaikai.hou (Jira)" <ji...@apache.org> on 2020/01/08 10:43:00 UTC
[jira] [Created] (KAFKA-9385) Connect cluster: connector task
repeat like a splitbrain cluster problem
kaikai.hou created KAFKA-9385:
---------------------------------
Summary: Connect cluster: connector task repeat like a splitbrain cluster problem
Key: KAFKA-9385
URL: https://issues.apache.org/jira/browse/KAFKA-9385
Project: Kafka
Issue Type: Bug
Components: KafkaConnect
Reporter: kaikai.hou
Attachments: 12_31_d8c7j_1.jpg
I am using Debezium. And find a task repeat problem.[Jump|[https://issues.redhat.com/browse/DBZ-1573?jql=key%20in%20watchedIssues()]]
1. I push the Debezium image to our private image repository.
2. Deploy the connect cluster with the following *Deployment Config*:
{code:java}
//代码占位符
apiVersion: apps.openshift.io/v1
kind: DeploymentConfig
metadata:
annotations:
openshift.io/generated-by: OpenShiftWebConsole
creationTimestamp: '2019-10-14T07:45:41Z'
generation: 29
labels:
app: debezium-test-cloud
name: debezium-test-cloud
namespace: test
resourceVersion: '168496156'
selfLink: >-
/apis/apps.openshift.io/v1/namespaces/test/deploymentconfigs/debezium-test-cloud
uid: 9f4f8f4d-ee56-11e9-a5a1-00163e0e008f
spec:
replicas: 2
selector:
app: debezium-test-cloud
deploymentconfig: debezium-test-cloud
strategy:
activeDeadlineSeconds: 21600
resources: {}
rollingParams:
intervalSeconds: 1
maxSurge: 25%
maxUnavailable: 25%
timeoutSeconds: 600
updatePeriodSeconds: 1
type: Rolling
template:
metadata:
annotations:
openshift.io/generated-by: OpenShiftWebConsole
creationTimestamp: null
labels:
app: debezium-test-cloud
deploymentconfig: debezium-test-cloud
spec:
containers:
- env:
- name: BOOTSTRAP_SERVERS
value: '192.168.100.228:9092'
- name: GROUP_ID
value: test-cloud
- name: CONFIG_STORAGE_TOPIC
value: base.test-cloud.config
- name: OFFSET_STORAGE_TOPIC
value: base.test-cloud.offset
- name: STATUS_STORAGE_TOPIC
value: base.test-cloud.status
- name: CONNECT_KEY_CONVERTER_SCHEMAS_ENABLE
value: 'true'
- name: CONNECT_VALUE_CONVERTER_SCHEMAS_ENABLE
value: 'true'
- name: CONNECT_PRODUCER_MAX_REQUEST_SIZE
value: '20971520'
- name: CONNECT_DATABASE_HISTORY_KAFKA_RECOVERY_POLL_INTERVAL_MS
value: '1000'
- name: HEAP_OPTS
value: '-XX:+UseContainerSupport -XX:MaxRAMPercentage=75.0'
image: 'registry.cn-hangzhou.aliyuncs.com/eshine/debeziumconnect:1.0.0.Beta2'
imagePullPolicy: IfNotPresent
name: debezium-test-cloud
ports:
- containerPort: 8083
protocol: TCP
- containerPort: 8778
protocol: TCP
- containerPort: 9092
protocol: TCP
- containerPort: 9779
protocol: TCP
resources:
limits:
cpu: 400m
memory: 1Gi
requests:
cpu: 200m
memory: 1Gi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /kafka/config
name: debezium-test-cloud-1
- mountPath: /kafka/data
name: debezium-test-cloud-2
- mountPath: /kafka/logs
name: debezium-test-cloud-3
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- emptyDir: {}
name: debezium-test-cloud-1
- emptyDir: {}
name: debezium-test-cloud-2
- emptyDir: {}
name: debezium-test-cloud-3
test: false
triggers:
- type: ConfigChange
status:
availableReplicas: 2
conditions:
- lastTransitionTime: '2019-11-25T06:44:30Z'
lastUpdateTime: '2019-11-25T06:44:44Z'
message: replication controller "debezium-test-cloud-15" successfully rolled out
reason: NewReplicationControllerAvailable
status: 'True'
type: Progressing
- lastTransitionTime: '2019-12-31T10:06:23Z'
lastUpdateTime: '2019-12-31T10:06:23Z'
message: Deployment config has minimum availability.
status: 'True'
type: Available
details:
causes:
- type: Manual
message: manual change
latestVersion: 15
observedGeneration: 29
readyReplicas: 2
replicas: 2
unavailableReplicas: 0
updatedReplicas: 2
{code}
3. Connect cluster in openshift: one service with two pods
4.
a). task_connector_1_0 and task_connector_3_0 were running in podA; task_connector_2_0 was running in PodB
b) Then, PodA console follows error log: In attachment "12_31_d8c7j_1.jpg"
c) Then, Rebalance started;
d) However, In PodB, all task (task_connector_1_0, task_connector_2_0, task_connector_3_0) are running. In PodA, still task_connector_1_0 and task_connector_3_0.
e) So the repeat task appeared.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)