You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@inlong.apache.org by GitBox <gi...@apache.org> on 2022/02/25 09:54:49 UTC

[GitHub] [incubator-inlong] zk1510 opened a new pull request #2725: [INLONG-2666][Agent] agent support kafka collection

zk1510 opened a new pull request #2725:
URL: https://github.com/apache/incubator-inlong/pull/2725


   ### Title Name: [INLONG-XYZ][component] Title of the pull request
   
   where *XYZ* should be replaced by the actual issue number.
   
   Fixes #<xyz>
   
   ### Motivation
   
   *Explain here the context, and why you're making that change. What is the problem you're trying to solve.*
   
   ### Modifications
   
   *Describe the modifications you've done.*
   
   ### Verifying this change
   
   - [ ] Make sure that the change passes the CI checks.
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
     - *Added integration tests for end-to-end deployment with large payloads (10MB)*
     - *Extended integration test for recovery after broker failure*
   
   ### Documentation
   
     - Does this pull request introduce a new feature? (yes / no)
     - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)
     - If a feature is not applicable for documentation, explain why?
     - If a feature is not documented yet in this PR, please create a followup issue for adding the documentation
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@inlong.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-inlong] EMsnap commented on a change in pull request #2725: [INLONG-2666][Agent] Agent supports collecting the data from Kafka

Posted by GitBox <gi...@apache.org>.
EMsnap commented on a change in pull request #2725:
URL: https://github.com/apache/incubator-inlong/pull/2725#discussion_r815291870



##########
File path: inlong-agent/agent-common/src/main/java/org/apache/inlong/agent/constant/JobConstants.java
##########
@@ -51,27 +51,27 @@
     public static final String JOB_DIR_FILTER_PATH = "job.filejob.dir.path";
 
     //Binlog job
-    private static final String JOB_DATABASE_USER = "job.binlogjob.user";
-    private static final String JOB_DATABASE_PASSWORD = "job.binlogjob.password";
-    private static final String JOB_DATABASE_HOSTNAME = "job.binlogjob.hostname";
-    private static final String JOB_DATABASE_WHITELIST = "job.binlogjob.tableWhiteList";
-    private static final String JOB_DATABASE_SERVER_TIME_ZONE = "job.binlogjob.database.serverTimezone";
-    private static final String JOB_DATABASE_STORE_OFFSET_INTERVAL_MS = "offset.binlogjob.offset.flush.interval.ms";
-    private static final String JOB_DATABASE_STORE_HISTORY_FILENAME = "job.binlogjob.database.history.file.filename";
-    private static final String JOB_DATABASE_SNAPSHOT_MODE = "job.binlogjob.database.snapshot.mode";
-    private static final  String JOB_DATABASE_OFFSET = "job.binlogjob.database.offset";
+    public static final String JOB_DATABASE_USER = "job.binlogjob.user";
+    public static final String JOB_DATABASE_PASSWORD = "job.binlogjob.password";
+    public static final String JOB_DATABASE_HOSTNAME = "job.binlogjob.hostname";
+    public static final String JOB_DATABASE_WHITELIST = "job.binlogjob.tableWhiteList";
+    public static final String JOB_DATABASE_SERVER_TIME_ZONE = "job.binlogjob.database.serverTimezone";
+    public static final String JOB_DATABASE_STORE_OFFSET_INTERVAL_MS = "offset.binlogjob.offset.flush.interval.ms";
+    public static final String JOB_DATABASE_STORE_HISTORY_FILENAME = "job.binlogjob.database.history.file.filename";
+    public static final String JOB_DATABASE_SNAPSHOT_MODE = "job.binlogjob.database.snapshot.mode";
+    public static final  String JOB_DATABASE_OFFSET = "job.binlogjob.database.offset";
 
     //Kafka job
-    private static final  String SOURCE_KAFKA_TOPIC = "job.kafkajob.topic";
-    private static final  String SOURCE_KAFKA_KEY_DESERIALIZER = "job.kafkajob.key.deserializer";
-    private static final  String SOURCE_KAFKA_VALUE_DESERIALIZER = "job.kafkajob.value.Deserializer";
-    private static final  String SOURCE_KAFKA_BOOTSTRAP_SERVERS = "job.kafkajob.bootstrap.servers";
-    private static final  String SOURCE_KAFKA_GROUP_ID = "job.kafkajob.group.Id";
-    private static final  String SOURCE_KAFKA_RECORD_SPEED = "job.kafkajob.record.speed";
-    private static final  String SOURCE_KAFKA_BYTE_SPEED_LIMIT = "job.kafkajob.byte.speed.limit";
-    private static final  String SOURCE_KAFKA_MIN_INTERVAL = "job.kafkajob.min.interval";
-    private static final  String SOURCE_KAFKA_OFFSET = "job.kafkajob.offset";
-    private static final  String SOURCE_KAFKA_READ_TIMEOUT = "job.kafkajob.read.timeout";
+    public static final  String JOB_KAFKA_TOPIC = "job.kafkajob.topic";
+    public static final  String JOB_KAFKA_BOOTSTRAP_SERVERS = "job.kafkajob.bootstrap.servers";
+    public static final  String JOB_KAFKA_GROUP_ID = "job.kafkajob.group.id";
+    public static final  String JOB_KAFKA_RECORD_SPEED_LIMIT = "job.kafkajob.recordspeed.limit";
+    public static final  String JOB_KAFKA_BYTE_SPEED_LIMIT = "job.kafkajob.bytespeed.limit";

Review comment:
       It is recommended to use private when there's no need to use public 

##########
File path: inlong-agent/agent-plugins/src/main/java/org/apache/inlong/agent/plugin/sources/KafkaSource.java
##########
@@ -0,0 +1,134 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.inlong.agent.plugin.sources;
+
+import com.alibaba.fastjson.JSON;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.inlong.agent.conf.JobProfile;
+import org.apache.inlong.agent.plugin.Reader;
+import org.apache.inlong.agent.plugin.Source;
+import org.apache.inlong.agent.plugin.metrics.SourceJmxMetric;
+import org.apache.inlong.agent.plugin.metrics.SourceMetrics;
+import org.apache.inlong.agent.plugin.metrics.SourcePrometheusMetrics;
+import org.apache.inlong.agent.plugin.sources.reader.KafkaReader;
+import org.apache.inlong.agent.utils.ConfigUtil;
+import org.apache.kafka.clients.consumer.KafkaConsumer;
+import org.apache.kafka.common.PartitionInfo;
+import org.apache.kafka.common.TopicPartition;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Properties;
+import static org.apache.inlong.agent.constant.JobConstants.DEFAULT_JOB_LINE_FILTER;
+import static org.apache.inlong.agent.constant.JobConstants.JOB_KAFKA_OFFSET;
+import static org.apache.inlong.agent.constant.JobConstants.JOB_KAFKA_PARTITION_OFFSET_DELIMITER;
+import static org.apache.inlong.agent.constant.JobConstants.JOB_LINE_FILTER_PATTERN;
+import static org.apache.inlong.agent.constant.JobConstants.JOB_OFFSET_DELIMITER;
+
+public class KafkaSource implements Source {
+
+    private static final Logger LOGGER = LoggerFactory.getLogger(KafkaSource.class);
+
+    private static final String KAFKA_SOURCE_TAG_NAME = "AgentKafkaSourceMetric";
+    private static final String JOB_KAFKAJOB_PARAM_PREFIX = "job.kafkajob.";
+    private static final String JOB_KAFKAJOB_TOPIC = "job.kafkajob.topic";
+    private static final String JOB_KAFKAJOB_BOOTSTRAP_SERVERS = "job.kafkajob.bootstrap.servers";
+    private static final String JOB_KAFKAJOB_GROUP_ID = "job.kafkajob.group.id";
+    private static final String JOB_KAFKAJOB_WAIT_TIMEOUT = "job.kafkajob.wait.timeout";
+    //private static final String JOB_KAFKAJOB_PARTITION_OFFSET = "job.kafkajob.topic.partition.offset";
+    private static final String KAFKA_COMMIT_AUTO = "enable.auto.commit";
+    private static final String KAFKA_DESERIALIZER_METHOD = "org.apache.kafka.common.serialization.StringDeserializer";
+    private static final String KAFKA_KEY_DESERIALIZER = "key.deserializer";
+    private static final String KAFKA_VALUE_DESERIALIZER = "value.deserializer";
+
+    private final SourceMetrics sourceMetrics;
+
+    public KafkaSource() {
+        if (ConfigUtil.isPrometheusEnabled()) {
+            this.sourceMetrics = new SourcePrometheusMetrics(KAFKA_SOURCE_TAG_NAME);

Review comment:
       add a atomic integer to define different source




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@inlong.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-inlong] healchow merged pull request #2725: [INLONG-2666][Agent] Agent supports collecting the data from Kafka

Posted by GitBox <gi...@apache.org>.
healchow merged pull request #2725:
URL: https://github.com/apache/incubator-inlong/pull/2725


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@inlong.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-inlong] EMsnap commented on a change in pull request #2725: [INLONG-2666][Agent] Agent supports collecting the data from Kafka

Posted by GitBox <gi...@apache.org>.
EMsnap commented on a change in pull request #2725:
URL: https://github.com/apache/incubator-inlong/pull/2725#discussion_r815255492



##########
File path: inlong-agent/agent-plugins/src/main/java/org/apache/inlong/agent/plugin/sources/reader/KafkaReader.java
##########
@@ -0,0 +1,296 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.inlong.agent.plugin.sources.reader;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.inlong.agent.conf.JobProfile;
+import org.apache.inlong.agent.message.DefaultMessage;
+import org.apache.inlong.agent.metrics.audit.AuditUtils;
+import org.apache.inlong.agent.plugin.Message;
+import org.apache.inlong.agent.plugin.Reader;
+import org.apache.inlong.agent.plugin.Validator;
+import org.apache.inlong.agent.plugin.metrics.PluginJmxMetric;
+import org.apache.inlong.agent.plugin.metrics.PluginMetric;
+import org.apache.inlong.agent.plugin.metrics.PluginPrometheusMetric;
+import org.apache.inlong.agent.plugin.validator.PatternValidator;
+import org.apache.inlong.agent.utils.AgentUtils;
+import org.apache.inlong.agent.utils.ConfigUtil;
+import org.apache.kafka.clients.consumer.ConsumerRecord;
+import org.apache.kafka.clients.consumer.ConsumerRecords;
+import org.apache.kafka.clients.consumer.KafkaConsumer;
+import org.apache.kafka.common.TopicPartition;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.nio.charset.StandardCharsets;
+import java.time.Duration;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicLong;
+import static org.apache.inlong.agent.constant.CommonConstants.DEFAULT_PROXY_INLONG_GROUP_ID;
+import static org.apache.inlong.agent.constant.CommonConstants.DEFAULT_PROXY_INLONG_STREAM_ID;
+import static org.apache.inlong.agent.constant.CommonConstants.PROXY_INLONG_GROUP_ID;
+import static org.apache.inlong.agent.constant.CommonConstants.PROXY_INLONG_STREAM_ID;
+import static org.apache.inlong.agent.constant.JobConstants.JOB_KAFKA_OFFSET;
+
+public class KafkaReader<K, V> implements Reader {
+    private static final Logger LOGGER = LoggerFactory.getLogger(KafkaReader.class);
+
+    KafkaConsumer<K, V> consumer;
+    private Iterator<ConsumerRecord<K, V>> iterator;
+    private List<Validator> validators = new ArrayList<>();
+    public static final int NEVER_STOP_SIGN = -1;
+    private long timeout;
+    private long waitTimeout = 1000;
+    private long lastTime = 0;
+    // metric
+    private static final String KAFKA_READER_TAG_NAME = "AgentKafkaMetric";
+    private final PluginMetric kafkaMetric;
+    //total readRecords
+    private static AtomicLong currentTotalReadRecords = new AtomicLong(0);
+
+    private static AtomicLong lastTotalReadRecords = new AtomicLong(0);
+    // total readBytes
+    private static AtomicLong currentTotalReadBytes = new AtomicLong(0);
+    private static AtomicLong lastTotalReadBytes = new AtomicLong(0);
+    long lastTimestamp;
+    // bps: records/s
+    long recordSpeed;
+    // tps: bytes/s
+    long byteSpeed;
+    // sleepTime
+    long flowControlInterval;
+    private String inlongGroupId;
+    private String inlongStreamId;
+    private String snapshot;
+    private static final String KAFKA_SOURCE_READ_RECORD_SPEED = "job.kafkajob.record.speed.limit";
+    private static final String KAFKA_SOURCE_READ_BYTE_SPEED = "job.kafkajob.byte.speed.limit";
+    private static final String KAFKA_SOURCE_READ_MIN_INTERVAL = "kafka.min.interval.limit";
+    private static final String JOB_KAFKAJOB_READ_TIMEOUT = "job.kafkajob.read.timeout";
+
+    /**
+     * init attribute
+     * @param consumer
+     * @param paraMap
+     */
+    public KafkaReader(KafkaConsumer<K, V> consumer,Map<String,String> paraMap) {
+        this.consumer = consumer;
+        // metrics total readRecords
+        if (ConfigUtil.isPrometheusEnabled()) {
+            kafkaMetric = new PluginPrometheusMetric(AgentUtils.getUniqId(
+                    KAFKA_READER_TAG_NAME, currentTotalReadRecords.incrementAndGet()));
+        } else {
+            kafkaMetric = new PluginJmxMetric(AgentUtils.getUniqId(
+                    KAFKA_READER_TAG_NAME, currentTotalReadRecords.incrementAndGet()));
+        }
+
+        this.recordSpeed = Long.valueOf(paraMap.getOrDefault(KAFKA_SOURCE_READ_RECORD_SPEED,"10000"));
+        this.byteSpeed = Long.valueOf(paraMap.getOrDefault(KAFKA_SOURCE_READ_BYTE_SPEED,String.valueOf(1024 * 1024)));
+        this.flowControlInterval = Long.valueOf(paraMap.getOrDefault(KAFKA_SOURCE_READ_MIN_INTERVAL,"1000"));
+        this.lastTimestamp = System.currentTimeMillis();
+
+        LOGGER.info("KAFKA_SOURCE_READ_RECORD_SPEED = {}", this.recordSpeed);
+        LOGGER.info("KAFKA_SOURCE_READ_BYTE_SPEED = {}", this.byteSpeed);
+    }
+
+    @Override
+    public Message read() {
+
+        if (iterator != null && iterator.hasNext()) {
+            ConsumerRecord<K, V> record = iterator.next();
+            // body
+            String recordValue = record.value().toString();
+            if (validateMessage(recordValue)) {
+                AuditUtils.add(AuditUtils.AUDIT_ID_AGENT_READ_SUCCESS,
+                        inlongGroupId, inlongStreamId, System.currentTimeMillis());
+                // header
+                Map<String,String> headerMap = new HashMap<>();
+                headerMap.put("record.offset", String.valueOf(record.offset()));
+                headerMap.put("record.key", String.valueOf(record.key()));
+                // control speed
+                kafkaMetric.incReadNum();
+                //commit offset
+                consumer.commitAsync();
+                //commit succeed,then record current offset
+                snapshot = String.valueOf(record.offset());
+                DefaultMessage message = new DefaultMessage(recordValue.getBytes(StandardCharsets.UTF_8), headerMap);
+                recordReadLimit(1L, message.getBody().length);
+                return message;
+            }
+        }
+        AgentUtils.silenceSleepInMs(waitTimeout);
+
+        return null;
+    }
+
+    @Override
+    public boolean isFinished() {
+        if (iterator == null) {
+            //fetch data
+            fetchData(5000);
+            return false;
+        }
+        if (iterator.hasNext()) {
+            lastTime = 0;
+            return false;
+        }
+        //fetch data
+        boolean fetchDataSuccess = fetchData(5000);
+        if (fetchDataSuccess && iterator.hasNext()) {
+            lastTime = 0;
+            return false;
+        } else {
+            if (lastTime == 0) {
+                lastTime = System.currentTimeMillis();
+            }
+            if (timeout == NEVER_STOP_SIGN) {
+                return false;
+            }
+            return System.currentTimeMillis() - lastTime > timeout;
+        }
+    }
+
+    @Override
+    public String getReadSource() {
+        Set<TopicPartition> assignment = consumer.assignment();
+        //consumer.
+        Iterator<TopicPartition> iterator = assignment.iterator();
+        while (iterator.hasNext()) {
+            TopicPartition topicPartition = iterator.next();
+            return topicPartition.topic() + "_" + topicPartition.partition();

Review comment:
       why return here ?

##########
File path: inlong-agent/agent-plugins/src/main/java/org/apache/inlong/agent/plugin/sources/reader/KafkaReader.java
##########
@@ -0,0 +1,296 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.inlong.agent.plugin.sources.reader;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.inlong.agent.conf.JobProfile;
+import org.apache.inlong.agent.message.DefaultMessage;
+import org.apache.inlong.agent.metrics.audit.AuditUtils;
+import org.apache.inlong.agent.plugin.Message;
+import org.apache.inlong.agent.plugin.Reader;
+import org.apache.inlong.agent.plugin.Validator;
+import org.apache.inlong.agent.plugin.metrics.PluginJmxMetric;
+import org.apache.inlong.agent.plugin.metrics.PluginMetric;
+import org.apache.inlong.agent.plugin.metrics.PluginPrometheusMetric;
+import org.apache.inlong.agent.plugin.validator.PatternValidator;
+import org.apache.inlong.agent.utils.AgentUtils;
+import org.apache.inlong.agent.utils.ConfigUtil;
+import org.apache.kafka.clients.consumer.ConsumerRecord;
+import org.apache.kafka.clients.consumer.ConsumerRecords;
+import org.apache.kafka.clients.consumer.KafkaConsumer;
+import org.apache.kafka.common.TopicPartition;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.nio.charset.StandardCharsets;
+import java.time.Duration;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicLong;
+import static org.apache.inlong.agent.constant.CommonConstants.DEFAULT_PROXY_INLONG_GROUP_ID;
+import static org.apache.inlong.agent.constant.CommonConstants.DEFAULT_PROXY_INLONG_STREAM_ID;
+import static org.apache.inlong.agent.constant.CommonConstants.PROXY_INLONG_GROUP_ID;
+import static org.apache.inlong.agent.constant.CommonConstants.PROXY_INLONG_STREAM_ID;
+import static org.apache.inlong.agent.constant.JobConstants.JOB_KAFKA_OFFSET;
+
+public class KafkaReader<K, V> implements Reader {
+    private static final Logger LOGGER = LoggerFactory.getLogger(KafkaReader.class);
+
+    KafkaConsumer<K, V> consumer;
+    private Iterator<ConsumerRecord<K, V>> iterator;
+    private List<Validator> validators = new ArrayList<>();
+    public static final int NEVER_STOP_SIGN = -1;
+    private long timeout;
+    private long waitTimeout = 1000;
+    private long lastTime = 0;
+    // metric
+    private static final String KAFKA_READER_TAG_NAME = "AgentKafkaMetric";
+    private final PluginMetric kafkaMetric;
+    //total readRecords

Review comment:
       /**/ use comment like this




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@inlong.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-inlong] healchow commented on a change in pull request #2725: [INLONG-2666][Agent] Agent supports collecting the data from Kafka

Posted by GitBox <gi...@apache.org>.
healchow commented on a change in pull request #2725:
URL: https://github.com/apache/incubator-inlong/pull/2725#discussion_r815589552



##########
File path: inlong-agent/agent-plugins/src/test/java/org/apache/inlong/agent/plugin/sources/TestKafkaReader.java
##########
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.inlong.agent.plugin.sources;
+
+import org.apache.inlong.agent.conf.JobProfile;
+import org.apache.inlong.agent.plugin.Message;
+import org.apache.inlong.agent.plugin.Reader;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.List;
+
+public class TestKafkaReader {
+    private static final Logger LOGGER = LoggerFactory.getLogger(TestKafkaReader.class);
+
+    @Test
+    public void testKafkaReader() {
+        KafkaSource kafkaSource = new KafkaSource();
+        JobProfile conf = JobProfile.parseJsonStr("{}");
+        conf.set("job.kafkajob.topic","test2");
+        conf.set("job.kafkajob.bootstrap.servers","127.0.0.1:9092");
+        conf.set("job.kafkajob.group.id","test_group1");
+        conf.set("job.kafkajob.recordspeed.limit","1");
+        conf.set("job.kafkajob.bytespeed.limit","1");
+        conf.set("job.kafkajob.partition.offset", "0#5");
+        conf.set("job.kafkajob.auto.offsetReset", "earliest");
+        conf.set("proxy.inlongGroupId", "");
+        conf.set("proxy.inlongStreamId", "");
+
+        try {
+            List<Reader> readers = kafkaSource.split(conf);
+            System.out.println(readers.size());

Review comment:
       Remove sout.

##########
File path: inlong-agent/agent-plugins/src/test/java/org/apache/inlong/agent/plugin/TestFileAgent.java
##########
@@ -170,18 +181,25 @@ public void testCycleUnit() throws Exception {
     public void testGroupIdFilter() throws Exception {
 
         String nowDate = AgentUtils.formatCurrentTimeWithoutOffset("yyyyMMdd");
+        InputStream stream = null;
+        try {
+            stream = LOADER.getResourceAsStream("fileAgentJob.json");
 
-        try (InputStream stream = LOADER.getResourceAsStream("fileAgentJob.json")) {

Review comment:
       Why not use the `try-with-resource`? It's better to write `finally` statement.

##########
File path: inlong-agent/agent-plugins/src/test/java/org/apache/inlong/agent/plugin/sources/TestKafkaReader.java
##########
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.inlong.agent.plugin.sources;
+
+import org.apache.inlong.agent.conf.JobProfile;
+import org.apache.inlong.agent.plugin.Message;
+import org.apache.inlong.agent.plugin.Reader;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.List;
+
+public class TestKafkaReader {
+    private static final Logger LOGGER = LoggerFactory.getLogger(TestKafkaReader.class);
+
+    @Test
+    public void testKafkaReader() {
+        KafkaSource kafkaSource = new KafkaSource();
+        JobProfile conf = JobProfile.parseJsonStr("{}");
+        conf.set("job.kafkajob.topic","test2");
+        conf.set("job.kafkajob.bootstrap.servers","127.0.0.1:9092");
+        conf.set("job.kafkajob.group.id","test_group1");
+        conf.set("job.kafkajob.recordspeed.limit","1");
+        conf.set("job.kafkajob.bytespeed.limit","1");
+        conf.set("job.kafkajob.partition.offset", "0#5");
+        conf.set("job.kafkajob.auto.offsetReset", "earliest");
+        conf.set("proxy.inlongGroupId", "");
+        conf.set("proxy.inlongStreamId", "");
+
+        try {
+            List<Reader> readers = kafkaSource.split(conf);
+            System.out.println(readers.size());
+            LOGGER.info("total readers by split after:{}",readers.size());
+            readers.forEach(reader -> {
+                reader.init(conf);
+                Runnable runnable = () -> {
+                    while (!reader.isFinished()) {
+                        Message msg = reader.read();
+                        if (msg != null) {
+                            LOGGER.info(new String(msg.getBody()));
+                        }
+                    }
+                    LOGGER.info("reader is finished!");
+                };
+
+                Thread readerThread = new Thread(runnable);

Review comment:
       Just use `new Thread(runnable).start()`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@inlong.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-inlong] healchow commented on a change in pull request #2725: [INLONG-2666][Agent] Agent supports collecting the data from Kafka

Posted by GitBox <gi...@apache.org>.
healchow commented on a change in pull request #2725:
URL: https://github.com/apache/incubator-inlong/pull/2725#discussion_r814677031



##########
File path: inlong-agent/agent-plugins/pom.xml
##########
@@ -126,5 +126,17 @@
             <artifactId>agent-common</artifactId>
             <version>${project.version}</version>
         </dependency>
+
+        <dependency>
+            <groupId>org.apache.kafka</groupId>
+            <artifactId>kafka_${flink.scala.binary.version}</artifactId>
+            <version>${kafka.version}</version>
+        </dependency>
+
+        <dependency>

Review comment:
       Could u use `gson` or `jackson` instead of `fastjson`?

##########
File path: inlong-agent/agent-plugins/src/test/java/org/apache/inlong/agent/plugin/sources/TestKafkaReader.java
##########
@@ -0,0 +1,73 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.inlong.agent.plugin.sources;
+
+import org.apache.inlong.agent.conf.JobProfile;
+import org.apache.inlong.agent.plugin.Message;
+import org.apache.inlong.agent.plugin.Reader;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.List;
+
+public class TestKafkaReader {
+    private static final Logger LOGGER = LoggerFactory.getLogger(TestKafkaReader.class);
+
+    @Test
+    public void testKafkaReader() {
+        KafkaSource kafkaSource = new KafkaSource();
+        JobProfile conf = JobProfile.parseJsonStr("{}");
+        conf.set("job.kafkajob.topic","test2");
+        conf.set("job.kafkajob.bootstrap.servers","10.91.78.107:9092");
+        conf.set("job.kafkajob.group.id","test_group1");
+//        conf.set("job.kafkajob.record.speed.limit","1");

Review comment:
       Please remove those unused codes.

##########
File path: inlong-agent/agent-common/src/main/java/org/apache/inlong/agent/constant/JobConstants.java
##########
@@ -51,27 +51,27 @@
     public static final String JOB_DIR_FILTER_PATH = "job.filejob.dir.path";
 
     //Binlog job
-    private static final String JOB_DATABASE_USER = "job.binlogjob.user";
-    private static final String JOB_DATABASE_PASSWORD = "job.binlogjob.password";
-    private static final String JOB_DATABASE_HOSTNAME = "job.binlogjob.hostname";
-    private static final String JOB_DATABASE_WHITELIST = "job.binlogjob.tableWhiteList";
-    private static final String JOB_DATABASE_SERVER_TIME_ZONE = "job.binlogjob.database.serverTimezone";
-    private static final String JOB_DATABASE_STORE_OFFSET_INTERVAL_MS = "offset.binlogjob.offset.flush.interval.ms";
-    private static final String JOB_DATABASE_STORE_HISTORY_FILENAME = "job.binlogjob.database.history.file.filename";
-    private static final String JOB_DATABASE_SNAPSHOT_MODE = "job.binlogjob.database.snapshot.mode";
-    private static final  String JOB_DATABASE_OFFSET = "job.binlogjob.database.offset";
+    public static final String JOB_DATABASE_USER = "job.binlogjob.user";
+    public static final String JOB_DATABASE_PASSWORD = "job.binlogjob.password";
+    public static final String JOB_DATABASE_HOSTNAME = "job.binlogjob.hostname";
+    public static final String JOB_DATABASE_WHITELIST = "job.binlogjob.tableWhiteList";
+    public static final String JOB_DATABASE_SERVER_TIME_ZONE = "job.binlogjob.database.serverTimezone";
+    public static final String JOB_DATABASE_STORE_OFFSET_INTERVAL_MS = "offset.binlogjob.offset.flush.interval.ms";
+    public static final String JOB_DATABASE_STORE_HISTORY_FILENAME = "job.binlogjob.database.history.file.filename";
+    public static final String JOB_DATABASE_SNAPSHOT_MODE = "job.binlogjob.database.snapshot.mode";
+    public static final  String JOB_DATABASE_OFFSET = "job.binlogjob.database.offset";
 
     //Kafka job
-    private static final  String SOURCE_KAFKA_TOPIC = "job.kafkajob.topic";
-    private static final  String SOURCE_KAFKA_KEY_DESERIALIZER = "job.kafkajob.key.deserializer";
-    private static final  String SOURCE_KAFKA_VALUE_DESERIALIZER = "job.kafkajob.value.Deserializer";
-    private static final  String SOURCE_KAFKA_BOOTSTRAP_SERVERS = "job.kafkajob.bootstrap.servers";
-    private static final  String SOURCE_KAFKA_GROUP_ID = "job.kafkajob.group.Id";
-    private static final  String SOURCE_KAFKA_RECORD_SPEED = "job.kafkajob.record.speed";
-    private static final  String SOURCE_KAFKA_BYTE_SPEED_LIMIT = "job.kafkajob.byte.speed.limit";
-    private static final  String SOURCE_KAFKA_MIN_INTERVAL = "job.kafkajob.min.interval";
-    private static final  String SOURCE_KAFKA_OFFSET = "job.kafkajob.offset";
-    private static final  String SOURCE_KAFKA_READ_TIMEOUT = "job.kafkajob.read.timeout";
+    public static final  String JOB_KAFKA_TOPIC = "job.kafkajob.topic";

Review comment:
       It is recommended to use all lowercase or all uppercase.

##########
File path: inlong-agent/agent-plugins/src/main/java/org/apache/inlong/agent/plugin/sources/KafkaSource.java
##########
@@ -0,0 +1,134 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.inlong.agent.plugin.sources;
+
+import com.alibaba.fastjson.JSON;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.inlong.agent.conf.JobProfile;
+import org.apache.inlong.agent.plugin.Reader;
+import org.apache.inlong.agent.plugin.Source;
+import org.apache.inlong.agent.plugin.metrics.SourceJmxMetric;
+import org.apache.inlong.agent.plugin.metrics.SourceMetrics;
+import org.apache.inlong.agent.plugin.metrics.SourcePrometheusMetrics;
+import org.apache.inlong.agent.plugin.sources.reader.KafkaReader;
+import org.apache.inlong.agent.utils.ConfigUtil;
+import org.apache.kafka.clients.consumer.KafkaConsumer;
+import org.apache.kafka.common.PartitionInfo;
+import org.apache.kafka.common.TopicPartition;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Properties;
+import static org.apache.inlong.agent.constant.JobConstants.DEFAULT_JOB_LINE_FILTER;
+import static org.apache.inlong.agent.constant.JobConstants.JOB_KAFKA_OFFSET;
+import static org.apache.inlong.agent.constant.JobConstants.JOB_KAFKA_PARTITION_OFFSET_DELIMITER;
+import static org.apache.inlong.agent.constant.JobConstants.JOB_LINE_FILTER_PATTERN;
+import static org.apache.inlong.agent.constant.JobConstants.JOB_OFFSET_DELIMITER;
+
+public class KafkaSource implements Source {
+
+    private static final Logger LOGGER = LoggerFactory.getLogger(KafkaSource.class);
+
+    private static final String KAFKA_SOURCE_TAG_NAME = "AgentKafkaSourceMetric";
+    private static final String JOB_KAFKAJOB_PARAM_PREFIX = "job.kafkajob.";
+    private static final String JOB_KAFKAJOB_TOPIC = "job.kafkajob.topic";
+    private static final String JOB_KAFKAJOB_BOOTSTRAP_SERVERS = "job.kafkajob.bootstrap.servers";
+    private static final String JOB_KAFKAJOB_GROUP_ID = "job.kafkajob.group.id";
+    private static final String JOB_KAFKAJOB_WAIT_TIMEOUT = "job.kafkajob.wait.timeout";
+    //private static final String JOB_KAFKAJOB_PARTITION_OFFSET = "job.kafkajob.topic.partition.offset";
+    private static final String KAFKA_COMMIT_AUTO = "enable.auto.commit";
+    private static final String KAFKA_DESERIALIZER_METHOD = "org.apache.kafka.common.serialization.StringDeserializer";
+    private static final String KAFKA_KEY_DESERIALIZER = "key.deserializer";
+    private static final String KAFKA_VALUE_DESERIALIZER = "value.deserializer";
+
+    private final SourceMetrics sourceMetrics;
+
+    public KafkaSource() {
+        if (ConfigUtil.isPrometheusEnabled()) {
+            this.sourceMetrics = new SourcePrometheusMetrics(KAFKA_SOURCE_TAG_NAME);
+        } else {
+            this.sourceMetrics = new SourceJmxMetric(KAFKA_SOURCE_TAG_NAME);
+        }
+
+    }
+
+    @Override
+    public List<Reader> split(JobProfile conf) {
+        List<Reader> result = new ArrayList<>();
+        String filterPattern = conf.get(JOB_LINE_FILTER_PATTERN, DEFAULT_JOB_LINE_FILTER);
+
+        Properties props = new Properties();
+        Map<String,String> map = (Map)JSON.parse(conf.toJsonStr());
+        Iterator<Map.Entry<String,String>> iterator = map.entrySet().iterator();
+        //begin build kafkaConsumer

Review comment:
       It's suggested to add one blank to begin your comment.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@inlong.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-inlong] zk1510 commented on a change in pull request #2725: [INLONG-2666][Agent] Agent supports collecting the data from Kafka

Posted by GitBox <gi...@apache.org>.
zk1510 commented on a change in pull request #2725:
URL: https://github.com/apache/incubator-inlong/pull/2725#discussion_r815261339



##########
File path: inlong-agent/agent-plugins/src/main/java/org/apache/inlong/agent/plugin/sources/reader/KafkaReader.java
##########
@@ -0,0 +1,296 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.inlong.agent.plugin.sources.reader;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.inlong.agent.conf.JobProfile;
+import org.apache.inlong.agent.message.DefaultMessage;
+import org.apache.inlong.agent.metrics.audit.AuditUtils;
+import org.apache.inlong.agent.plugin.Message;
+import org.apache.inlong.agent.plugin.Reader;
+import org.apache.inlong.agent.plugin.Validator;
+import org.apache.inlong.agent.plugin.metrics.PluginJmxMetric;
+import org.apache.inlong.agent.plugin.metrics.PluginMetric;
+import org.apache.inlong.agent.plugin.metrics.PluginPrometheusMetric;
+import org.apache.inlong.agent.plugin.validator.PatternValidator;
+import org.apache.inlong.agent.utils.AgentUtils;
+import org.apache.inlong.agent.utils.ConfigUtil;
+import org.apache.kafka.clients.consumer.ConsumerRecord;
+import org.apache.kafka.clients.consumer.ConsumerRecords;
+import org.apache.kafka.clients.consumer.KafkaConsumer;
+import org.apache.kafka.common.TopicPartition;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.nio.charset.StandardCharsets;
+import java.time.Duration;
+import java.util.ArrayList;
+import java.util.HashMap;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicLong;
+import static org.apache.inlong.agent.constant.CommonConstants.DEFAULT_PROXY_INLONG_GROUP_ID;
+import static org.apache.inlong.agent.constant.CommonConstants.DEFAULT_PROXY_INLONG_STREAM_ID;
+import static org.apache.inlong.agent.constant.CommonConstants.PROXY_INLONG_GROUP_ID;
+import static org.apache.inlong.agent.constant.CommonConstants.PROXY_INLONG_STREAM_ID;
+import static org.apache.inlong.agent.constant.JobConstants.JOB_KAFKA_OFFSET;
+
+public class KafkaReader<K, V> implements Reader {
+    private static final Logger LOGGER = LoggerFactory.getLogger(KafkaReader.class);
+
+    KafkaConsumer<K, V> consumer;
+    private Iterator<ConsumerRecord<K, V>> iterator;
+    private List<Validator> validators = new ArrayList<>();
+    public static final int NEVER_STOP_SIGN = -1;
+    private long timeout;
+    private long waitTimeout = 1000;
+    private long lastTime = 0;
+    // metric
+    private static final String KAFKA_READER_TAG_NAME = "AgentKafkaMetric";
+    private final PluginMetric kafkaMetric;
+    //total readRecords
+    private static AtomicLong currentTotalReadRecords = new AtomicLong(0);
+
+    private static AtomicLong lastTotalReadRecords = new AtomicLong(0);
+    // total readBytes
+    private static AtomicLong currentTotalReadBytes = new AtomicLong(0);
+    private static AtomicLong lastTotalReadBytes = new AtomicLong(0);
+    long lastTimestamp;
+    // bps: records/s
+    long recordSpeed;
+    // tps: bytes/s
+    long byteSpeed;
+    // sleepTime
+    long flowControlInterval;
+    private String inlongGroupId;
+    private String inlongStreamId;
+    private String snapshot;
+    private static final String KAFKA_SOURCE_READ_RECORD_SPEED = "job.kafkajob.record.speed.limit";
+    private static final String KAFKA_SOURCE_READ_BYTE_SPEED = "job.kafkajob.byte.speed.limit";
+    private static final String KAFKA_SOURCE_READ_MIN_INTERVAL = "kafka.min.interval.limit";
+    private static final String JOB_KAFKAJOB_READ_TIMEOUT = "job.kafkajob.read.timeout";
+
+    /**
+     * init attribute
+     * @param consumer
+     * @param paraMap
+     */
+    public KafkaReader(KafkaConsumer<K, V> consumer,Map<String,String> paraMap) {
+        this.consumer = consumer;
+        // metrics total readRecords
+        if (ConfigUtil.isPrometheusEnabled()) {
+            kafkaMetric = new PluginPrometheusMetric(AgentUtils.getUniqId(
+                    KAFKA_READER_TAG_NAME, currentTotalReadRecords.incrementAndGet()));
+        } else {
+            kafkaMetric = new PluginJmxMetric(AgentUtils.getUniqId(
+                    KAFKA_READER_TAG_NAME, currentTotalReadRecords.incrementAndGet()));
+        }
+
+        this.recordSpeed = Long.valueOf(paraMap.getOrDefault(KAFKA_SOURCE_READ_RECORD_SPEED,"10000"));
+        this.byteSpeed = Long.valueOf(paraMap.getOrDefault(KAFKA_SOURCE_READ_BYTE_SPEED,String.valueOf(1024 * 1024)));
+        this.flowControlInterval = Long.valueOf(paraMap.getOrDefault(KAFKA_SOURCE_READ_MIN_INTERVAL,"1000"));
+        this.lastTimestamp = System.currentTimeMillis();
+
+        LOGGER.info("KAFKA_SOURCE_READ_RECORD_SPEED = {}", this.recordSpeed);
+        LOGGER.info("KAFKA_SOURCE_READ_BYTE_SPEED = {}", this.byteSpeed);
+    }
+
+    @Override
+    public Message read() {
+
+        if (iterator != null && iterator.hasNext()) {
+            ConsumerRecord<K, V> record = iterator.next();
+            // body
+            String recordValue = record.value().toString();
+            if (validateMessage(recordValue)) {
+                AuditUtils.add(AuditUtils.AUDIT_ID_AGENT_READ_SUCCESS,
+                        inlongGroupId, inlongStreamId, System.currentTimeMillis());
+                // header
+                Map<String,String> headerMap = new HashMap<>();
+                headerMap.put("record.offset", String.valueOf(record.offset()));
+                headerMap.put("record.key", String.valueOf(record.key()));
+                // control speed
+                kafkaMetric.incReadNum();
+                //commit offset
+                consumer.commitAsync();
+                //commit succeed,then record current offset
+                snapshot = String.valueOf(record.offset());
+                DefaultMessage message = new DefaultMessage(recordValue.getBytes(StandardCharsets.UTF_8), headerMap);
+                recordReadLimit(1L, message.getBody().length);
+                return message;
+            }
+        }
+        AgentUtils.silenceSleepInMs(waitTimeout);
+
+        return null;
+    }
+
+    @Override
+    public boolean isFinished() {
+        if (iterator == null) {
+            //fetch data
+            fetchData(5000);
+            return false;
+        }
+        if (iterator.hasNext()) {
+            lastTime = 0;
+            return false;
+        }
+        //fetch data
+        boolean fetchDataSuccess = fetchData(5000);
+        if (fetchDataSuccess && iterator.hasNext()) {
+            lastTime = 0;
+            return false;
+        } else {
+            if (lastTime == 0) {
+                lastTime = System.currentTimeMillis();
+            }
+            if (timeout == NEVER_STOP_SIGN) {
+                return false;
+            }
+            return System.currentTimeMillis() - lastTime > timeout;
+        }
+    }
+
+    @Override
+    public String getReadSource() {
+        Set<TopicPartition> assignment = consumer.assignment();
+        //consumer.
+        Iterator<TopicPartition> iterator = assignment.iterator();
+        while (iterator.hasNext()) {
+            TopicPartition topicPartition = iterator.next();
+            return topicPartition.topic() + "_" + topicPartition.partition();

Review comment:
       consumer-> one topic->one partition




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@inlong.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-inlong] EMsnap commented on a change in pull request #2725: [INLONG-2666][Agent] Agent supports collecting the data from Kafka

Posted by GitBox <gi...@apache.org>.
EMsnap commented on a change in pull request #2725:
URL: https://github.com/apache/incubator-inlong/pull/2725#discussion_r814714942



##########
File path: inlong-agent/agent-plugins/src/test/java/org/apache/inlong/agent/plugin/sources/TestKafkaReader.java
##########
@@ -0,0 +1,73 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.inlong.agent.plugin.sources;
+
+import org.apache.inlong.agent.conf.JobProfile;
+import org.apache.inlong.agent.plugin.Message;
+import org.apache.inlong.agent.plugin.Reader;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+import java.util.List;
+
+public class TestKafkaReader {
+    private static final Logger LOGGER = LoggerFactory.getLogger(TestKafkaReader.class);
+
+    @Test
+    public void testKafkaReader() {
+        KafkaSource kafkaSource = new KafkaSource();
+        JobProfile conf = JobProfile.parseJsonStr("{}");
+        conf.set("job.kafkajob.topic","test2");
+        conf.set("job.kafkajob.bootstrap.servers","10.91.78.107:9092");
+        conf.set("job.kafkajob.group.id","test_group1");
+//        conf.set("job.kafkajob.record.speed.limit","1");
+//        conf.set("job.kafkajob.byte.speed.limit","1");
+//        conf.set("job.kafkajob.read.timeout", "-1");
+        conf.set("job.kafkajob.partition.offset", "0#5");
+        conf.set("proxy.inlongGroupId", "");
+        conf.set("proxy.inlongStreamId", "");
+
+        List<Reader> readers = kafkaSource.split(conf);
+        LOGGER.info("total readers by split after:{}",readers.size());
+
+        readers.forEach(reader -> {
+            reader.init(conf);
+            String readSource = reader.getReadSource();
+            System.out.println(readSource);

Review comment:
       don't use sout pls, change to logger




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@inlong.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org