You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/07/18 20:40:53 UTC

[GitHub] [hudi] the-other-tim-brown opened a new pull request, #6135: [HUDI-4418] Add support for ProtoKafkaSource

the-other-tim-brown opened a new pull request, #6135:
URL: https://github.com/apache/hudi/pull/6135

   
   ## What is the purpose of the pull request
   
   - Adds support for a ProtoKafkaSource based on RFC 57 
   
   ## Brief change log
   
     - Adds `PROTO`  to `Source.SourceType` enum
     - Handles `PROTO` type in `SourceFormatAdapter` by converting to Avro from proto Message objects. Conversion to Row goes Proto -> Avro -> Row currently
     - Added `ProtoClassBasedSchemaProvider` to generate schemas for a proto class that is currently on the classpath
     - Added `ProtoKafkaSource` which parses byte[] into a class that is on the path
     - Added `ProtoConversionUtil` which exposes methods for creating schemas and translating from Proto messages to Avro GenericRecords
   
   ## Verify this pull request
   
   This change added tests and can be verified as follows:
   
     - Added `TestProtoKafkaSource` to validate the flow from Kafka to Hudi
     - Added `TestProtoClassBasedSchemaProvider` to validate schemas are properly generated
     - Added `TestProtoConversionUtil` to validate the conversion from Proto to Avro is working as expected for all proto types
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1232365884

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "194396032e698ac4210d8652a969c7e58832d5db",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925",
       "triggerID" : "194396032e698ac4210d8652a969c7e58832d5db",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10933",
       "triggerID" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "triggerType" : "PUSH"
     }, {
       "hash" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10999",
       "triggerID" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e6e1a68d195995b126c5d32286059d627006f08d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11020",
       "triggerID" : "e6e1a68d195995b126c5d32286059d627006f08d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11021",
       "triggerID" : "95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fc344fa87202e68178333b5cc0128aa37462706e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11022",
       "triggerID" : "fc344fa87202e68178333b5cc0128aa37462706e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "55997e718024a9ca75494b3db2fe5027003bf635",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11057",
       "triggerID" : "55997e718024a9ca75494b3db2fe5027003bf635",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * 55997e718024a9ca75494b3db2fe5027003bf635 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11057) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] the-other-tim-brown commented on a diff in pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
the-other-tim-brown commented on code in PR #6135:
URL: https://github.com/apache/hudi/pull/6135#discussion_r957539914


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/ProtoClassBasedSchemaProvider.java:
##########
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+
+package org.apache.hudi.utilities.schema;
+
+import org.apache.hudi.DataSourceUtils;
+import org.apache.hudi.common.config.TypedProperties;
+import org.apache.hudi.common.util.ReflectionUtils;
+import org.apache.hudi.exception.HoodieException;
+import org.apache.hudi.utilities.sources.helpers.ProtoConversionUtil;
+
+import org.apache.avro.Schema;
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+import org.apache.spark.api.java.JavaSparkContext;
+
+import java.util.Arrays;
+
+/**
+ * A schema provider that takes in a class name for a generated protobuf class that is on the classpath.
+ */
+public class ProtoClassBasedSchemaProvider extends SchemaProvider {
+  private static final Logger LOG = LogManager.getLogger(ProtoClassBasedSchemaProvider.class);

Review Comment:
   Removed



##########
hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/BaseTestKafkaSource.java:
##########
@@ -0,0 +1,280 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.sources;
+
+import org.apache.hudi.AvroConversionUtils;
+import org.apache.hudi.common.config.TypedProperties;
+import org.apache.hudi.common.util.Option;
+import org.apache.hudi.exception.HoodieNotSupportedException;
+import org.apache.hudi.testutils.SparkClientFunctionalTestHarness;
+import org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerMetrics;
+import org.apache.hudi.utilities.deltastreamer.SourceFormatAdapter;
+import org.apache.hudi.utilities.exception.HoodieDeltaStreamerException;
+import org.apache.hudi.utilities.schema.SchemaProvider;
+import org.apache.hudi.utilities.sources.helpers.KafkaOffsetGen;
+
+import org.apache.avro.generic.GenericRecord;
+import org.apache.kafka.clients.consumer.ConsumerConfig;
+import org.apache.kafka.clients.consumer.KafkaConsumer;
+import org.apache.kafka.clients.consumer.OffsetAndMetadata;
+import org.apache.kafka.common.TopicPartition;
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.sql.Dataset;
+import org.apache.spark.sql.Row;
+import org.apache.spark.streaming.kafka010.KafkaTestUtils;
+import org.junit.jupiter.api.AfterAll;
+import org.junit.jupiter.api.BeforeAll;
+import org.junit.jupiter.api.Test;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Properties;
+
+import static org.apache.hudi.utilities.sources.helpers.KafkaOffsetGen.Config.ENABLE_KAFKA_COMMIT_OFFSET;
+import static org.junit.jupiter.api.Assertions.assertEquals;
+import static org.junit.jupiter.api.Assertions.assertNotNull;
+import static org.junit.jupiter.api.Assertions.assertThrows;
+import static org.mockito.Mockito.mock;
+
+/**
+ * Generic tests for all {@link KafkaSource} to ensure all implementations properly handle offsets, fetch limits, failure modes, etc.
+ */
+abstract class BaseTestKafkaSource extends SparkClientFunctionalTestHarness {
+  protected static final String TEST_TOPIC_PREFIX = "hoodie_test_";
+  protected static KafkaTestUtils testUtils;
+
+  protected final HoodieDeltaStreamerMetrics metrics = mock(HoodieDeltaStreamerMetrics.class);
+
+  protected SchemaProvider schemaProvider;
+
+  @BeforeAll
+  public static void initClass() {
+    testUtils = new KafkaTestUtils();
+    testUtils.setup();
+  }
+
+  @AfterAll
+  public static void cleanupClass() {
+    testUtils.teardown();
+  }
+
+  abstract TypedProperties createPropsForKafkaSource(String topic, Long maxEventsToReadFromKafkaSource, String resetStrategy);
+
+  abstract SourceFormatAdapter createSource(TypedProperties props);
+
+  abstract void sendMessagesToKafka(String topic, int count, int numPartitions);
+
+  @Test
+  public void testKafkaSource() {
+
+    // topic setup.
+    final String topic = TEST_TOPIC_PREFIX + "testKafkaSource";
+    testUtils.createTopic(topic, 2);
+    TypedProperties props = createPropsForKafkaSource(topic, null, "earliest");
+    SourceFormatAdapter kafkaSource = createSource(props);
+
+    // 1. Extract without any checkpoint => get all the data, respecting sourceLimit
+    assertEquals(Option.empty(), kafkaSource.fetchNewDataInAvroFormat(Option.empty(), Long.MAX_VALUE).getBatch());
+    sendMessagesToKafka(topic, 1000, 2);
+    InputBatch<JavaRDD<GenericRecord>> fetch1 = kafkaSource.fetchNewDataInAvroFormat(Option.empty(), 900);
+    assertEquals(900, fetch1.getBatch().get().count());
+    // Test Avro To DataFrame<Row> path
+    Dataset<Row> fetch1AsRows = AvroConversionUtils.createDataFrame(JavaRDD.toRDD(fetch1.getBatch().get()),
+        schemaProvider.getSourceSchema().toString(), kafkaSource.getSource().getSparkSession());
+    assertEquals(900, fetch1AsRows.count());
+
+    // 2. Produce new data, extract new data
+    sendMessagesToKafka(topic, 1000, 2);
+    InputBatch<Dataset<Row>> fetch2 =
+        kafkaSource.fetchNewDataInRowFormat(Option.of(fetch1.getCheckpointForNextBatch()), Long.MAX_VALUE);
+    assertEquals(1100, fetch2.getBatch().get().count());
+
+    // 3. Extract with previous checkpoint => gives same data back (idempotent)
+    InputBatch<JavaRDD<GenericRecord>> fetch3 =
+        kafkaSource.fetchNewDataInAvroFormat(Option.of(fetch1.getCheckpointForNextBatch()), Long.MAX_VALUE);
+    assertEquals(fetch2.getBatch().get().count(), fetch3.getBatch().get().count());
+    assertEquals(fetch2.getCheckpointForNextBatch(), fetch3.getCheckpointForNextBatch());
+    // Same using Row API
+    InputBatch<Dataset<Row>> fetch3AsRows =
+        kafkaSource.fetchNewDataInRowFormat(Option.of(fetch1.getCheckpointForNextBatch()), Long.MAX_VALUE);
+    assertEquals(fetch2.getBatch().get().count(), fetch3AsRows.getBatch().get().count());
+    assertEquals(fetch2.getCheckpointForNextBatch(), fetch3AsRows.getCheckpointForNextBatch());
+
+    // 4. Extract with latest checkpoint => no new data returned
+    InputBatch<JavaRDD<GenericRecord>> fetch4 =
+        kafkaSource.fetchNewDataInAvroFormat(Option.of(fetch2.getCheckpointForNextBatch()), Long.MAX_VALUE);
+    assertEquals(Option.empty(), fetch4.getBatch());
+    // Same using Row API
+    InputBatch<Dataset<Row>> fetch4AsRows =
+        kafkaSource.fetchNewDataInRowFormat(Option.of(fetch2.getCheckpointForNextBatch()), Long.MAX_VALUE);
+    assertEquals(Option.empty(), fetch4AsRows.getBatch());
+  }
+
+  // test case with kafka offset reset strategy
+  @Test
+  public void testKafkaSourceResetStrategy() {
+    // topic setup.
+    final String topic = TEST_TOPIC_PREFIX + "testKafkaSourceResetStrategy";
+    testUtils.createTopic(topic, 2);
+
+    TypedProperties earliestProps = createPropsForKafkaSource(topic, null, "earliest");
+    SourceFormatAdapter earliestKafkaSource = createSource(earliestProps);
+
+    TypedProperties latestProps = createPropsForKafkaSource(topic, null, "latest");
+    SourceFormatAdapter latestKafkaSource = createSource(latestProps);
+
+    // 1. Extract with a none data kafka checkpoint
+    // => get a checkpoint string like "hoodie_test,0:0,1:0", latest checkpoint should be equals to earliest checkpoint
+    InputBatch<JavaRDD<GenericRecord>> earFetch0 = earliestKafkaSource.fetchNewDataInAvroFormat(Option.empty(), Long.MAX_VALUE);
+    InputBatch<JavaRDD<GenericRecord>> latFetch0 = latestKafkaSource.fetchNewDataInAvroFormat(Option.empty(), Long.MAX_VALUE);
+    assertEquals(earFetch0.getBatch(), latFetch0.getBatch());
+    assertEquals(earFetch0.getCheckpointForNextBatch(), latFetch0.getCheckpointForNextBatch());
+
+    sendMessagesToKafka(topic, 1000, 2);
+
+    // 2. Extract new checkpoint with a null / empty string pre checkpoint
+    // => earliest fetch with max source limit will get all of data and a end offset checkpoint
+    InputBatch<JavaRDD<GenericRecord>> earFetch1 = earliestKafkaSource.fetchNewDataInAvroFormat(Option.empty(), Long.MAX_VALUE);
+
+    // => [a null pre checkpoint] latest reset fetch will get a end offset checkpoint same to earliest
+    InputBatch<JavaRDD<GenericRecord>> latFetch1 = latestKafkaSource.fetchNewDataInAvroFormat(Option.empty(), Long.MAX_VALUE);
+    assertEquals(earFetch1.getCheckpointForNextBatch(), latFetch1.getCheckpointForNextBatch());
+  }
+
+  @Test
+  public void testProtoKafkaSourceInsertRecordsLessSourceLimit() {
+    // topic setup.
+    final String topic = TEST_TOPIC_PREFIX + "testKafkaSourceInsertRecordsLessSourceLimit";
+    testUtils.createTopic(topic, 2);
+    TypedProperties props = createPropsForKafkaSource(topic, Long.MAX_VALUE, "earliest");
+    SourceFormatAdapter kafkaSource = createSource(props);
+    props.setProperty("hoodie.deltastreamer.kafka.source.maxEvents", "500");
+
+    /*
+     1. maxEventsFromKafkaSourceProp set to more than generated insert records
+     and sourceLimit less than the generated insert records num.
+     */
+    sendMessagesToKafka(topic, 400, 2);
+    InputBatch<JavaRDD<GenericRecord>> fetch1 = kafkaSource.fetchNewDataInAvroFormat(Option.empty(), 300);
+    assertEquals(300, fetch1.getBatch().get().count());
+
+    /*
+     2. Produce new data, extract new data based on sourceLimit
+     and sourceLimit less than the generated insert records num.
+     */
+    sendMessagesToKafka(topic, 600, 2);
+    InputBatch<Dataset<Row>> fetch2 =
+        kafkaSource.fetchNewDataInRowFormat(Option.of(fetch1.getCheckpointForNextBatch()), 300);
+    assertEquals(300, fetch2.getBatch().get().count());
+  }
+
+  @Test
+  public void testCommitOffsetToKafka() {
+    // topic setup.
+    final String topic = TEST_TOPIC_PREFIX + "testCommitOffsetToKafka";
+    testUtils.createTopic(topic, 2);
+    List<TopicPartition> topicPartitions = new ArrayList<>();
+    TopicPartition topicPartition0 = new TopicPartition(topic, 0);
+    topicPartitions.add(topicPartition0);
+    TopicPartition topicPartition1 = new TopicPartition(topic, 1);
+    topicPartitions.add(topicPartition1);
+
+    TypedProperties props = createPropsForKafkaSource(topic, null, "earliest");
+    props.put(ENABLE_KAFKA_COMMIT_OFFSET.key(), "true");
+    SourceFormatAdapter kafkaSource = createSource(props);
+
+    // 1. Extract without any checkpoint => get all the data, respecting sourceLimit
+    assertEquals(Option.empty(), kafkaSource.fetchNewDataInAvroFormat(Option.empty(), Long.MAX_VALUE).getBatch());
+    sendMessagesToKafka(topic, 1000, 2);
+
+    InputBatch<JavaRDD<GenericRecord>> fetch1 = kafkaSource.fetchNewDataInAvroFormat(Option.empty(), 599);
+    // commit to kafka after first batch
+    kafkaSource.getSource().onCommit(fetch1.getCheckpointForNextBatch());
+    try (KafkaConsumer consumer = new KafkaConsumer(props)) {
+      consumer.assign(topicPartitions);
+
+      OffsetAndMetadata offsetAndMetadata = consumer.committed(topicPartition0);
+      assertNotNull(offsetAndMetadata);
+      assertEquals(300, offsetAndMetadata.offset());
+      offsetAndMetadata = consumer.committed(topicPartition1);
+      assertNotNull(offsetAndMetadata);
+      assertEquals(299, offsetAndMetadata.offset());
+      // end offsets will point to 500 for each partition because we consumed less messages from first batch
+      Map endOffsets = consumer.endOffsets(topicPartitions);

Review Comment:
   Updated



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] jinyius commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
jinyius commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1234668240

   ❤️  so much!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1222780033

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * b9c91181d3f39f1e03a7ff7d5536eaa6813b7913 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839) 
   * 14115a6f79de39f538ddfba407f84249c35ebca5 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1230742010

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "194396032e698ac4210d8652a969c7e58832d5db",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925",
       "triggerID" : "194396032e698ac4210d8652a969c7e58832d5db",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10933",
       "triggerID" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "triggerType" : "PUSH"
     }, {
       "hash" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10999",
       "triggerID" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e6e1a68d195995b126c5d32286059d627006f08d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11020",
       "triggerID" : "e6e1a68d195995b126c5d32286059d627006f08d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11021",
       "triggerID" : "95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fc344fa87202e68178333b5cc0128aa37462706e",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fc344fa87202e68178333b5cc0128aa37462706e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * 95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11021) 
   * fc344fa87202e68178333b5cc0128aa37462706e UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1188305333

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ec6177e8856b4b6809843c6295a5e15718d32b75 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] codope merged pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
codope merged PR #6135:
URL: https://github.com/apache/hudi/pull/6135


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1226143907

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "194396032e698ac4210d8652a969c7e58832d5db",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925",
       "triggerID" : "194396032e698ac4210d8652a969c7e58832d5db",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * f70abbc3b45005d40e74252814edc0078a50030e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909) 
   * 194396032e698ac4210d8652a969c7e58832d5db Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1222876742

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * 14115a6f79de39f538ddfba407f84249c35ebca5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1230577243

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "194396032e698ac4210d8652a969c7e58832d5db",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925",
       "triggerID" : "194396032e698ac4210d8652a969c7e58832d5db",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10933",
       "triggerID" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "triggerType" : "PUSH"
     }, {
       "hash" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10999",
       "triggerID" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e6e1a68d195995b126c5d32286059d627006f08d",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11020",
       "triggerID" : "e6e1a68d195995b126c5d32286059d627006f08d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * 794ed4861a42b5718bacfe299bfac2444744115d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10999) 
   * e6e1a68d195995b126c5d32286059d627006f08d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11020) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1221058019

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * a9adc8d80f44a3d5963b139f56f658bb512ae0df Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835) 
   * b9c91181d3f39f1e03a7ff7d5536eaa6813b7913 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1191936235

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 03f2c7a09ebc30b413d78fce7a983a9ed1b4d727 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1226732135

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "194396032e698ac4210d8652a969c7e58832d5db",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925",
       "triggerID" : "194396032e698ac4210d8652a969c7e58832d5db",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * 194396032e698ac4210d8652a969c7e58832d5db Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925) 
   * 9a4df85fd5b1787af587817cba959c536d904fde UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] codope commented on a diff in pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
codope commented on code in PR #6135:
URL: https://github.com/apache/hudi/pull/6135#discussion_r956968574


##########
hudi-utilities/pom.xml:
##########
@@ -60,6 +60,27 @@
         <groupId>org.apache.rat</groupId>
         <artifactId>apache-rat-plugin</artifactId>
       </plugin>
+      <plugin>

Review Comment:
   generally, we define the plugin in the root pom and configure that in child pom.



##########
hudi-utilities/pom.xml:
##########
@@ -73,6 +94,25 @@
   </build>
 
   <dependencies>
+    <!-- Protobuf -->
+    <dependency>
+      <groupId>com.google.protobuf</groupId>

Review Comment:
   same thing about these dependencies. let's define them in root pom.



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/ProtoClassBasedSchemaProvider.java:
##########
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+
+package org.apache.hudi.utilities.schema;
+
+import org.apache.hudi.DataSourceUtils;
+import org.apache.hudi.common.config.TypedProperties;
+import org.apache.hudi.common.util.ReflectionUtils;
+import org.apache.hudi.exception.HoodieException;
+import org.apache.hudi.utilities.sources.helpers.ProtoConversionUtil;
+
+import org.apache.avro.Schema;
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+import org.apache.spark.api.java.JavaSparkContext;
+
+import java.util.Arrays;
+
+/**
+ * A schema provider that takes in a class name for a generated protobuf class that is on the classpath.
+ */
+public class ProtoClassBasedSchemaProvider extends SchemaProvider {
+  private static final Logger LOG = LogManager.getLogger(ProtoClassBasedSchemaProvider.class);
+  /**
+   * Configs supported.
+   */
+  public static class Config {
+    public static final String PROTO_SCHEMA_CLASS_NAME = "hoodie.deltastreamer.schemaprovider.proto.className";
+    public static final String PROTO_SCHEMA_FLATTEN_WRAPPED_PRIMITIVES = "hoodie.deltastreamer.schemaprovider.proto.flattenWrappers";
+  }
+
+  private final String schemaString;
+
+  /**
+   * To be lazily inited on executors.
+   */
+  private transient Schema schema;
+
+  public ProtoClassBasedSchemaProvider(TypedProperties props, JavaSparkContext jssc) {
+    super(props, jssc);
+    DataSourceUtils.checkRequiredProperties(props, Arrays.asList(

Review Comment:
   we can use singletonList instead of asList



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/ProtoConversionUtil.java:
##########
@@ -0,0 +1,393 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.sources.helpers;
+
+import org.apache.hudi.common.util.collection.Pair;
+import org.apache.hudi.exception.HoodieException;
+
+import com.google.protobuf.BoolValue;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.BytesValue;
+import com.google.protobuf.DescriptorProtos;
+import com.google.protobuf.Descriptors;
+import com.google.protobuf.DoubleValue;
+import com.google.protobuf.FloatValue;
+import com.google.protobuf.Int32Value;
+import com.google.protobuf.Int64Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.StringValue;
+import com.google.protobuf.UInt32Value;
+import com.google.protobuf.UInt64Value;
+import org.apache.avro.Schema;
+import org.apache.avro.generic.GenericData;
+import org.apache.avro.generic.GenericFixed;
+import org.apache.avro.generic.GenericRecord;
+import org.apache.avro.util.Utf8;
+
+import java.io.File;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.function.Function;
+
+/**
+ * A utility class to help translate from Proto to Avro.
+ */
+public class ProtoConversionUtil {
+
+  /**
+   * Creates an Avro {@link Schema} for the provided class. Assumes that the class is a protobuf {@link Message}.
+   * @param clazz The protobuf class
+   * @param flattenWrappedPrimitives set to true to treat wrapped primitives like nullable fields instead of nested messages.
+   * @return An Avro schema
+   */
+  public static Schema getAvroSchemaForMessageClass(Class clazz, boolean flattenWrappedPrimitives) {
+    return AvroSupport.get().getSchema(clazz, flattenWrappedPrimitives);
+  }
+
+  /**
+   * Converts the provided {@link Message} into an avro {@link GenericRecord} with the provided schema.
+   * @param schema target schema to convert into
+   * @param message the source message to convert
+   * @return an Avro GenericRecord
+   */
+  public static GenericRecord convertToAvro(Schema schema, Message message) {
+    return AvroSupport.get().convert(schema, message);
+  }
+
+  /**
+   * This class provides support for generating schemas and converting from proto to avro. We don't directly use Avro's ProtobufData class so we can:
+   * 1. Customize how schemas are generated for protobufs. We treat Enums as strings and provide an option to treat wrapped primitives like {@link Int32Value} and {@link StringValue} as messages
+   * (default behavior) or as nullable versions of those primitives.
+   * 2. Convert directly from a protobuf {@link Message} to a {@link GenericRecord} while properly handling enums and wrapped primitives mentioned above.
+   */
+  private static class AvroSupport {
+    private static final AvroSupport INSTANCE = new AvroSupport();
+    private static final Map<Pair<Class, Boolean>, Schema> SCHEMA_CACHE = new ConcurrentHashMap<>();
+    private static final Map<Pair<Schema, Descriptors.Descriptor>, Descriptors.FieldDescriptor[]> FIELD_CACHE = new ConcurrentHashMap<>();
+
+
+    private static final Schema STRINGS = Schema.create(Schema.Type.STRING);
+
+    private static final Schema NULL = Schema.create(Schema.Type.NULL);
+    private static final Map<Descriptors.Descriptor, Schema.Type> WRAPPER_DESCRIPTORS_TO_TYPE = getWrapperDescriptorsToType();
+
+    private static Map<Descriptors.Descriptor, Schema.Type> getWrapperDescriptorsToType() {
+      Map<Descriptors.Descriptor, Schema.Type> wrapperDescriptorsToType = new HashMap<>();
+      wrapperDescriptorsToType.put(StringValue.getDescriptor(), Schema.Type.STRING);
+      wrapperDescriptorsToType.put(Int32Value.getDescriptor(), Schema.Type.INT);
+      wrapperDescriptorsToType.put(UInt32Value.getDescriptor(), Schema.Type.INT);
+      wrapperDescriptorsToType.put(Int64Value.getDescriptor(), Schema.Type.LONG);
+      wrapperDescriptorsToType.put(UInt64Value.getDescriptor(), Schema.Type.LONG);
+      wrapperDescriptorsToType.put(BoolValue.getDescriptor(), Schema.Type.BOOLEAN);
+      wrapperDescriptorsToType.put(BytesValue.getDescriptor(), Schema.Type.BYTES);
+      wrapperDescriptorsToType.put(DoubleValue.getDescriptor(), Schema.Type.DOUBLE);
+      wrapperDescriptorsToType.put(FloatValue.getDescriptor(), Schema.Type.FLOAT);
+      return wrapperDescriptorsToType;
+    }
+
+    private AvroSupport() {
+    }
+
+    public static AvroSupport get() {
+      return INSTANCE;
+    }
+
+    public GenericRecord convert(Schema schema, Message message) {
+      return (GenericRecord) convertObject(schema, message);
+    }
+
+    public Schema getSchema(Class c, boolean flattenWrappedPrimitives) {
+      return SCHEMA_CACHE.computeIfAbsent(Pair.of(c, flattenWrappedPrimitives), key -> {
+        try {
+          Object descriptor = c.getMethod("getDescriptor").invoke(null);
+          if (c.isEnum()) {
+            return getEnumSchema((Descriptors.EnumDescriptor) descriptor);
+          } else {
+            return getMessageSchema((Descriptors.Descriptor) descriptor, new HashMap<>(), flattenWrappedPrimitives);
+          }
+        } catch (Exception e) {
+          throw new RuntimeException(e);
+        }
+      });
+    }
+
+    private Schema getEnumSchema(Descriptors.EnumDescriptor d) {
+      List<String> symbols = new ArrayList<>(d.getValues().size());
+      for (Descriptors.EnumValueDescriptor e : d.getValues()) {
+        symbols.add(e.getName());
+      }
+      return Schema.createEnum(d.getName(), null, getNamespace(d.getFile(), d.getContainingType()), symbols);
+    }
+
+    private Schema getMessageSchema(Descriptors.Descriptor descriptor, Map<Descriptors.Descriptor, Schema> seen, boolean flattenWrappedPrimitives) {
+      if (seen.containsKey(descriptor)) {
+        return seen.get(descriptor);
+      }
+      Schema result = Schema.createRecord(descriptor.getName(), null,
+          getNamespace(descriptor.getFile(), descriptor.getContainingType()), false);
+
+      seen.put(descriptor, result);
+
+      List<Schema.Field> fields = new ArrayList<>(descriptor.getFields().size());
+      for (Descriptors.FieldDescriptor f : descriptor.getFields()) {
+        fields.add(new Schema.Field(f.getName(), getFieldSchema(f, seen, flattenWrappedPrimitives), null, getDefault(f)));
+      }
+      result.setFields(fields);
+      return result;
+    }
+
+    private Schema getFieldSchema(Descriptors.FieldDescriptor f, Map<Descriptors.Descriptor, Schema> seen, boolean flattenWrappedPrimitives) {
+      Function<Schema, Schema> schemaFinalizer =  f.isRepeated() ? Schema::createArray : Function.identity();
+      switch (f.getType()) {
+        case BOOL:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.BOOLEAN));
+        case FLOAT:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.FLOAT));
+        case DOUBLE:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.DOUBLE));
+        case ENUM:
+          return schemaFinalizer.apply(getEnumSchema(f.getEnumType()));
+        case STRING:
+          Schema s = Schema.create(Schema.Type.STRING);
+          GenericData.setStringType(s, GenericData.StringType.String);
+          return schemaFinalizer.apply(s);
+        case BYTES:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.BYTES));
+        case INT32:
+        case SINT32:
+        case FIXED32:
+        case SFIXED32:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.INT));
+        case UINT32:
+        case INT64:
+        case UINT64:
+        case SINT64:
+        case FIXED64:
+        case SFIXED64:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.LONG));
+        case MESSAGE:
+          if (flattenWrappedPrimitives && WRAPPER_DESCRIPTORS_TO_TYPE.containsKey(f.getMessageType())) {
+            // all wrapper types have a single field, so we can get the first field in the message's schema
+            return schemaFinalizer.apply(Schema.createUnion(Arrays.asList(NULL, getFieldSchema(f.getMessageType().getFields().get(0), seen, flattenWrappedPrimitives))));
+          }
+          // if message field is repeated (like a list), elements are non-null
+          if (f.isRepeated()) {
+            return schemaFinalizer.apply(getMessageSchema(f.getMessageType(), seen, flattenWrappedPrimitives));
+          }
+          // otherwise we create a nullable field schema
+          return schemaFinalizer.apply(Schema.createUnion(Arrays.asList(NULL, getMessageSchema(f.getMessageType(), seen, flattenWrappedPrimitives))));
+        case GROUP: // groups are deprecated

Review Comment:
   let's make a note to call it out in release notes.



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/ProtoKafkaSource.java:
##########
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.sources;
+
+import org.apache.hudi.DataSourceUtils;
+import org.apache.hudi.common.config.TypedProperties;
+import org.apache.hudi.common.util.ReflectionUtils;
+import org.apache.hudi.exception.HoodieException;
+import org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerMetrics;
+import org.apache.hudi.utilities.schema.ProtoClassBasedSchemaProvider;
+import org.apache.hudi.utilities.schema.SchemaProvider;
+import org.apache.hudi.utilities.sources.helpers.KafkaOffsetGen;
+
+import com.google.protobuf.Message;
+import org.apache.kafka.common.serialization.ByteArrayDeserializer;
+import org.apache.kafka.common.serialization.StringDeserializer;
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.api.java.JavaSparkContext;
+import org.apache.spark.sql.SparkSession;
+import org.apache.spark.streaming.kafka010.KafkaUtils;
+import org.apache.spark.streaming.kafka010.LocationStrategies;
+import org.apache.spark.streaming.kafka010.OffsetRange;
+
+import java.io.Serializable;
+import java.lang.reflect.InvocationTargetException;
+import java.lang.reflect.Method;
+import java.util.Arrays;
+
+/**
+ * Reads protobuf serialized Kafka data, based on a provided class name.
+ */
+public class ProtoKafkaSource extends KafkaSource<Message> {
+
+  private static final Logger LOG = LogManager.getLogger(ProtoKafkaSource.class);
+  // these are native kafka's config. do not change the config names.
+  private static final String NATIVE_KAFKA_KEY_DESERIALIZER_PROP = "key.deserializer";
+  private static final String NATIVE_KAFKA_VALUE_DESERIALIZER_PROP = "value.deserializer";
+  private final String className;
+
+  public ProtoKafkaSource(TypedProperties props, JavaSparkContext sparkContext,
+                          SparkSession sparkSession, SchemaProvider schemaProvider, HoodieDeltaStreamerMetrics metrics) {
+    super(props, sparkContext, sparkSession, schemaProvider, SourceType.PROTO, metrics);
+    DataSourceUtils.checkRequiredProperties(props, Arrays.asList(

Review Comment:
   let's use singletonList?



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/SourceFormatAdapter.java:
##########
@@ -96,7 +101,7 @@ public InputBatch<JavaRDD<GenericRecord>> fetchNewDataInAvroFormat(Option<String
   public InputBatch<Dataset<Row>> fetchNewDataInRowFormat(Option<String> lastCkptStr, long sourceLimit) {
     switch (source.getSourceType()) {
       case ROW:
-        return ((RowSource) source).fetchNext(lastCkptStr, sourceLimit);

Review Comment:
   Why are we removing these casts?



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/ProtoConversionUtil.java:
##########
@@ -0,0 +1,393 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.sources.helpers;
+
+import org.apache.hudi.common.util.collection.Pair;
+import org.apache.hudi.exception.HoodieException;
+
+import com.google.protobuf.BoolValue;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.BytesValue;
+import com.google.protobuf.DescriptorProtos;
+import com.google.protobuf.Descriptors;
+import com.google.protobuf.DoubleValue;
+import com.google.protobuf.FloatValue;
+import com.google.protobuf.Int32Value;
+import com.google.protobuf.Int64Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.StringValue;
+import com.google.protobuf.UInt32Value;
+import com.google.protobuf.UInt64Value;
+import org.apache.avro.Schema;
+import org.apache.avro.generic.GenericData;
+import org.apache.avro.generic.GenericFixed;
+import org.apache.avro.generic.GenericRecord;
+import org.apache.avro.util.Utf8;
+
+import java.io.File;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.function.Function;
+
+/**
+ * A utility class to help translate from Proto to Avro.
+ */
+public class ProtoConversionUtil {
+
+  /**
+   * Creates an Avro {@link Schema} for the provided class. Assumes that the class is a protobuf {@link Message}.
+   * @param clazz The protobuf class
+   * @param flattenWrappedPrimitives set to true to treat wrapped primitives like nullable fields instead of nested messages.
+   * @return An Avro schema
+   */
+  public static Schema getAvroSchemaForMessageClass(Class clazz, boolean flattenWrappedPrimitives) {
+    return AvroSupport.get().getSchema(clazz, flattenWrappedPrimitives);
+  }
+
+  /**
+   * Converts the provided {@link Message} into an avro {@link GenericRecord} with the provided schema.
+   * @param schema target schema to convert into
+   * @param message the source message to convert
+   * @return an Avro GenericRecord
+   */
+  public static GenericRecord convertToAvro(Schema schema, Message message) {
+    return AvroSupport.get().convert(schema, message);
+  }
+
+  /**
+   * This class provides support for generating schemas and converting from proto to avro. We don't directly use Avro's ProtobufData class so we can:
+   * 1. Customize how schemas are generated for protobufs. We treat Enums as strings and provide an option to treat wrapped primitives like {@link Int32Value} and {@link StringValue} as messages
+   * (default behavior) or as nullable versions of those primitives.
+   * 2. Convert directly from a protobuf {@link Message} to a {@link GenericRecord} while properly handling enums and wrapped primitives mentioned above.
+   */
+  private static class AvroSupport {
+    private static final AvroSupport INSTANCE = new AvroSupport();
+    private static final Map<Pair<Class, Boolean>, Schema> SCHEMA_CACHE = new ConcurrentHashMap<>();

Review Comment:
   let's add a comment on what these two maps hold.



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/ProtoKafkaSource.java:
##########
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.sources;
+
+import org.apache.hudi.DataSourceUtils;
+import org.apache.hudi.common.config.TypedProperties;
+import org.apache.hudi.common.util.ReflectionUtils;
+import org.apache.hudi.exception.HoodieException;
+import org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerMetrics;
+import org.apache.hudi.utilities.schema.ProtoClassBasedSchemaProvider;
+import org.apache.hudi.utilities.schema.SchemaProvider;
+import org.apache.hudi.utilities.sources.helpers.KafkaOffsetGen;
+
+import com.google.protobuf.Message;
+import org.apache.kafka.common.serialization.ByteArrayDeserializer;
+import org.apache.kafka.common.serialization.StringDeserializer;
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.api.java.JavaSparkContext;
+import org.apache.spark.sql.SparkSession;
+import org.apache.spark.streaming.kafka010.KafkaUtils;
+import org.apache.spark.streaming.kafka010.LocationStrategies;
+import org.apache.spark.streaming.kafka010.OffsetRange;
+
+import java.io.Serializable;
+import java.lang.reflect.InvocationTargetException;
+import java.lang.reflect.Method;
+import java.util.Arrays;
+
+/**
+ * Reads protobuf serialized Kafka data, based on a provided class name.
+ */
+public class ProtoKafkaSource extends KafkaSource<Message> {
+
+  private static final Logger LOG = LogManager.getLogger(ProtoKafkaSource.class);
+  // these are native kafka's config. do not change the config names.
+  private static final String NATIVE_KAFKA_KEY_DESERIALIZER_PROP = "key.deserializer";
+  private static final String NATIVE_KAFKA_VALUE_DESERIALIZER_PROP = "value.deserializer";
+  private final String className;
+
+  public ProtoKafkaSource(TypedProperties props, JavaSparkContext sparkContext,
+                          SparkSession sparkSession, SchemaProvider schemaProvider, HoodieDeltaStreamerMetrics metrics) {
+    super(props, sparkContext, sparkSession, schemaProvider, SourceType.PROTO, metrics);
+    DataSourceUtils.checkRequiredProperties(props, Arrays.asList(
+        ProtoClassBasedSchemaProvider.Config.PROTO_SCHEMA_CLASS_NAME));
+    props.put(NATIVE_KAFKA_KEY_DESERIALIZER_PROP, StringDeserializer.class);
+    props.put(NATIVE_KAFKA_VALUE_DESERIALIZER_PROP, ByteArrayDeserializer.class);
+    className = props.getString(ProtoClassBasedSchemaProvider.Config.PROTO_SCHEMA_CLASS_NAME);
+    this.offsetGen = new KafkaOffsetGen(props);
+  }
+
+  @Override
+  JavaRDD<Message> toRDD(OffsetRange[] offsetRanges) {
+    ProtoDeserializer deserializer = new ProtoDeserializer(className);
+    return KafkaUtils.<String, byte[]>createRDD(sparkContext, offsetGen.getKafkaParams(), offsetRanges,
+        LocationStrategies.PreferConsistent()).map(obj -> deserializer.parse(obj.value()));
+  }
+
+  private static class ProtoDeserializer implements Serializable {
+    private final String className;
+    private transient Class protoClass;
+    private transient Method parseMethod;
+
+    public ProtoDeserializer(String className) {
+      this.className = className;
+    }
+
+    public Message parse(byte[] bytes) {
+      try {
+        return (Message) getParseMethod().invoke(getClass(), bytes);
+      } catch (IllegalAccessException | InvocationTargetException ex) {
+        throw new HoodieException("Failed to parse proto message from kafka", ex);
+      }
+    }
+
+    private Class getProtoClass() {
+      if (protoClass == null) {
+        protoClass = ReflectionUtils.getClass(className);
+      }
+      return protoClass;
+    }
+
+    private Method getParseMethod() {
+      if (parseMethod == null) {
+        try {
+          parseMethod = getProtoClass().getMethod("parseFrom", byte[].class);

Review Comment:
   should we also handle HoodieException from ReflectionUtils.getClass?



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/ProtoConversionUtil.java:
##########
@@ -0,0 +1,393 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.sources.helpers;
+
+import org.apache.hudi.common.util.collection.Pair;
+import org.apache.hudi.exception.HoodieException;
+
+import com.google.protobuf.BoolValue;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.BytesValue;
+import com.google.protobuf.DescriptorProtos;
+import com.google.protobuf.Descriptors;
+import com.google.protobuf.DoubleValue;
+import com.google.protobuf.FloatValue;
+import com.google.protobuf.Int32Value;
+import com.google.protobuf.Int64Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.StringValue;
+import com.google.protobuf.UInt32Value;
+import com.google.protobuf.UInt64Value;
+import org.apache.avro.Schema;
+import org.apache.avro.generic.GenericData;
+import org.apache.avro.generic.GenericFixed;
+import org.apache.avro.generic.GenericRecord;
+import org.apache.avro.util.Utf8;
+
+import java.io.File;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.function.Function;
+
+/**
+ * A utility class to help translate from Proto to Avro.
+ */
+public class ProtoConversionUtil {
+
+  /**
+   * Creates an Avro {@link Schema} for the provided class. Assumes that the class is a protobuf {@link Message}.
+   * @param clazz The protobuf class
+   * @param flattenWrappedPrimitives set to true to treat wrapped primitives like nullable fields instead of nested messages.
+   * @return An Avro schema
+   */
+  public static Schema getAvroSchemaForMessageClass(Class clazz, boolean flattenWrappedPrimitives) {
+    return AvroSupport.get().getSchema(clazz, flattenWrappedPrimitives);
+  }
+
+  /**
+   * Converts the provided {@link Message} into an avro {@link GenericRecord} with the provided schema.
+   * @param schema target schema to convert into
+   * @param message the source message to convert
+   * @return an Avro GenericRecord
+   */
+  public static GenericRecord convertToAvro(Schema schema, Message message) {
+    return AvroSupport.get().convert(schema, message);
+  }
+
+  /**
+   * This class provides support for generating schemas and converting from proto to avro. We don't directly use Avro's ProtobufData class so we can:
+   * 1. Customize how schemas are generated for protobufs. We treat Enums as strings and provide an option to treat wrapped primitives like {@link Int32Value} and {@link StringValue} as messages
+   * (default behavior) or as nullable versions of those primitives.
+   * 2. Convert directly from a protobuf {@link Message} to a {@link GenericRecord} while properly handling enums and wrapped primitives mentioned above.
+   */
+  private static class AvroSupport {
+    private static final AvroSupport INSTANCE = new AvroSupport();
+    private static final Map<Pair<Class, Boolean>, Schema> SCHEMA_CACHE = new ConcurrentHashMap<>();
+    private static final Map<Pair<Schema, Descriptors.Descriptor>, Descriptors.FieldDescriptor[]> FIELD_CACHE = new ConcurrentHashMap<>();
+
+
+    private static final Schema STRINGS = Schema.create(Schema.Type.STRING);
+
+    private static final Schema NULL = Schema.create(Schema.Type.NULL);
+    private static final Map<Descriptors.Descriptor, Schema.Type> WRAPPER_DESCRIPTORS_TO_TYPE = getWrapperDescriptorsToType();
+
+    private static Map<Descriptors.Descriptor, Schema.Type> getWrapperDescriptorsToType() {
+      Map<Descriptors.Descriptor, Schema.Type> wrapperDescriptorsToType = new HashMap<>();
+      wrapperDescriptorsToType.put(StringValue.getDescriptor(), Schema.Type.STRING);
+      wrapperDescriptorsToType.put(Int32Value.getDescriptor(), Schema.Type.INT);
+      wrapperDescriptorsToType.put(UInt32Value.getDescriptor(), Schema.Type.INT);
+      wrapperDescriptorsToType.put(Int64Value.getDescriptor(), Schema.Type.LONG);
+      wrapperDescriptorsToType.put(UInt64Value.getDescriptor(), Schema.Type.LONG);
+      wrapperDescriptorsToType.put(BoolValue.getDescriptor(), Schema.Type.BOOLEAN);
+      wrapperDescriptorsToType.put(BytesValue.getDescriptor(), Schema.Type.BYTES);
+      wrapperDescriptorsToType.put(DoubleValue.getDescriptor(), Schema.Type.DOUBLE);
+      wrapperDescriptorsToType.put(FloatValue.getDescriptor(), Schema.Type.FLOAT);
+      return wrapperDescriptorsToType;
+    }
+
+    private AvroSupport() {
+    }
+
+    public static AvroSupport get() {
+      return INSTANCE;
+    }
+
+    public GenericRecord convert(Schema schema, Message message) {
+      return (GenericRecord) convertObject(schema, message);
+    }
+
+    public Schema getSchema(Class c, boolean flattenWrappedPrimitives) {
+      return SCHEMA_CACHE.computeIfAbsent(Pair.of(c, flattenWrappedPrimitives), key -> {
+        try {
+          Object descriptor = c.getMethod("getDescriptor").invoke(null);
+          if (c.isEnum()) {
+            return getEnumSchema((Descriptors.EnumDescriptor) descriptor);
+          } else {
+            return getMessageSchema((Descriptors.Descriptor) descriptor, new HashMap<>(), flattenWrappedPrimitives);
+          }
+        } catch (Exception e) {
+          throw new RuntimeException(e);
+        }
+      });
+    }
+
+    private Schema getEnumSchema(Descriptors.EnumDescriptor d) {
+      List<String> symbols = new ArrayList<>(d.getValues().size());
+      for (Descriptors.EnumValueDescriptor e : d.getValues()) {
+        symbols.add(e.getName());
+      }
+      return Schema.createEnum(d.getName(), null, getNamespace(d.getFile(), d.getContainingType()), symbols);
+    }
+
+    private Schema getMessageSchema(Descriptors.Descriptor descriptor, Map<Descriptors.Descriptor, Schema> seen, boolean flattenWrappedPrimitives) {
+      if (seen.containsKey(descriptor)) {
+        return seen.get(descriptor);
+      }
+      Schema result = Schema.createRecord(descriptor.getName(), null,
+          getNamespace(descriptor.getFile(), descriptor.getContainingType()), false);
+
+      seen.put(descriptor, result);
+
+      List<Schema.Field> fields = new ArrayList<>(descriptor.getFields().size());
+      for (Descriptors.FieldDescriptor f : descriptor.getFields()) {
+        fields.add(new Schema.Field(f.getName(), getFieldSchema(f, seen, flattenWrappedPrimitives), null, getDefault(f)));
+      }
+      result.setFields(fields);
+      return result;
+    }
+
+    private Schema getFieldSchema(Descriptors.FieldDescriptor f, Map<Descriptors.Descriptor, Schema> seen, boolean flattenWrappedPrimitives) {
+      Function<Schema, Schema> schemaFinalizer =  f.isRepeated() ? Schema::createArray : Function.identity();
+      switch (f.getType()) {
+        case BOOL:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.BOOLEAN));
+        case FLOAT:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.FLOAT));
+        case DOUBLE:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.DOUBLE));
+        case ENUM:
+          return schemaFinalizer.apply(getEnumSchema(f.getEnumType()));
+        case STRING:
+          Schema s = Schema.create(Schema.Type.STRING);
+          GenericData.setStringType(s, GenericData.StringType.String);
+          return schemaFinalizer.apply(s);
+        case BYTES:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.BYTES));
+        case INT32:
+        case SINT32:
+        case FIXED32:
+        case SFIXED32:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.INT));
+        case UINT32:
+        case INT64:
+        case UINT64:
+        case SINT64:
+        case FIXED64:
+        case SFIXED64:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.LONG));
+        case MESSAGE:
+          if (flattenWrappedPrimitives && WRAPPER_DESCRIPTORS_TO_TYPE.containsKey(f.getMessageType())) {
+            // all wrapper types have a single field, so we can get the first field in the message's schema
+            return schemaFinalizer.apply(Schema.createUnion(Arrays.asList(NULL, getFieldSchema(f.getMessageType().getFields().get(0), seen, flattenWrappedPrimitives))));
+          }
+          // if message field is repeated (like a list), elements are non-null
+          if (f.isRepeated()) {
+            return schemaFinalizer.apply(getMessageSchema(f.getMessageType(), seen, flattenWrappedPrimitives));
+          }
+          // otherwise we create a nullable field schema
+          return schemaFinalizer.apply(Schema.createUnion(Arrays.asList(NULL, getMessageSchema(f.getMessageType(), seen, flattenWrappedPrimitives))));
+        case GROUP: // groups are deprecated
+        default:
+          throw new RuntimeException("Unexpected type: " + f.getType());
+      }
+    }
+
+    private Object getDefault(Descriptors.FieldDescriptor f) {
+      if (f.isRepeated()) { // empty array as repeated fields' default value
+        return Collections.emptyList();
+      }
+
+      switch (f.getType()) { // generate default for type
+        case BOOL:
+          return false;
+        case FLOAT:
+          return 0.0F;
+        case DOUBLE:
+          return 0.0D;
+        case INT32:
+        case UINT32:
+        case SINT32:
+        case FIXED32:
+        case SFIXED32:
+        case INT64:
+        case UINT64:
+        case SINT64:
+        case FIXED64:
+        case SFIXED64:
+          return 0;
+        case STRING:
+        case BYTES:
+          return "";
+        case ENUM:
+          return f.getEnumType().getValues().get(0).getName();
+        case MESSAGE:
+          return Schema.Field.NULL_VALUE;
+        case GROUP: // groups are deprecated
+        default:
+          throw new RuntimeException("Unexpected type: " + f.getType());
+      }
+    }
+
+    private Descriptors.FieldDescriptor[] getOrderedFields(Schema schema, Message message) {
+      Descriptors.Descriptor descriptor = message.getDescriptorForType();
+      return FIELD_CACHE.computeIfAbsent(Pair.of(schema, descriptor), key -> {
+        Descriptors.FieldDescriptor[] fields = new Descriptors.FieldDescriptor[key.getLeft().getFields().size()];
+        for (Schema.Field f : key.getLeft().getFields()) {
+          fields[f.pos()] = key.getRight().findFieldByName(f.name());
+        }
+        return fields;
+      });
+    }
+
+    private Object convertObject(Schema schema, Object value) {
+      if (value == null) {
+        return null;
+      }
+
+      switch (schema.getType()) {
+        case ARRAY:
+          List<Object> arrayValue = (List<Object>) value;
+          List<Object> arrayCopy = new GenericData.Array<>(arrayValue.size(), schema);
+          for (Object obj : arrayValue) {
+            arrayCopy.add(convertObject(schema.getElementType(), obj));
+          }
+          return arrayCopy;
+        case BYTES:
+          ByteBuffer byteBufferValue;
+          if (value instanceof ByteString) {
+            byteBufferValue = ((ByteString) value).asReadOnlyByteBuffer();
+          } else if (value instanceof Message) {
+            byteBufferValue = ((ByteString) getWrappedValue(value)).asReadOnlyByteBuffer();
+          } else {
+            byteBufferValue = (ByteBuffer) value;
+          }
+          int start = byteBufferValue.position();
+          int length = byteBufferValue.limit() - start;
+          byte[] bytesCopy = new byte[length];
+          byteBufferValue.get(bytesCopy, 0, length);
+          byteBufferValue.position(start);
+          return ByteBuffer.wrap(bytesCopy, 0, length);
+        case ENUM:
+          return GenericData.get().createEnum(value.toString(), schema);
+        case FIXED:
+          return GenericData.get().createFixed(null, ((GenericFixed) value).bytes(), schema);
+        case BOOLEAN:
+        case DOUBLE:
+        case FLOAT:
+        case INT:
+          if (value instanceof Message) {
+            return getWrappedValue(value);
+          }
+          return value; // immutable
+        case LONG:
+          Object tmpValue = value;
+          if (value instanceof Message) {
+            tmpValue = getWrappedValue(value);
+          }
+          // unsigned ints need to be casted to long
+          if (tmpValue instanceof Integer) {
+            tmpValue = new Long((Integer) tmpValue);
+          }
+          return tmpValue;
+        case MAP:
+          Map<Object, Object> mapValue = (Map) value;
+          Map<Object, Object> mapCopy = new HashMap<>(mapValue.size());
+          for (Map.Entry<Object, Object> entry : mapValue.entrySet()) {
+            mapCopy.put(convertObject(STRINGS, entry.getKey()), convertObject(schema.getValueType(), entry.getValue()));
+          }
+          return mapCopy;
+        case NULL:
+          return null;
+        case RECORD:
+          GenericData.Record newRecord = new GenericData.Record(schema);
+          Message messageValue = (Message) value;
+          for (Schema.Field f : schema.getFields()) {
+            int position = f.pos();
+            Descriptors.FieldDescriptor fieldDescriptor = getOrderedFields(schema, messageValue)[position];
+            Object convertedValue;
+            if (fieldDescriptor.getType() == Descriptors.FieldDescriptor.Type.MESSAGE && !fieldDescriptor.isRepeated() && !messageValue.hasField(fieldDescriptor)) {
+              convertedValue = null;
+            } else {
+              convertedValue = convertObject(f.schema(), messageValue.getField(fieldDescriptor));
+            }
+            newRecord.put(position, convertedValue);
+          }
+          return newRecord;
+        case STRING:
+          if (value instanceof String) {
+            return value;
+          } else if (value instanceof StringValue) {
+            return ((StringValue) value).getValue();
+          } else {
+            return new Utf8(value.toString());
+          }
+        case UNION:
+          // Unions only occur for nullable fields when working with proto + avro and null is the first schema in the union
+          return convertObject(schema.getTypes().get(1), value);
+        default:
+          throw new HoodieException("Proto to Avro conversion failed for schema \"" + schema + "\" and value \"" + value + "\"");
+      }
+    }
+
+    /**
+     * Returns the wrapped field, assumes all wrapped fields have a single value
+     * @param value wrapper message like {@link Int32Value} or {@link StringValue}
+     * @return the wrapped object
+     */
+    private Object getWrappedValue(Object value) {
+      Message valueAsMessage = (Message) value;
+      return valueAsMessage.getField(valueAsMessage.getDescriptorForType().getFields().get(0));
+    }
+
+    private String getNamespace(Descriptors.FileDescriptor fd, Descriptors.Descriptor containing) {
+      DescriptorProtos.FileOptions filedOptions = fd.getOptions();
+      String classPackage = filedOptions.hasJavaPackage() ? filedOptions.getJavaPackage() : fd.getPackage();
+      String outer = "";
+      if (!filedOptions.getJavaMultipleFiles()) {
+        if (filedOptions.hasJavaOuterClassname()) {
+          outer = filedOptions.getJavaOuterClassname();
+        } else {
+          outer = new File(fd.getName()).getName();
+          outer = outer.substring(0, outer.lastIndexOf('.'));
+          outer = toCamelCase(outer);

Review Comment:
   why do we need to do this? is it a convention?



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/ProtoConversionUtil.java:
##########
@@ -0,0 +1,393 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.sources.helpers;
+
+import org.apache.hudi.common.util.collection.Pair;
+import org.apache.hudi.exception.HoodieException;
+
+import com.google.protobuf.BoolValue;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.BytesValue;
+import com.google.protobuf.DescriptorProtos;
+import com.google.protobuf.Descriptors;
+import com.google.protobuf.DoubleValue;
+import com.google.protobuf.FloatValue;
+import com.google.protobuf.Int32Value;
+import com.google.protobuf.Int64Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.StringValue;
+import com.google.protobuf.UInt32Value;
+import com.google.protobuf.UInt64Value;
+import org.apache.avro.Schema;
+import org.apache.avro.generic.GenericData;
+import org.apache.avro.generic.GenericFixed;
+import org.apache.avro.generic.GenericRecord;
+import org.apache.avro.util.Utf8;
+
+import java.io.File;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.function.Function;
+
+/**
+ * A utility class to help translate from Proto to Avro.
+ */
+public class ProtoConversionUtil {
+
+  /**
+   * Creates an Avro {@link Schema} for the provided class. Assumes that the class is a protobuf {@link Message}.
+   * @param clazz The protobuf class
+   * @param flattenWrappedPrimitives set to true to treat wrapped primitives like nullable fields instead of nested messages.
+   * @return An Avro schema
+   */
+  public static Schema getAvroSchemaForMessageClass(Class clazz, boolean flattenWrappedPrimitives) {
+    return AvroSupport.get().getSchema(clazz, flattenWrappedPrimitives);
+  }
+
+  /**
+   * Converts the provided {@link Message} into an avro {@link GenericRecord} with the provided schema.
+   * @param schema target schema to convert into
+   * @param message the source message to convert
+   * @return an Avro GenericRecord
+   */
+  public static GenericRecord convertToAvro(Schema schema, Message message) {
+    return AvroSupport.get().convert(schema, message);
+  }
+
+  /**
+   * This class provides support for generating schemas and converting from proto to avro. We don't directly use Avro's ProtobufData class so we can:
+   * 1. Customize how schemas are generated for protobufs. We treat Enums as strings and provide an option to treat wrapped primitives like {@link Int32Value} and {@link StringValue} as messages
+   * (default behavior) or as nullable versions of those primitives.
+   * 2. Convert directly from a protobuf {@link Message} to a {@link GenericRecord} while properly handling enums and wrapped primitives mentioned above.
+   */
+  private static class AvroSupport {
+    private static final AvroSupport INSTANCE = new AvroSupport();
+    private static final Map<Pair<Class, Boolean>, Schema> SCHEMA_CACHE = new ConcurrentHashMap<>();
+    private static final Map<Pair<Schema, Descriptors.Descriptor>, Descriptors.FieldDescriptor[]> FIELD_CACHE = new ConcurrentHashMap<>();
+
+
+    private static final Schema STRINGS = Schema.create(Schema.Type.STRING);
+
+    private static final Schema NULL = Schema.create(Schema.Type.NULL);
+    private static final Map<Descriptors.Descriptor, Schema.Type> WRAPPER_DESCRIPTORS_TO_TYPE = getWrapperDescriptorsToType();
+
+    private static Map<Descriptors.Descriptor, Schema.Type> getWrapperDescriptorsToType() {
+      Map<Descriptors.Descriptor, Schema.Type> wrapperDescriptorsToType = new HashMap<>();
+      wrapperDescriptorsToType.put(StringValue.getDescriptor(), Schema.Type.STRING);
+      wrapperDescriptorsToType.put(Int32Value.getDescriptor(), Schema.Type.INT);
+      wrapperDescriptorsToType.put(UInt32Value.getDescriptor(), Schema.Type.INT);
+      wrapperDescriptorsToType.put(Int64Value.getDescriptor(), Schema.Type.LONG);
+      wrapperDescriptorsToType.put(UInt64Value.getDescriptor(), Schema.Type.LONG);
+      wrapperDescriptorsToType.put(BoolValue.getDescriptor(), Schema.Type.BOOLEAN);
+      wrapperDescriptorsToType.put(BytesValue.getDescriptor(), Schema.Type.BYTES);
+      wrapperDescriptorsToType.put(DoubleValue.getDescriptor(), Schema.Type.DOUBLE);
+      wrapperDescriptorsToType.put(FloatValue.getDescriptor(), Schema.Type.FLOAT);
+      return wrapperDescriptorsToType;
+    }
+
+    private AvroSupport() {
+    }
+
+    public static AvroSupport get() {
+      return INSTANCE;
+    }
+
+    public GenericRecord convert(Schema schema, Message message) {
+      return (GenericRecord) convertObject(schema, message);
+    }
+
+    public Schema getSchema(Class c, boolean flattenWrappedPrimitives) {
+      return SCHEMA_CACHE.computeIfAbsent(Pair.of(c, flattenWrappedPrimitives), key -> {
+        try {
+          Object descriptor = c.getMethod("getDescriptor").invoke(null);
+          if (c.isEnum()) {
+            return getEnumSchema((Descriptors.EnumDescriptor) descriptor);
+          } else {
+            return getMessageSchema((Descriptors.Descriptor) descriptor, new HashMap<>(), flattenWrappedPrimitives);
+          }
+        } catch (Exception e) {
+          throw new RuntimeException(e);
+        }
+      });
+    }
+
+    private Schema getEnumSchema(Descriptors.EnumDescriptor d) {
+      List<String> symbols = new ArrayList<>(d.getValues().size());
+      for (Descriptors.EnumValueDescriptor e : d.getValues()) {
+        symbols.add(e.getName());
+      }
+      return Schema.createEnum(d.getName(), null, getNamespace(d.getFile(), d.getContainingType()), symbols);
+    }
+
+    private Schema getMessageSchema(Descriptors.Descriptor descriptor, Map<Descriptors.Descriptor, Schema> seen, boolean flattenWrappedPrimitives) {
+      if (seen.containsKey(descriptor)) {
+        return seen.get(descriptor);
+      }
+      Schema result = Schema.createRecord(descriptor.getName(), null,
+          getNamespace(descriptor.getFile(), descriptor.getContainingType()), false);
+
+      seen.put(descriptor, result);
+
+      List<Schema.Field> fields = new ArrayList<>(descriptor.getFields().size());
+      for (Descriptors.FieldDescriptor f : descriptor.getFields()) {
+        fields.add(new Schema.Field(f.getName(), getFieldSchema(f, seen, flattenWrappedPrimitives), null, getDefault(f)));
+      }
+      result.setFields(fields);
+      return result;
+    }
+
+    private Schema getFieldSchema(Descriptors.FieldDescriptor f, Map<Descriptors.Descriptor, Schema> seen, boolean flattenWrappedPrimitives) {
+      Function<Schema, Schema> schemaFinalizer =  f.isRepeated() ? Schema::createArray : Function.identity();
+      switch (f.getType()) {
+        case BOOL:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.BOOLEAN));
+        case FLOAT:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.FLOAT));
+        case DOUBLE:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.DOUBLE));
+        case ENUM:
+          return schemaFinalizer.apply(getEnumSchema(f.getEnumType()));
+        case STRING:
+          Schema s = Schema.create(Schema.Type.STRING);
+          GenericData.setStringType(s, GenericData.StringType.String);
+          return schemaFinalizer.apply(s);
+        case BYTES:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.BYTES));
+        case INT32:
+        case SINT32:
+        case FIXED32:
+        case SFIXED32:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.INT));
+        case UINT32:
+        case INT64:
+        case UINT64:
+        case SINT64:
+        case FIXED64:
+        case SFIXED64:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.LONG));
+        case MESSAGE:
+          if (flattenWrappedPrimitives && WRAPPER_DESCRIPTORS_TO_TYPE.containsKey(f.getMessageType())) {
+            // all wrapper types have a single field, so we can get the first field in the message's schema
+            return schemaFinalizer.apply(Schema.createUnion(Arrays.asList(NULL, getFieldSchema(f.getMessageType().getFields().get(0), seen, flattenWrappedPrimitives))));
+          }
+          // if message field is repeated (like a list), elements are non-null
+          if (f.isRepeated()) {
+            return schemaFinalizer.apply(getMessageSchema(f.getMessageType(), seen, flattenWrappedPrimitives));
+          }
+          // otherwise we create a nullable field schema
+          return schemaFinalizer.apply(Schema.createUnion(Arrays.asList(NULL, getMessageSchema(f.getMessageType(), seen, flattenWrappedPrimitives))));
+        case GROUP: // groups are deprecated
+        default:
+          throw new RuntimeException("Unexpected type: " + f.getType());
+      }
+    }
+
+    private Object getDefault(Descriptors.FieldDescriptor f) {
+      if (f.isRepeated()) { // empty array as repeated fields' default value
+        return Collections.emptyList();
+      }
+
+      switch (f.getType()) { // generate default for type
+        case BOOL:
+          return false;
+        case FLOAT:
+          return 0.0F;
+        case DOUBLE:
+          return 0.0D;
+        case INT32:
+        case UINT32:
+        case SINT32:
+        case FIXED32:
+        case SFIXED32:
+        case INT64:
+        case UINT64:
+        case SINT64:
+        case FIXED64:
+        case SFIXED64:
+          return 0;
+        case STRING:
+        case BYTES:
+          return "";
+        case ENUM:
+          return f.getEnumType().getValues().get(0).getName();
+        case MESSAGE:
+          return Schema.Field.NULL_VALUE;
+        case GROUP: // groups are deprecated
+        default:
+          throw new RuntimeException("Unexpected type: " + f.getType());
+      }
+    }
+
+    private Descriptors.FieldDescriptor[] getOrderedFields(Schema schema, Message message) {
+      Descriptors.Descriptor descriptor = message.getDescriptorForType();
+      return FIELD_CACHE.computeIfAbsent(Pair.of(schema, descriptor), key -> {
+        Descriptors.FieldDescriptor[] fields = new Descriptors.FieldDescriptor[key.getLeft().getFields().size()];
+        for (Schema.Field f : key.getLeft().getFields()) {
+          fields[f.pos()] = key.getRight().findFieldByName(f.name());
+        }
+        return fields;
+      });
+    }
+
+    private Object convertObject(Schema schema, Object value) {
+      if (value == null) {
+        return null;
+      }
+
+      switch (schema.getType()) {
+        case ARRAY:
+          List<Object> arrayValue = (List<Object>) value;
+          List<Object> arrayCopy = new GenericData.Array<>(arrayValue.size(), schema);
+          for (Object obj : arrayValue) {
+            arrayCopy.add(convertObject(schema.getElementType(), obj));
+          }
+          return arrayCopy;
+        case BYTES:
+          ByteBuffer byteBufferValue;
+          if (value instanceof ByteString) {
+            byteBufferValue = ((ByteString) value).asReadOnlyByteBuffer();
+          } else if (value instanceof Message) {
+            byteBufferValue = ((ByteString) getWrappedValue(value)).asReadOnlyByteBuffer();
+          } else {
+            byteBufferValue = (ByteBuffer) value;
+          }
+          int start = byteBufferValue.position();
+          int length = byteBufferValue.limit() - start;
+          byte[] bytesCopy = new byte[length];
+          byteBufferValue.get(bytesCopy, 0, length);
+          byteBufferValue.position(start);
+          return ByteBuffer.wrap(bytesCopy, 0, length);
+        case ENUM:
+          return GenericData.get().createEnum(value.toString(), schema);
+        case FIXED:
+          return GenericData.get().createFixed(null, ((GenericFixed) value).bytes(), schema);
+        case BOOLEAN:
+        case DOUBLE:
+        case FLOAT:
+        case INT:
+          if (value instanceof Message) {
+            return getWrappedValue(value);
+          }
+          return value; // immutable
+        case LONG:
+          Object tmpValue = value;
+          if (value instanceof Message) {
+            tmpValue = getWrappedValue(value);
+          }
+          // unsigned ints need to be casted to long
+          if (tmpValue instanceof Integer) {
+            tmpValue = new Long((Integer) tmpValue);
+          }
+          return tmpValue;
+        case MAP:
+          Map<Object, Object> mapValue = (Map) value;
+          Map<Object, Object> mapCopy = new HashMap<>(mapValue.size());
+          for (Map.Entry<Object, Object> entry : mapValue.entrySet()) {
+            mapCopy.put(convertObject(STRINGS, entry.getKey()), convertObject(schema.getValueType(), entry.getValue()));
+          }
+          return mapCopy;
+        case NULL:
+          return null;
+        case RECORD:
+          GenericData.Record newRecord = new GenericData.Record(schema);
+          Message messageValue = (Message) value;
+          for (Schema.Field f : schema.getFields()) {
+            int position = f.pos();
+            Descriptors.FieldDescriptor fieldDescriptor = getOrderedFields(schema, messageValue)[position];
+            Object convertedValue;
+            if (fieldDescriptor.getType() == Descriptors.FieldDescriptor.Type.MESSAGE && !fieldDescriptor.isRepeated() && !messageValue.hasField(fieldDescriptor)) {
+              convertedValue = null;
+            } else {
+              convertedValue = convertObject(f.schema(), messageValue.getField(fieldDescriptor));
+            }
+            newRecord.put(position, convertedValue);
+          }
+          return newRecord;
+        case STRING:
+          if (value instanceof String) {
+            return value;
+          } else if (value instanceof StringValue) {
+            return ((StringValue) value).getValue();
+          } else {
+            return new Utf8(value.toString());
+          }
+        case UNION:
+          // Unions only occur for nullable fields when working with proto + avro and null is the first schema in the union
+          return convertObject(schema.getTypes().get(1), value);
+        default:
+          throw new HoodieException("Proto to Avro conversion failed for schema \"" + schema + "\" and value \"" + value + "\"");
+      }
+    }
+
+    /**
+     * Returns the wrapped field, assumes all wrapped fields have a single value
+     * @param value wrapper message like {@link Int32Value} or {@link StringValue}
+     * @return the wrapped object
+     */
+    private Object getWrappedValue(Object value) {
+      Message valueAsMessage = (Message) value;
+      return valueAsMessage.getField(valueAsMessage.getDescriptorForType().getFields().get(0));
+    }
+
+    private String getNamespace(Descriptors.FileDescriptor fd, Descriptors.Descriptor containing) {
+      DescriptorProtos.FileOptions filedOptions = fd.getOptions();
+      String classPackage = filedOptions.hasJavaPackage() ? filedOptions.getJavaPackage() : fd.getPackage();
+      String outer = "";
+      if (!filedOptions.getJavaMultipleFiles()) {
+        if (filedOptions.hasJavaOuterClassname()) {
+          outer = filedOptions.getJavaOuterClassname();
+        } else {
+          outer = new File(fd.getName()).getName();
+          outer = outer.substring(0, outer.lastIndexOf('.'));
+          outer = toCamelCase(outer);
+        }
+      }
+      StringBuilder inner = new StringBuilder();
+      while (containing != null) {
+        if (inner.length() == 0) {
+          inner.insert(0, containing.getName());
+        } else {
+          inner.insert(0, containing.getName() + "$");
+        }
+        containing = containing.getContainingType();
+      }
+      String d1 = (!outer.isEmpty() || inner.length() != 0 ? "." : "");
+      String d2 = (!outer.isEmpty() && inner.length() != 0 ? "$" : "");
+      return classPackage + d1 + outer + d2 + inner;
+    }
+
+    private String toCamelCase(String s) {

Review Comment:
   can move this method to StringUtils



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/ProtoConversionUtil.java:
##########
@@ -0,0 +1,393 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.sources.helpers;
+
+import org.apache.hudi.common.util.collection.Pair;
+import org.apache.hudi.exception.HoodieException;
+
+import com.google.protobuf.BoolValue;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.BytesValue;
+import com.google.protobuf.DescriptorProtos;
+import com.google.protobuf.Descriptors;
+import com.google.protobuf.DoubleValue;
+import com.google.protobuf.FloatValue;
+import com.google.protobuf.Int32Value;
+import com.google.protobuf.Int64Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.StringValue;
+import com.google.protobuf.UInt32Value;
+import com.google.protobuf.UInt64Value;
+import org.apache.avro.Schema;
+import org.apache.avro.generic.GenericData;
+import org.apache.avro.generic.GenericFixed;
+import org.apache.avro.generic.GenericRecord;
+import org.apache.avro.util.Utf8;
+
+import java.io.File;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.function.Function;
+
+/**
+ * A utility class to help translate from Proto to Avro.
+ */
+public class ProtoConversionUtil {
+
+  /**
+   * Creates an Avro {@link Schema} for the provided class. Assumes that the class is a protobuf {@link Message}.
+   * @param clazz The protobuf class
+   * @param flattenWrappedPrimitives set to true to treat wrapped primitives like nullable fields instead of nested messages.
+   * @return An Avro schema
+   */
+  public static Schema getAvroSchemaForMessageClass(Class clazz, boolean flattenWrappedPrimitives) {
+    return AvroSupport.get().getSchema(clazz, flattenWrappedPrimitives);
+  }
+
+  /**
+   * Converts the provided {@link Message} into an avro {@link GenericRecord} with the provided schema.
+   * @param schema target schema to convert into
+   * @param message the source message to convert
+   * @return an Avro GenericRecord
+   */
+  public static GenericRecord convertToAvro(Schema schema, Message message) {
+    return AvroSupport.get().convert(schema, message);
+  }
+
+  /**
+   * This class provides support for generating schemas and converting from proto to avro. We don't directly use Avro's ProtobufData class so we can:
+   * 1. Customize how schemas are generated for protobufs. We treat Enums as strings and provide an option to treat wrapped primitives like {@link Int32Value} and {@link StringValue} as messages
+   * (default behavior) or as nullable versions of those primitives.
+   * 2. Convert directly from a protobuf {@link Message} to a {@link GenericRecord} while properly handling enums and wrapped primitives mentioned above.
+   */
+  private static class AvroSupport {
+    private static final AvroSupport INSTANCE = new AvroSupport();
+    private static final Map<Pair<Class, Boolean>, Schema> SCHEMA_CACHE = new ConcurrentHashMap<>();
+    private static final Map<Pair<Schema, Descriptors.Descriptor>, Descriptors.FieldDescriptor[]> FIELD_CACHE = new ConcurrentHashMap<>();
+
+
+    private static final Schema STRINGS = Schema.create(Schema.Type.STRING);
+
+    private static final Schema NULL = Schema.create(Schema.Type.NULL);
+    private static final Map<Descriptors.Descriptor, Schema.Type> WRAPPER_DESCRIPTORS_TO_TYPE = getWrapperDescriptorsToType();
+
+    private static Map<Descriptors.Descriptor, Schema.Type> getWrapperDescriptorsToType() {
+      Map<Descriptors.Descriptor, Schema.Type> wrapperDescriptorsToType = new HashMap<>();
+      wrapperDescriptorsToType.put(StringValue.getDescriptor(), Schema.Type.STRING);
+      wrapperDescriptorsToType.put(Int32Value.getDescriptor(), Schema.Type.INT);
+      wrapperDescriptorsToType.put(UInt32Value.getDescriptor(), Schema.Type.INT);
+      wrapperDescriptorsToType.put(Int64Value.getDescriptor(), Schema.Type.LONG);
+      wrapperDescriptorsToType.put(UInt64Value.getDescriptor(), Schema.Type.LONG);
+      wrapperDescriptorsToType.put(BoolValue.getDescriptor(), Schema.Type.BOOLEAN);
+      wrapperDescriptorsToType.put(BytesValue.getDescriptor(), Schema.Type.BYTES);
+      wrapperDescriptorsToType.put(DoubleValue.getDescriptor(), Schema.Type.DOUBLE);
+      wrapperDescriptorsToType.put(FloatValue.getDescriptor(), Schema.Type.FLOAT);
+      return wrapperDescriptorsToType;
+    }
+
+    private AvroSupport() {
+    }
+
+    public static AvroSupport get() {
+      return INSTANCE;
+    }
+
+    public GenericRecord convert(Schema schema, Message message) {
+      return (GenericRecord) convertObject(schema, message);
+    }
+
+    public Schema getSchema(Class c, boolean flattenWrappedPrimitives) {
+      return SCHEMA_CACHE.computeIfAbsent(Pair.of(c, flattenWrappedPrimitives), key -> {
+        try {
+          Object descriptor = c.getMethod("getDescriptor").invoke(null);
+          if (c.isEnum()) {
+            return getEnumSchema((Descriptors.EnumDescriptor) descriptor);
+          } else {
+            return getMessageSchema((Descriptors.Descriptor) descriptor, new HashMap<>(), flattenWrappedPrimitives);
+          }
+        } catch (Exception e) {
+          throw new RuntimeException(e);
+        }
+      });
+    }
+
+    private Schema getEnumSchema(Descriptors.EnumDescriptor d) {
+      List<String> symbols = new ArrayList<>(d.getValues().size());
+      for (Descriptors.EnumValueDescriptor e : d.getValues()) {
+        symbols.add(e.getName());
+      }
+      return Schema.createEnum(d.getName(), null, getNamespace(d.getFile(), d.getContainingType()), symbols);
+    }
+
+    private Schema getMessageSchema(Descriptors.Descriptor descriptor, Map<Descriptors.Descriptor, Schema> seen, boolean flattenWrappedPrimitives) {
+      if (seen.containsKey(descriptor)) {
+        return seen.get(descriptor);
+      }
+      Schema result = Schema.createRecord(descriptor.getName(), null,
+          getNamespace(descriptor.getFile(), descriptor.getContainingType()), false);
+
+      seen.put(descriptor, result);
+
+      List<Schema.Field> fields = new ArrayList<>(descriptor.getFields().size());
+      for (Descriptors.FieldDescriptor f : descriptor.getFields()) {
+        fields.add(new Schema.Field(f.getName(), getFieldSchema(f, seen, flattenWrappedPrimitives), null, getDefault(f)));
+      }
+      result.setFields(fields);
+      return result;
+    }
+
+    private Schema getFieldSchema(Descriptors.FieldDescriptor f, Map<Descriptors.Descriptor, Schema> seen, boolean flattenWrappedPrimitives) {
+      Function<Schema, Schema> schemaFinalizer =  f.isRepeated() ? Schema::createArray : Function.identity();
+      switch (f.getType()) {
+        case BOOL:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.BOOLEAN));
+        case FLOAT:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.FLOAT));
+        case DOUBLE:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.DOUBLE));
+        case ENUM:
+          return schemaFinalizer.apply(getEnumSchema(f.getEnumType()));
+        case STRING:
+          Schema s = Schema.create(Schema.Type.STRING);
+          GenericData.setStringType(s, GenericData.StringType.String);
+          return schemaFinalizer.apply(s);
+        case BYTES:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.BYTES));
+        case INT32:
+        case SINT32:
+        case FIXED32:
+        case SFIXED32:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.INT));
+        case UINT32:
+        case INT64:
+        case UINT64:
+        case SINT64:
+        case FIXED64:
+        case SFIXED64:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.LONG));
+        case MESSAGE:
+          if (flattenWrappedPrimitives && WRAPPER_DESCRIPTORS_TO_TYPE.containsKey(f.getMessageType())) {
+            // all wrapper types have a single field, so we can get the first field in the message's schema
+            return schemaFinalizer.apply(Schema.createUnion(Arrays.asList(NULL, getFieldSchema(f.getMessageType().getFields().get(0), seen, flattenWrappedPrimitives))));
+          }
+          // if message field is repeated (like a list), elements are non-null
+          if (f.isRepeated()) {
+            return schemaFinalizer.apply(getMessageSchema(f.getMessageType(), seen, flattenWrappedPrimitives));
+          }
+          // otherwise we create a nullable field schema
+          return schemaFinalizer.apply(Schema.createUnion(Arrays.asList(NULL, getMessageSchema(f.getMessageType(), seen, flattenWrappedPrimitives))));
+        case GROUP: // groups are deprecated
+        default:
+          throw new RuntimeException("Unexpected type: " + f.getType());
+      }
+    }
+
+    private Object getDefault(Descriptors.FieldDescriptor f) {
+      if (f.isRepeated()) { // empty array as repeated fields' default value
+        return Collections.emptyList();
+      }
+
+      switch (f.getType()) { // generate default for type
+        case BOOL:
+          return false;
+        case FLOAT:
+          return 0.0F;
+        case DOUBLE:
+          return 0.0D;
+        case INT32:
+        case UINT32:
+        case SINT32:
+        case FIXED32:
+        case SFIXED32:
+        case INT64:
+        case UINT64:
+        case SINT64:
+        case FIXED64:
+        case SFIXED64:
+          return 0;
+        case STRING:
+        case BYTES:
+          return "";
+        case ENUM:
+          return f.getEnumType().getValues().get(0).getName();
+        case MESSAGE:
+          return Schema.Field.NULL_VALUE;
+        case GROUP: // groups are deprecated
+        default:
+          throw new RuntimeException("Unexpected type: " + f.getType());
+      }
+    }
+
+    private Descriptors.FieldDescriptor[] getOrderedFields(Schema schema, Message message) {
+      Descriptors.Descriptor descriptor = message.getDescriptorForType();
+      return FIELD_CACHE.computeIfAbsent(Pair.of(schema, descriptor), key -> {
+        Descriptors.FieldDescriptor[] fields = new Descriptors.FieldDescriptor[key.getLeft().getFields().size()];
+        for (Schema.Field f : key.getLeft().getFields()) {
+          fields[f.pos()] = key.getRight().findFieldByName(f.name());
+        }
+        return fields;
+      });
+    }
+
+    private Object convertObject(Schema schema, Object value) {
+      if (value == null) {
+        return null;
+      }
+
+      switch (schema.getType()) {
+        case ARRAY:
+          List<Object> arrayValue = (List<Object>) value;
+          List<Object> arrayCopy = new GenericData.Array<>(arrayValue.size(), schema);
+          for (Object obj : arrayValue) {
+            arrayCopy.add(convertObject(schema.getElementType(), obj));
+          }
+          return arrayCopy;
+        case BYTES:
+          ByteBuffer byteBufferValue;
+          if (value instanceof ByteString) {
+            byteBufferValue = ((ByteString) value).asReadOnlyByteBuffer();
+          } else if (value instanceof Message) {
+            byteBufferValue = ((ByteString) getWrappedValue(value)).asReadOnlyByteBuffer();
+          } else {
+            byteBufferValue = (ByteBuffer) value;
+          }
+          int start = byteBufferValue.position();
+          int length = byteBufferValue.limit() - start;
+          byte[] bytesCopy = new byte[length];
+          byteBufferValue.get(bytesCopy, 0, length);
+          byteBufferValue.position(start);
+          return ByteBuffer.wrap(bytesCopy, 0, length);
+        case ENUM:
+          return GenericData.get().createEnum(value.toString(), schema);
+        case FIXED:
+          return GenericData.get().createFixed(null, ((GenericFixed) value).bytes(), schema);
+        case BOOLEAN:
+        case DOUBLE:
+        case FLOAT:
+        case INT:
+          if (value instanceof Message) {
+            return getWrappedValue(value);
+          }
+          return value; // immutable
+        case LONG:
+          Object tmpValue = value;
+          if (value instanceof Message) {
+            tmpValue = getWrappedValue(value);
+          }
+          // unsigned ints need to be casted to long
+          if (tmpValue instanceof Integer) {
+            tmpValue = new Long((Integer) tmpValue);
+          }
+          return tmpValue;
+        case MAP:
+          Map<Object, Object> mapValue = (Map) value;
+          Map<Object, Object> mapCopy = new HashMap<>(mapValue.size());
+          for (Map.Entry<Object, Object> entry : mapValue.entrySet()) {
+            mapCopy.put(convertObject(STRINGS, entry.getKey()), convertObject(schema.getValueType(), entry.getValue()));
+          }
+          return mapCopy;
+        case NULL:
+          return null;
+        case RECORD:
+          GenericData.Record newRecord = new GenericData.Record(schema);
+          Message messageValue = (Message) value;
+          for (Schema.Field f : schema.getFields()) {
+            int position = f.pos();
+            Descriptors.FieldDescriptor fieldDescriptor = getOrderedFields(schema, messageValue)[position];
+            Object convertedValue;
+            if (fieldDescriptor.getType() == Descriptors.FieldDescriptor.Type.MESSAGE && !fieldDescriptor.isRepeated() && !messageValue.hasField(fieldDescriptor)) {
+              convertedValue = null;
+            } else {
+              convertedValue = convertObject(f.schema(), messageValue.getField(fieldDescriptor));
+            }
+            newRecord.put(position, convertedValue);
+          }
+          return newRecord;
+        case STRING:
+          if (value instanceof String) {
+            return value;
+          } else if (value instanceof StringValue) {
+            return ((StringValue) value).getValue();
+          } else {
+            return new Utf8(value.toString());
+          }
+        case UNION:
+          // Unions only occur for nullable fields when working with proto + avro and null is the first schema in the union
+          return convertObject(schema.getTypes().get(1), value);
+        default:
+          throw new HoodieException("Proto to Avro conversion failed for schema \"" + schema + "\" and value \"" + value + "\"");
+      }
+    }
+
+    /**
+     * Returns the wrapped field, assumes all wrapped fields have a single value
+     * @param value wrapper message like {@link Int32Value} or {@link StringValue}
+     * @return the wrapped object
+     */
+    private Object getWrappedValue(Object value) {
+      Message valueAsMessage = (Message) value;
+      return valueAsMessage.getField(valueAsMessage.getDescriptorForType().getFields().get(0));
+    }
+
+    private String getNamespace(Descriptors.FileDescriptor fd, Descriptors.Descriptor containing) {
+      DescriptorProtos.FileOptions filedOptions = fd.getOptions();
+      String classPackage = filedOptions.hasJavaPackage() ? filedOptions.getJavaPackage() : fd.getPackage();
+      String outer = "";
+      if (!filedOptions.getJavaMultipleFiles()) {
+        if (filedOptions.hasJavaOuterClassname()) {
+          outer = filedOptions.getJavaOuterClassname();
+        } else {
+          outer = new File(fd.getName()).getName();
+          outer = outer.substring(0, outer.lastIndexOf('.'));
+          outer = toCamelCase(outer);
+        }
+      }
+      StringBuilder inner = new StringBuilder();
+      while (containing != null) {
+        if (inner.length() == 0) {
+          inner.insert(0, containing.getName());
+        } else {
+          inner.insert(0, containing.getName() + "$");
+        }
+        containing = containing.getContainingType();
+      }
+      String d1 = (!outer.isEmpty() || inner.length() != 0 ? "." : "");
+      String d2 = (!outer.isEmpty() && inner.length() != 0 ? "$" : "");
+      return classPackage + d1 + outer + d2 + inner;
+    }
+
+    private String toCamelCase(String s) {
+      String[] parts = s.split("_");
+      StringBuilder camelCaseString = new StringBuilder(s.length());
+      for (String part : parts) {
+        camelCaseString.append(cap(part));
+      }
+      return camelCaseString.toString();
+    }
+
+    private String cap(String s) {
+      return s.substring(0, 1).toUpperCase() + s.substring(1).toLowerCase();
+    }
+  }
+}

Review Comment:
   nit: newline



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/schema/ProtoClassBasedSchemaProvider.java:
##########
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+
+package org.apache.hudi.utilities.schema;
+
+import org.apache.hudi.DataSourceUtils;
+import org.apache.hudi.common.config.TypedProperties;
+import org.apache.hudi.common.util.ReflectionUtils;
+import org.apache.hudi.exception.HoodieException;
+import org.apache.hudi.utilities.sources.helpers.ProtoConversionUtil;
+
+import org.apache.avro.Schema;
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+import org.apache.spark.api.java.JavaSparkContext;
+
+import java.util.Arrays;
+
+/**
+ * A schema provider that takes in a class name for a generated protobuf class that is on the classpath.
+ */
+public class ProtoClassBasedSchemaProvider extends SchemaProvider {
+  private static final Logger LOG = LogManager.getLogger(ProtoClassBasedSchemaProvider.class);

Review Comment:
   not being used anywhere?



##########
hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/BaseTestKafkaSource.java:
##########
@@ -0,0 +1,280 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.sources;
+
+import org.apache.hudi.AvroConversionUtils;
+import org.apache.hudi.common.config.TypedProperties;
+import org.apache.hudi.common.util.Option;
+import org.apache.hudi.exception.HoodieNotSupportedException;
+import org.apache.hudi.testutils.SparkClientFunctionalTestHarness;
+import org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerMetrics;
+import org.apache.hudi.utilities.deltastreamer.SourceFormatAdapter;
+import org.apache.hudi.utilities.exception.HoodieDeltaStreamerException;
+import org.apache.hudi.utilities.schema.SchemaProvider;
+import org.apache.hudi.utilities.sources.helpers.KafkaOffsetGen;
+
+import org.apache.avro.generic.GenericRecord;
+import org.apache.kafka.clients.consumer.ConsumerConfig;
+import org.apache.kafka.clients.consumer.KafkaConsumer;
+import org.apache.kafka.clients.consumer.OffsetAndMetadata;
+import org.apache.kafka.common.TopicPartition;
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.sql.Dataset;
+import org.apache.spark.sql.Row;
+import org.apache.spark.streaming.kafka010.KafkaTestUtils;
+import org.junit.jupiter.api.AfterAll;
+import org.junit.jupiter.api.BeforeAll;
+import org.junit.jupiter.api.Test;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Properties;
+
+import static org.apache.hudi.utilities.sources.helpers.KafkaOffsetGen.Config.ENABLE_KAFKA_COMMIT_OFFSET;
+import static org.junit.jupiter.api.Assertions.assertEquals;
+import static org.junit.jupiter.api.Assertions.assertNotNull;
+import static org.junit.jupiter.api.Assertions.assertThrows;
+import static org.mockito.Mockito.mock;
+
+/**
+ * Generic tests for all {@link KafkaSource} to ensure all implementations properly handle offsets, fetch limits, failure modes, etc.
+ */
+abstract class BaseTestKafkaSource extends SparkClientFunctionalTestHarness {
+  protected static final String TEST_TOPIC_PREFIX = "hoodie_test_";
+  protected static KafkaTestUtils testUtils;
+
+  protected final HoodieDeltaStreamerMetrics metrics = mock(HoodieDeltaStreamerMetrics.class);
+
+  protected SchemaProvider schemaProvider;
+
+  @BeforeAll
+  public static void initClass() {
+    testUtils = new KafkaTestUtils();
+    testUtils.setup();
+  }
+
+  @AfterAll
+  public static void cleanupClass() {
+    testUtils.teardown();
+  }
+
+  abstract TypedProperties createPropsForKafkaSource(String topic, Long maxEventsToReadFromKafkaSource, String resetStrategy);
+
+  abstract SourceFormatAdapter createSource(TypedProperties props);
+
+  abstract void sendMessagesToKafka(String topic, int count, int numPartitions);
+
+  @Test
+  public void testKafkaSource() {
+
+    // topic setup.
+    final String topic = TEST_TOPIC_PREFIX + "testKafkaSource";
+    testUtils.createTopic(topic, 2);
+    TypedProperties props = createPropsForKafkaSource(topic, null, "earliest");
+    SourceFormatAdapter kafkaSource = createSource(props);
+
+    // 1. Extract without any checkpoint => get all the data, respecting sourceLimit
+    assertEquals(Option.empty(), kafkaSource.fetchNewDataInAvroFormat(Option.empty(), Long.MAX_VALUE).getBatch());
+    sendMessagesToKafka(topic, 1000, 2);
+    InputBatch<JavaRDD<GenericRecord>> fetch1 = kafkaSource.fetchNewDataInAvroFormat(Option.empty(), 900);
+    assertEquals(900, fetch1.getBatch().get().count());
+    // Test Avro To DataFrame<Row> path
+    Dataset<Row> fetch1AsRows = AvroConversionUtils.createDataFrame(JavaRDD.toRDD(fetch1.getBatch().get()),
+        schemaProvider.getSourceSchema().toString(), kafkaSource.getSource().getSparkSession());
+    assertEquals(900, fetch1AsRows.count());
+
+    // 2. Produce new data, extract new data
+    sendMessagesToKafka(topic, 1000, 2);
+    InputBatch<Dataset<Row>> fetch2 =
+        kafkaSource.fetchNewDataInRowFormat(Option.of(fetch1.getCheckpointForNextBatch()), Long.MAX_VALUE);
+    assertEquals(1100, fetch2.getBatch().get().count());
+
+    // 3. Extract with previous checkpoint => gives same data back (idempotent)
+    InputBatch<JavaRDD<GenericRecord>> fetch3 =
+        kafkaSource.fetchNewDataInAvroFormat(Option.of(fetch1.getCheckpointForNextBatch()), Long.MAX_VALUE);
+    assertEquals(fetch2.getBatch().get().count(), fetch3.getBatch().get().count());
+    assertEquals(fetch2.getCheckpointForNextBatch(), fetch3.getCheckpointForNextBatch());
+    // Same using Row API
+    InputBatch<Dataset<Row>> fetch3AsRows =
+        kafkaSource.fetchNewDataInRowFormat(Option.of(fetch1.getCheckpointForNextBatch()), Long.MAX_VALUE);
+    assertEquals(fetch2.getBatch().get().count(), fetch3AsRows.getBatch().get().count());
+    assertEquals(fetch2.getCheckpointForNextBatch(), fetch3AsRows.getCheckpointForNextBatch());
+
+    // 4. Extract with latest checkpoint => no new data returned
+    InputBatch<JavaRDD<GenericRecord>> fetch4 =
+        kafkaSource.fetchNewDataInAvroFormat(Option.of(fetch2.getCheckpointForNextBatch()), Long.MAX_VALUE);
+    assertEquals(Option.empty(), fetch4.getBatch());
+    // Same using Row API
+    InputBatch<Dataset<Row>> fetch4AsRows =
+        kafkaSource.fetchNewDataInRowFormat(Option.of(fetch2.getCheckpointForNextBatch()), Long.MAX_VALUE);
+    assertEquals(Option.empty(), fetch4AsRows.getBatch());
+  }
+
+  // test case with kafka offset reset strategy
+  @Test
+  public void testKafkaSourceResetStrategy() {
+    // topic setup.
+    final String topic = TEST_TOPIC_PREFIX + "testKafkaSourceResetStrategy";
+    testUtils.createTopic(topic, 2);
+
+    TypedProperties earliestProps = createPropsForKafkaSource(topic, null, "earliest");
+    SourceFormatAdapter earliestKafkaSource = createSource(earliestProps);
+
+    TypedProperties latestProps = createPropsForKafkaSource(topic, null, "latest");
+    SourceFormatAdapter latestKafkaSource = createSource(latestProps);
+
+    // 1. Extract with a none data kafka checkpoint
+    // => get a checkpoint string like "hoodie_test,0:0,1:0", latest checkpoint should be equals to earliest checkpoint
+    InputBatch<JavaRDD<GenericRecord>> earFetch0 = earliestKafkaSource.fetchNewDataInAvroFormat(Option.empty(), Long.MAX_VALUE);
+    InputBatch<JavaRDD<GenericRecord>> latFetch0 = latestKafkaSource.fetchNewDataInAvroFormat(Option.empty(), Long.MAX_VALUE);
+    assertEquals(earFetch0.getBatch(), latFetch0.getBatch());
+    assertEquals(earFetch0.getCheckpointForNextBatch(), latFetch0.getCheckpointForNextBatch());
+
+    sendMessagesToKafka(topic, 1000, 2);
+
+    // 2. Extract new checkpoint with a null / empty string pre checkpoint
+    // => earliest fetch with max source limit will get all of data and a end offset checkpoint
+    InputBatch<JavaRDD<GenericRecord>> earFetch1 = earliestKafkaSource.fetchNewDataInAvroFormat(Option.empty(), Long.MAX_VALUE);
+
+    // => [a null pre checkpoint] latest reset fetch will get a end offset checkpoint same to earliest
+    InputBatch<JavaRDD<GenericRecord>> latFetch1 = latestKafkaSource.fetchNewDataInAvroFormat(Option.empty(), Long.MAX_VALUE);
+    assertEquals(earFetch1.getCheckpointForNextBatch(), latFetch1.getCheckpointForNextBatch());
+  }
+
+  @Test
+  public void testProtoKafkaSourceInsertRecordsLessSourceLimit() {
+    // topic setup.
+    final String topic = TEST_TOPIC_PREFIX + "testKafkaSourceInsertRecordsLessSourceLimit";
+    testUtils.createTopic(topic, 2);
+    TypedProperties props = createPropsForKafkaSource(topic, Long.MAX_VALUE, "earliest");
+    SourceFormatAdapter kafkaSource = createSource(props);
+    props.setProperty("hoodie.deltastreamer.kafka.source.maxEvents", "500");
+
+    /*
+     1. maxEventsFromKafkaSourceProp set to more than generated insert records
+     and sourceLimit less than the generated insert records num.
+     */
+    sendMessagesToKafka(topic, 400, 2);
+    InputBatch<JavaRDD<GenericRecord>> fetch1 = kafkaSource.fetchNewDataInAvroFormat(Option.empty(), 300);
+    assertEquals(300, fetch1.getBatch().get().count());
+
+    /*
+     2. Produce new data, extract new data based on sourceLimit
+     and sourceLimit less than the generated insert records num.
+     */
+    sendMessagesToKafka(topic, 600, 2);
+    InputBatch<Dataset<Row>> fetch2 =
+        kafkaSource.fetchNewDataInRowFormat(Option.of(fetch1.getCheckpointForNextBatch()), 300);
+    assertEquals(300, fetch2.getBatch().get().count());
+  }
+
+  @Test
+  public void testCommitOffsetToKafka() {
+    // topic setup.
+    final String topic = TEST_TOPIC_PREFIX + "testCommitOffsetToKafka";
+    testUtils.createTopic(topic, 2);
+    List<TopicPartition> topicPartitions = new ArrayList<>();
+    TopicPartition topicPartition0 = new TopicPartition(topic, 0);
+    topicPartitions.add(topicPartition0);
+    TopicPartition topicPartition1 = new TopicPartition(topic, 1);
+    topicPartitions.add(topicPartition1);
+
+    TypedProperties props = createPropsForKafkaSource(topic, null, "earliest");
+    props.put(ENABLE_KAFKA_COMMIT_OFFSET.key(), "true");
+    SourceFormatAdapter kafkaSource = createSource(props);
+
+    // 1. Extract without any checkpoint => get all the data, respecting sourceLimit
+    assertEquals(Option.empty(), kafkaSource.fetchNewDataInAvroFormat(Option.empty(), Long.MAX_VALUE).getBatch());
+    sendMessagesToKafka(topic, 1000, 2);
+
+    InputBatch<JavaRDD<GenericRecord>> fetch1 = kafkaSource.fetchNewDataInAvroFormat(Option.empty(), 599);
+    // commit to kafka after first batch
+    kafkaSource.getSource().onCommit(fetch1.getCheckpointForNextBatch());
+    try (KafkaConsumer consumer = new KafkaConsumer(props)) {
+      consumer.assign(topicPartitions);
+
+      OffsetAndMetadata offsetAndMetadata = consumer.committed(topicPartition0);
+      assertNotNull(offsetAndMetadata);
+      assertEquals(300, offsetAndMetadata.offset());
+      offsetAndMetadata = consumer.committed(topicPartition1);
+      assertNotNull(offsetAndMetadata);
+      assertEquals(299, offsetAndMetadata.offset());
+      // end offsets will point to 500 for each partition because we consumed less messages from first batch
+      Map endOffsets = consumer.endOffsets(topicPartitions);

Review Comment:
   let's define key and value types



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] the-other-tim-brown commented on a diff in pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
the-other-tim-brown commented on code in PR #6135:
URL: https://github.com/apache/hudi/pull/6135#discussion_r957524622


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/ProtoConversionUtil.java:
##########
@@ -0,0 +1,393 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.sources.helpers;
+
+import org.apache.hudi.common.util.collection.Pair;
+import org.apache.hudi.exception.HoodieException;
+
+import com.google.protobuf.BoolValue;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.BytesValue;
+import com.google.protobuf.DescriptorProtos;
+import com.google.protobuf.Descriptors;
+import com.google.protobuf.DoubleValue;
+import com.google.protobuf.FloatValue;
+import com.google.protobuf.Int32Value;
+import com.google.protobuf.Int64Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.StringValue;
+import com.google.protobuf.UInt32Value;
+import com.google.protobuf.UInt64Value;
+import org.apache.avro.Schema;
+import org.apache.avro.generic.GenericData;
+import org.apache.avro.generic.GenericFixed;
+import org.apache.avro.generic.GenericRecord;
+import org.apache.avro.util.Utf8;
+
+import java.io.File;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.function.Function;
+
+/**
+ * A utility class to help translate from Proto to Avro.
+ */
+public class ProtoConversionUtil {
+
+  /**
+   * Creates an Avro {@link Schema} for the provided class. Assumes that the class is a protobuf {@link Message}.
+   * @param clazz The protobuf class
+   * @param flattenWrappedPrimitives set to true to treat wrapped primitives like nullable fields instead of nested messages.
+   * @return An Avro schema
+   */
+  public static Schema getAvroSchemaForMessageClass(Class clazz, boolean flattenWrappedPrimitives) {
+    return AvroSupport.get().getSchema(clazz, flattenWrappedPrimitives);
+  }
+
+  /**
+   * Converts the provided {@link Message} into an avro {@link GenericRecord} with the provided schema.
+   * @param schema target schema to convert into
+   * @param message the source message to convert
+   * @return an Avro GenericRecord
+   */
+  public static GenericRecord convertToAvro(Schema schema, Message message) {
+    return AvroSupport.get().convert(schema, message);
+  }
+
+  /**
+   * This class provides support for generating schemas and converting from proto to avro. We don't directly use Avro's ProtobufData class so we can:
+   * 1. Customize how schemas are generated for protobufs. We treat Enums as strings and provide an option to treat wrapped primitives like {@link Int32Value} and {@link StringValue} as messages
+   * (default behavior) or as nullable versions of those primitives.
+   * 2. Convert directly from a protobuf {@link Message} to a {@link GenericRecord} while properly handling enums and wrapped primitives mentioned above.
+   */
+  private static class AvroSupport {
+    private static final AvroSupport INSTANCE = new AvroSupport();
+    private static final Map<Pair<Class, Boolean>, Schema> SCHEMA_CACHE = new ConcurrentHashMap<>();
+    private static final Map<Pair<Schema, Descriptors.Descriptor>, Descriptors.FieldDescriptor[]> FIELD_CACHE = new ConcurrentHashMap<>();
+
+
+    private static final Schema STRINGS = Schema.create(Schema.Type.STRING);
+
+    private static final Schema NULL = Schema.create(Schema.Type.NULL);
+    private static final Map<Descriptors.Descriptor, Schema.Type> WRAPPER_DESCRIPTORS_TO_TYPE = getWrapperDescriptorsToType();
+
+    private static Map<Descriptors.Descriptor, Schema.Type> getWrapperDescriptorsToType() {
+      Map<Descriptors.Descriptor, Schema.Type> wrapperDescriptorsToType = new HashMap<>();
+      wrapperDescriptorsToType.put(StringValue.getDescriptor(), Schema.Type.STRING);
+      wrapperDescriptorsToType.put(Int32Value.getDescriptor(), Schema.Type.INT);
+      wrapperDescriptorsToType.put(UInt32Value.getDescriptor(), Schema.Type.INT);
+      wrapperDescriptorsToType.put(Int64Value.getDescriptor(), Schema.Type.LONG);
+      wrapperDescriptorsToType.put(UInt64Value.getDescriptor(), Schema.Type.LONG);
+      wrapperDescriptorsToType.put(BoolValue.getDescriptor(), Schema.Type.BOOLEAN);
+      wrapperDescriptorsToType.put(BytesValue.getDescriptor(), Schema.Type.BYTES);
+      wrapperDescriptorsToType.put(DoubleValue.getDescriptor(), Schema.Type.DOUBLE);
+      wrapperDescriptorsToType.put(FloatValue.getDescriptor(), Schema.Type.FLOAT);
+      return wrapperDescriptorsToType;
+    }
+
+    private AvroSupport() {
+    }
+
+    public static AvroSupport get() {
+      return INSTANCE;
+    }
+
+    public GenericRecord convert(Schema schema, Message message) {
+      return (GenericRecord) convertObject(schema, message);
+    }
+
+    public Schema getSchema(Class c, boolean flattenWrappedPrimitives) {
+      return SCHEMA_CACHE.computeIfAbsent(Pair.of(c, flattenWrappedPrimitives), key -> {
+        try {
+          Object descriptor = c.getMethod("getDescriptor").invoke(null);
+          if (c.isEnum()) {
+            return getEnumSchema((Descriptors.EnumDescriptor) descriptor);
+          } else {
+            return getMessageSchema((Descriptors.Descriptor) descriptor, new HashMap<>(), flattenWrappedPrimitives);
+          }
+        } catch (Exception e) {
+          throw new RuntimeException(e);
+        }
+      });
+    }
+
+    private Schema getEnumSchema(Descriptors.EnumDescriptor d) {
+      List<String> symbols = new ArrayList<>(d.getValues().size());
+      for (Descriptors.EnumValueDescriptor e : d.getValues()) {
+        symbols.add(e.getName());
+      }
+      return Schema.createEnum(d.getName(), null, getNamespace(d.getFile(), d.getContainingType()), symbols);
+    }
+
+    private Schema getMessageSchema(Descriptors.Descriptor descriptor, Map<Descriptors.Descriptor, Schema> seen, boolean flattenWrappedPrimitives) {
+      if (seen.containsKey(descriptor)) {
+        return seen.get(descriptor);
+      }
+      Schema result = Schema.createRecord(descriptor.getName(), null,
+          getNamespace(descriptor.getFile(), descriptor.getContainingType()), false);
+
+      seen.put(descriptor, result);
+
+      List<Schema.Field> fields = new ArrayList<>(descriptor.getFields().size());
+      for (Descriptors.FieldDescriptor f : descriptor.getFields()) {
+        fields.add(new Schema.Field(f.getName(), getFieldSchema(f, seen, flattenWrappedPrimitives), null, getDefault(f)));
+      }
+      result.setFields(fields);
+      return result;
+    }
+
+    private Schema getFieldSchema(Descriptors.FieldDescriptor f, Map<Descriptors.Descriptor, Schema> seen, boolean flattenWrappedPrimitives) {
+      Function<Schema, Schema> schemaFinalizer =  f.isRepeated() ? Schema::createArray : Function.identity();
+      switch (f.getType()) {
+        case BOOL:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.BOOLEAN));
+        case FLOAT:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.FLOAT));
+        case DOUBLE:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.DOUBLE));
+        case ENUM:
+          return schemaFinalizer.apply(getEnumSchema(f.getEnumType()));
+        case STRING:
+          Schema s = Schema.create(Schema.Type.STRING);
+          GenericData.setStringType(s, GenericData.StringType.String);
+          return schemaFinalizer.apply(s);
+        case BYTES:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.BYTES));
+        case INT32:
+        case SINT32:
+        case FIXED32:
+        case SFIXED32:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.INT));
+        case UINT32:
+        case INT64:
+        case UINT64:
+        case SINT64:
+        case FIXED64:
+        case SFIXED64:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.LONG));
+        case MESSAGE:
+          if (flattenWrappedPrimitives && WRAPPER_DESCRIPTORS_TO_TYPE.containsKey(f.getMessageType())) {
+            // all wrapper types have a single field, so we can get the first field in the message's schema
+            return schemaFinalizer.apply(Schema.createUnion(Arrays.asList(NULL, getFieldSchema(f.getMessageType().getFields().get(0), seen, flattenWrappedPrimitives))));
+          }
+          // if message field is repeated (like a list), elements are non-null
+          if (f.isRepeated()) {
+            return schemaFinalizer.apply(getMessageSchema(f.getMessageType(), seen, flattenWrappedPrimitives));
+          }
+          // otherwise we create a nullable field schema
+          return schemaFinalizer.apply(Schema.createUnion(Arrays.asList(NULL, getMessageSchema(f.getMessageType(), seen, flattenWrappedPrimitives))));
+        case GROUP: // groups are deprecated
+        default:
+          throw new RuntimeException("Unexpected type: " + f.getType());
+      }
+    }
+
+    private Object getDefault(Descriptors.FieldDescriptor f) {
+      if (f.isRepeated()) { // empty array as repeated fields' default value
+        return Collections.emptyList();
+      }
+
+      switch (f.getType()) { // generate default for type
+        case BOOL:
+          return false;
+        case FLOAT:
+          return 0.0F;
+        case DOUBLE:
+          return 0.0D;
+        case INT32:
+        case UINT32:
+        case SINT32:
+        case FIXED32:
+        case SFIXED32:
+        case INT64:
+        case UINT64:
+        case SINT64:
+        case FIXED64:
+        case SFIXED64:
+          return 0;
+        case STRING:
+        case BYTES:
+          return "";
+        case ENUM:
+          return f.getEnumType().getValues().get(0).getName();
+        case MESSAGE:
+          return Schema.Field.NULL_VALUE;
+        case GROUP: // groups are deprecated
+        default:
+          throw new RuntimeException("Unexpected type: " + f.getType());
+      }
+    }
+
+    private Descriptors.FieldDescriptor[] getOrderedFields(Schema schema, Message message) {
+      Descriptors.Descriptor descriptor = message.getDescriptorForType();
+      return FIELD_CACHE.computeIfAbsent(Pair.of(schema, descriptor), key -> {
+        Descriptors.FieldDescriptor[] fields = new Descriptors.FieldDescriptor[key.getLeft().getFields().size()];
+        for (Schema.Field f : key.getLeft().getFields()) {
+          fields[f.pos()] = key.getRight().findFieldByName(f.name());
+        }
+        return fields;
+      });
+    }
+
+    private Object convertObject(Schema schema, Object value) {
+      if (value == null) {
+        return null;
+      }
+
+      switch (schema.getType()) {
+        case ARRAY:
+          List<Object> arrayValue = (List<Object>) value;
+          List<Object> arrayCopy = new GenericData.Array<>(arrayValue.size(), schema);
+          for (Object obj : arrayValue) {
+            arrayCopy.add(convertObject(schema.getElementType(), obj));
+          }
+          return arrayCopy;
+        case BYTES:
+          ByteBuffer byteBufferValue;
+          if (value instanceof ByteString) {
+            byteBufferValue = ((ByteString) value).asReadOnlyByteBuffer();
+          } else if (value instanceof Message) {
+            byteBufferValue = ((ByteString) getWrappedValue(value)).asReadOnlyByteBuffer();
+          } else {
+            byteBufferValue = (ByteBuffer) value;
+          }
+          int start = byteBufferValue.position();
+          int length = byteBufferValue.limit() - start;
+          byte[] bytesCopy = new byte[length];
+          byteBufferValue.get(bytesCopy, 0, length);
+          byteBufferValue.position(start);
+          return ByteBuffer.wrap(bytesCopy, 0, length);
+        case ENUM:
+          return GenericData.get().createEnum(value.toString(), schema);
+        case FIXED:
+          return GenericData.get().createFixed(null, ((GenericFixed) value).bytes(), schema);
+        case BOOLEAN:
+        case DOUBLE:
+        case FLOAT:
+        case INT:
+          if (value instanceof Message) {
+            return getWrappedValue(value);
+          }
+          return value; // immutable
+        case LONG:
+          Object tmpValue = value;
+          if (value instanceof Message) {
+            tmpValue = getWrappedValue(value);
+          }
+          // unsigned ints need to be casted to long
+          if (tmpValue instanceof Integer) {
+            tmpValue = new Long((Integer) tmpValue);
+          }
+          return tmpValue;
+        case MAP:
+          Map<Object, Object> mapValue = (Map) value;
+          Map<Object, Object> mapCopy = new HashMap<>(mapValue.size());
+          for (Map.Entry<Object, Object> entry : mapValue.entrySet()) {
+            mapCopy.put(convertObject(STRINGS, entry.getKey()), convertObject(schema.getValueType(), entry.getValue()));
+          }
+          return mapCopy;
+        case NULL:
+          return null;
+        case RECORD:
+          GenericData.Record newRecord = new GenericData.Record(schema);
+          Message messageValue = (Message) value;
+          for (Schema.Field f : schema.getFields()) {
+            int position = f.pos();
+            Descriptors.FieldDescriptor fieldDescriptor = getOrderedFields(schema, messageValue)[position];
+            Object convertedValue;
+            if (fieldDescriptor.getType() == Descriptors.FieldDescriptor.Type.MESSAGE && !fieldDescriptor.isRepeated() && !messageValue.hasField(fieldDescriptor)) {
+              convertedValue = null;
+            } else {
+              convertedValue = convertObject(f.schema(), messageValue.getField(fieldDescriptor));
+            }
+            newRecord.put(position, convertedValue);
+          }
+          return newRecord;
+        case STRING:
+          if (value instanceof String) {
+            return value;
+          } else if (value instanceof StringValue) {
+            return ((StringValue) value).getValue();
+          } else {
+            return new Utf8(value.toString());
+          }
+        case UNION:
+          // Unions only occur for nullable fields when working with proto + avro and null is the first schema in the union
+          return convertObject(schema.getTypes().get(1), value);
+        default:
+          throw new HoodieException("Proto to Avro conversion failed for schema \"" + schema + "\" and value \"" + value + "\"");
+      }
+    }
+
+    /**
+     * Returns the wrapped field, assumes all wrapped fields have a single value
+     * @param value wrapper message like {@link Int32Value} or {@link StringValue}
+     * @return the wrapped object
+     */
+    private Object getWrappedValue(Object value) {
+      Message valueAsMessage = (Message) value;
+      return valueAsMessage.getField(valueAsMessage.getDescriptorForType().getFields().get(0));
+    }
+
+    private String getNamespace(Descriptors.FileDescriptor fd, Descriptors.Descriptor containing) {
+      DescriptorProtos.FileOptions filedOptions = fd.getOptions();
+      String classPackage = filedOptions.hasJavaPackage() ? filedOptions.getJavaPackage() : fd.getPackage();
+      String outer = "";
+      if (!filedOptions.getJavaMultipleFiles()) {
+        if (filedOptions.hasJavaOuterClassname()) {
+          outer = filedOptions.getJavaOuterClassname();
+        } else {
+          outer = new File(fd.getName()).getName();
+          outer = outer.substring(0, outer.lastIndexOf('.'));
+          outer = toCamelCase(outer);

Review Comment:
   Now that you mention it, I don't think this is really necessary since we're not persisting these schemas anywhere. It was mainly to make it match java naming conventions if the provided proto wasn't already following those.
   
   I think I could simplify all of this by just using the `classPackage` var above. What do you think? It's mainly so we have some namespacing so schemas don't collide, unless I'm missing something



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1230679184

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "194396032e698ac4210d8652a969c7e58832d5db",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925",
       "triggerID" : "194396032e698ac4210d8652a969c7e58832d5db",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10933",
       "triggerID" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "triggerType" : "PUSH"
     }, {
       "hash" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10999",
       "triggerID" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e6e1a68d195995b126c5d32286059d627006f08d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11020",
       "triggerID" : "e6e1a68d195995b126c5d32286059d627006f08d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11021",
       "triggerID" : "95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * 95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11021) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1191881288

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 376eeddeef27bb387a187ba8ac8100b6e129f582 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112) 
   * 03f2c7a09ebc30b413d78fce7a983a9ed1b4d727 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1192024204

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 51bbad445f6a51bb6e619b64f9db27d99b2d70d9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165) 
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * ffc9419327e0e9e91c4e803f9771cf71f7f5e581 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1192022079

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 51bbad445f6a51bb6e619b64f9db27d99b2d70d9 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165) 
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1221109203

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * b9c91181d3f39f1e03a7ff7d5536eaa6813b7913 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] the-other-tim-brown commented on a diff in pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
the-other-tim-brown commented on code in PR #6135:
URL: https://github.com/apache/hudi/pull/6135#discussion_r950460768


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/KafkaSource.java:
##########
@@ -0,0 +1,75 @@
+/*

Review Comment:
   Created a common base class for KafkaSources since there was a lot of repeated code between json, avro, and now proto implementations



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1229547656

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "194396032e698ac4210d8652a969c7e58832d5db",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925",
       "triggerID" : "194396032e698ac4210d8652a969c7e58832d5db",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10933",
       "triggerID" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "triggerType" : "PUSH"
     }, {
       "hash" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10999",
       "triggerID" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * 794ed4861a42b5718bacfe299bfac2444744115d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10999) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1224389948

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * 14115a6f79de39f538ddfba407f84249c35ebca5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881) 
   * 1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1224979442

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * 1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905) 
   * f70abbc3b45005d40e74252814edc0078a50030e UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1188297360

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ec6177e8856b4b6809843c6295a5e15718d32b75 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1230571315

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "194396032e698ac4210d8652a969c7e58832d5db",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925",
       "triggerID" : "194396032e698ac4210d8652a969c7e58832d5db",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10933",
       "triggerID" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "triggerType" : "PUSH"
     }, {
       "hash" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10999",
       "triggerID" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e6e1a68d195995b126c5d32286059d627006f08d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "e6e1a68d195995b126c5d32286059d627006f08d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * 794ed4861a42b5718bacfe299bfac2444744115d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10999) 
   * e6e1a68d195995b126c5d32286059d627006f08d UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1229507784

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "194396032e698ac4210d8652a969c7e58832d5db",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925",
       "triggerID" : "194396032e698ac4210d8652a969c7e58832d5db",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10933",
       "triggerID" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "triggerType" : "PUSH"
     }, {
       "hash" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * 9a4df85fd5b1787af587817cba959c536d904fde Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10933) 
   * 794ed4861a42b5718bacfe299bfac2444744115d UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1192026284

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * ffc9419327e0e9e91c4e803f9771cf71f7f5e581 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] the-other-tim-brown commented on a diff in pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
the-other-tim-brown commented on code in PR #6135:
URL: https://github.com/apache/hudi/pull/6135#discussion_r950480894


##########
hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/BaseTestKafkaSource.java:
##########
@@ -0,0 +1,280 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.sources;
+
+import org.apache.hudi.AvroConversionUtils;
+import org.apache.hudi.common.config.TypedProperties;
+import org.apache.hudi.common.util.Option;
+import org.apache.hudi.exception.HoodieNotSupportedException;
+import org.apache.hudi.testutils.SparkClientFunctionalTestHarness;
+import org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerMetrics;
+import org.apache.hudi.utilities.deltastreamer.SourceFormatAdapter;
+import org.apache.hudi.utilities.exception.HoodieDeltaStreamerException;
+import org.apache.hudi.utilities.schema.SchemaProvider;
+import org.apache.hudi.utilities.sources.helpers.KafkaOffsetGen;
+
+import org.apache.avro.generic.GenericRecord;
+import org.apache.kafka.clients.consumer.ConsumerConfig;
+import org.apache.kafka.clients.consumer.KafkaConsumer;
+import org.apache.kafka.clients.consumer.OffsetAndMetadata;
+import org.apache.kafka.common.TopicPartition;
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.sql.Dataset;
+import org.apache.spark.sql.Row;
+import org.apache.spark.streaming.kafka010.KafkaTestUtils;
+import org.junit.jupiter.api.AfterAll;
+import org.junit.jupiter.api.BeforeAll;
+import org.junit.jupiter.api.Test;
+
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Properties;
+
+import static org.apache.hudi.utilities.sources.helpers.KafkaOffsetGen.Config.ENABLE_KAFKA_COMMIT_OFFSET;
+import static org.junit.jupiter.api.Assertions.assertEquals;
+import static org.junit.jupiter.api.Assertions.assertNotNull;
+import static org.junit.jupiter.api.Assertions.assertThrows;
+import static org.mockito.Mockito.mock;
+
+/**
+ * Generic tests for all {@link KafkaSource} to ensure all implementations properly handle offsets, fetch limits, failure modes, etc.
+ */
+abstract class BaseTestKafkaSource extends SparkClientFunctionalTestHarness {

Review Comment:
   All of the tests in this class are taken from the TestJsonKafkaSource class and modified to be a bit more generic



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1222785954

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * b9c91181d3f39f1e03a7ff7d5536eaa6813b7913 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839) 
   * 14115a6f79de39f538ddfba407f84249c35ebca5 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1220957975

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * ffc9419327e0e9e91c4e803f9771cf71f7f5e581 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168) 
   * a9adc8d80f44a3d5963b139f56f658bb512ae0df Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] the-other-tim-brown commented on a diff in pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
the-other-tim-brown commented on code in PR #6135:
URL: https://github.com/apache/hudi/pull/6135#discussion_r958741140


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/ProtoConversionUtil.java:
##########
@@ -0,0 +1,393 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.sources.helpers;
+
+import org.apache.hudi.common.util.collection.Pair;
+import org.apache.hudi.exception.HoodieException;
+
+import com.google.protobuf.BoolValue;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.BytesValue;
+import com.google.protobuf.DescriptorProtos;
+import com.google.protobuf.Descriptors;
+import com.google.protobuf.DoubleValue;
+import com.google.protobuf.FloatValue;
+import com.google.protobuf.Int32Value;
+import com.google.protobuf.Int64Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.StringValue;
+import com.google.protobuf.UInt32Value;
+import com.google.protobuf.UInt64Value;
+import org.apache.avro.Schema;
+import org.apache.avro.generic.GenericData;
+import org.apache.avro.generic.GenericFixed;
+import org.apache.avro.generic.GenericRecord;
+import org.apache.avro.util.Utf8;
+
+import java.io.File;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.function.Function;
+
+/**
+ * A utility class to help translate from Proto to Avro.
+ */
+public class ProtoConversionUtil {
+
+  /**
+   * Creates an Avro {@link Schema} for the provided class. Assumes that the class is a protobuf {@link Message}.
+   * @param clazz The protobuf class
+   * @param flattenWrappedPrimitives set to true to treat wrapped primitives like nullable fields instead of nested messages.
+   * @return An Avro schema
+   */
+  public static Schema getAvroSchemaForMessageClass(Class clazz, boolean flattenWrappedPrimitives) {
+    return AvroSupport.get().getSchema(clazz, flattenWrappedPrimitives);
+  }
+
+  /**
+   * Converts the provided {@link Message} into an avro {@link GenericRecord} with the provided schema.
+   * @param schema target schema to convert into
+   * @param message the source message to convert
+   * @return an Avro GenericRecord
+   */
+  public static GenericRecord convertToAvro(Schema schema, Message message) {
+    return AvroSupport.get().convert(schema, message);
+  }
+
+  /**
+   * This class provides support for generating schemas and converting from proto to avro. We don't directly use Avro's ProtobufData class so we can:
+   * 1. Customize how schemas are generated for protobufs. We treat Enums as strings and provide an option to treat wrapped primitives like {@link Int32Value} and {@link StringValue} as messages
+   * (default behavior) or as nullable versions of those primitives.
+   * 2. Convert directly from a protobuf {@link Message} to a {@link GenericRecord} while properly handling enums and wrapped primitives mentioned above.
+   */
+  private static class AvroSupport {
+    private static final AvroSupport INSTANCE = new AvroSupport();
+    private static final Map<Pair<Class, Boolean>, Schema> SCHEMA_CACHE = new ConcurrentHashMap<>();
+    private static final Map<Pair<Schema, Descriptors.Descriptor>, Descriptors.FieldDescriptor[]> FIELD_CACHE = new ConcurrentHashMap<>();
+
+
+    private static final Schema STRINGS = Schema.create(Schema.Type.STRING);
+
+    private static final Schema NULL = Schema.create(Schema.Type.NULL);
+    private static final Map<Descriptors.Descriptor, Schema.Type> WRAPPER_DESCRIPTORS_TO_TYPE = getWrapperDescriptorsToType();
+
+    private static Map<Descriptors.Descriptor, Schema.Type> getWrapperDescriptorsToType() {
+      Map<Descriptors.Descriptor, Schema.Type> wrapperDescriptorsToType = new HashMap<>();
+      wrapperDescriptorsToType.put(StringValue.getDescriptor(), Schema.Type.STRING);
+      wrapperDescriptorsToType.put(Int32Value.getDescriptor(), Schema.Type.INT);
+      wrapperDescriptorsToType.put(UInt32Value.getDescriptor(), Schema.Type.INT);
+      wrapperDescriptorsToType.put(Int64Value.getDescriptor(), Schema.Type.LONG);
+      wrapperDescriptorsToType.put(UInt64Value.getDescriptor(), Schema.Type.LONG);
+      wrapperDescriptorsToType.put(BoolValue.getDescriptor(), Schema.Type.BOOLEAN);
+      wrapperDescriptorsToType.put(BytesValue.getDescriptor(), Schema.Type.BYTES);
+      wrapperDescriptorsToType.put(DoubleValue.getDescriptor(), Schema.Type.DOUBLE);
+      wrapperDescriptorsToType.put(FloatValue.getDescriptor(), Schema.Type.FLOAT);
+      return wrapperDescriptorsToType;
+    }
+
+    private AvroSupport() {
+    }
+
+    public static AvroSupport get() {
+      return INSTANCE;
+    }
+
+    public GenericRecord convert(Schema schema, Message message) {
+      return (GenericRecord) convertObject(schema, message);
+    }
+
+    public Schema getSchema(Class c, boolean flattenWrappedPrimitives) {
+      return SCHEMA_CACHE.computeIfAbsent(Pair.of(c, flattenWrappedPrimitives), key -> {
+        try {
+          Object descriptor = c.getMethod("getDescriptor").invoke(null);
+          if (c.isEnum()) {
+            return getEnumSchema((Descriptors.EnumDescriptor) descriptor);
+          } else {
+            return getMessageSchema((Descriptors.Descriptor) descriptor, new HashMap<>(), flattenWrappedPrimitives);
+          }
+        } catch (Exception e) {
+          throw new RuntimeException(e);
+        }
+      });
+    }
+
+    private Schema getEnumSchema(Descriptors.EnumDescriptor d) {
+      List<String> symbols = new ArrayList<>(d.getValues().size());
+      for (Descriptors.EnumValueDescriptor e : d.getValues()) {
+        symbols.add(e.getName());
+      }
+      return Schema.createEnum(d.getName(), null, getNamespace(d.getFile(), d.getContainingType()), symbols);
+    }
+
+    private Schema getMessageSchema(Descriptors.Descriptor descriptor, Map<Descriptors.Descriptor, Schema> seen, boolean flattenWrappedPrimitives) {
+      if (seen.containsKey(descriptor)) {
+        return seen.get(descriptor);
+      }
+      Schema result = Schema.createRecord(descriptor.getName(), null,
+          getNamespace(descriptor.getFile(), descriptor.getContainingType()), false);
+
+      seen.put(descriptor, result);
+
+      List<Schema.Field> fields = new ArrayList<>(descriptor.getFields().size());
+      for (Descriptors.FieldDescriptor f : descriptor.getFields()) {
+        fields.add(new Schema.Field(f.getName(), getFieldSchema(f, seen, flattenWrappedPrimitives), null, getDefault(f)));
+      }
+      result.setFields(fields);
+      return result;
+    }
+
+    private Schema getFieldSchema(Descriptors.FieldDescriptor f, Map<Descriptors.Descriptor, Schema> seen, boolean flattenWrappedPrimitives) {
+      Function<Schema, Schema> schemaFinalizer =  f.isRepeated() ? Schema::createArray : Function.identity();
+      switch (f.getType()) {
+        case BOOL:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.BOOLEAN));
+        case FLOAT:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.FLOAT));
+        case DOUBLE:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.DOUBLE));
+        case ENUM:
+          return schemaFinalizer.apply(getEnumSchema(f.getEnumType()));
+        case STRING:
+          Schema s = Schema.create(Schema.Type.STRING);
+          GenericData.setStringType(s, GenericData.StringType.String);
+          return schemaFinalizer.apply(s);
+        case BYTES:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.BYTES));
+        case INT32:
+        case SINT32:
+        case FIXED32:
+        case SFIXED32:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.INT));
+        case UINT32:
+        case INT64:
+        case UINT64:
+        case SINT64:
+        case FIXED64:
+        case SFIXED64:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.LONG));
+        case MESSAGE:
+          if (flattenWrappedPrimitives && WRAPPER_DESCRIPTORS_TO_TYPE.containsKey(f.getMessageType())) {
+            // all wrapper types have a single field, so we can get the first field in the message's schema
+            return schemaFinalizer.apply(Schema.createUnion(Arrays.asList(NULL, getFieldSchema(f.getMessageType().getFields().get(0), seen, flattenWrappedPrimitives))));
+          }
+          // if message field is repeated (like a list), elements are non-null
+          if (f.isRepeated()) {
+            return schemaFinalizer.apply(getMessageSchema(f.getMessageType(), seen, flattenWrappedPrimitives));
+          }
+          // otherwise we create a nullable field schema
+          return schemaFinalizer.apply(Schema.createUnion(Arrays.asList(NULL, getMessageSchema(f.getMessageType(), seen, flattenWrappedPrimitives))));
+        case GROUP: // groups are deprecated
+        default:
+          throw new RuntimeException("Unexpected type: " + f.getType());
+      }
+    }
+
+    private Object getDefault(Descriptors.FieldDescriptor f) {
+      if (f.isRepeated()) { // empty array as repeated fields' default value
+        return Collections.emptyList();
+      }
+
+      switch (f.getType()) { // generate default for type
+        case BOOL:
+          return false;
+        case FLOAT:
+          return 0.0F;
+        case DOUBLE:
+          return 0.0D;
+        case INT32:
+        case UINT32:
+        case SINT32:
+        case FIXED32:
+        case SFIXED32:
+        case INT64:
+        case UINT64:
+        case SINT64:
+        case FIXED64:
+        case SFIXED64:
+          return 0;
+        case STRING:
+        case BYTES:
+          return "";
+        case ENUM:
+          return f.getEnumType().getValues().get(0).getName();
+        case MESSAGE:
+          return Schema.Field.NULL_VALUE;
+        case GROUP: // groups are deprecated
+        default:
+          throw new RuntimeException("Unexpected type: " + f.getType());
+      }
+    }
+
+    private Descriptors.FieldDescriptor[] getOrderedFields(Schema schema, Message message) {
+      Descriptors.Descriptor descriptor = message.getDescriptorForType();
+      return FIELD_CACHE.computeIfAbsent(Pair.of(schema, descriptor), key -> {
+        Descriptors.FieldDescriptor[] fields = new Descriptors.FieldDescriptor[key.getLeft().getFields().size()];
+        for (Schema.Field f : key.getLeft().getFields()) {
+          fields[f.pos()] = key.getRight().findFieldByName(f.name());
+        }
+        return fields;
+      });
+    }
+
+    private Object convertObject(Schema schema, Object value) {
+      if (value == null) {
+        return null;
+      }
+
+      switch (schema.getType()) {
+        case ARRAY:
+          List<Object> arrayValue = (List<Object>) value;
+          List<Object> arrayCopy = new GenericData.Array<>(arrayValue.size(), schema);
+          for (Object obj : arrayValue) {
+            arrayCopy.add(convertObject(schema.getElementType(), obj));
+          }
+          return arrayCopy;
+        case BYTES:
+          ByteBuffer byteBufferValue;
+          if (value instanceof ByteString) {
+            byteBufferValue = ((ByteString) value).asReadOnlyByteBuffer();
+          } else if (value instanceof Message) {
+            byteBufferValue = ((ByteString) getWrappedValue(value)).asReadOnlyByteBuffer();
+          } else {
+            byteBufferValue = (ByteBuffer) value;
+          }
+          int start = byteBufferValue.position();
+          int length = byteBufferValue.limit() - start;
+          byte[] bytesCopy = new byte[length];
+          byteBufferValue.get(bytesCopy, 0, length);
+          byteBufferValue.position(start);
+          return ByteBuffer.wrap(bytesCopy, 0, length);
+        case ENUM:
+          return GenericData.get().createEnum(value.toString(), schema);
+        case FIXED:
+          return GenericData.get().createFixed(null, ((GenericFixed) value).bytes(), schema);
+        case BOOLEAN:
+        case DOUBLE:
+        case FLOAT:
+        case INT:
+          if (value instanceof Message) {
+            return getWrappedValue(value);
+          }
+          return value; // immutable
+        case LONG:
+          Object tmpValue = value;
+          if (value instanceof Message) {
+            tmpValue = getWrappedValue(value);
+          }
+          // unsigned ints need to be casted to long
+          if (tmpValue instanceof Integer) {
+            tmpValue = new Long((Integer) tmpValue);
+          }
+          return tmpValue;
+        case MAP:
+          Map<Object, Object> mapValue = (Map) value;
+          Map<Object, Object> mapCopy = new HashMap<>(mapValue.size());
+          for (Map.Entry<Object, Object> entry : mapValue.entrySet()) {
+            mapCopy.put(convertObject(STRINGS, entry.getKey()), convertObject(schema.getValueType(), entry.getValue()));
+          }
+          return mapCopy;
+        case NULL:
+          return null;
+        case RECORD:
+          GenericData.Record newRecord = new GenericData.Record(schema);
+          Message messageValue = (Message) value;
+          for (Schema.Field f : schema.getFields()) {
+            int position = f.pos();
+            Descriptors.FieldDescriptor fieldDescriptor = getOrderedFields(schema, messageValue)[position];
+            Object convertedValue;
+            if (fieldDescriptor.getType() == Descriptors.FieldDescriptor.Type.MESSAGE && !fieldDescriptor.isRepeated() && !messageValue.hasField(fieldDescriptor)) {
+              convertedValue = null;
+            } else {
+              convertedValue = convertObject(f.schema(), messageValue.getField(fieldDescriptor));
+            }
+            newRecord.put(position, convertedValue);
+          }
+          return newRecord;
+        case STRING:
+          if (value instanceof String) {
+            return value;
+          } else if (value instanceof StringValue) {
+            return ((StringValue) value).getValue();
+          } else {
+            return new Utf8(value.toString());
+          }
+        case UNION:
+          // Unions only occur for nullable fields when working with proto + avro and null is the first schema in the union
+          return convertObject(schema.getTypes().get(1), value);
+        default:
+          throw new HoodieException("Proto to Avro conversion failed for schema \"" + schema + "\" and value \"" + value + "\"");
+      }
+    }
+
+    /**
+     * Returns the wrapped field, assumes all wrapped fields have a single value
+     * @param value wrapper message like {@link Int32Value} or {@link StringValue}
+     * @return the wrapped object
+     */
+    private Object getWrappedValue(Object value) {
+      Message valueAsMessage = (Message) value;
+      return valueAsMessage.getField(valueAsMessage.getDescriptorForType().getFields().get(0));
+    }
+
+    private String getNamespace(Descriptors.FileDescriptor fd, Descriptors.Descriptor containing) {
+      DescriptorProtos.FileOptions filedOptions = fd.getOptions();
+      String classPackage = filedOptions.hasJavaPackage() ? filedOptions.getJavaPackage() : fd.getPackage();
+      String outer = "";
+      if (!filedOptions.getJavaMultipleFiles()) {
+        if (filedOptions.hasJavaOuterClassname()) {
+          outer = filedOptions.getJavaOuterClassname();
+        } else {
+          outer = new File(fd.getName()).getName();
+          outer = outer.substring(0, outer.lastIndexOf('.'));
+          outer = toCamelCase(outer);
+        }
+      }
+      StringBuilder inner = new StringBuilder();
+      while (containing != null) {
+        if (inner.length() == 0) {
+          inner.insert(0, containing.getName());
+        } else {
+          inner.insert(0, containing.getName() + "$");
+        }
+        containing = containing.getContainingType();
+      }
+      String d1 = (!outer.isEmpty() || inner.length() != 0 ? "." : "");
+      String d2 = (!outer.isEmpty() && inner.length() != 0 ? "$" : "");
+      return classPackage + d1 + outer + d2 + inner;
+    }
+
+    private String toCamelCase(String s) {

Review Comment:
   This method was removed in recent changes



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1229522698

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "194396032e698ac4210d8652a969c7e58832d5db",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925",
       "triggerID" : "194396032e698ac4210d8652a969c7e58832d5db",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10933",
       "triggerID" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "triggerType" : "PUSH"
     }, {
       "hash" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10999",
       "triggerID" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * 9a4df85fd5b1787af587817cba959c536d904fde Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10933) 
   * 794ed4861a42b5718bacfe299bfac2444744115d Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10999) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1230659283

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "194396032e698ac4210d8652a969c7e58832d5db",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925",
       "triggerID" : "194396032e698ac4210d8652a969c7e58832d5db",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10933",
       "triggerID" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "triggerType" : "PUSH"
     }, {
       "hash" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10999",
       "triggerID" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e6e1a68d195995b126c5d32286059d627006f08d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11020",
       "triggerID" : "e6e1a68d195995b126c5d32286059d627006f08d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * e6e1a68d195995b126c5d32286059d627006f08d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11020) 
   * 95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] codope commented on a diff in pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
codope commented on code in PR #6135:
URL: https://github.com/apache/hudi/pull/6135#discussion_r960604640


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/SourceFormatAdapter.java:
##########
@@ -96,7 +101,7 @@ public InputBatch<JavaRDD<GenericRecord>> fetchNewDataInAvroFormat(Option<String
   public InputBatch<Dataset<Row>> fetchNewDataInRowFormat(Option<String> lastCkptStr, long sourceLimit) {
     switch (source.getSourceType()) {
       case ROW:
-        return ((RowSource) source).fetchNext(lastCkptStr, sourceLimit);

Review Comment:
   got it 👍 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1226410197

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "194396032e698ac4210d8652a969c7e58832d5db",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925",
       "triggerID" : "194396032e698ac4210d8652a969c7e58832d5db",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * 194396032e698ac4210d8652a969c7e58832d5db Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1188293329

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ec6177e8856b4b6809843c6295a5e15718d32b75 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1226857383

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "194396032e698ac4210d8652a969c7e58832d5db",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925",
       "triggerID" : "194396032e698ac4210d8652a969c7e58832d5db",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10933",
       "triggerID" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * 9a4df85fd5b1787af587817cba959c536d904fde Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10933) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] the-other-tim-brown commented on a diff in pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
the-other-tim-brown commented on code in PR #6135:
URL: https://github.com/apache/hudi/pull/6135#discussion_r957632497


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/SourceFormatAdapter.java:
##########
@@ -96,7 +101,7 @@ public InputBatch<JavaRDD<GenericRecord>> fetchNewDataInAvroFormat(Option<String
   public InputBatch<Dataset<Row>> fetchNewDataInRowFormat(Option<String> lastCkptStr, long sourceLimit) {
     switch (source.getSourceType()) {
       case ROW:
-        return ((RowSource) source).fetchNext(lastCkptStr, sourceLimit);

Review Comment:
   This is to allow us to use base classes other than the Row|Json|AvroSource. In the case of this PR, I've added a KafkaSource as a base class so we can reuse the logic for fetching and comitting offsets



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] codope commented on a diff in pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
codope commented on code in PR #6135:
URL: https://github.com/apache/hudi/pull/6135#discussion_r960610229


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/ProtoKafkaSource.java:
##########
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.sources;
+
+import org.apache.hudi.DataSourceUtils;
+import org.apache.hudi.common.config.TypedProperties;
+import org.apache.hudi.common.util.ReflectionUtils;
+import org.apache.hudi.exception.HoodieException;
+import org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerMetrics;
+import org.apache.hudi.utilities.schema.ProtoClassBasedSchemaProvider;
+import org.apache.hudi.utilities.schema.SchemaProvider;
+import org.apache.hudi.utilities.sources.helpers.KafkaOffsetGen;
+
+import com.google.protobuf.Message;
+import org.apache.kafka.common.serialization.ByteArrayDeserializer;
+import org.apache.kafka.common.serialization.StringDeserializer;
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.api.java.JavaSparkContext;
+import org.apache.spark.sql.SparkSession;
+import org.apache.spark.streaming.kafka010.KafkaUtils;
+import org.apache.spark.streaming.kafka010.LocationStrategies;
+import org.apache.spark.streaming.kafka010.OffsetRange;
+
+import java.io.Serializable;
+import java.lang.reflect.InvocationTargetException;
+import java.lang.reflect.Method;
+import java.util.Arrays;
+
+/**
+ * Reads protobuf serialized Kafka data, based on a provided class name.
+ */
+public class ProtoKafkaSource extends KafkaSource<Message> {
+
+  private static final Logger LOG = LogManager.getLogger(ProtoKafkaSource.class);
+  // these are native kafka's config. do not change the config names.
+  private static final String NATIVE_KAFKA_KEY_DESERIALIZER_PROP = "key.deserializer";
+  private static final String NATIVE_KAFKA_VALUE_DESERIALIZER_PROP = "value.deserializer";
+  private final String className;
+
+  public ProtoKafkaSource(TypedProperties props, JavaSparkContext sparkContext,
+                          SparkSession sparkSession, SchemaProvider schemaProvider, HoodieDeltaStreamerMetrics metrics) {
+    super(props, sparkContext, sparkSession, schemaProvider, SourceType.PROTO, metrics);
+    DataSourceUtils.checkRequiredProperties(props, Arrays.asList(
+        ProtoClassBasedSchemaProvider.Config.PROTO_SCHEMA_CLASS_NAME));
+    props.put(NATIVE_KAFKA_KEY_DESERIALIZER_PROP, StringDeserializer.class);
+    props.put(NATIVE_KAFKA_VALUE_DESERIALIZER_PROP, ByteArrayDeserializer.class);
+    className = props.getString(ProtoClassBasedSchemaProvider.Config.PROTO_SCHEMA_CLASS_NAME);
+    this.offsetGen = new KafkaOffsetGen(props);
+  }
+
+  @Override
+  JavaRDD<Message> toRDD(OffsetRange[] offsetRanges) {
+    ProtoDeserializer deserializer = new ProtoDeserializer(className);
+    return KafkaUtils.<String, byte[]>createRDD(sparkContext, offsetGen.getKafkaParams(), offsetRanges,
+        LocationStrategies.PreferConsistent()).map(obj -> deserializer.parse(obj.value()));
+  }
+
+  private static class ProtoDeserializer implements Serializable {
+    private final String className;
+    private transient Class protoClass;
+    private transient Method parseMethod;
+
+    public ProtoDeserializer(String className) {
+      this.className = className;
+    }
+
+    public Message parse(byte[] bytes) {
+      try {
+        return (Message) getParseMethod().invoke(getClass(), bytes);
+      } catch (IllegalAccessException | InvocationTargetException ex) {
+        throw new HoodieException("Failed to parse proto message from kafka", ex);
+      }
+    }
+
+    private Class getProtoClass() {
+      if (protoClass == null) {
+        protoClass = ReflectionUtils.getClass(className);
+      }
+      return protoClass;
+    }
+
+    private Method getParseMethod() {
+      if (parseMethod == null) {
+        try {
+          parseMethod = getProtoClass().getMethod("parseFrom", byte[].class);

Review Comment:
   I'm just curious whether this needs to be handled, though the stacktrace should be sufficient.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1231988224

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "194396032e698ac4210d8652a969c7e58832d5db",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925",
       "triggerID" : "194396032e698ac4210d8652a969c7e58832d5db",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10933",
       "triggerID" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "triggerType" : "PUSH"
     }, {
       "hash" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10999",
       "triggerID" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e6e1a68d195995b126c5d32286059d627006f08d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11020",
       "triggerID" : "e6e1a68d195995b126c5d32286059d627006f08d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11021",
       "triggerID" : "95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fc344fa87202e68178333b5cc0128aa37462706e",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11022",
       "triggerID" : "fc344fa87202e68178333b5cc0128aa37462706e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "55997e718024a9ca75494b3db2fe5027003bf635",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11057",
       "triggerID" : "55997e718024a9ca75494b3db2fe5027003bf635",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * fc344fa87202e68178333b5cc0128aa37462706e Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11022) 
   * 55997e718024a9ca75494b3db2fe5027003bf635 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11057) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] the-other-tim-brown commented on a diff in pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
the-other-tim-brown commented on code in PR #6135:
URL: https://github.com/apache/hudi/pull/6135#discussion_r957533515


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/ProtoKafkaSource.java:
##########
@@ -0,0 +1,111 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.sources;
+
+import org.apache.hudi.DataSourceUtils;
+import org.apache.hudi.common.config.TypedProperties;
+import org.apache.hudi.common.util.ReflectionUtils;
+import org.apache.hudi.exception.HoodieException;
+import org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerMetrics;
+import org.apache.hudi.utilities.schema.ProtoClassBasedSchemaProvider;
+import org.apache.hudi.utilities.schema.SchemaProvider;
+import org.apache.hudi.utilities.sources.helpers.KafkaOffsetGen;
+
+import com.google.protobuf.Message;
+import org.apache.kafka.common.serialization.ByteArrayDeserializer;
+import org.apache.kafka.common.serialization.StringDeserializer;
+import org.apache.log4j.LogManager;
+import org.apache.log4j.Logger;
+import org.apache.spark.api.java.JavaRDD;
+import org.apache.spark.api.java.JavaSparkContext;
+import org.apache.spark.sql.SparkSession;
+import org.apache.spark.streaming.kafka010.KafkaUtils;
+import org.apache.spark.streaming.kafka010.LocationStrategies;
+import org.apache.spark.streaming.kafka010.OffsetRange;
+
+import java.io.Serializable;
+import java.lang.reflect.InvocationTargetException;
+import java.lang.reflect.Method;
+import java.util.Arrays;
+
+/**
+ * Reads protobuf serialized Kafka data, based on a provided class name.
+ */
+public class ProtoKafkaSource extends KafkaSource<Message> {
+
+  private static final Logger LOG = LogManager.getLogger(ProtoKafkaSource.class);
+  // these are native kafka's config. do not change the config names.
+  private static final String NATIVE_KAFKA_KEY_DESERIALIZER_PROP = "key.deserializer";
+  private static final String NATIVE_KAFKA_VALUE_DESERIALIZER_PROP = "value.deserializer";
+  private final String className;
+
+  public ProtoKafkaSource(TypedProperties props, JavaSparkContext sparkContext,
+                          SparkSession sparkSession, SchemaProvider schemaProvider, HoodieDeltaStreamerMetrics metrics) {
+    super(props, sparkContext, sparkSession, schemaProvider, SourceType.PROTO, metrics);
+    DataSourceUtils.checkRequiredProperties(props, Arrays.asList(
+        ProtoClassBasedSchemaProvider.Config.PROTO_SCHEMA_CLASS_NAME));
+    props.put(NATIVE_KAFKA_KEY_DESERIALIZER_PROP, StringDeserializer.class);
+    props.put(NATIVE_KAFKA_VALUE_DESERIALIZER_PROP, ByteArrayDeserializer.class);
+    className = props.getString(ProtoClassBasedSchemaProvider.Config.PROTO_SCHEMA_CLASS_NAME);
+    this.offsetGen = new KafkaOffsetGen(props);
+  }
+
+  @Override
+  JavaRDD<Message> toRDD(OffsetRange[] offsetRanges) {
+    ProtoDeserializer deserializer = new ProtoDeserializer(className);
+    return KafkaUtils.<String, byte[]>createRDD(sparkContext, offsetGen.getKafkaParams(), offsetRanges,
+        LocationStrategies.PreferConsistent()).map(obj -> deserializer.parse(obj.value()));
+  }
+
+  private static class ProtoDeserializer implements Serializable {
+    private final String className;
+    private transient Class protoClass;
+    private transient Method parseMethod;
+
+    public ProtoDeserializer(String className) {
+      this.className = className;
+    }
+
+    public Message parse(byte[] bytes) {
+      try {
+        return (Message) getParseMethod().invoke(getClass(), bytes);
+      } catch (IllegalAccessException | InvocationTargetException ex) {
+        throw new HoodieException("Failed to parse proto message from kafka", ex);
+      }
+    }
+
+    private Class getProtoClass() {
+      if (protoClass == null) {
+        protoClass = ReflectionUtils.getClass(className);
+      }
+      return protoClass;
+    }
+
+    private Method getParseMethod() {
+      if (parseMethod == null) {
+        try {
+          parseMethod = getProtoClass().getMethod("parseFrom", byte[].class);

Review Comment:
   Are you suggesting to catch it so we can add some more context to the error? like 
   ```
   catch (HoodieException ex) {
     throw new HoodieException("Unable to get proto class specified: " + className, ex);
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1191991418

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 03f2c7a09ebc30b413d78fce7a983a9ed1b4d727 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159) 
   * 51bbad445f6a51bb6e619b64f9db27d99b2d70d9 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1221012492

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * ffc9419327e0e9e91c4e803f9771cf71f7f5e581 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168) 
   * a9adc8d80f44a3d5963b139f56f658bb512ae0df Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835) 
   * b9c91181d3f39f1e03a7ff7d5536eaa6813b7913 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] the-other-tim-brown commented on a diff in pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
the-other-tim-brown commented on code in PR #6135:
URL: https://github.com/apache/hudi/pull/6135#discussion_r957635996


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/deltastreamer/SourceFormatAdapter.java:
##########
@@ -96,7 +101,7 @@ public InputBatch<JavaRDD<GenericRecord>> fetchNewDataInAvroFormat(Option<String
   public InputBatch<Dataset<Row>> fetchNewDataInRowFormat(Option<String> lastCkptStr, long sourceLimit) {
     switch (source.getSourceType()) {
       case ROW:
-        return ((RowSource) source).fetchNext(lastCkptStr, sourceLimit);

Review Comment:
   I'm updating the code to instead cast to the `Source<T>` that these abstract classes were shorthand for



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1189609813

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ec6177e8856b4b6809843c6295a5e15718d32b75 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041) 
   * d4d4d720b26bb123e72090529cd08cdcd593aea6 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1189675293

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d4d4d720b26bb123e72090529cd08cdcd593aea6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1190848064

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d4d4d720b26bb123e72090529cd08cdcd593aea6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074) 
   * 376eeddeef27bb387a187ba8ac8100b6e129f582 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1224544985

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * 1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1226064289

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "194396032e698ac4210d8652a969c7e58832d5db",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "194396032e698ac4210d8652a969c7e58832d5db",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * f70abbc3b45005d40e74252814edc0078a50030e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909) 
   * 194396032e698ac4210d8652a969c7e58832d5db UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1221016076

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * ffc9419327e0e9e91c4e803f9771cf71f7f5e581 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168) 
   * a9adc8d80f44a3d5963b139f56f658bb512ae0df Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835) 
   * b9c91181d3f39f1e03a7ff7d5536eaa6813b7913 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1220954214

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * ffc9419327e0e9e91c4e803f9771cf71f7f5e581 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168) 
   * a9adc8d80f44a3d5963b139f56f658bb512ae0df UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1190988505

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 376eeddeef27bb387a187ba8ac8100b6e129f582 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1191885225

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 376eeddeef27bb387a187ba8ac8100b6e129f582 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112) 
   * 03f2c7a09ebc30b413d78fce7a983a9ed1b4d727 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1224382862

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * 14115a6f79de39f538ddfba407f84249c35ebca5 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881) 
   * 1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1225016353

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * f70abbc3b45005d40e74252814edc0078a50030e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] the-other-tim-brown commented on a diff in pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
the-other-tim-brown commented on code in PR #6135:
URL: https://github.com/apache/hudi/pull/6135#discussion_r958740952


##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/ProtoConversionUtil.java:
##########
@@ -0,0 +1,393 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.utilities.sources.helpers;
+
+import org.apache.hudi.common.util.collection.Pair;
+import org.apache.hudi.exception.HoodieException;
+
+import com.google.protobuf.BoolValue;
+import com.google.protobuf.ByteString;
+import com.google.protobuf.BytesValue;
+import com.google.protobuf.DescriptorProtos;
+import com.google.protobuf.Descriptors;
+import com.google.protobuf.DoubleValue;
+import com.google.protobuf.FloatValue;
+import com.google.protobuf.Int32Value;
+import com.google.protobuf.Int64Value;
+import com.google.protobuf.Message;
+import com.google.protobuf.StringValue;
+import com.google.protobuf.UInt32Value;
+import com.google.protobuf.UInt64Value;
+import org.apache.avro.Schema;
+import org.apache.avro.generic.GenericData;
+import org.apache.avro.generic.GenericFixed;
+import org.apache.avro.generic.GenericRecord;
+import org.apache.avro.util.Utf8;
+
+import java.io.File;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.ConcurrentHashMap;
+import java.util.function.Function;
+
+/**
+ * A utility class to help translate from Proto to Avro.
+ */
+public class ProtoConversionUtil {
+
+  /**
+   * Creates an Avro {@link Schema} for the provided class. Assumes that the class is a protobuf {@link Message}.
+   * @param clazz The protobuf class
+   * @param flattenWrappedPrimitives set to true to treat wrapped primitives like nullable fields instead of nested messages.
+   * @return An Avro schema
+   */
+  public static Schema getAvroSchemaForMessageClass(Class clazz, boolean flattenWrappedPrimitives) {
+    return AvroSupport.get().getSchema(clazz, flattenWrappedPrimitives);
+  }
+
+  /**
+   * Converts the provided {@link Message} into an avro {@link GenericRecord} with the provided schema.
+   * @param schema target schema to convert into
+   * @param message the source message to convert
+   * @return an Avro GenericRecord
+   */
+  public static GenericRecord convertToAvro(Schema schema, Message message) {
+    return AvroSupport.get().convert(schema, message);
+  }
+
+  /**
+   * This class provides support for generating schemas and converting from proto to avro. We don't directly use Avro's ProtobufData class so we can:
+   * 1. Customize how schemas are generated for protobufs. We treat Enums as strings and provide an option to treat wrapped primitives like {@link Int32Value} and {@link StringValue} as messages
+   * (default behavior) or as nullable versions of those primitives.
+   * 2. Convert directly from a protobuf {@link Message} to a {@link GenericRecord} while properly handling enums and wrapped primitives mentioned above.
+   */
+  private static class AvroSupport {
+    private static final AvroSupport INSTANCE = new AvroSupport();
+    private static final Map<Pair<Class, Boolean>, Schema> SCHEMA_CACHE = new ConcurrentHashMap<>();
+    private static final Map<Pair<Schema, Descriptors.Descriptor>, Descriptors.FieldDescriptor[]> FIELD_CACHE = new ConcurrentHashMap<>();
+
+
+    private static final Schema STRINGS = Schema.create(Schema.Type.STRING);
+
+    private static final Schema NULL = Schema.create(Schema.Type.NULL);
+    private static final Map<Descriptors.Descriptor, Schema.Type> WRAPPER_DESCRIPTORS_TO_TYPE = getWrapperDescriptorsToType();
+
+    private static Map<Descriptors.Descriptor, Schema.Type> getWrapperDescriptorsToType() {
+      Map<Descriptors.Descriptor, Schema.Type> wrapperDescriptorsToType = new HashMap<>();
+      wrapperDescriptorsToType.put(StringValue.getDescriptor(), Schema.Type.STRING);
+      wrapperDescriptorsToType.put(Int32Value.getDescriptor(), Schema.Type.INT);
+      wrapperDescriptorsToType.put(UInt32Value.getDescriptor(), Schema.Type.INT);
+      wrapperDescriptorsToType.put(Int64Value.getDescriptor(), Schema.Type.LONG);
+      wrapperDescriptorsToType.put(UInt64Value.getDescriptor(), Schema.Type.LONG);
+      wrapperDescriptorsToType.put(BoolValue.getDescriptor(), Schema.Type.BOOLEAN);
+      wrapperDescriptorsToType.put(BytesValue.getDescriptor(), Schema.Type.BYTES);
+      wrapperDescriptorsToType.put(DoubleValue.getDescriptor(), Schema.Type.DOUBLE);
+      wrapperDescriptorsToType.put(FloatValue.getDescriptor(), Schema.Type.FLOAT);
+      return wrapperDescriptorsToType;
+    }
+
+    private AvroSupport() {
+    }
+
+    public static AvroSupport get() {
+      return INSTANCE;
+    }
+
+    public GenericRecord convert(Schema schema, Message message) {
+      return (GenericRecord) convertObject(schema, message);
+    }
+
+    public Schema getSchema(Class c, boolean flattenWrappedPrimitives) {
+      return SCHEMA_CACHE.computeIfAbsent(Pair.of(c, flattenWrappedPrimitives), key -> {
+        try {
+          Object descriptor = c.getMethod("getDescriptor").invoke(null);
+          if (c.isEnum()) {
+            return getEnumSchema((Descriptors.EnumDescriptor) descriptor);
+          } else {
+            return getMessageSchema((Descriptors.Descriptor) descriptor, new HashMap<>(), flattenWrappedPrimitives);
+          }
+        } catch (Exception e) {
+          throw new RuntimeException(e);
+        }
+      });
+    }
+
+    private Schema getEnumSchema(Descriptors.EnumDescriptor d) {
+      List<String> symbols = new ArrayList<>(d.getValues().size());
+      for (Descriptors.EnumValueDescriptor e : d.getValues()) {
+        symbols.add(e.getName());
+      }
+      return Schema.createEnum(d.getName(), null, getNamespace(d.getFile(), d.getContainingType()), symbols);
+    }
+
+    private Schema getMessageSchema(Descriptors.Descriptor descriptor, Map<Descriptors.Descriptor, Schema> seen, boolean flattenWrappedPrimitives) {
+      if (seen.containsKey(descriptor)) {
+        return seen.get(descriptor);
+      }
+      Schema result = Schema.createRecord(descriptor.getName(), null,
+          getNamespace(descriptor.getFile(), descriptor.getContainingType()), false);
+
+      seen.put(descriptor, result);
+
+      List<Schema.Field> fields = new ArrayList<>(descriptor.getFields().size());
+      for (Descriptors.FieldDescriptor f : descriptor.getFields()) {
+        fields.add(new Schema.Field(f.getName(), getFieldSchema(f, seen, flattenWrappedPrimitives), null, getDefault(f)));
+      }
+      result.setFields(fields);
+      return result;
+    }
+
+    private Schema getFieldSchema(Descriptors.FieldDescriptor f, Map<Descriptors.Descriptor, Schema> seen, boolean flattenWrappedPrimitives) {
+      Function<Schema, Schema> schemaFinalizer =  f.isRepeated() ? Schema::createArray : Function.identity();
+      switch (f.getType()) {
+        case BOOL:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.BOOLEAN));
+        case FLOAT:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.FLOAT));
+        case DOUBLE:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.DOUBLE));
+        case ENUM:
+          return schemaFinalizer.apply(getEnumSchema(f.getEnumType()));
+        case STRING:
+          Schema s = Schema.create(Schema.Type.STRING);
+          GenericData.setStringType(s, GenericData.StringType.String);
+          return schemaFinalizer.apply(s);
+        case BYTES:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.BYTES));
+        case INT32:
+        case SINT32:
+        case FIXED32:
+        case SFIXED32:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.INT));
+        case UINT32:
+        case INT64:
+        case UINT64:
+        case SINT64:
+        case FIXED64:
+        case SFIXED64:
+          return schemaFinalizer.apply(Schema.create(Schema.Type.LONG));
+        case MESSAGE:
+          if (flattenWrappedPrimitives && WRAPPER_DESCRIPTORS_TO_TYPE.containsKey(f.getMessageType())) {
+            // all wrapper types have a single field, so we can get the first field in the message's schema
+            return schemaFinalizer.apply(Schema.createUnion(Arrays.asList(NULL, getFieldSchema(f.getMessageType().getFields().get(0), seen, flattenWrappedPrimitives))));
+          }
+          // if message field is repeated (like a list), elements are non-null
+          if (f.isRepeated()) {
+            return schemaFinalizer.apply(getMessageSchema(f.getMessageType(), seen, flattenWrappedPrimitives));
+          }
+          // otherwise we create a nullable field schema
+          return schemaFinalizer.apply(Schema.createUnion(Arrays.asList(NULL, getMessageSchema(f.getMessageType(), seen, flattenWrappedPrimitives))));
+        case GROUP: // groups are deprecated
+        default:
+          throw new RuntimeException("Unexpected type: " + f.getType());
+      }
+    }
+
+    private Object getDefault(Descriptors.FieldDescriptor f) {
+      if (f.isRepeated()) { // empty array as repeated fields' default value
+        return Collections.emptyList();
+      }
+
+      switch (f.getType()) { // generate default for type
+        case BOOL:
+          return false;
+        case FLOAT:
+          return 0.0F;
+        case DOUBLE:
+          return 0.0D;
+        case INT32:
+        case UINT32:
+        case SINT32:
+        case FIXED32:
+        case SFIXED32:
+        case INT64:
+        case UINT64:
+        case SINT64:
+        case FIXED64:
+        case SFIXED64:
+          return 0;
+        case STRING:
+        case BYTES:
+          return "";
+        case ENUM:
+          return f.getEnumType().getValues().get(0).getName();
+        case MESSAGE:
+          return Schema.Field.NULL_VALUE;
+        case GROUP: // groups are deprecated
+        default:
+          throw new RuntimeException("Unexpected type: " + f.getType());
+      }
+    }
+
+    private Descriptors.FieldDescriptor[] getOrderedFields(Schema schema, Message message) {
+      Descriptors.Descriptor descriptor = message.getDescriptorForType();
+      return FIELD_CACHE.computeIfAbsent(Pair.of(schema, descriptor), key -> {
+        Descriptors.FieldDescriptor[] fields = new Descriptors.FieldDescriptor[key.getLeft().getFields().size()];
+        for (Schema.Field f : key.getLeft().getFields()) {
+          fields[f.pos()] = key.getRight().findFieldByName(f.name());
+        }
+        return fields;
+      });
+    }
+
+    private Object convertObject(Schema schema, Object value) {
+      if (value == null) {
+        return null;
+      }
+
+      switch (schema.getType()) {
+        case ARRAY:
+          List<Object> arrayValue = (List<Object>) value;
+          List<Object> arrayCopy = new GenericData.Array<>(arrayValue.size(), schema);
+          for (Object obj : arrayValue) {
+            arrayCopy.add(convertObject(schema.getElementType(), obj));
+          }
+          return arrayCopy;
+        case BYTES:
+          ByteBuffer byteBufferValue;
+          if (value instanceof ByteString) {
+            byteBufferValue = ((ByteString) value).asReadOnlyByteBuffer();
+          } else if (value instanceof Message) {
+            byteBufferValue = ((ByteString) getWrappedValue(value)).asReadOnlyByteBuffer();
+          } else {
+            byteBufferValue = (ByteBuffer) value;
+          }
+          int start = byteBufferValue.position();
+          int length = byteBufferValue.limit() - start;
+          byte[] bytesCopy = new byte[length];
+          byteBufferValue.get(bytesCopy, 0, length);
+          byteBufferValue.position(start);
+          return ByteBuffer.wrap(bytesCopy, 0, length);
+        case ENUM:
+          return GenericData.get().createEnum(value.toString(), schema);
+        case FIXED:
+          return GenericData.get().createFixed(null, ((GenericFixed) value).bytes(), schema);
+        case BOOLEAN:
+        case DOUBLE:
+        case FLOAT:
+        case INT:
+          if (value instanceof Message) {
+            return getWrappedValue(value);
+          }
+          return value; // immutable
+        case LONG:
+          Object tmpValue = value;
+          if (value instanceof Message) {
+            tmpValue = getWrappedValue(value);
+          }
+          // unsigned ints need to be casted to long
+          if (tmpValue instanceof Integer) {
+            tmpValue = new Long((Integer) tmpValue);
+          }
+          return tmpValue;
+        case MAP:
+          Map<Object, Object> mapValue = (Map) value;
+          Map<Object, Object> mapCopy = new HashMap<>(mapValue.size());
+          for (Map.Entry<Object, Object> entry : mapValue.entrySet()) {
+            mapCopy.put(convertObject(STRINGS, entry.getKey()), convertObject(schema.getValueType(), entry.getValue()));
+          }
+          return mapCopy;
+        case NULL:
+          return null;
+        case RECORD:
+          GenericData.Record newRecord = new GenericData.Record(schema);
+          Message messageValue = (Message) value;
+          for (Schema.Field f : schema.getFields()) {
+            int position = f.pos();
+            Descriptors.FieldDescriptor fieldDescriptor = getOrderedFields(schema, messageValue)[position];
+            Object convertedValue;
+            if (fieldDescriptor.getType() == Descriptors.FieldDescriptor.Type.MESSAGE && !fieldDescriptor.isRepeated() && !messageValue.hasField(fieldDescriptor)) {
+              convertedValue = null;
+            } else {
+              convertedValue = convertObject(f.schema(), messageValue.getField(fieldDescriptor));
+            }
+            newRecord.put(position, convertedValue);
+          }
+          return newRecord;
+        case STRING:
+          if (value instanceof String) {
+            return value;
+          } else if (value instanceof StringValue) {
+            return ((StringValue) value).getValue();
+          } else {
+            return new Utf8(value.toString());
+          }
+        case UNION:
+          // Unions only occur for nullable fields when working with proto + avro and null is the first schema in the union
+          return convertObject(schema.getTypes().get(1), value);
+        default:
+          throw new HoodieException("Proto to Avro conversion failed for schema \"" + schema + "\" and value \"" + value + "\"");
+      }
+    }
+
+    /**
+     * Returns the wrapped field, assumes all wrapped fields have a single value
+     * @param value wrapper message like {@link Int32Value} or {@link StringValue}
+     * @return the wrapped object
+     */
+    private Object getWrappedValue(Object value) {
+      Message valueAsMessage = (Message) value;
+      return valueAsMessage.getField(valueAsMessage.getDescriptorForType().getFields().get(0));
+    }
+
+    private String getNamespace(Descriptors.FileDescriptor fd, Descriptors.Descriptor containing) {
+      DescriptorProtos.FileOptions filedOptions = fd.getOptions();
+      String classPackage = filedOptions.hasJavaPackage() ? filedOptions.getJavaPackage() : fd.getPackage();
+      String outer = "";
+      if (!filedOptions.getJavaMultipleFiles()) {
+        if (filedOptions.hasJavaOuterClassname()) {
+          outer = filedOptions.getJavaOuterClassname();
+        } else {
+          outer = new File(fd.getName()).getName();
+          outer = outer.substring(0, outer.lastIndexOf('.'));
+          outer = toCamelCase(outer);

Review Comment:
   I simplified this namespace generation to just use the proto package



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1191988965

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 03f2c7a09ebc30b413d78fce7a983a9ed1b4d727 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159) 
   * 51bbad445f6a51bb6e619b64f9db27d99b2d70d9 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1231982783

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "194396032e698ac4210d8652a969c7e58832d5db",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925",
       "triggerID" : "194396032e698ac4210d8652a969c7e58832d5db",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10933",
       "triggerID" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "triggerType" : "PUSH"
     }, {
       "hash" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10999",
       "triggerID" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e6e1a68d195995b126c5d32286059d627006f08d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11020",
       "triggerID" : "e6e1a68d195995b126c5d32286059d627006f08d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11021",
       "triggerID" : "95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fc344fa87202e68178333b5cc0128aa37462706e",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11022",
       "triggerID" : "fc344fa87202e68178333b5cc0128aa37462706e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "55997e718024a9ca75494b3db2fe5027003bf635",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "55997e718024a9ca75494b3db2fe5027003bf635",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * fc344fa87202e68178333b5cc0128aa37462706e Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11022) 
   * 55997e718024a9ca75494b3db2fe5027003bf635 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1230666749

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "194396032e698ac4210d8652a969c7e58832d5db",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925",
       "triggerID" : "194396032e698ac4210d8652a969c7e58832d5db",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10933",
       "triggerID" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "triggerType" : "PUSH"
     }, {
       "hash" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10999",
       "triggerID" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e6e1a68d195995b126c5d32286059d627006f08d",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11020",
       "triggerID" : "e6e1a68d195995b126c5d32286059d627006f08d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11021",
       "triggerID" : "95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * e6e1a68d195995b126c5d32286059d627006f08d Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11020) 
   * 95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11021) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1230746937

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "194396032e698ac4210d8652a969c7e58832d5db",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925",
       "triggerID" : "194396032e698ac4210d8652a969c7e58832d5db",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10933",
       "triggerID" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "triggerType" : "PUSH"
     }, {
       "hash" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10999",
       "triggerID" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e6e1a68d195995b126c5d32286059d627006f08d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11020",
       "triggerID" : "e6e1a68d195995b126c5d32286059d627006f08d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11021",
       "triggerID" : "95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fc344fa87202e68178333b5cc0128aa37462706e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11022",
       "triggerID" : "fc344fa87202e68178333b5cc0128aa37462706e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * 95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11021) 
   * fc344fa87202e68178333b5cc0128aa37462706e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11022) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1230881401

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "194396032e698ac4210d8652a969c7e58832d5db",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925",
       "triggerID" : "194396032e698ac4210d8652a969c7e58832d5db",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10933",
       "triggerID" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "triggerType" : "PUSH"
     }, {
       "hash" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10999",
       "triggerID" : "794ed4861a42b5718bacfe299bfac2444744115d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "e6e1a68d195995b126c5d32286059d627006f08d",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11020",
       "triggerID" : "e6e1a68d195995b126c5d32286059d627006f08d",
       "triggerType" : "PUSH"
     }, {
       "hash" : "95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11021",
       "triggerID" : "95a7f4a9faca9e08e4771dd2f33e35bd1b5f6273",
       "triggerType" : "PUSH"
     }, {
       "hash" : "fc344fa87202e68178333b5cc0128aa37462706e",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11022",
       "triggerID" : "fc344fa87202e68178333b5cc0128aa37462706e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * fc344fa87202e68178333b5cc0128aa37462706e Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=11022) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1189612534

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * ec6177e8856b4b6809843c6295a5e15718d32b75 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041) 
   * d4d4d720b26bb123e72090529cd08cdcd593aea6 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1190845452

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d4d4d720b26bb123e72090529cd08cdcd593aea6 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074) 
   * 376eeddeef27bb387a187ba8ac8100b6e129f582 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] the-other-tim-brown commented on a diff in pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
the-other-tim-brown commented on code in PR #6135:
URL: https://github.com/apache/hudi/pull/6135#discussion_r950480662


##########
hudi-utilities/src/test/java/org/apache/hudi/utilities/sources/TestJsonKafkaSourcePostProcessor.java:
##########
@@ -48,17 +56,41 @@
 import static org.junit.jupiter.api.Assertions.assertNotEquals;
 import static org.junit.jupiter.api.Assertions.assertNull;
 import static org.junit.jupiter.api.Assertions.assertTrue;
+import static org.mockito.Mockito.mock;
 
-public class TestJsonKafkaSourcePostProcessor extends TestJsonKafkaSource {

Review Comment:
   By extending `TestJsonKafkaSource` we were running the tests for `TestJsonKafkaSource` twice. I made some changes to this class to avoid that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1224982778

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * 1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905) 
   * f70abbc3b45005d40e74252814edc0078a50030e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6135: [HUDI-4418] Add support for ProtoKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6135:
URL: https://github.com/apache/hudi/pull/6135#issuecomment-1226770645

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10041",
       "triggerID" : "ec6177e8856b4b6809843c6295a5e15718d32b75",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10074",
       "triggerID" : "d4d4d720b26bb123e72090529cd08cdcd593aea6",
       "triggerType" : "PUSH"
     }, {
       "hash" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10112",
       "triggerID" : "376eeddeef27bb387a187ba8ac8100b6e129f582",
       "triggerType" : "PUSH"
     }, {
       "hash" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10159",
       "triggerID" : "03f2c7a09ebc30b413d78fce7a983a9ed1b4d727",
       "triggerType" : "PUSH"
     }, {
       "hash" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10165",
       "triggerID" : "51bbad445f6a51bb6e619b64f9db27d99b2d70d9",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d36fed637603d9959e8d049ac0815b9c729eb246",
       "triggerType" : "PUSH"
     }, {
       "hash" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10168",
       "triggerID" : "ffc9419327e0e9e91c4e803f9771cf71f7f5e581",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10835",
       "triggerID" : "a9adc8d80f44a3d5963b139f56f658bb512ae0df",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10839",
       "triggerID" : "b9c91181d3f39f1e03a7ff7d5536eaa6813b7913",
       "triggerType" : "PUSH"
     }, {
       "hash" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10881",
       "triggerID" : "14115a6f79de39f538ddfba407f84249c35ebca5",
       "triggerType" : "PUSH"
     }, {
       "hash" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10905",
       "triggerID" : "1879403e5a33bfcaa6d9d1d3e6e2cbc226403f90",
       "triggerType" : "PUSH"
     }, {
       "hash" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10909",
       "triggerID" : "f70abbc3b45005d40e74252814edc0078a50030e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "194396032e698ac4210d8652a969c7e58832d5db",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925",
       "triggerID" : "194396032e698ac4210d8652a969c7e58832d5db",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10933",
       "triggerID" : "9a4df85fd5b1787af587817cba959c536d904fde",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d36fed637603d9959e8d049ac0815b9c729eb246 UNKNOWN
   * 194396032e698ac4210d8652a969c7e58832d5db Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10925) 
   * 9a4df85fd5b1787af587817cba959c536d904fde Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=10933) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org