You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/08/19 11:04:43 UTC

[GitHub] [hudi] dongkelun opened a new pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

dongkelun opened a new pull request #3502:
URL: https://github.com/apache/hudi/pull/3502


   ## What is the purpose of the pull request
   Add support ByteArrayDeserializer in AvroKafkaSource
   
   
   
   ## Verify this pull request
   
   When the 'value.serializer' of Kafka Avro Producer is 'org.apache.kafka.common.serialization.ByteArraySerializer',Use the following configuration
   
   --source-class org.apache.hudi.utilities.sources.AvroKafkaSource \
   --schemaprovider-class org.apache.hudi.utilities.schema.JdbcbasedSchemaProvider \
   --hoodie-conf "hoodie.deltastreamer.source.kafka.value.deserializer.class=org.apache.kafka.common.serialization.ByteArrayDeserializer"
   
   For now,It will throw an exception:
   java.lang.ClassCastException: [B cannot be cast to org.apache.avro.generic.GenericRecord
   
   After support ByteArrayDeserializer,Use the configuration above,It works properly.And there is no need to provide 'schema.registry.url',For example, we can use the JdbcbasedSchemaProvider to get the sourceSchema
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] dongkelun commented on a change in pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
dongkelun commented on a change in pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#discussion_r697981489



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/AvroKafkaSource.java
##########
@@ -91,12 +96,21 @@ public AvroKafkaSource(TypedProperties props, JavaSparkContext sparkContext, Spa
       return new InputBatch<>(Option.empty(), CheckpointUtils.offsetsToStr(offsetRanges));
     }
     JavaRDD<GenericRecord> newDataRDD = toRDD(offsetRanges);
-    return new InputBatch<>(Option.of(newDataRDD), KafkaOffsetGen.CheckpointUtils.offsetsToStr(offsetRanges));
+    return new InputBatch<>(Option.of(newDataRDD), CheckpointUtils.offsetsToStr(offsetRanges));
   }
 
   private JavaRDD<GenericRecord> toRDD(OffsetRange[] offsetRanges) {
-    return KafkaUtils.createRDD(sparkContext, offsetGen.getKafkaParams(), offsetRanges,
-            LocationStrategies.PreferConsistent()).map(obj -> (GenericRecord) obj.value());
+    if (deserializerClassName.equals(ByteArrayDeserializer.class.getName())) {
+      if (schemaProvider == null) {
+        throw new HoodieException("Please provide a valid schema provider class!");

Review comment:
       OK, good idea,'when use ByteArrayDeserializer ' would be better?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#issuecomment-901822779


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809",
       "triggerID" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b1de704c4a618aea6026eca4e832888d78b8b6b2",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1934",
       "triggerID" : "b1de704c4a618aea6026eca4e832888d78b8b6b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9843bda5ccd7222233a681d23b62eac310c30d29",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "9843bda5ccd7222233a681d23b62eac310c30d29",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * b1de704c4a618aea6026eca4e832888d78b8b6b2 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1934) 
   * 9843bda5ccd7222233a681d23b62eac310c30d29 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot commented on pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#issuecomment-901822779


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 61dfee3f3db371b4c4ec55f630d2a5b40467ea12 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#issuecomment-901822779


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809",
       "triggerID" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b1de704c4a618aea6026eca4e832888d78b8b6b2",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1934",
       "triggerID" : "b1de704c4a618aea6026eca4e832888d78b8b6b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9843bda5ccd7222233a681d23b62eac310c30d29",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1935",
       "triggerID" : "9843bda5ccd7222233a681d23b62eac310c30d29",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * b1de704c4a618aea6026eca4e832888d78b8b6b2 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1934) 
   * 9843bda5ccd7222233a681d23b62eac310c30d29 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1935) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] dongkelun commented on a change in pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
dongkelun commented on a change in pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#discussion_r694591562



##########
File path: hudi-utilities/pom.xml
##########
@@ -254,7 +254,7 @@
     <dependency>
       <groupId>com.twitter</groupId>
       <artifactId>bijection-avro_${scala.binary.version}</artifactId>
-      <version>0.9.3</version>
+      <version>0.9.7</version>

Review comment:
       Hello, if we use version 0.9.3, when we use recordinjection.convert, it will throw an exception:
   ```
   Exception in thread "main" java.io.FileNotFoundException: \Users\pnarang\workspace\bijection\bijection-avro\target\scala-2.11\scoverage-data\scoverage.measurements.
   	at java.io.FileOutputStream.open0(Native Method)
   	at java.io.FileOutputStream.open(FileOutputStream.java:270)
   	at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
   	at java.io.FileWriter.<init>(FileWriter.java:107)
   	at scoverage.Invoker$$anonfun$1.apply(Invoker.scala:42)
   	at scoverage.Invoker$$anonfun$1.apply(Invoker.scala:42)
   	at scala.collection.concurrent.TrieMap.getOrElseUpdate(TrieMap.scala:901)
   	at scoverage.Invoker$.invoked(Invoker.scala:42)
   	at com.twitter.bijection.avro.GenericAvroCodecs$.toBinary(AvroCodecs.scala:222)
   ```
   I think this is because scalac-scoverage-runtime is used in the source code of version 0.9.3,it  may be a bug in this version.When I upgrade the version to 0.9.7, and it is the same as the version in hudi-flink, it will not throw the above exception
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#issuecomment-901822779


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809",
       "triggerID" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b1de704c4a618aea6026eca4e832888d78b8b6b2",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1934",
       "triggerID" : "b1de704c4a618aea6026eca4e832888d78b8b6b2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * b1de704c4a618aea6026eca4e832888d78b8b6b2 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1934) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua commented on a change in pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
yanghua commented on a change in pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#discussion_r697977680



##########
File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/AvroKafkaSource.java
##########
@@ -91,12 +96,21 @@ public AvroKafkaSource(TypedProperties props, JavaSparkContext sparkContext, Spa
       return new InputBatch<>(Option.empty(), CheckpointUtils.offsetsToStr(offsetRanges));
     }
     JavaRDD<GenericRecord> newDataRDD = toRDD(offsetRanges);
-    return new InputBatch<>(Option.of(newDataRDD), KafkaOffsetGen.CheckpointUtils.offsetsToStr(offsetRanges));
+    return new InputBatch<>(Option.of(newDataRDD), CheckpointUtils.offsetsToStr(offsetRanges));
   }
 
   private JavaRDD<GenericRecord> toRDD(OffsetRange[] offsetRanges) {
-    return KafkaUtils.createRDD(sparkContext, offsetGen.getKafkaParams(), offsetRanges,
-            LocationStrategies.PreferConsistent()).map(obj -> (GenericRecord) obj.value());
+    if (deserializerClassName.equals(ByteArrayDeserializer.class.getName())) {
+      if (schemaProvider == null) {
+        throw new HoodieException("Please provide a valid schema provider class!");

Review comment:
       Given more exception information, e.g. : when choose `ByteArrayDeserializer `?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#issuecomment-901822779


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809",
       "triggerID" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 61dfee3f3db371b4c4ec55f630d2a5b40467ea12 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] dongkelun commented on pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
dongkelun commented on pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#issuecomment-907640353


   There is no problem with the local virtual machine verification. The reason for the problem mentioned above is that the schema of producer is not completely consistent with the schema obtained by Hudi JdbcbasedSchemaProvider.The specific differences are as follows:
   The producer shcema's field:
   ```java
   {"name": "userId", "type": "string"}
   ```
   The JdbcbasedSchemaProvider's field:
   ```java
   {"name": "userid", "type":["null","string"],"default":null}
   ```
   
   Modify producer shcema's field to '{name ":" userid "," type ": [" null "," string "]," default ": null}' or '{name": "userid", "type": ["null", "string"]}',Then it can run normally


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] pratyakshsharma commented on pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
pratyakshsharma commented on pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#issuecomment-907596058


   @dongkelun So essentially I wanted to understand the reason behind using ByteArraySerDes. Because I have not seen this getting used so far in my working experience. Mostly StringSerializer or confluent Avro Serializer are the most common ones to use. So just wanted to know if one gets any additional benefit with using ByteArraySerializer in terms of data size etc.
   
   LGTM otherwise. :) cc @yanghua 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] dongkelun commented on pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
dongkelun commented on pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#issuecomment-907561959


   > @dongkelun Can you please help me understand the use case? I want to know what kind of data you are pushing using ByteArraySerializer.
   
   @pratyakshsharma  Hi,The main parameters of Kafka producer are:
   
   ```java
       val props = new Properties()
       props.put("bootstrap.servers", "192.168.44.128:9092")
       props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer")
       props.put("value.serializer", "org.apache.kafka.common.serialization.ByteArraySerializer")
   ```
   See this example for detailed code:[AvroSerializerProducerTest ](https://github.com/dongkelun/AvroDemo/blob/master/src/main/java/com/dkl/blog/avro/kafka/AvroSerializerProducerTest.scala).And See this for the demo of consumer:[kafkaConsume](https://github.com/dongkelun/AvroDemo/blob/master/src/main/java/com/dkl/blog/avro/kafka/kafkaConsume.scala)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#issuecomment-901822779


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809",
       "triggerID" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b1de704c4a618aea6026eca4e832888d78b8b6b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1934",
       "triggerID" : "b1de704c4a618aea6026eca4e832888d78b8b6b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9843bda5ccd7222233a681d23b62eac310c30d29",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1935",
       "triggerID" : "9843bda5ccd7222233a681d23b62eac310c30d29",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9843bda5ccd7222233a681d23b62eac310c30d29 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1935) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua commented on a change in pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
yanghua commented on a change in pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#discussion_r697977419



##########
File path: hudi-utilities/pom.xml
##########
@@ -254,7 +254,7 @@
     <dependency>
       <groupId>com.twitter</groupId>
       <artifactId>bijection-avro_${scala.binary.version}</artifactId>
-      <version>0.9.3</version>
+      <version>0.9.7</version>

Review comment:
       s.g.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#issuecomment-901822779


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809",
       "triggerID" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b1de704c4a618aea6026eca4e832888d78b8b6b2",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "b1de704c4a618aea6026eca4e832888d78b8b6b2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 61dfee3f3db371b4c4ec55f630d2a5b40467ea12 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809) 
   * b1de704c4a618aea6026eca4e832888d78b8b6b2 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] dongkelun commented on pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
dongkelun commented on pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#issuecomment-907601265


   @pratyakshsharma Because our department uses nifi to extract data and then push it to Kafka, and the nifi processor uses ByteArraySerializer, I want to support ByteArrayDeserializer in Hudi.As for the benefits, I'm not very clear now,But I don't think it's very convenient to register a 'schema.registry.url'.And I found a problem in the local virtual machine verification today. I passed the verification in the company's test environment before. I don't know if there is a problem with my environment settings. I'll verify and solve it later


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#issuecomment-901822779


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809",
       "triggerID" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 61dfee3f3db371b4c4ec55f630d2a5b40467ea12 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua merged pull request #3502: [HUDI-2320] Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
yanghua merged pull request #3502:
URL: https://github.com/apache/hudi/pull/3502


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] pratyakshsharma commented on pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
pratyakshsharma commented on pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#issuecomment-907446794


   @dongkelun Can you please help me understand the use case? I want to know what kind of data you are pushing using ByteArraySerializer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#issuecomment-901822779


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809",
       "triggerID" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b1de704c4a618aea6026eca4e832888d78b8b6b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1934",
       "triggerID" : "b1de704c4a618aea6026eca4e832888d78b8b6b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9843bda5ccd7222233a681d23b62eac310c30d29",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1935",
       "triggerID" : "9843bda5ccd7222233a681d23b62eac310c30d29",
       "triggerType" : "PUSH"
     }, {
       "hash" : "dd6c65593db545ee86dab43fb2356c6994eb1292",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1936",
       "triggerID" : "dd6c65593db545ee86dab43fb2356c6994eb1292",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9843bda5ccd7222233a681d23b62eac310c30d29 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1935) 
   * dd6c65593db545ee86dab43fb2356c6994eb1292 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1936) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] yanghua commented on a change in pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
yanghua commented on a change in pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#discussion_r694567919



##########
File path: hudi-utilities/pom.xml
##########
@@ -254,7 +254,7 @@
     <dependency>
       <groupId>com.twitter</groupId>
       <artifactId>bijection-avro_${scala.binary.version}</artifactId>
-      <version>0.9.3</version>
+      <version>0.9.7</version>

Review comment:
       We must bump up this version number? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#issuecomment-901822779


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809",
       "triggerID" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b1de704c4a618aea6026eca4e832888d78b8b6b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1934",
       "triggerID" : "b1de704c4a618aea6026eca4e832888d78b8b6b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9843bda5ccd7222233a681d23b62eac310c30d29",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1935",
       "triggerID" : "9843bda5ccd7222233a681d23b62eac310c30d29",
       "triggerType" : "PUSH"
     }, {
       "hash" : "dd6c65593db545ee86dab43fb2356c6994eb1292",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1936",
       "triggerID" : "dd6c65593db545ee86dab43fb2356c6994eb1292",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * dd6c65593db545ee86dab43fb2356c6994eb1292 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1936) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#issuecomment-901822779


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809",
       "triggerID" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b1de704c4a618aea6026eca4e832888d78b8b6b2",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1934",
       "triggerID" : "b1de704c4a618aea6026eca4e832888d78b8b6b2",
       "triggerType" : "PUSH"
     }, {
       "hash" : "9843bda5ccd7222233a681d23b62eac310c30d29",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1935",
       "triggerID" : "9843bda5ccd7222233a681d23b62eac310c30d29",
       "triggerType" : "PUSH"
     }, {
       "hash" : "dd6c65593db545ee86dab43fb2356c6994eb1292",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "dd6c65593db545ee86dab43fb2356c6994eb1292",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 9843bda5ccd7222233a681d23b62eac310c30d29 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1935) 
   * dd6c65593db545ee86dab43fb2356c6994eb1292 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] hudi-bot edited a comment on pull request #3502: [HUDI-2320]Add support ByteArrayDeserializer in AvroKafkaSource

Posted by GitBox <gi...@apache.org>.
hudi-bot edited a comment on pull request #3502:
URL: https://github.com/apache/hudi/pull/3502#issuecomment-901822779


   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809",
       "triggerID" : "61dfee3f3db371b4c4ec55f630d2a5b40467ea12",
       "triggerType" : "PUSH"
     }, {
       "hash" : "b1de704c4a618aea6026eca4e832888d78b8b6b2",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1934",
       "triggerID" : "b1de704c4a618aea6026eca4e832888d78b8b6b2",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 61dfee3f3db371b4c4ec55f630d2a5b40467ea12 Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1809) 
   * b1de704c4a618aea6026eca4e832888d78b8b6b2 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=1934) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run travis` re-run the last Travis build
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org