You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/10/01 11:56:01 UTC

[GitHub] [hudi] tandonraghavs edited a comment on issue #2131: [SUPPORT] HUDI with Mongo Oplogs (Debezium)

tandonraghavs edited a comment on issue #2131:
URL: https://github.com/apache/hudi/issues/2131#issuecomment-702082811


   @bvaradar I am using my Custom class as `PAYLOAD_CLASS_OPT_KEY` key -> But the problem is **preCombine** doesnt have reference to **Schema** and it is giving me bytes , so how do i get the Generic Record out of it? 
   Which is the reason I am not able to implement any custom logic in _preCombine_ as I did in _combineAndGetUpdateValue_.
   
   I am using hudi via Spark Datasource (0.5.3).
   
   And due to the scale of data I dont want to run Compaction after every commit, so using _INLINE_COMPACT_NUM_DELTA_COMMITS_PROP_
   
   - How do i get hold of Schema in preCombine?
   
   Sample Code of my Spark job. 
   **jsonDf** -> This is a simple Json String which contains the resords.
   
   ````
                  Dataset<GenericRecord> data=jsonDf.map((MapFunction<String, GenericRecord>) record ->
                                           generateHoodieRecord(record, schemaStr),Encoders.bean(GenericRecord.class));
                   
                  Dataset<Row> ds= AvroConversionUtils.createDataFrame(data.rdd(),
                                                   schemaStr,sparkSession);
                  
                  ds
                       .write().format("org.apache.hudi").
                       .options ()...
                       .mode(SaveMode.Append)
                       .save(tablePath);
   
   
   ````
   This Ticket also talks about the same - https://issues.apache.org/jira/browse/HUDI-898


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org