You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/05/05 15:11:13 UTC

[GitHub] [incubator-hudi] vinothchandar commented on pull request #1566: [HUDI-603]: DeltaStreamer can now fetch schema before every run in continuous mode

vinothchandar commented on pull request #1566:
URL: https://github.com/apache/incubator-hudi/pull/1566#issuecomment-624114147


   @pratyakshsharma  @afilipchik IIUC using the Confluent Avro Kafka decoders etc will integrate with SR and fetch and decode using the latest schema for us, which we will use as the schema for the write as well... There is another PR tracking this/fixing this.. (a lot of these schema PR interplay quite a bit :))
   
   
   On the initial suggestion, @pratyakshsharma I was merely suggesting a better contract for `SchemaProvider` where `getSourceSchema()` is support to return the latest source schema as of that time, not a cached copy based on what was fetched in the constructor.. Existing schema providers do a mix of these.. 
   
   Filebased/JdbcBased fetch the schema once in constructor and keep serving.. whereas SchemaRegistry/RowBased fetch again when `getSourceSchema()` is called.. So no need to create the schemaRegistryProvider instance every run, simply call `getSourceSchema()` every run? 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org