You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/08/18 15:21:55 UTC

[GitHub] [hudi] hughfdjackson opened a new issue #1979: [SUPPORT]: Incremental read returns all upserted rows, even if no material change has occurred.

hughfdjackson opened a new issue #1979:
URL: https://github.com/apache/hudi/issues/1979


   **Describe the problem you faced**
   
   My team are interested in writing to Hudi tables using a repeated batch process that often upserts data that's identical to what's already there.  For instance, we may be: 
   
   - recalculating # of times a particular set of event has occurred
   - re-running a query over the last week of data, to include potentially late arriving data. 
   
   We also have some consumers that want to consume these tables incrementally (to ingest the latest results into local databases, or monitor the changes).  Ideally, these consumers would only see the 1% of records that have changed, rather than all records involved in the upsert. 
   
   However, in our testing, it seems like the incremental query returns _all_ records that were involved in the upsert, even if they were overwriting identical data.  
   
   (As far as I can tell, this happens here: https://github.com/apache/hudi/blob/release-0.5.3/hudi-client/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java#L238-L244, no matter which `PAYLOAD_CLASS_OPT_KEY` class is used).
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. clone hudi git repo, checkout `release-0.5.3-rc2` and run `mvn clean package -DskipTests -DskipITs`
   2. Copy `packaging/hudi-spark-bundle/target/hudi-spark-bundle_2.11-0.5.3-rc2.jar` to EMR master node
   3. Run the following spark shell on master, with the command: `spark-shell --conf "spark.serializer=org.apache.spark.serializer.KryoSerializer" --conf "spark.sql.hive.convertMetastoreParquet=false" --jars hudi-spark-bundle_2.11-0.5.3-rc2.jar,/usr/lib/spark/external/lib/spark-avro.jar -i spark-shell-script`
   
   where `spark-shell-script` contents is:
   ```scala
   import org.apache.hudi.QuickstartUtils._
   import scala.collection.JavaConversions._
   import org.apache.spark.sql.SaveMode._
   import org.apache.spark.sql.SaveMode
   import org.apache.hudi.DataSourceReadOptions._
   import org.apache.hudi.DataSourceWriteOptions._
   import org.apache.hudi.config.HoodieWriteConfig._
   import org.apache.spark.sql.DataFrame
   import org.apache.hudi.common.table.HoodieTableMetaClient
   import org.apache.hudi.table.HoodieTable
   import org.apache.hudi.config.HoodieWriteConfig
     
   // Helper functions
   val basePath = "s3://{s3BucketNameAndPrefixPath}"
   val tableName = "hudi_incremental_read_test"
   def write(df: DataFrame, saveMode: SaveMode = Append) = df.write.format("hudi")
       .option(PRECOMBINE_FIELD_OPT_KEY, "ts")
       .option(RECORDKEY_FIELD_OPT_KEY, "uuid")
       .option(PARTITIONPATH_FIELD_OPT_KEY, "partitionpath")
       .option("hoodie.consistency.check.enabled", "true")
       .option(TABLE_NAME, tableName)
       .mode(saveMode)
       .save(basePath)
   def incrementalRead(beginInstant: String) = { 
       println(s"READING FROM $beginInstant")   
       spark.read
        .format("hudi")
        .option(QUERY_TYPE_OPT_KEY, QUERY_TYPE_INCREMENTAL_OPT_VAL)
        .option(BEGIN_INSTANTTIME_OPT_KEY, beginInstant)
        .load(basePath)
   } 
   def latestCommitInstant() = { 
     val metaClient = new HoodieTableMetaClient(spark.sparkContext.hadoopConfiguration, basePath, true)
     val hoodieTable = HoodieTable.getHoodieTable(metaClient, HoodieWriteConfig.newBuilder().withPath(basePath).build(), spark.sparkContext)
     
     hoodieTable.getMetaClient.getCommitTimeline.filterCompletedInstants().lastInstant.get.getTimestamp
   }
   
   def justBefore(commitTime: String) = (commitTime.toLong - 1).toString
   val dataGen = new DataGenerator
   val inserts = convertToStringList(dataGen.generateInserts(10))
   val df = spark.read.json(spark.sparkContext.parallelize(inserts, 2))
   
   write(df, saveMode=Overwrite)
   
   println("""
   ----------- INCREMENTAL READ -------
   """)
   println("The whole table is new, so I'm expecting all 10 rows to be returned on incremental read")
   incrementalRead(justBefore(latestCommitInstant)).show()
   
   // generate an update for a single row
   val updates = convertToStringList(dataGen.generateUpdates(1))
   val updatesDF = spark.read.json(spark.sparkContext.parallelize(updates, 2))
   
   println("""
   ----------- INCREMENTAL READ -------
   """)
   println("Now we're updating a row, we expect to see the updated row only on incremental read, which we do")
   write(updatesDF)
   incrementalRead(justBefore(latestCommitInstant)).show()
   
   println("""
   ----------- INCREMENTAL READ -------
   """)
   println("Re-upserting the same row twice causes it to be 'emitted' twice to the incremental reader, even though the contents of the second reading are identical from the first (metadata aside)")
   write(updatesDF)
   incrementalRead(justBefore(latestCommitInstant)).show()
   ```
   
   That results in: 
   
   ```
   ----------- INCREMENTAL READ -------
   The whole table is new, so I'm expecting all 10 rows to be returned on incremental read
   READING FROM 20200818091617
   +-------------------+--------------------+--------------------+----------------------+--------------------+-------------------+-------------------+----------+-------------------+-------------------+------------------+--------------------+---------+---+--------------------+
   |_hoodie_commit_time|_hoodie_commit_seqno|  _hoodie_record_key|_hoodie_partition_path|   _hoodie_file_name|          begin_lat|          begin_lon|    driver|            end_lat|            end_lon|              fare|       partitionpath|    rider| ts|                uuid|
   +-------------------+--------------------+--------------------+----------------------+--------------------+-------------------+-------------------+----------+-------------------+-------------------+------------------+--------------------+---------+---+--------------------+
   |     20200818091618|  20200818091618_1_1|ecde6618-0cbc-4b6...|  americas/united_s...|3e9b3e64-3895-46a...|0.21624150367601136|0.14285051259466197|driver-213| 0.5890949624813784| 0.0966823831927115| 93.56018115236618|americas/united_s...|rider-213|0.0|ecde6618-0cbc-4b6...|
   |     20200818091618|  20200818091618_1_2|c9a45eda-fe53-480...|  americas/united_s...|3e9b3e64-3895-46a...| 0.8742041526408587| 0.7528268153249502|driver-213| 0.9197827128888302|  0.362464770874404|19.179139106643607|americas/united_s...|rider-213|0.0|c9a45eda-fe53-480...|
   |     20200818091618|  20200818091618_1_3|35808b31-2d1e-474...|  americas/united_s...|3e9b3e64-3895-46a...| 0.5731835407930634| 0.4923479652912024|driver-213|0.08988581780930216|0.42520899698713666| 64.27696295884016|americas/united_s...|rider-213|0.0|35808b31-2d1e-474...|
   |     20200818091618|  20200818091618_1_4|67e1c9d5-a3c0-4f7...|  americas/united_s...|3e9b3e64-3895-46a...|0.11488393157088261| 0.6273212202489661|driver-213| 0.7454678537511295| 0.3954939864908973| 27.79478688582596|americas/united_s...|rider-213|0.0|67e1c9d5-a3c0-4f7...|
   |     20200818091618|  20200818091618_1_5|8fdf91c8-b0ca-46c...|  americas/united_s...|3e9b3e64-3895-46a...| 0.1856488085068272| 0.9694586417848392|driver-213|0.38186367037201974|0.25252652214479043| 33.92216483948643|americas/united_s...|rider-213|0.0|8fdf91c8-b0ca-46c...|
   |     20200818091618|  20200818091618_0_1|2efbfbf1-aa1f-40f...|  americas/brazil/s...|a71d09b8-7cc8-408...| 0.4726905879569653|0.46157858450465483|driver-213|  0.754803407008858| 0.9671159942018241|34.158284716382845|americas/brazil/s...|rider-213|0.0|2efbfbf1-aa1f-40f...|
   |     20200818091618|  20200818091618_0_2|2bbebad3-1a3c-4f1...|  americas/brazil/s...|a71d09b8-7cc8-408...| 0.0750588760043035|0.03844104444445928|driver-213|0.04376353354538354| 0.6346040067610669| 66.62084366450246|americas/brazil/s...|rider-213|0.0|2bbebad3-1a3c-4f1...|
   |     20200818091618|  20200818091618_0_3|2c3d179c-899f-42f...|  americas/brazil/s...|a71d09b8-7cc8-408...| 0.6100070562136587| 0.8779402295427752|driver-213| 0.3407870505929602| 0.5030798142293655|  43.4923811219014|americas/brazil/s...|rider-213|0.0|2c3d179c-899f-42f...|
   |     20200818091618|  20200818091618_2_1|3c9add87-8347-41d...|    asia/india/chennai|df2d7f47-0d10-43b...|  0.651058505660742| 0.8192868687714224|driver-213|0.20714896002914462|0.06224031095826987| 41.06290929046368|  asia/india/chennai|rider-213|0.0|3c9add87-8347-41d...|
   |     20200818091618|  20200818091618_2_2|8cd8ff41-791e-43a...|    asia/india/chennai|df2d7f47-0d10-43b...|   0.40613510977307| 0.5644092139040959|driver-213|  0.798706304941517|0.02698359227182834|17.851135255091155|  asia/india/chennai|rider-213|0.0|8cd8ff41-791e-43a...|
   +-------------------+--------------------+--------------------+----------------------+--------------------+-------------------+-------------------+----------+-------------------+-------------------+------------------+--------------------+---------+---+--------------------+
   ----------- INCREMENTAL READ -------
   Now we're updating a row, we expect to see the updated row only on incremental read, which we do
   20/08/18 09:17:36 WARN IncrementalTimelineSyncFileSystemView: Incremental Sync of timeline is turned off or deemed unsafe. Will revert to full syncing
   READING FROM 20200818091705
   +-------------------+--------------------+--------------------+----------------------+--------------------+------------------+------------------+----------+------------------+------------------+------------------+--------------------+---------+---+--------------------+
   |_hoodie_commit_time|_hoodie_commit_seqno|  _hoodie_record_key|_hoodie_partition_path|   _hoodie_file_name|         begin_lat|         begin_lon|    driver|           end_lat|           end_lon|              fare|       partitionpath|    rider| ts|                uuid|
   +-------------------+--------------------+--------------------+----------------------+--------------------+------------------+------------------+----------+------------------+------------------+------------------+--------------------+---------+---+--------------------+
   |     20200818091706|  20200818091706_0_3|35808b31-2d1e-474...|  americas/united_s...|3e9b3e64-3895-46a...|0.7340133901254792|0.5142184937933181|driver-284|0.7814655558162802|0.6592596683641996|49.527694252432056|americas/united_s...|rider-284|0.0|35808b31-2d1e-474...|
   +-------------------+--------------------+--------------------+----------------------+--------------------+------------------+------------------+----------+------------------+------------------+------------------+--------------------+---------+---+--------------------+
   ----------- INCREMENTAL READ -------
   Re-upserting the same row twice causes it to be 'emitted' twice to the incremental reader, even though the contents of the second reading are identical from the first (metadata aside)
   20/08/18 09:18:04 WARN IncrementalTimelineSyncFileSystemView: Incremental Sync of timeline is turned off or deemed unsafe. Will revert to full syncing
   READING FROM 20200818091736
   +-------------------+--------------------+--------------------+----------------------+--------------------+------------------+------------------+----------+------------------+------------------+------------------+--------------------+---------+---+--------------------+
   |_hoodie_commit_time|_hoodie_commit_seqno|  _hoodie_record_key|_hoodie_partition_path|   _hoodie_file_name|         begin_lat|         begin_lon|    driver|           end_lat|           end_lon|              fare|       partitionpath|    rider| ts|                uuid|
   +-------------------+--------------------+--------------------+----------------------+--------------------+------------------+------------------+----------+------------------+------------------+------------------+--------------------+---------+---+--------------------+
   |     20200818091737|  20200818091737_0_4|35808b31-2d1e-474...|  americas/united_s...|3e9b3e64-3895-46a...|0.7340133901254792|0.5142184937933181|driver-284|0.7814655558162802|0.6592596683641996|49.527694252432056|americas/united_s...|rider-284|0.0|35808b31-2d1e-474...|
   +-------------------+--------------------+--------------------+----------------------+--------------------+------------------+------------------+----------+------------------+------------------+------------------+--------------------+---------+---+--------------------+
   ```
   
   **Expected behavior**
   
   Ideally (in our use case), upserting a row whose contents is identical doesn't cause an incremental reader to read the data again. 
   
   **Environment Description**
   
   * Hudi version : 0.5.3-rc2, built from source
   
   * Spark version : 2.4.4 (Using Scala version 2.11.12, OpenJDK 64-Bit Server VM, 1.8.0_252)
   
   * Hive version : 2.3.6
   
   * Hadoop version : 2.8.5-amzn-5
   
   * Storage (HDFS/S3/GCS..) : S3
   
   * Running on Docker? (yes/no) : no
   
   * EMR Version : emr-5.29.0


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] hughfdjackson edited a comment on issue #1979: [SUPPORT]: Is it possible to incrementally read only upserted rows where a material change has occurred?

Posted by GitBox <gi...@apache.org>.

hughfdjackson edited a comment on issue #1979:
URL: https://github.com/apache/hudi/issues/1979#issuecomment-679026628


   @bvaradar - As a follow-up question, your reply confirms that what we're looking for (ideally) isn't a Hudi feature currently.  Is it something you might be interested in supporting?
   
   In many use cases, the behaviour would likely be nearly identical to the current behaviour* - for snapshot queries, or for incrementally reading tables where the writer ensures only material changes** are written (e.g. some stream processing, or insert-only batch processes).  In the remaining use-cases like ours, it would cut back on a lot of noise + processing.   
   
   If so, I can talk to my team about contributing towards the project, since it would be valuable to us.  
   
   ----
   
   \* Implementation dependent, of course!  It may be that it'd require another metadata field to be added to support that sort of behaviour, for instance.  
   
   \** I'm using 'material changes' here to describe an upsert that impacts on the non-`_hoodie` columns.  Either a deletion, or a change in value to one of those columns.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] bvaradar commented on issue #1979: [SUPPORT]: Is it possible to incrementally read only upserted rows where a material change has occurred?

Posted by GitBox <gi...@apache.org>.

bvaradar commented on issue #1979:
URL: https://github.com/apache/hudi/issues/1979#issuecomment-678595102


   Right, this dataset is essentially a log but if you are only worried about incremental query,  then you will be reading only the records added by the new commits.  Also, note that your dataset will keep increasing.  So, its application is limited.
   
   In general, I don't see another way to do this in a generic way. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] hughfdjackson commented on issue #1979: [SUPPORT]: Is it possible to incrementally read only upserted rows where a material change has occurred?

Posted by GitBox <gi...@apache.org>.

hughfdjackson commented on issue #1979:
URL: https://github.com/apache/hudi/issues/1979#issuecomment-679026628


   @bvaradar - As a follow-up question, your reply confirms that what we're looking for (ideally) isn't a Hudi feature currently.  Is it something you might be interested in supporting?
   
   In many use cases, the behaviour would likely be identical to the current - for snapshot queries, or for incrementally reading tables where the writer ensures only material changes* are written (e.g. some stream processing, or insert-only batch processes).  In the remaining use-cases like ours, it would cut back on a lot of noise + processing.   
   
   If so, I can talk to my team about contributing towards the project, since it would be valuable to us.  
   
   ----
   
   \* I'm using 'material changes' here to describe an upsert that impacts on the non-`_hoodie` columns.  Either a deletion, or a change in value to one of those columns.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] bvaradar closed issue #1979: [SUPPORT]: Is it possible to incrementally read only upserted rows where a material change has occurred?

Posted by GitBox <gi...@apache.org>.

bvaradar closed issue #1979:
URL: https://github.com/apache/hudi/issues/1979


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] bvaradar commented on issue #1979: [SUPPORT]: Is it possible to incrementally read only upserted rows where a material change has occurred?

Posted by GitBox <gi...@apache.org>.

bvaradar commented on issue #1979:
URL: https://github.com/apache/hudi/issues/1979#issuecomment-676490788


   One option to make this to work currently is to add columns that gets updated also as part of the composite record key.  We can use key uniqueness constraint of Hudi to achieve the result. This way, you have an option to filter out duplicates first and then upsert rest of the records in the batch. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] bvaradar commented on issue #1979: [SUPPORT]: Is it possible to incrementally read only upserted rows where a material change has occurred?

Posted by GitBox <gi...@apache.org>.

bvaradar commented on issue #1979:
URL: https://github.com/apache/hudi/issues/1979#issuecomment-680998998


   @hughfdjackson : Good point about incrementally reading multiple commits. The variation you suggested seems to make sense. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] hughfdjackson commented on issue #1979: [SUPPORT]: Is it possible to incrementally read only upserted rows where a material change has occurred?

Posted by GitBox <gi...@apache.org>.

hughfdjackson commented on issue #1979:
URL: https://github.com/apache/hudi/issues/1979#issuecomment-688168867


   @bvaradar  - thanks for your help.  I think we're going to try the above approach, but it's something we might return to later. 
   
   Closing the issue for now sounds like a good idea. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] hughfdjackson commented on issue #1979: [SUPPORT]: Is it possible to incrementally read only upserted rows where a material change has occurred?

Posted by GitBox <gi...@apache.org>.

hughfdjackson commented on issue #1979:
URL: https://github.com/apache/hudi/issues/1979#issuecomment-679017460


   Hi @bvaradar - thanks for the reply! And for the suggestion.
   
   In our use case, we're interested in both incremental read of material changes, and in using the Hudi table with regular snapshot queries. I would expect 30-50% incremental reads, and 50-70% snapshot queries.
   
   If I'm understanding correctly, your suggestion would essentially lead to an event log of all material changes to an entity. If you do a snapshot query against that data, you'd end up with lots of duplicates, so each query would need to include de-duplication to reproduce the a materialised view with the latest data for each entity.
   
   Is that right?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] bvaradar commented on issue #1979: [SUPPORT]: Is it possible to incrementally read only upserted rows where a material change has occurred?

Posted by GitBox <gi...@apache.org>.

bvaradar commented on issue #1979:
URL: https://github.com/apache/hudi/issues/1979#issuecomment-679522323


   @hughfdjackson : In general getting incremental read to discard duplicates is not possible for MOR table types as we defer the merging of records to compaction.
   
   I was thinking about alternate ways to achieve your use-case for COW table by using an application level boolean flag. Let me know if this makes sense:
   
   1. Introduce additional  boolean column "changed". Default Value is false.
   2. Have your own implementation of HoodieRecordPayload plugged-in.
   3a In HoodieRecordPayload.getInsertValue(), return an avro record with changed = true. This function is called first time  when the new record is inserted.
   3(b) In HoodieRecordPayload.combineAndGetUpdateValue(), if you determine, there is no material change, set changed = false else set it to true.
   
   In your incremental query,  add the filter changed = true to filter out those without material changes ? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] rubenssoto commented on issue #1979: [SUPPORT]: Is it possible to incrementally read only upserted rows where a material change has occurred?

Posted by GitBox <gi...@apache.org>.

rubenssoto commented on issue #1979:
URL: https://github.com/apache/hudi/issues/1979#issuecomment-678577218


   @bvaradar but if you use an updated column in a primary key if a new record with the same ID but with newer updated date cames hudi will treat like a new record.
   
   Incremental query read all commit or only new values? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] hughfdjackson commented on issue #1979: [SUPPORT]: Is it possible to incrementally read only upserted rows where a material change has occurred?

Posted by GitBox <gi...@apache.org>.

hughfdjackson commented on issue #1979:
URL: https://github.com/apache/hudi/issues/1979#issuecomment-680685136


   Hi @bvaradar -  
   
   > In general getting incremental read to discard duplicates is not possible for MOR table types as we defer the merging of records to compaction.
   
   That's interesting - as your comment suggests, I've only looked at CoW tables in any depth.  I look forward to delving into MoR's design in a bit more detail so I can get my head around what the implications of such a feature would be there + understand your comment better. 
   
   > I was thinking about alternate ways to achieve your use-case for COW table by using an application level boolean flag. Let me know if this makes sense:
   > 
   >     Introduce additional boolean column "changed". Default Value is false.
   >     Have your own implementation of HoodieRecordPayload plugged-in.
   >     3a In HoodieRecordPayload.getInsertValue(), return an avro record with changed = true. This function is called first time when the new record is inserted.
   >     3(b) In HoodieRecordPayload.combineAndGetUpdateValue(), if you determine, there is no material change, set changed = false else set it to true.
   > 
   > In your incremental query, add the filter changed = true to filter out those without material changes ?
   
   That does make sense, although I think a boolean column may lead to missing changes if the incremental read spans two or more commits to the same row.  I'm spiking a variation on that suggesting with my team, wherein: 
   
   1. Introduce a 'last_updated_timestamp', default to null (i.e. the update was in this commit)
   2. Have your own implementation of HoodieRecordPayload plugged-in.
   3. a. In HoodieRecordPayload.getInsertValue(), return an avro record with last_updated_timestamp = null.*
   3. b. In HoodieRecordPayload.combineAndGetUpdateValue(), if you determine, there is no material change, set last_updated_timestamp to that of the old record (if it exists) _or_ to the old record's commit_time. 
   
   In the incremental query, we're filtering for `null` (which indicates that one of the commits within the timeline last updated the record) or for `last_updated_timestamp` within the beginInstant and endInstant bounds. 
   
   We've not tested it extensively, but it looks like a promising workaround so far. 
   
   ---
   
   \* It'd be 'cleaner' to set this equal to the commit time of the write, but in our HoodieRecordPayload class, that's not available unfortunately.  The 'null means insert' + special case handling in HoodieRecordPayload.combineAndGetUpdateValue() is a work-around for that.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] bvaradar commented on issue #1979: [SUPPORT]: Is it possible to incrementally read only upserted rows where a material change has occurred?

Posted by GitBox <gi...@apache.org>.

bvaradar commented on issue #1979:
URL: https://github.com/apache/hudi/issues/1979#issuecomment-682061419


   Will close the ticket for now. Please reopen if we need to discuss more on this topic.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org