You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/05/17 08:52:22 UTC

[GitHub] [hudi] PavelPetukhov opened a new issue #2959: No data stored after migrating to Hudi 0.8.0

PavelPetukhov opened a new issue #2959:
URL: https://github.com/apache/hudi/issues/2959


   While working with Hudi 0.7.0 we were able to store data from Kafka topics to hdfs
   We tried to migrate to 0.8.0, but we've discovered a strange behavior - 
   spark submit finishes with status SUCCEEDED but no data is actually stored in HDFS
   Only .hoodie folder is created is in the desired location with files like .aux, .temp, deltacommit.infligh, deltacommit.requested, hoodie.properties, archived
   
   Spark Submit looks like this (attached only Hudi related configurations, can send full request if necessary):
   (
     Please note that 0.7.0 with the same config worked (data is stored as expected), only 
     hudi-utilities-bundle_2.12:0.8.0 changed from hudi-utilities-bundle_2.11:0.7.0
     spark-avro_2.12:2.4.7 changed from spark-avro_2.11:2.4.7
     hoodie-utilities.jar taken hudi-0.8.0-utilities-2.12.jar instead of hudi-0.7.0-utilities-2.11 
   )
   
   
   /usr/local/spark/bin/spark-submit --conf "spark.yarn.submit.waitAppCompletion=false" \
   --packages org.apache.hudi:hudi-utilities-bundle_2.12:0.8.0,org.apache.spark:spark-avro_2.12:2.4.7 \
   --master yarn \
   --deploy-mode cluster \
   --name xxx \
   --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \
   /app/hoodie-utilities.jar \
   --op BULK_INSERT \
   --table-type MERGE_ON_READ \
   --source-class org.apache.hudi.utilities.sources.AvroKafkaSource \
   --source-ordering-field __null_ts_ms \
   --schemaprovider-class org.apache.hudi.utilities.schema.SchemaRegistryProvider \
   --enable-hive-sync \
   --target-base-path xxx \
   --target-table xxx \
   --hoodie-conf "hoodie.datasource.hive_sync.enable=true" \
   --hoodie-conf "hoodie.datasource.hive_sync.table=foo" \
   --hoodie-conf "hoodie.datasource.hive_sync.partition_fields=date:TIMESTAMP" \
   --hoodie-conf "hoodie.datasource.hive_sync.partition_extractor_class=org.apache.hudi.hive.SlashEncodedDayPartitionValueExtractor" \
   --hoodie-conf "hoodie.datasource.hive_sync.jdbcurl=" \
   --hoodie-conf "hoodie.upsert.shuffle.parallelism=2" \
   --hoodie-conf "hoodie.insert.shuffle.parallelism=2" \
   --hoodie-conf "hoodie.delete.shuffle.parallelism=2" \
   --hoodie-conf "hoodie.bulkinsert.shuffle.parallelism=2" \
   --hoodie-conf "hoodie.embed.timeline.server=true" \
   --hoodie-conf "hoodie.filesystem.view.type=EMBEDDED_KV_STORE" \
   --hoodie-conf "hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.CustomKeyGenerator" \
   --hoodie-conf "hoodie.deltastreamer.keygen.timebased.timestamp.type=DATE_STRING" \
   --hoodie-conf "hoodie.deltastreamer.keygen.timebased.input.dateformat=yyyy-MM-dd'T'HH:mm:ssZ,yyyy-MM-dd'T'HH:mm:ss.SSSZ" \
   --hoodie-conf "hoodie.deltastreamer.keygen.timebased.input.dateformat.list.delimiter.regex=" \
   --hoodie-conf "hoodie.deltastreamer.keygen.timebased.input.timezone=" \
   --hoodie-conf "hoodie.deltastreamer.keygen.timebased.output.dateformat=yyyy/MM/dd" \
   --hoodie-conf "hoodie.deltastreamer.schemaprovider.registry.url=xxx \
   --hoodie-conf "xxx" \
   --hoodie-conf "auto.offset.reset=earliest" \
   --hoodie-conf "group.id=hudi_group" \
   --hoodie-conf "schema.registry.url=xxx" \
   --hoodie-conf "hoodie.parquet.small.file.limit=0" \
   --hoodie-conf "hoodie.clustering.inline=true" \
   --hoodie-conf "hoodie.clustering.inline.max.commits=4" \
   --hoodie-conf "hoodie.clustering.plan.strategy.target.file.max.bytes=1073741824" \
   --hoodie-conf "hoodie.clustering.plan.strategy.small.file.limit=629145600" \
   --hoodie-conf "hoodie.datasource.write.recordkey.field=id" \
   --hoodie-conf "hoodie.datasource.write.partitionpath.field=date:TIMESTAMP" \
   --hoodie-conf "hoodie.deltastreamer.source.kafka.topic=xxx" \
   
   * Hudi version : 0.8.0
   
   * Spark version : 2.4.7
   
   * Storage (HDFS/S3/GCS..) : hdfs
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] PavelPetukhov edited a comment on issue #2959: No data stored after migrating to Hudi 0.8.0

Posted by GitBox <gi...@apache.org>.
PavelPetukhov edited a comment on issue #2959:
URL: https://github.com/apache/hudi/issues/2959#issuecomment-848885930


   @n3nash 
   
   .hoodie directory structure is the following
   hdfs dfs -ls /user/hdfs/raw_data/public/ml_training_data/foo/.hoodie
   Found 7 items
   drwxr-xr-x   - hdfs hadoop          0 2021-05-26 18:33 /path_to_location/foo/.hoodie/.aux
   drwxr-xr-x   - hdfs hadoop          0 2021-05-26 18:33 /path_to_location/foo/.hoodie/.temp
   drwxr-xr-x   - hdfs hadoop          0 2021-05-26 18:33 /path_to_location/foo/.hoodie/20210526183328.deltacommit
   -rw-r--r--   3 hdfs hadoop        518 2021-05-26 18:33 /path_to_location/foo/.hoodie/20210526183328.deltacommit.inflight
   -rw-r--r--   3 hdfs hadoop          0 2021-05-26 18:33 /path_to_location/foo/.hoodie/20210526183328.deltacommit.requested
   drwxr-xr-x   - hdfs hadoop          0 2021-05-26 18:33 /path_to_location/foo/.hoodie/archived
   -rw-r--r--   3 hdfs hadoop        391 2021-05-26 18:33 /path_to_location/foo/.hoodie/hoodie.properties
   
   
   Also, I have removed everything unrelated, so the request looks like this:
   
   /usr/local/spark/bin/spark-submit --conf "spark.yarn.submit.waitAppCompletion=false" \
   --conf "spark.dynamicAllocation.minExecutors=1" \
   --conf "spark.dynamicAllocation.maxExecutors=10" \
   --conf "spark.dynamicAllocation.enabled=true" \
   --conf "spark.dynamicAllocation.shuffleTracking.enabled=true" \
   --conf "spark.shuffle.service.enabled=true" \
   --conf "spark.eventLog.enabled=true" \
   --conf "spark.eventLog.dir=hdfs://xxx/eventLogging" \
   --conf "spark.executor.memoryOverhead=384" \
   --conf "spark.driver.memoryOverhead=384" \
   --conf "spark.driver.extraJavaOptions=-DsparkAappName=xxx -DlogIndex=GOLANG_JSON -DappName=data-lake-extractors-streamer -DlogFacility=stdout" \
   --packages org.apache.spark:spark-avro_2.12:2.4.7 \
   --master yarn \
   --deploy-mode cluster \
   --name xxx \
   --driver-memory 2G \
   --executor-memory 2G \
   --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \
   hdfs://xxx/user/hudi/hudi-utilities-bundle_2.12-0.8.0.jar \
   --op UPSERT \
   --table-type MERGE_ON_READ \
   --source-class org.apache.hudi.utilities.sources.AvroKafkaSource \
   --source-ordering-field __null_ts_ms \
   --schemaprovider-class org.apache.hudi.utilities.schema.SchemaRegistryProvider \
   --target-base-path /user/hdfs/raw_data/public/xxx/yyy \
   --target-table xxx \
   --hoodie-conf "hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.CustomKeyGenerator" \
   --hoodie-conf "hoodie.deltastreamer.keygen.timebased.timestamp.type=DATE_STRING" \
   --hoodie-conf "hoodie.deltastreamer.keygen.timebased.output.dateformat=yyyy/MM/dd" \
   --hoodie-conf "hoodie.deltastreamer.keygen.timebased.input.dateformat=yyyy-MM-ddTHH:mm:ssZ,yyyy-MM-ddTHH:mm:ss.SSSZ" \
   --hoodie-conf "hoodie.deltastreamer.keygen.timebased.input.dateformat.list.delimiter.regex=" \
   --hoodie-conf "hoodie.deltastreamer.keygen.timebased.input.timezone=" \
   --hoodie-conf "hoodie.upsert.shuffle.parallelism=2" \
   --hoodie-conf "hoodie.insert.shuffle.parallelism=2" \
   --hoodie-conf "hoodie.delete.shuffle.parallelism=2" \
   --hoodie-conf "hoodie.bulkinsert.shuffle.parallelism=2" \
   --hoodie-conf "hoodie.embed.timeline.server=true" \
   --hoodie-conf "hoodie.filesystem.view.type=EMBEDDED_KV_STORE" \
   --hoodie-conf "hoodie.deltastreamer.schemaprovider.registry.url=http://xxx/subjects/xxx-value/versions/latest" \
   --hoodie-conf "bootstrap.servers=xxx" \
   --hoodie-conf "auto.offset.reset=earliest" \
   --hoodie-conf "group.id=hudi_group" \
   --hoodie-conf "schema.registry.url=http://xxx" \
   --hoodie-conf "hoodie.datasource.write.recordkey.field=id" \
   --hoodie-conf "hoodie.datasource.write.partitionpath.field=date:TIMESTAMP" \
   --hoodie-conf "hoodie.deltastreamer.source.kafka.topic=xxx" \
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] PavelPetukhov edited a comment on issue #2959: No data stored after migrating to Hudi 0.8.0

Posted by GitBox <gi...@apache.org>.
PavelPetukhov edited a comment on issue #2959:
URL: https://github.com/apache/hudi/issues/2959#issuecomment-848891756


   This is our full log
   [spark_log.txt](https://github.com/apache/hudi/files/6548390/spark_log.txt)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] nsivabalan commented on issue #2959: No data stored after migrating to Hudi 0.8.0

Posted by GitBox <gi...@apache.org>.
nsivabalan commented on issue #2959:
URL: https://github.com/apache/hudi/issues/2959#issuecomment-918433157


   @PavelPetukhov : sorry for very late turn around. Were you able to get it resolved? Erasing of all data seems very strange. Definitely interested in looking into the issue. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar commented on issue #2959: No data stored after migrating to Hudi 0.8.0

Posted by GitBox <gi...@apache.org>.
vinothchandar commented on issue #2959:
URL: https://github.com/apache/hudi/issues/2959#issuecomment-926254291


   I can't find any exceptions in the log. Please feel free to reopen if this is still an issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] PavelPetukhov edited a comment on issue #2959: No data stored after migrating to Hudi 0.8.0

Posted by GitBox <gi...@apache.org>.
PavelPetukhov edited a comment on issue #2959:
URL: https://github.com/apache/hudi/issues/2959#issuecomment-848885930


   .hoodie directory structure is the following
   hdfs dfs -ls /user/hdfs/raw_data/public/ml_training_data/foo/.hoodie
   Found 7 items
   drwxr-xr-x   - hdfs hadoop          0 2021-05-26 18:33 /path_to_location/foo/.hoodie/.aux
   drwxr-xr-x   - hdfs hadoop          0 2021-05-26 18:33 /path_to_location/foo/.hoodie/.temp
   drwxr-xr-x   - hdfs hadoop          0 2021-05-26 18:33 /path_to_location/foo/.hoodie/20210526183328.deltacommit
   -rw-r--r--   3 hdfs hadoop        518 2021-05-26 18:33 /path_to_location/foo/.hoodie/20210526183328.deltacommit.inflight
   -rw-r--r--   3 hdfs hadoop          0 2021-05-26 18:33 /path_to_location/foo/.hoodie/20210526183328.deltacommit.requested
   drwxr-xr-x   - hdfs hadoop          0 2021-05-26 18:33 /path_to_location/foo/.hoodie/archived
   -rw-r--r--   3 hdfs hadoop        391 2021-05-26 18:33 /path_to_location/foo/.hoodie/hoodie.properties
   
   
   Also, I have removed everything unrelated, so the request looks like this:
   
   /usr/local/spark/bin/spark-submit --conf "spark.yarn.submit.waitAppCompletion=false" \
   --conf "spark.dynamicAllocation.minExecutors=1" \
   --conf "spark.dynamicAllocation.maxExecutors=10" \
   --conf "spark.dynamicAllocation.enabled=true" \
   --conf "spark.dynamicAllocation.shuffleTracking.enabled=true" \
   --conf "spark.shuffle.service.enabled=true" \
   --conf "spark.eventLog.enabled=true" \
   --conf "spark.eventLog.dir=hdfs://xxx/eventLogging" \
   --conf "spark.executor.memoryOverhead=384" \
   --conf "spark.driver.memoryOverhead=384" \
   --conf "spark.driver.extraJavaOptions=-DsparkAappName=xxx -DlogIndex=GOLANG_JSON -DappName=data-lake-extractors-streamer -DlogFacility=stdout" \
   --packages org.apache.spark:spark-avro_2.12:2.4.7 \
   --master yarn \
   --deploy-mode cluster \
   --name xxx \
   --driver-memory 2G \
   --executor-memory 2G \
   --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \
   hdfs://xxx/user/hudi/hudi-utilities-bundle_2.12-0.8.0.jar \
   --op UPSERT \
   --table-type MERGE_ON_READ \
   --source-class org.apache.hudi.utilities.sources.AvroKafkaSource \
   --source-ordering-field __null_ts_ms \
   --schemaprovider-class org.apache.hudi.utilities.schema.SchemaRegistryProvider \
   --target-base-path /user/hdfs/raw_data/public/xxx/yyy \
   --target-table xxx \
   --hoodie-conf "hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.CustomKeyGenerator" \
   --hoodie-conf "hoodie.deltastreamer.keygen.timebased.timestamp.type=DATE_STRING" \
   --hoodie-conf "hoodie.deltastreamer.keygen.timebased.output.dateformat=yyyy/MM/dd" \
   --hoodie-conf "hoodie.deltastreamer.keygen.timebased.input.dateformat=yyyy-MM-ddTHH:mm:ssZ,yyyy-MM-ddTHH:mm:ss.SSSZ" \
   --hoodie-conf "hoodie.deltastreamer.keygen.timebased.input.dateformat.list.delimiter.regex=" \
   --hoodie-conf "hoodie.deltastreamer.keygen.timebased.input.timezone=" \
   --hoodie-conf "hoodie.upsert.shuffle.parallelism=2" \
   --hoodie-conf "hoodie.insert.shuffle.parallelism=2" \
   --hoodie-conf "hoodie.delete.shuffle.parallelism=2" \
   --hoodie-conf "hoodie.bulkinsert.shuffle.parallelism=2" \
   --hoodie-conf "hoodie.embed.timeline.server=true" \
   --hoodie-conf "hoodie.filesystem.view.type=EMBEDDED_KV_STORE" \
   --hoodie-conf "hoodie.deltastreamer.schemaprovider.registry.url=http://xxx/subjects/xxx-value/versions/latest" \
   --hoodie-conf "bootstrap.servers=xxx" \
   --hoodie-conf "auto.offset.reset=earliest" \
   --hoodie-conf "group.id=hudi_group" \
   --hoodie-conf "schema.registry.url=http://xxx" \
   --hoodie-conf "hoodie.datasource.write.recordkey.field=id" \
   --hoodie-conf "hoodie.datasource.write.partitionpath.field=date:TIMESTAMP" \
   --hoodie-conf "hoodie.deltastreamer.source.kafka.topic=xxx" \
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] PavelPetukhov edited a comment on issue #2959: No data stored after migrating to Hudi 0.8.0

Posted by GitBox <gi...@apache.org>.
PavelPetukhov edited a comment on issue #2959:
URL: https://github.com/apache/hudi/issues/2959#issuecomment-848930327


   Below is our full log:
   
   
   Logged in as: dr.who 
   Application
   About
   Jobs
   Tools
   
   Log Type: stderr
   Log Upload Time: Wed May 26 18:33:34 +0300 2021
   
   Log Length: 104910
   
   21/05/26 18:33:18 INFO util.SignalUtils: Registered signal handler for TERM
   21/05/26 18:33:18 INFO util.SignalUtils: Registered signal handler for HUP
   21/05/26 18:33:18 INFO util.SignalUtils: Registered signal handler for INT
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing view acls to: yarn,hdfs
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing modify acls to: yarn,hdfs
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing view acls groups to: 
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing modify acls groups to: 
   21/05/26 18:33:18 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(yarn, hdfs); groups with view permissions: Set(); users  with modify permissions: Set(yarn, hdfs); groups with modify permissions: Set()
   21/05/26 18:33:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
   21/05/26 18:33:18 INFO yarn.ApplicationMaster: Preparing Local resources
   21/05/26 18:33:19 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
   21/05/26 18:33:19 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1618828995116_0162_000001
   21/05/26 18:33:19 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread
   21/05/26 18:33:19 INFO yarn.ApplicationMaster: Waiting for spark context initialization...
   21/05/26 18:33:19 WARN deltastreamer.SchedulerConfGenerator: Job Scheduling Configs will not be in effect as spark.scheduler.mode is not set to FAIR at instantiation time. Continuing without scheduling configs
   21/05/26 18:33:19 INFO spark.SparkContext: Running Spark version 2.4.7
   21/05/26 18:33:19 INFO spark.SparkContext: Submitted application: xxx
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing view acls to: yarn,hdfs
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing modify acls to: yarn,hdfs
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing view acls groups to: 
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing modify acls groups to: 
   21/05/26 18:33:19 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(yarn, hdfs); groups with view permissions: Set(); users  with modify permissions: Set(yarn, hdfs); groups with modify permissions: Set()
   21/05/26 18:33:20 INFO util.Utils: Successfully started service 'sparkDriver' on port 37691.
   21/05/26 18:33:20 INFO spark.SparkEnv: Registering MapOutputTracker
   21/05/26 18:33:20 INFO spark.SparkEnv: Registering BlockManagerMaster
   21/05/26 18:33:20 INFO storage.BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
   21/05/26 18:33:20 INFO storage.BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
   21/05/26 18:33:20 INFO storage.DiskBlockManager: Created local directory at /data/hadoop/yarn/local/usercache/hdfs/appcache/application_1618828995116_0162/blockmgr-9de167db-4756-414e-9126-32cb562e91aa
   21/05/26 18:33:20 INFO memory.MemoryStore: MemoryStore started with capacity 912.3 MB
   21/05/26 18:33:20 INFO spark.SparkEnv: Registering OutputCommitCoordinator
   21/05/26 18:33:20 INFO util.log: Logging initialized @2935ms
   21/05/26 18:33:20 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /jobs, /jobs/json, /jobs/job, /jobs/job/json, /stages, /stages/json, /stages/stage, /stages/stage/json, /stages/pool, /stages/pool/json, /storage, /storage/json, /storage/rdd, /storage/rdd/json, /environment, /environment/json, /executors, /executors/json, /executors/threadDump, /executors/threadDump/json, /static, /, /api, /jobs/job/kill, /stages/stage/kill.
   21/05/26 18:33:20 INFO server.Server: jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
   21/05/26 18:33:20 INFO server.Server: Started @3069ms
   21/05/26 18:33:20 INFO server.AbstractConnector: Started ServerConnector@7a0e94b4{HTTP/1.1,[http/1.1]}{0.0.0.0:32822}
   21/05/26 18:33:20 INFO util.Utils: Successfully started service 'SparkUI' on port 32822.
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@43837fbc{/jobs,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@d91ba30{/jobs/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4854d5d9{/jobs/job,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@672e7ec3{/jobs/job/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@67ee182c{/stages,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@97af315{/stages/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1936a0e0{/stages/stage,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@447ef19e{/stages/stage/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@68e36851{/stages/pool,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@352fe12b{/stages/pool/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3d39f28d{/storage,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@e7806b5{/storage/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7d2a56cb{/storage/rdd,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@37c6c6fc{/storage/rdd/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4599e713{/environment,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@b9a0cbb{/environment/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@24299f0d{/executors,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@25594c52{/executors/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2f728695{/executors/threadDump,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7456a814{/executors/threadDump/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1cef9064{/static,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@16ba2eda{/,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@dac88e2{/api,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@145850ef{/jobs/job/kill,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6d678cf2{/stages/stage/kill,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at http://xxx:32822
   21/05/26 18:33:20 INFO cluster.YarnClusterScheduler: Created YarnClusterScheduler
   21/05/26 18:33:20 INFO cluster.SchedulerExtensionServices: Starting Yarn extension services with app application_1618828995116_0162 and attemptId Some(appattempt_1618828995116_0162_000001)
   21/05/26 18:33:20 WARN util.Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs.
   21/05/26 18:33:20 INFO util.Utils: Using initial executors = 1, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
   21/05/26 18:33:20 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 38417.
   21/05/26 18:33:20 INFO netty.NettyBlockTransferService: Server created on xxx:38417
   21/05/26 18:33:20 INFO storage.BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
   21/05/26 18:33:20 INFO storage.BlockManagerMaster: Registering BlockManager BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO storage.BlockManagerMasterEndpoint: Registering block manager xxx:38417 with 912.3 MB RAM, BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO storage.BlockManager: external shuffle service port = 7337
   21/05/26 18:33:20 INFO storage.BlockManager: Initialized BlockManager: BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /metrics/json.
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1b3c78ce{/metrics/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:21 INFO scheduler.EventLoggingListener: Logging events to hdfs://xxx:8020/eventLogging/application_1618828995116_0162_1
   21/05/26 18:33:21 WARN util.Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs.
   21/05/26 18:33:21 INFO util.Utils: Using initial executors = 1, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
   21/05/26 18:33:21 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to request executors before the AM has registered!
   21/05/26 18:33:21 INFO client.RMProxy: Connecting to ResourceManager at xxx/10.246.4.117:8030
   21/05/26 18:33:21 INFO yarn.YarnRMClient: Registering the ApplicationMaster
   21/05/26 18:33:21 INFO yarn.ApplicationMaster: 
   ===============================================================================
   YARN executor launch context:
     env:
       CLASSPATH -> {{PWD}}<CPS>{{PWD}}/__spark_conf__<CPS>{{PWD}}/__spark_libs__/*<CPS>/usr/hdp/2.6.0.3-8/hadoop/conf<CPS>/usr/hdp/2.6.0.3-8/hadoop/*<CPS>/usr/hdp/2.6.0.3-8/hadoop/lib/*<CPS>/usr/hdp/current/hadoop-hdfs-client/*<CPS>/usr/hdp/current/hadoop-hdfs-client/lib/*<CPS>/usr/hdp/current/hadoop-yarn-client/*<CPS>/usr/hdp/current/hadoop-yarn-client/lib/*<CPS>/usr/hdp/current/ext/hadoop/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*<CPS>{{PWD}}/__spark_conf__/__hadoop_conf__
       SPARK_YARN_STAGING_DIR -> hdfs://xxx:8020/user/hd_xyz/.sparkStaging/application_1618828995116_0162
       SPARK_USER -> hdfs
   
     command:
       {{JAVA_HOME}}/bin/java \ 
         -server \ 
         -Xmx2048m \ 
         -Djava.io.tmpdir={{PWD}}/tmp \ 
         '-Dspark.driver.port=37691' \ 
         '-Dspark.ui.port=0' \ 
         -Dspark.yarn.app.container.log.dir=<LOG_DIR> \ 
         -XX:OnOutOfMemoryError='kill %p' \ 
         org.apache.spark.executor.CoarseGrainedExecutorBackend \ 
         --driver-url \ 
         spark://CoarseGrainedScheduler@xxx:37691 \ 
         --executor-id \ 
         <executorId> \ 
         --hostname \ 
         <hostname> \ 
         --cores \ 
         1 \ 
         --app-id \ 
         application_1618828995116_0162 \ 
         --user-class-path \ 
         file:$PWD/__app__.jar \ 
         --user-class-path \ 
         file:$PWD/org.apache.spark_spark-avro_2.12-2.4.7.jar \ 
         --user-class-path \ 
         file:$PWD/org.spark-project.spark_unused-1.0.0.jar \ 
         1><LOG_DIR>/stdout \ 
         2><LOG_DIR>/stderr
   
     resources:
       org.apache.spark_spark-avro_2.12-2.4.7.jar -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/hd_xyz/.sparkStaging/application_1618828995116_0162/org.apache.spark_spark-avro_2.12-2.4.7.jar" } size: 107269 timestamp: 1622043191967 type: FILE visibility: PRIVATE
       __app__.jar -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/jars/hudi/hudi-utilities-bundle_2.12-0.8.0.jar" } size: 40399204 timestamp: 1622022896130 type: FILE visibility: PUBLIC
       __spark_conf__ -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/hd_xyz/.sparkStaging/application_1618828995116_0162/__spark_conf__.zip" } size: 205423 timestamp: 1622043193955 type: ARCHIVE visibility: PRIVATE
       org.spark-project.spark_unused-1.0.0.jar -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/hd_xyz/.sparkStaging/application_1618828995116_0162/org.spark-project.spark_unused-1.0.0.jar" } size: 2777 timestamp: 1622043192905 type: FILE visibility: PRIVATE
       __spark_libs__ -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/hd_xyz/.sparkStaging/application_1618828995116_0162/__spark_libs__2858796966972713370.zip" } size: 242613518 timestamp: 1622043190403 type: ARCHIVE visibility: PRIVATE
   
   ===============================================================================
   21/05/26 18:33:21 WARN util.Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs.
   21/05/26 18:33:21 INFO util.Utils: Using initial executors = 1, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
   21/05/26 18:33:21 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(spark://YarnAM@xxx:37691)
   21/05/26 18:33:21 INFO yarn.YarnAllocator: Will request 1 executor container(s), each with 1 core(s) and 2432 MB memory (including 384 MB of overhead)
   21/05/26 18:33:21 INFO yarn.YarnAllocator: Submitted 1 unlocalized container requests.
   21/05/26 18:33:21 INFO yarn.ApplicationMaster: Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals
   21/05/26 18:33:22 INFO impl.AMRMClientImpl: Received new token for : xxx:45454
   21/05/26 18:33:22 INFO yarn.YarnAllocator: Launching container container_e03_1618828995116_0162_01_000002 on host xxx for executor with ID 1
   21/05/26 18:33:22 INFO yarn.YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them.
   21/05/26 18:33:22 INFO impl.ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
   21/05/26 18:33:22 INFO impl.ContainerManagementProtocolProxy: Opening proxy : xxx:45454
   21/05/26 18:33:25 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.246.3.9:49980) with ID 1
   21/05/26 18:33:25 INFO spark.ExecutorAllocationManager: New executor 1 has registered (new total is 1)
   21/05/26 18:33:25 INFO cluster.YarnClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
   21/05/26 18:33:25 INFO cluster.YarnClusterScheduler: YarnClusterScheduler.postStartHook done
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO utilities.UtilHelpers: Adding overridden properties to file properties.
   21/05/26 18:33:25 WARN spark.SparkContext: Using an existing SparkContext; some configuration may not take effect.
   21/05/26 18:33:25 INFO storage.BlockManagerMasterEndpoint: Registering block manager xxx:35696 with 912.3 MB RAM, BlockManagerId(1, xxx, 35696, None)
   21/05/26 18:33:25 INFO deltastreamer.HoodieDeltaStreamer: Creating delta streamer with configs : {hoodie.deltastreamer.keygen.timebased.input.timezone=, hoodie.embed.timeline.server=true, schema.registry.url=http://xxx, hoodie.filesystem.view.type=EMBEDDED_KV_STORE, hoodie.deltastreamer.keygen.timebased.input.dateformat=yyyy-MM-ddTHH:mm:ssZ,yyyy-MM-ddTHH:mm:ss.SSSZ, hoodie.delete.shuffle.parallelism=2, hoodie.bulkinsert.shuffle.parallelism=2, hoodie.deltastreamer.keygen.timebased.output.dateformat=yyyy/MM/dd, group.id=hudi_group_080, auto.offset.reset=earliest, hoodie.insert.shuffle.parallelism=2, hoodie.deltastreamer.keygen.timebased.timestamp.type=DATE_STRING, hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.CustomKeyGenerator, hoodie.deltastreamer.source.kafka.topic=xxx, bootstrap.servers=xxx:9092, hoodie.deltastreamer.keygen.timebased.input.dateformat.list.delimiter.regex=, hoodie.deltastreamer.schemaprovider.registry.url=http://xxx/subjects/xxx-value/versions
 /latest, hoodie.datasource.write.recordkey.field=id, hoodie.upsert.shuffle.parallelism=2, hoodie.datasource.write.partitionpath.field=date:TIMESTAMP}
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Initializing /user/hd_xyz/yyy/ml_xxx/foo as hoodie table /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Finished initializing Table of type MERGE_ON_READ from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO deltastreamer.DeltaSync: Registering Schema :[{"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.ml_xxx.public.foo.Value"}, {"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.m
 l_xxx.public.foo.Value"}]
   21/05/26 18:33:25 INFO deltastreamer.HoodieDeltaStreamer: Delta Streamer running only single round
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:26 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:26 INFO deltastreamer.DeltaSync: Checkpoint to resume from : Optional.empty
   21/05/26 18:33:26 INFO consumer.ConsumerConfig: ConsumerConfig values: 
   	allow.auto.create.topics = true
   	auto.commit.interval.ms = 5000
   	auto.offset.reset = earliest
   	bootstrap.servers = [xxx]
   	check.crcs = true
   	client.dns.lookup = default
   	client.id = 
   	client.rack = 
   	connections.max.idle.ms = 540000
   	default.api.timeout.ms = 60000
   	enable.auto.commit = true
   	exclude.internal.topics = true
   	fetch.max.bytes = 52428800
   	fetch.max.wait.ms = 500
   	fetch.min.bytes = 1
   	group.id = hudi_group_080
   	group.instance.id = null
   	heartbeat.interval.ms = 3000
   	interceptor.classes = []
   	internal.leave.group.on.close = true
   	isolation.level = read_uncommitted
   	key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
   	max.partition.fetch.bytes = 1048576
   	max.poll.interval.ms = 300000
   	max.poll.records = 500
   	metadata.max.age.ms = 300000
   	metric.reporters = []
   	metrics.num.samples = 2
   	metrics.recording.level = INFO
   	metrics.sample.window.ms = 30000
   	partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
   	receive.buffer.bytes = 65536
   	reconnect.backoff.max.ms = 1000
   	reconnect.backoff.ms = 50
   	request.timeout.ms = 30000
   	retry.backoff.ms = 100
   	sasl.client.callback.handler.class = null
   	sasl.jaas.config = null
   	sasl.kerberos.kinit.cmd = /usr/bin/kinit
   	sasl.kerberos.min.time.before.relogin = 60000
   	sasl.kerberos.service.name = null
   	sasl.kerberos.ticket.renew.jitter = 0.05
   	sasl.kerberos.ticket.renew.window.factor = 0.8
   	sasl.login.callback.handler.class = null
   	sasl.login.class = null
   	sasl.login.refresh.buffer.seconds = 300
   	sasl.login.refresh.min.period.seconds = 60
   	sasl.login.refresh.window.factor = 0.8
   	sasl.login.refresh.window.jitter = 0.05
   	sasl.mechanism = GSSAPI
   	security.protocol = PLAINTEXT
   	security.providers = null
   	send.buffer.bytes = 131072
   	session.timeout.ms = 10000
   	ssl.cipher.suites = null
   	ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
   	ssl.endpoint.identification.algorithm = https
   	ssl.key.password = null
   	ssl.keymanager.algorithm = SunX509
   	ssl.keystore.location = null
   	ssl.keystore.password = null
   	ssl.keystore.type = JKS
   	ssl.protocol = TLS
   	ssl.provider = null
   	ssl.secure.random.implementation = null
   	ssl.trustmanager.algorithm = PKIX
   	ssl.truststore.location = null
   	ssl.truststore.password = null
   	ssl.truststore.type = JKS
   	value.deserializer = class io.confluent.kafka.serializers.KafkaAvroDeserializer
   
   21/05/26 18:33:26 INFO serializers.KafkaAvroDeserializerConfig: KafkaAvroDeserializerConfig values: 
   	schema.registry.url = [xxx]
   	max.schemas.per.subject = 1000
   	specific.avro.reader = false
   
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.timestamp.type' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.output.dateformat' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.input.dateformat.list.delimiter.regex' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.input.dateformat' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.datasource.write.partitionpath.field' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.delete.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.datasource.write.recordkey.field' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.upsert.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.datasource.write.keygenerator.class' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.source.kafka.topic' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.schemaprovider.registry.url' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.insert.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.embed.timeline.server' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.bulkinsert.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.input.timezone' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.filesystem.view.type' was supplied but isn't a known config.
   21/05/26 18:33:26 INFO utils.AppInfoParser: Kafka version: 2.4.1
   21/05/26 18:33:26 INFO utils.AppInfoParser: Kafka commitId: c57222ae8cd7866b
   21/05/26 18:33:26 INFO utils.AppInfoParser: Kafka startTimeMs: 1622043206225
   21/05/26 18:33:26 INFO clients.Metadata: [Consumer clientId=consumer-hudi_group_080-1, groupId=hudi_group_080] Cluster ID: 5XoPi9AYT0mbHVQEj6VEaw
   21/05/26 18:33:27 INFO helpers.KafkaOffsetGen: SourceLimit not configured, set numEvents to default value : 5000000
   21/05/26 18:33:27 INFO sources.AvroKafkaSource: About to read 0 from Kafka for topic :xxx
   21/05/26 18:33:27 INFO deltastreamer.DeltaSync: No new data, perform empty commit.
   21/05/26 18:33:27 INFO deltastreamer.DeltaSync: Setting up new Hoodie Write Client
   21/05/26 18:33:27 INFO deltastreamer.DeltaSync: Registering Schema :[{"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.ml_xxx.public.foo.Value"}, {"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.m
 l_xxx.public.foo.Value"}]
   21/05/26 18:33:27 INFO embedded.EmbeddedTimelineService: Starting Timeline service !!
   21/05/26 18:33:27 INFO embedded.EmbeddedTimelineService: Overriding hostIp to (xxx) found in spark-conf. It was null
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating View Manager with storage type :EMBEDDED_KV_STORE
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating embedded rocks-db based Table View
   21/05/26 18:33:27 INFO util.log: Logging initialized @9978ms to org.apache.hudi.org.eclipse.jetty.util.log.Slf4jLog
   21/05/26 18:33:27 INFO javalin.Javalin: 
              __                      __ _
             / /____ _ _   __ ____ _ / /(_)____
        __  / // __ `/| | / // __ `// // // __ \
       / /_/ // /_/ / | |/ // /_/ // // // / / /
       \____/ \__,_/  |___/ \__,_//_//_//_/ /_/
   
           https://javalin.io/documentation
   
   21/05/26 18:33:27 INFO javalin.Javalin: Starting Javalin ...
   21/05/26 18:33:27 INFO javalin.Javalin: Listening on http://localhost:37089/
   21/05/26 18:33:27 INFO javalin.Javalin: Javalin started in 179ms \o/
   21/05/26 18:33:27 INFO service.TimelineService: Starting Timeline server on port :37089
   21/05/26 18:33:27 INFO embedded.EmbeddedTimelineService: Started embedded timeline server at xxx:37089
   21/05/26 18:33:27 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:27 INFO client.AbstractHoodieClient: Timeline Server already running. Not restarting the service
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:27 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:28 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:28 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:28 INFO client.AbstractHoodieWriteClient: Generate a new instant time: 20210526183328 action: deltacommit
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Creating a new instant [==>20210526183328__deltacommit__REQUESTED]
   21/05/26 18:33:28 INFO deltastreamer.DeltaSync: Starting commit  : 20210526183328
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:28 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Loaded instants [[==>20210526183328__deltacommit__REQUESTED]]
   21/05/26 18:33:28 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:28 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:28 INFO client.SparkRDDWriteClient: Successfully synced to metadata table
   21/05/26 18:33:28 INFO client.AsyncCleanerService: Auto cleaning is not enabled. Not running cleaner now
   21/05/26 18:33:28 INFO spark.SparkContext: Starting job: countByKey at SparkHoodieBloomIndex.java:114
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Registering RDD 1 (mapToPair at SparkWriteHelper.java:54) as input to shuffle 1
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Registering RDD 5 (countByKey at SparkHoodieBloomIndex.java:114) as input to shuffle 0
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Got job 0 (countByKey at SparkHoodieBloomIndex.java:114) with 2 output partitions
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Final stage: ResultStage 2 (countByKey at SparkHoodieBloomIndex.java:114)
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 1)
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 1)
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 1 (MapPartitionsRDD[5] at countByKey at SparkHoodieBloomIndex.java:114), which has no missing parents
   21/05/26 18:33:28 INFO memory.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 6.2 KB, free 912.3 MB)
   21/05/26 18:33:28 INFO yarn.YarnAllocator: Driver requested a total number of 2 executor(s).
   21/05/26 18:33:28 INFO yarn.YarnAllocator: Will request 1 executor container(s), each with 1 core(s) and 2432 MB memory (including 384 MB of overhead)
   21/05/26 18:33:28 INFO yarn.YarnAllocator: Submitted 1 unlocalized container requests.
   21/05/26 18:33:28 INFO spark.ExecutorAllocationManager: Requesting 1 new executor because tasks are backlogged (new desired total will be 2)
   21/05/26 18:33:28 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.3 KB, free 912.3 MB)
   21/05/26 18:33:28 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on xxx:38417 (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:28 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 1 (MapPartitionsRDD[5] at countByKey at SparkHoodieBloomIndex.java:114) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:28 INFO cluster.YarnClusterScheduler: Adding task set 1.0 with 2 tasks
   21/05/26 18:33:28 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 1.0 (TID 0, xxx, executor 1, partition 0, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:28 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on xxx:35696 (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:29 INFO impl.AMRMClientImpl: Received new token for : xxx:45454
   21/05/26 18:33:29 INFO yarn.YarnAllocator: Launching container container_e03_1618828995116_0162_01_000004 on host xxx for executor with ID 2
   21/05/26 18:33:29 INFO yarn.YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them.
   21/05/26 18:33:29 INFO impl.ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
   21/05/26 18:33:29 INFO impl.ContainerManagementProtocolProxy: Opening proxy : xxx:45454
   21/05/26 18:33:29 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to 10.246.3.9:49980
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added rdd_3_0 in memory on xxx:35696 (size: 0.0 B, free: 912.3 MB)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 1.0 (TID 1, xxx, executor 1, partition 1, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added rdd_3_1 in memory on xxx:35696 (size: 0.0 B, free: 912.3 MB)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 1.0 (TID 0) in 1023 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 1.0 (TID 1) in 70 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: ShuffleMapStage 1 (countByKey at SparkHoodieBloomIndex.java:114) finished in 1.177 s
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: looking for newly runnable stages
   21/05/26 18:33:29 INFO cluster.YarnClusterScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool 
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 2)
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: Submitting ResultStage 2 (ShuffledRDD[6] at countByKey at SparkHoodieBloomIndex.java:114), which has no missing parents
   21/05/26 18:33:29 INFO memory.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.8 KB, free 912.3 MB)
   21/05/26 18:33:29 INFO memory.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.2 KB, free 912.3 MB)
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on xxx:38417 (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:29 INFO spark.SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 2 (ShuffledRDD[6] at countByKey at SparkHoodieBloomIndex.java:114) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:29 INFO cluster.YarnClusterScheduler: Adding task set 2.0 with 2 tasks
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 2.0 (TID 2, xxx, executor 1, partition 0, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on xxx:35696 (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:29 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to 10.246.3.9:49980
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 2.0 (TID 3, xxx, executor 1, partition 1, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 2.0 (TID 2) in 85 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 2.0 (TID 3) in 32 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:29 INFO cluster.YarnClusterScheduler: Removed TaskSet 2.0, whose tasks have all completed, from pool 
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: ResultStage 2 (countByKey at SparkHoodieBloomIndex.java:114) finished in 0.126 s
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: Job 0 finished: countByKey at SparkHoodieBloomIndex.java:114, took 1.627903 s
   21/05/26 18:33:29 INFO yarn.YarnAllocator: Driver requested a total number of 1 executor(s).
   21/05/26 18:33:30 INFO spark.SparkContext: Starting job: collect at HoodieSparkEngineContext.java:78
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Got job 1 (collect at HoodieSparkEngineContext.java:78) with 1 output partitions
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Final stage: ResultStage 3 (collect at HoodieSparkEngineContext.java:78)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Parents of final stage: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Missing parents: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ResultStage 3 (MapPartitionsRDD[8] at flatMap at HoodieSparkEngineContext.java:78), which has no missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_2 stored as values in memory (estimated size 368.5 KB, free 911.9 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 101.0 KB, free 911.8 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on xxx:38417 (size: 101.0 KB, free: 912.2 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 3 (MapPartitionsRDD[8] at flatMap at HoodieSparkEngineContext.java:78) (first 15 tasks are for partitions Vector(0))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 3.0 with 1 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 3.0 (TID 4, xxx, executor 1, partition 0, PROCESS_LOCAL, 7710 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on xxx:35696 (size: 101.0 KB, free: 912.2 MB)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 3.0 (TID 4) in 178 ms on xxx (executor 1) (1/1)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 3.0, whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ResultStage 3 (collect at HoodieSparkEngineContext.java:78) finished in 0.233 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Job 1 finished: collect at HoodieSparkEngineContext.java:78, took 0.236923 s
   21/05/26 18:33:30 INFO spark.SparkContext: Starting job: collect at HoodieSparkEngineContext.java:73
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Got job 2 (collect at HoodieSparkEngineContext.java:73) with 1 output partitions
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Final stage: ResultStage 4 (collect at HoodieSparkEngineContext.java:73)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Parents of final stage: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Missing parents: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ResultStage 4 (MapPartitionsRDD[10] at map at HoodieSparkEngineContext.java:73), which has no missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_3 stored as values in memory (estimated size 368.3 KB, free 911.5 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 100.9 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_3_piece0 in memory on xxx:38417 (size: 100.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 4 (MapPartitionsRDD[10] at map at HoodieSparkEngineContext.java:73) (first 15 tasks are for partitions Vector(0))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 4.0 with 1 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 4.0 (TID 5, xxx, executor 1, partition 0, PROCESS_LOCAL, 7710 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_3_piece0 in memory on xxx:35696 (size: 100.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 4.0 (TID 5) in 94 ms on xxx (executor 1) (1/1)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 4.0, whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ResultStage 4 (collect at HoodieSparkEngineContext.java:73) finished in 0.167 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Job 2 finished: collect at HoodieSparkEngineContext.java:73, took 0.174163 s
   21/05/26 18:33:30 INFO spark.SparkContext: Starting job: countByKey at SparkHoodieBloomIndex.java:149
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Registering RDD 14 (countByKey at SparkHoodieBloomIndex.java:149) as input to shuffle 2
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Got job 3 (countByKey at SparkHoodieBloomIndex.java:149) with 2 output partitions
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Final stage: ResultStage 7 (countByKey at SparkHoodieBloomIndex.java:149)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 6)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 6)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 6 (MapPartitionsRDD[14] at countByKey at SparkHoodieBloomIndex.java:149), which has no missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_4 stored as values in memory (estimated size 7.5 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 3.9 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_4_piece0 in memory on xxx:38417 (size: 3.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 4 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 6 (MapPartitionsRDD[14] at countByKey at SparkHoodieBloomIndex.java:149) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 6.0 with 2 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 6.0 (TID 6, xxx, executor 1, partition 0, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_4_piece0 in memory on xxx:35696 (size: 3.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 6.0 (TID 7, xxx, executor 1, partition 1, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 6.0 (TID 6) in 60 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 6.0 (TID 7) in 36 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 6.0, whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ShuffleMapStage 6 (countByKey at SparkHoodieBloomIndex.java:149) finished in 0.121 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: looking for newly runnable stages
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 7)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ResultStage 7 (ShuffledRDD[15] at countByKey at SparkHoodieBloomIndex.java:149), which has no missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_5 stored as values in memory (estimated size 3.8 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_5_piece0 stored as bytes in memory (estimated size 2.2 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in memory on xxx:38417 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 5 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 7 (ShuffledRDD[15] at countByKey at SparkHoodieBloomIndex.java:149) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 7.0 with 2 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 7.0 (TID 8, xxx, executor 1, partition 0, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in memory on xxx:35696 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 2 to 10.246.3.9:49980
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 7.0 (TID 9, xxx, executor 1, partition 1, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 7.0 (TID 8) in 47 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 7.0 (TID 9) in 20 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 7.0, whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ResultStage 7 (countByKey at SparkHoodieBloomIndex.java:149) finished in 0.081 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Job 3 finished: countByKey at SparkHoodieBloomIndex.java:149, took 0.219895 s
   21/05/26 18:33:30 INFO bloom.SparkHoodieBloomIndex: InputParallelism: ${2}, IndexParallelism: ${0}
   21/05/26 18:33:30 INFO bloom.BucketizedBloomCheckPartitioner: TotalBuckets 0, min_buckets/partition 1
   21/05/26 18:33:30 INFO rdd.MapPartitionsRDD: Removing RDD 3 from persistence list
   21/05/26 18:33:30 INFO storage.BlockManager: Removing RDD 3
   21/05/26 18:33:31 INFO rdd.MapPartitionsRDD: Removing RDD 22 from persistence list
   21/05/26 18:33:31 INFO storage.BlockManager: Removing RDD 22
   21/05/26 18:33:31 INFO spark.SparkContext: Starting job: countByKey at BaseSparkCommitActionExecutor.java:158
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 16 (mapToPair at SparkHoodieBloomIndex.java:266) as input to shuffle 6
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 23 (mapToPair at SparkHoodieBloomIndex.java:287) as input to shuffle 3
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 22 (flatMapToPair at SparkHoodieBloomIndex.java:274) as input to shuffle 4
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 31 (countByKey at BaseSparkCommitActionExecutor.java:158) as input to shuffle 5
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Got job 4 (countByKey at BaseSparkCommitActionExecutor.java:158) with 2 output partitions
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Final stage: ResultStage 13 (countByKey at BaseSparkCommitActionExecutor.java:158)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 12)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 12)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 10 (MapPartitionsRDD[23] at mapToPair at SparkHoodieBloomIndex.java:287), which has no missing parents
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_6 stored as values in memory (estimated size 5.9 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_6_piece0 stored as bytes in memory (estimated size 3.3 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_6_piece0 in memory on xxx:38417 (size: 3.3 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Created broadcast 6 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 10 (MapPartitionsRDD[23] at mapToPair at SparkHoodieBloomIndex.java:287) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Adding task set 10.0 with 2 tasks
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 10.0 (TID 10, xxx, executor 1, partition 0, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_6_piece0 in memory on xxx:35696 (size: 3.3 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 10.0 (TID 11, xxx, executor 1, partition 1, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 10.0 (TID 10) in 50 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 10.0 (TID 11) in 24 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Removed TaskSet 10.0, whose tasks have all completed, from pool 
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: ShuffleMapStage 10 (mapToPair at SparkHoodieBloomIndex.java:287) finished in 0.092 s
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: looking for newly runnable stages
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: waiting: Set(ShuffleMapStage 12, ResultStage 13)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 12 (MapPartitionsRDD[31] at countByKey at BaseSparkCommitActionExecutor.java:158), which has no missing parents
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_7 stored as values in memory (estimated size 7.1 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_7_piece0 stored as bytes in memory (estimated size 3.8 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in memory on xxx:38417 (size: 3.8 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Created broadcast 7 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 12 (MapPartitionsRDD[31] at countByKey at BaseSparkCommitActionExecutor.java:158) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Adding task set 12.0 with 2 tasks
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 12.0 (TID 12, xxx, executor 1, partition 0, PROCESS_LOCAL, 7730 bytes)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in memory on xxx:35696 (size: 3.8 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 3 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 4 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added rdd_29_0 in memory on xxx:35696 (size: 0.0 B, free: 912.1 MB)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 12.0 (TID 13, xxx, executor 1, partition 1, PROCESS_LOCAL, 7730 bytes)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 12.0 (TID 12) in 105 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added rdd_29_1 in memory on xxx:35696 (size: 0.0 B, free: 912.1 MB)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 12.0 (TID 13) in 24 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Removed TaskSet 12.0, whose tasks have all completed, from pool 
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: ShuffleMapStage 12 (countByKey at BaseSparkCommitActionExecutor.java:158) finished in 0.146 s
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: looking for newly runnable stages
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 13)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting ResultStage 13 (ShuffledRDD[32] at countByKey at BaseSparkCommitActionExecutor.java:158), which has no missing parents
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_8 stored as values in memory (estimated size 3.8 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_8_piece0 stored as bytes in memory (estimated size 2.2 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in memory on xxx:38417 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Created broadcast 8 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 13 (ShuffledRDD[32] at countByKey at BaseSparkCommitActionExecutor.java:158) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Adding task set 13.0 with 2 tasks
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 13.0 (TID 14, xxx, executor 1, partition 0, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in memory on xxx:35696 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 5 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 13.0 (TID 15, xxx, executor 1, partition 1, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 13.0 (TID 14) in 31 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 13.0 (TID 15) in 12 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Removed TaskSet 13.0, whose tasks have all completed, from pool 
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: ResultStage 13 (countByKey at BaseSparkCommitActionExecutor.java:158) finished in 0.064 s
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Job 4 finished: countByKey at BaseSparkCommitActionExecutor.java:158, took 0.320123 s
   21/05/26 18:33:31 INFO commit.BaseSparkCommitActionExecutor: Workload profile :WorkloadProfile {globalStat=WorkloadStat {numInserts=0, numUpdates=0}, partitionStat={}, operationType=UPSERT}
   21/05/26 18:33:31 INFO timeline.HoodieActiveTimeline: Checking for file exists ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit.requested
   21/05/26 18:33:31 INFO timeline.HoodieActiveTimeline: Create new file for toInstant ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit.inflight
   21/05/26 18:33:31 INFO commit.UpsertPartitioner: AvgRecordSize => 1024
   21/05/26 18:33:31 INFO view.AbstractTableFileSystemView: Took 3 ms to read  0 instants, 0 replaced file groups
   21/05/26 18:33:31 INFO util.ClusteringUtils: Found 0 files in pending clustering operations
   21/05/26 18:33:31 INFO commit.UpsertPartitioner: Total Buckets :0, buckets info => {}, 
   Partition to insert buckets => {}, 
   UpdateLocations mapped to buckets =>{}
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 175
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 62
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 9
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 148
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 105
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 143
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 2
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 55
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 209
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 154
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 147
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 163
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 69
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 34
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 100
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned shuffle 5
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 1
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 193
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 169
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 27
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 16
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 115
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 120
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 106
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 174
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 210
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 96
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 6
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 57
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 133
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 11
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 74
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 107
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 164
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 172
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 176
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 194
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 109
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 37
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 177
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 128
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 182
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 205
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 30
   21/05/26 18:33:31 INFO commit.BaseCommitActionExecutor: Auto commit disabled for 20210526183328
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 102
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 180
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 150
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 186
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 89
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 223
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 47
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 158
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 162
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 88
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 39
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 8
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 29
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 124
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 75
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 165
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 217
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 134
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_5_piece0 on xxx:35696 in memory (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_5_piece0 on xxx:38417 in memory (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 35
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 216
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 22
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 114
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 152
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 42
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 94
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 145
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 126
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 144
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 168
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_3_piece0 on xxx:38417 in memory (size: 100.9 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_3_piece0 on xxx:35696 in memory (size: 100.9 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 149
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 38
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 70
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 15
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 118
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 166
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 207
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 170
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 171
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 65
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 5
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 97
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 110
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 222
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 87
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_6_piece0 on xxx:38417 in memory (size: 3.3 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_6_piece0 on xxx:35696 in memory (size: 3.3 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 192
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 201
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 117
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 123
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 12
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 60
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 84
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 127
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 91
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 136
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 45
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 200
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 64
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_2_piece0 on xxx:38417 in memory (size: 101.0 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_2_piece0 on xxx:35696 in memory (size: 101.0 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 92
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 0
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 81
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 185
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 214
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 21
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 31
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 67
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 112
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 178
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 208
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 78
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 73
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 131
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_8_piece0 on xxx:38417 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_8_piece0 on xxx:35696 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 61
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 3
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_7_piece0 on xxx:38417 in memory (size: 3.8 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_7_piece0 on xxx:35696 in memory (size: 3.8 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Starting job: sum at DeltaSync.java:448
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Job 5 finished: sum at DeltaSync.java:448, took 0.000044 s
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 36
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 80
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 103
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 108
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 183
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 72
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 54
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 132
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 99
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 19
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 93
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 179
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 215
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 66
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 77
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 151
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 116
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 191
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 17
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 14
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 18
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 125
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 204
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 146
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 50
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 56
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 52
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 101
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 221
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 213
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 181
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 190
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 85
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned shuffle 2
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 156
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 161
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 53
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 197
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 20
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 41
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 44
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 140
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 218
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 188
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 122
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 195
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 167
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 220
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 43
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 199
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 155
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 24
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 219
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 71
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 198
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 23
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 135
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 26
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 141
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 121
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 157
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 13
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 130
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned shuffle 0
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 7
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 138
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 63
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 187
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 32
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 196
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 48
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 206
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 119
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 160
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 90
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 40
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 113
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_0_piece0 on xxx:38417 in memory (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_0_piece0 on xxx:35696 in memory (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 68
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 224
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 28
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 202
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 10
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 139
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 76
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 49
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 137
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 58
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_4_piece0 on xxx:38417 in memory (size: 3.9 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_4_piece0 on xxx:35696 in memory (size: 3.9 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 4
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 211
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 212
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 83
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 203
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 33
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 86
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 82
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_1_piece0 on xxx:38417 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_1_piece0 on xxx:35696 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 95
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 142
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 111
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 98
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 184
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 46
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 129
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 104
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 159
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 59
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 25
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 173
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 79
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 153
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 189
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 51
   21/05/26 18:33:32 INFO spark.SparkContext: Starting job: sum at DeltaSync.java:449
   21/05/26 18:33:32 INFO scheduler.DAGScheduler: Job 6 finished: sum at DeltaSync.java:449, took 0.000035 s
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO spark.SparkContext: Starting job: collect at SparkRDDWriteClient.java:120
   21/05/26 18:33:32 INFO scheduler.DAGScheduler: Job 7 finished: collect at SparkRDDWriteClient.java:120, took 0.000039 s
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants [[==>20210526183328__deltacommit__INFLIGHT]]
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:32 INFO util.CommitUtils: Creating  metadata for UPSERT numWriteStats:0numReplaceFileIds:0
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants [[==>20210526183328__deltacommit__INFLIGHT]]
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:32 INFO client.AbstractHoodieWriteClient: Committing 20210526183328 action deltacommit
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Marking instant complete [==>20210526183328__deltacommit__INFLIGHT]
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Checking for file exists ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit.inflight
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Create new file for toInstant ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Completed [==>20210526183328__deltacommit__INFLIGHT]
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants [[==>20210526183328__deltacommit__REQUESTED], [==>20210526183328__deltacommit__INFLIGHT], [20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:32 INFO table.HoodieTimelineArchiveLog: No Instants to archive
   21/05/26 18:33:32 INFO client.AbstractHoodieWriteClient: Auto cleaning is enabled. Running cleaner now
   21/05/26 18:33:32 INFO client.AbstractHoodieWriteClient: Scheduling cleaning at instant time :20210526183332
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants [[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote view for basePath /user/hd_xyz/yyy/ml_xxx/foo. Server=xxx:37089, Timeout=300
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating InMemory based view for basePath /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO view.AbstractTableFileSystemView: Took 0 ms to read  0 instants, 0 replaced file groups
   21/05/26 18:33:32 INFO util.ClusteringUtils: Found 0 files in pending clustering operations
   21/05/26 18:33:32 INFO view.RemoteHoodieTableFileSystemView: Sending request : (http://xxx:37089/v1/hoodie/view/compactions/pending/?basepath=%2Fuser%2Fhdfs%2Fxyz%2Fpublic%2Fml_xxx%2Ffoo&lastinstantts=20210526183328&timelinehash=3cb19d4eacc8a39b3d4198ed17d5dac7ca1a076cc50020fab31fed29c6ccddb1)
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:33 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO timeline.HoodieActiveTimeline: Loaded instants [[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:33 INFO collection.RocksDBDAO: DELETING RocksDB persisted at /tmp/hoodie_timeline_rocksdb/_user_hdfs_xyz_public_ml_xxx_foo/a138e066-6b6b-4f72-8865-4c30301cbe11
   21/05/26 18:33:33 INFO collection.RocksDBDAO: No column family found. Loading default
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl_open.cc:230] Creating manifest 1 
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/version_set.cc:3406] Recovering from manifest file: MANIFEST-000001
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [default]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/version_set.cc:3610] Recovered from manifest file:/tmp/hoodie_timeline_rocksdb/_user_hdfs_xyz_public_ml_xxx_foo/a138e066-6b6b-4f72-8865-4c30301cbe11/MANIFEST-000001 succeeded,manifest_file_number is 1, next_file_number is 3, last_sequence is 0, log_number is 0,prev_log_number is 0,max_column_family is 0,min_log_number_to_keep is 0
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/version_set.cc:3618] Column family [default] (ID 0), log number is 0
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl_open.cc:1287] DB pointer 0x7f3aaccf1f20
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/version_set.cc:2936] Creating manifest 6
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_view__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_view__user_hdfs_xyz_public_ml_xxx_foo] (ID 1)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_pending_compaction__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_pending_compaction__user_hdfs_xyz_public_ml_xxx_foo] (ID 2)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_bootstrap_basefile__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_bootstrap_basefile__user_hdfs_xyz_public_ml_xxx_foo] (ID 3)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_partitions__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_partitions__user_hdfs_xyz_public_ml_xxx_foo] (ID 4)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_replaced_fg_user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_replaced_fg_user_hdfs_xyz_public_ml_xxx_foo] (ID 5)
   21/05/26 18:33:33 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.246.4.117:53684) with ID 2
   21/05/26 18:33:33 INFO spark.ExecutorAllocationManager: New executor 2 has registered (new total is 2)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_pending_clustering_fg_user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_pending_clustering_fg_user_hdfs_xyz_public_ml_xxx_foo] (ID 6)
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting replacedFileGroups to ROCKSDB based file-system view at /tmp/hoodie_timeline_rocksdb, Total file-groups=0
   21/05/26 18:33:33 INFO collection.RocksDBDAO: Prefix DELETE (query=part=) on hudi_replaced_fg_user_hdfs_xyz_public_ml_xxx_foo
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting replacedFileGroups to ROCKSDB based file-system view complete
   21/05/26 18:33:33 INFO view.AbstractTableFileSystemView: Took 9 ms to read  0 instants, 0 replaced file groups
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Initializing pending compaction operations. Count=0
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Initializing external data file mapping. Count=0
   21/05/26 18:33:33 INFO util.ClusteringUtils: Found 0 files in pending clustering operations
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting file groups in pending clustering to ROCKSDB based file-system view at /tmp/hoodie_timeline_rocksdb, Total file-groups=0
   21/05/26 18:33:33 INFO collection.RocksDBDAO: Prefix DELETE (query=part=) on hudi_pending_clustering_fg_user_hdfs_xyz_public_ml_xxx_foo
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting replacedFileGroups to ROCKSDB based file-system view complete
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Created ROCKSDB based file-system view at /tmp/hoodie_timeline_rocksdb
   21/05/26 18:33:33 INFO collection.RocksDBDAO: Prefix Search for (query=) on hudi_pending_compaction__user_hdfs_xyz_public_ml_xxx_foo. Total Time Taken (msec)=1. Serialization Time taken(micro)=0, num entries=0
   21/05/26 18:33:33 INFO service.RequestHandler: TimeTakenMillis[Total=791, Refresh=779, handle=11, Check=1], Success=true, Query=basepath=%2Fuser%2Fhdfs%2Fxyz%2Fpublic%2Fml_xxx%2Ffoo&lastinstantts=20210526183328&timelinehash=3cb19d4eacc8a39b3d4198ed17d5dac7ca1a076cc50020fab31fed29c6ccddb1, Host=xxx:37089, synced=false
   21/05/26 18:33:33 INFO storage.BlockManagerMasterEndpoint: Registering block manager xxx:36920 with 912.3 MB RAM, BlockManagerId(2, xxx, 36920, None)
   21/05/26 18:33:33 INFO clean.CleanPlanner: No earliest commit to retain. No need to scan partitions !!
   21/05/26 18:33:33 INFO clean.CleanPlanner: Nothing to clean here. It is already clean
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Cleaner started
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Cleaned failed attempts if any
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:33 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO timeline.HoodieActiveTimeline: Loaded instants [[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:33 INFO client.SparkRDDWriteClient: Successfully synced to metadata table
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Committed 20210526183328
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Scheduling table service COMPACT
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Scheduling compaction at instant time :20210526183333
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:33 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO timeline.HoodieActiveTimeline: Loaded instants [[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:33 INFO compact.SparkScheduleCompactionActionExecutor: Checking if compaction needs to be run on /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO deltastreamer.DeltaSync: Commit 20210526183328 successful!
   21/05/26 18:33:33 INFO rdd.MapPartitionsRDD: Removing RDD 29 from persistence list
   21/05/26 18:33:33 INFO storage.BlockManager: Removing RDD 29
   21/05/26 18:33:34 INFO rdd.MapPartitionsRDD: Removing RDD 37 from persistence list
   21/05/26 18:33:34 INFO storage.BlockManager: Removing RDD 37
   21/05/26 18:33:34 INFO deltastreamer.DeltaSync: Shutting down embedded timeline server
   21/05/26 18:33:34 INFO embedded.EmbeddedTimelineService: Closing Timeline server
   21/05/26 18:33:34 INFO service.TimelineService: Closing Timeline Service
   21/05/26 18:33:34 INFO javalin.Javalin: Stopping Javalin ...
   21/05/26 18:33:34 INFO javalin.Javalin: Javalin has stopped
   21/05/26 18:33:34 INFO view.RocksDbBasedFileSystemView: Closing Rocksdb !!
   21/05/26 18:33:34 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:365] Shutdown: canceling all background work
   21/05/26 18:33:34 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:521] Shutdown complete
   21/05/26 18:33:34 INFO view.RocksDbBasedFileSystemView: Closed Rocksdb !!
   21/05/26 18:33:34 INFO service.TimelineService: Closed Timeline Service
   21/05/26 18:33:34 INFO embedded.EmbeddedTimelineService: Closed Timeline server
   21/05/26 18:33:34 INFO deltastreamer.HoodieDeltaStreamer: Shut down delta streamer
   21/05/26 18:33:34 INFO server.AbstractConnector: Stopped Spark@7a0e94b4{HTTP/1.1,[http/1.1]}{0.0.0.0:0}
   21/05/26 18:33:34 INFO ui.SparkUI: Stopped Spark web UI at http://xxx:32822
   21/05/26 18:33:34 INFO yarn.YarnAllocator: Driver requested a total number of 0 executor(s).
   21/05/26 18:33:34 INFO cluster.YarnClusterSchedulerBackend: Shutting down all executors
   21/05/26 18:33:34 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down
   21/05/26 18:33:34 INFO cluster.SchedulerExtensionServices: Stopping SchedulerExtensionServices
   (serviceOption=None,
    services=List(),
    started=false)
   21/05/26 18:33:34 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
   21/05/26 18:33:34 INFO memory.MemoryStore: MemoryStore cleared
   21/05/26 18:33:34 INFO storage.BlockManager: BlockManager stopped
   21/05/26 18:33:34 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
   21/05/26 18:33:34 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
   21/05/26 18:33:34 INFO spark.SparkContext: Successfully stopped SparkContext
   21/05/26 18:33:34 INFO yarn.ApplicationMaster: Final app status: SUCCEEDED, exitCode: 0
   21/05/26 18:33:34 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with SUCCEEDED
   21/05/26 18:33:34 INFO impl.AMRMClientImpl: Waiting for application to be successfully unregistered.
   21/05/26 18:33:34 INFO yarn.ApplicationMaster: Deleting staging directory hdfs://xxx:8020/user/hd_xyz/.sparkStaging/application_1618828995116_0162
   21/05/26 18:33:34 INFO util.ShutdownHookManager: Shutdown hook called
   21/05/26 18:33:34 INFO util.ShutdownHookManager: Deleting directory /data/hadoop/yarn/local/usercache/hdfs/appcache/application_1618828995116_0162/spark-4c7e81b9-e526-4325-abf0-d163828b92b5
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] PavelPetukhov commented on issue #2959: No data stored after migrating to Hudi 0.8.0

Posted by GitBox <gi...@apache.org>.
PavelPetukhov commented on issue #2959:
URL: https://github.com/apache/hudi/issues/2959#issuecomment-848930327


   Below is our full log:
   
   
   
   Logged in as: dr.who 
   Application
   About
   Jobs
   Tools
   
   Log Type: stderr
   Log Upload Time: Wed May 26 18:33:34 +0300 2021
   Log Length: 104910
   21/05/26 18:33:18 INFO util.SignalUtils: Registered signal handler for TERM
   21/05/26 18:33:18 INFO util.SignalUtils: Registered signal handler for HUP
   21/05/26 18:33:18 INFO util.SignalUtils: Registered signal handler for INT
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing view acls to: yarn,hdfs
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing modify acls to: yarn,hdfs
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing view acls groups to: 
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing modify acls groups to: 
   21/05/26 18:33:18 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(yarn, hdfs); groups with view permissions: Set(); users  with modify permissions: Set(yarn, hdfs); groups with modify permissions: Set()
   21/05/26 18:33:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
   21/05/26 18:33:18 INFO yarn.ApplicationMaster: Preparing Local resources
   21/05/26 18:33:19 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
   21/05/26 18:33:19 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1618828995116_0162_000001
   21/05/26 18:33:19 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread
   21/05/26 18:33:19 INFO yarn.ApplicationMaster: Waiting for spark context initialization...
   21/05/26 18:33:19 WARN deltastreamer.SchedulerConfGenerator: Job Scheduling Configs will not be in effect as spark.scheduler.mode is not set to FAIR at instantiation time. Continuing without scheduling configs
   21/05/26 18:33:19 INFO spark.SparkContext: Running Spark version 2.4.7
   21/05/26 18:33:19 INFO spark.SparkContext: Submitted application: xxx
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing view acls to: yarn,hdfs
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing modify acls to: yarn,hdfs
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing view acls groups to: 
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing modify acls groups to: 
   21/05/26 18:33:19 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(yarn, hdfs); groups with view permissions: Set(); users  with modify permissions: Set(yarn, hdfs); groups with modify permissions: Set()
   21/05/26 18:33:20 INFO util.Utils: Successfully started service 'sparkDriver' on port 37691.
   21/05/26 18:33:20 INFO spark.SparkEnv: Registering MapOutputTracker
   21/05/26 18:33:20 INFO spark.SparkEnv: Registering BlockManagerMaster
   21/05/26 18:33:20 INFO storage.BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
   21/05/26 18:33:20 INFO storage.BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
   21/05/26 18:33:20 INFO storage.DiskBlockManager: Created local directory at /data/hadoop/yarn/local/usercache/hdfs/appcache/application_1618828995116_0162/blockmgr-9de167db-4756-414e-9126-32cb562e91aa
   21/05/26 18:33:20 INFO memory.MemoryStore: MemoryStore started with capacity 912.3 MB
   21/05/26 18:33:20 INFO spark.SparkEnv: Registering OutputCommitCoordinator
   21/05/26 18:33:20 INFO util.log: Logging initialized @2935ms
   21/05/26 18:33:20 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /jobs, /jobs/json, /jobs/job, /jobs/job/json, /stages, /stages/json, /stages/stage, /stages/stage/json, /stages/pool, /stages/pool/json, /storage, /storage/json, /storage/rdd, /storage/rdd/json, /environment, /environment/json, /executors, /executors/json, /executors/threadDump, /executors/threadDump/json, /static, /, /api, /jobs/job/kill, /stages/stage/kill.
   21/05/26 18:33:20 INFO server.Server: jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
   21/05/26 18:33:20 INFO server.Server: Started @3069ms
   21/05/26 18:33:20 INFO server.AbstractConnector: Started ServerConnector@7a0e94b4{HTTP/1.1,[http/1.1]}{0.0.0.0:32822}
   21/05/26 18:33:20 INFO util.Utils: Successfully started service 'SparkUI' on port 32822.
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@43837fbc{/jobs,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@d91ba30{/jobs/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4854d5d9{/jobs/job,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@672e7ec3{/jobs/job/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@67ee182c{/stages,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@97af315{/stages/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1936a0e0{/stages/stage,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@447ef19e{/stages/stage/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@68e36851{/stages/pool,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@352fe12b{/stages/pool/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3d39f28d{/storage,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@e7806b5{/storage/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7d2a56cb{/storage/rdd,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@37c6c6fc{/storage/rdd/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4599e713{/environment,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@b9a0cbb{/environment/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@24299f0d{/executors,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@25594c52{/executors/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2f728695{/executors/threadDump,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7456a814{/executors/threadDump/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1cef9064{/static,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@16ba2eda{/,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@dac88e2{/api,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@145850ef{/jobs/job/kill,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6d678cf2{/stages/stage/kill,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at http://xxx:32822
   21/05/26 18:33:20 INFO cluster.YarnClusterScheduler: Created YarnClusterScheduler
   21/05/26 18:33:20 INFO cluster.SchedulerExtensionServices: Starting Yarn extension services with app application_1618828995116_0162 and attemptId Some(appattempt_1618828995116_0162_000001)
   21/05/26 18:33:20 WARN util.Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs.
   21/05/26 18:33:20 INFO util.Utils: Using initial executors = 1, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
   21/05/26 18:33:20 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 38417.
   21/05/26 18:33:20 INFO netty.NettyBlockTransferService: Server created on xxx:38417
   21/05/26 18:33:20 INFO storage.BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
   21/05/26 18:33:20 INFO storage.BlockManagerMaster: Registering BlockManager BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO storage.BlockManagerMasterEndpoint: Registering block manager xxx:38417 with 912.3 MB RAM, BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO storage.BlockManager: external shuffle service port = 7337
   21/05/26 18:33:20 INFO storage.BlockManager: Initialized BlockManager: BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /metrics/json.
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1b3c78ce{/metrics/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:21 INFO scheduler.EventLoggingListener: Logging events to hdfs://xxx:8020/eventLogging/application_1618828995116_0162_1
   21/05/26 18:33:21 WARN util.Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs.
   21/05/26 18:33:21 INFO util.Utils: Using initial executors = 1, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
   21/05/26 18:33:21 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to request executors before the AM has registered!
   21/05/26 18:33:21 INFO client.RMProxy: Connecting to ResourceManager at xxx/10.246.4.117:8030
   21/05/26 18:33:21 INFO yarn.YarnRMClient: Registering the ApplicationMaster
   21/05/26 18:33:21 INFO yarn.ApplicationMaster: 
   ===============================================================================
   YARN executor launch context:
     env:
       CLASSPATH -> {{PWD}}<CPS>{{PWD}}/__spark_conf__<CPS>{{PWD}}/__spark_libs__/*<CPS>/usr/hdp/2.6.0.3-8/hadoop/conf<CPS>/usr/hdp/2.6.0.3-8/hadoop/*<CPS>/usr/hdp/2.6.0.3-8/hadoop/lib/*<CPS>/usr/hdp/current/hadoop-hdfs-client/*<CPS>/usr/hdp/current/hadoop-hdfs-client/lib/*<CPS>/usr/hdp/current/hadoop-yarn-client/*<CPS>/usr/hdp/current/hadoop-yarn-client/lib/*<CPS>/usr/hdp/current/ext/hadoop/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*<CPS>{{PWD}}/__spark_conf__/__hadoop_conf__
       SPARK_YARN_STAGING_DIR -> hdfs://xxx:8020/user/hd_xyz/.sparkStaging/application_1618828995116_0162
       SPARK_USER -> hdfs
   
     command:
       {{JAVA_HOME}}/bin/java \ 
         -server \ 
         -Xmx2048m \ 
         -Djava.io.tmpdir={{PWD}}/tmp \ 
         '-Dspark.driver.port=37691' \ 
         '-Dspark.ui.port=0' \ 
         -Dspark.yarn.app.container.log.dir=<LOG_DIR> \ 
         -XX:OnOutOfMemoryError='kill %p' \ 
         org.apache.spark.executor.CoarseGrainedExecutorBackend \ 
         --driver-url \ 
         spark://CoarseGrainedScheduler@xxx:37691 \ 
         --executor-id \ 
         <executorId> \ 
         --hostname \ 
         <hostname> \ 
         --cores \ 
         1 \ 
         --app-id \ 
         application_1618828995116_0162 \ 
         --user-class-path \ 
         file:$PWD/__app__.jar \ 
         --user-class-path \ 
         file:$PWD/org.apache.spark_spark-avro_2.12-2.4.7.jar \ 
         --user-class-path \ 
         file:$PWD/org.spark-project.spark_unused-1.0.0.jar \ 
         1><LOG_DIR>/stdout \ 
         2><LOG_DIR>/stderr
   
     resources:
       org.apache.spark_spark-avro_2.12-2.4.7.jar -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/hd_xyz/.sparkStaging/application_1618828995116_0162/org.apache.spark_spark-avro_2.12-2.4.7.jar" } size: 107269 timestamp: 1622043191967 type: FILE visibility: PRIVATE
       __app__.jar -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/jars/hudi/hudi-utilities-bundle_2.12-0.8.0.jar" } size: 40399204 timestamp: 1622022896130 type: FILE visibility: PUBLIC
       __spark_conf__ -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/hd_xyz/.sparkStaging/application_1618828995116_0162/__spark_conf__.zip" } size: 205423 timestamp: 1622043193955 type: ARCHIVE visibility: PRIVATE
       org.spark-project.spark_unused-1.0.0.jar -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/hd_xyz/.sparkStaging/application_1618828995116_0162/org.spark-project.spark_unused-1.0.0.jar" } size: 2777 timestamp: 1622043192905 type: FILE visibility: PRIVATE
       __spark_libs__ -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/hd_xyz/.sparkStaging/application_1618828995116_0162/__spark_libs__2858796966972713370.zip" } size: 242613518 timestamp: 1622043190403 type: ARCHIVE visibility: PRIVATE
   
   ===============================================================================
   21/05/26 18:33:21 WARN util.Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs.
   21/05/26 18:33:21 INFO util.Utils: Using initial executors = 1, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
   21/05/26 18:33:21 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(spark://YarnAM@xxx:37691)
   21/05/26 18:33:21 INFO yarn.YarnAllocator: Will request 1 executor container(s), each with 1 core(s) and 2432 MB memory (including 384 MB of overhead)
   21/05/26 18:33:21 INFO yarn.YarnAllocator: Submitted 1 unlocalized container requests.
   21/05/26 18:33:21 INFO yarn.ApplicationMaster: Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals
   21/05/26 18:33:22 INFO impl.AMRMClientImpl: Received new token for : xxx:45454
   21/05/26 18:33:22 INFO yarn.YarnAllocator: Launching container container_e03_1618828995116_0162_01_000002 on host xxx for executor with ID 1
   21/05/26 18:33:22 INFO yarn.YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them.
   21/05/26 18:33:22 INFO impl.ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
   21/05/26 18:33:22 INFO impl.ContainerManagementProtocolProxy: Opening proxy : xxx:45454
   21/05/26 18:33:25 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.246.3.9:49980) with ID 1
   21/05/26 18:33:25 INFO spark.ExecutorAllocationManager: New executor 1 has registered (new total is 1)
   21/05/26 18:33:25 INFO cluster.YarnClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
   21/05/26 18:33:25 INFO cluster.YarnClusterScheduler: YarnClusterScheduler.postStartHook done
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO utilities.UtilHelpers: Adding overridden properties to file properties.
   21/05/26 18:33:25 WARN spark.SparkContext: Using an existing SparkContext; some configuration may not take effect.
   21/05/26 18:33:25 INFO storage.BlockManagerMasterEndpoint: Registering block manager xxx:35696 with 912.3 MB RAM, BlockManagerId(1, xxx, 35696, None)
   21/05/26 18:33:25 INFO deltastreamer.HoodieDeltaStreamer: Creating delta streamer with configs : {hoodie.deltastreamer.keygen.timebased.input.timezone=, hoodie.embed.timeline.server=true, schema.registry.url=http://xxx, hoodie.filesystem.view.type=EMBEDDED_KV_STORE, hoodie.deltastreamer.keygen.timebased.input.dateformat=yyyy-MM-ddTHH:mm:ssZ,yyyy-MM-ddTHH:mm:ss.SSSZ, hoodie.delete.shuffle.parallelism=2, hoodie.bulkinsert.shuffle.parallelism=2, hoodie.deltastreamer.keygen.timebased.output.dateformat=yyyy/MM/dd, group.id=hudi_group_080, auto.offset.reset=earliest, hoodie.insert.shuffle.parallelism=2, hoodie.deltastreamer.keygen.timebased.timestamp.type=DATE_STRING, hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.CustomKeyGenerator, hoodie.deltastreamer.source.kafka.topic=xxx, bootstrap.servers=xxx:9092, hoodie.deltastreamer.keygen.timebased.input.dateformat.list.delimiter.regex=, hoodie.deltastreamer.schemaprovider.registry.url=http://xxx/subjects/xxx-value/versions
 /latest, hoodie.datasource.write.recordkey.field=id, hoodie.upsert.shuffle.parallelism=2, hoodie.datasource.write.partitionpath.field=date:TIMESTAMP}
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Initializing /user/hd_xyz/yyy/ml_xxx/foo as hoodie table /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Finished initializing Table of type MERGE_ON_READ from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO deltastreamer.DeltaSync: Registering Schema :[{"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.ml_xxx.public.foo.Value"}, {"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.m
 l_xxx.public.foo.Value"}]
   21/05/26 18:33:25 INFO deltastreamer.HoodieDeltaStreamer: Delta Streamer running only single round
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:26 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:26 INFO deltastreamer.DeltaSync: Checkpoint to resume from : Optional.empty
   21/05/26 18:33:26 INFO consumer.ConsumerConfig: ConsumerConfig values: 
   	allow.auto.create.topics = true
   	auto.commit.interval.ms = 5000
   	auto.offset.reset = earliest
   	bootstrap.servers = [xxx]
   	check.crcs = true
   	client.dns.lookup = default
   	client.id = 
   	client.rack = 
   	connections.max.idle.ms = 540000
   	default.api.timeout.ms = 60000
   	enable.auto.commit = true
   	exclude.internal.topics = true
   	fetch.max.bytes = 52428800
   	fetch.max.wait.ms = 500
   	fetch.min.bytes = 1
   	group.id = hudi_group_080
   	group.instance.id = null
   	heartbeat.interval.ms = 3000
   	interceptor.classes = []
   	internal.leave.group.on.close = true
   	isolation.level = read_uncommitted
   	key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
   	max.partition.fetch.bytes = 1048576
   	max.poll.interval.ms = 300000
   	max.poll.records = 500
   	metadata.max.age.ms = 300000
   	metric.reporters = []
   	metrics.num.samples = 2
   	metrics.recording.level = INFO
   	metrics.sample.window.ms = 30000
   	partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
   	receive.buffer.bytes = 65536
   	reconnect.backoff.max.ms = 1000
   	reconnect.backoff.ms = 50
   	request.timeout.ms = 30000
   	retry.backoff.ms = 100
   	sasl.client.callback.handler.class = null
   	sasl.jaas.config = null
   	sasl.kerberos.kinit.cmd = /usr/bin/kinit
   	sasl.kerberos.min.time.before.relogin = 60000
   	sasl.kerberos.service.name = null
   	sasl.kerberos.ticket.renew.jitter = 0.05
   	sasl.kerberos.ticket.renew.window.factor = 0.8
   	sasl.login.callback.handler.class = null
   	sasl.login.class = null
   	sasl.login.refresh.buffer.seconds = 300
   	sasl.login.refresh.min.period.seconds = 60
   	sasl.login.refresh.window.factor = 0.8
   	sasl.login.refresh.window.jitter = 0.05
   	sasl.mechanism = GSSAPI
   	security.protocol = PLAINTEXT
   	security.providers = null
   	send.buffer.bytes = 131072
   	session.timeout.ms = 10000
   	ssl.cipher.suites = null
   	ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
   	ssl.endpoint.identification.algorithm = https
   	ssl.key.password = null
   	ssl.keymanager.algorithm = SunX509
   	ssl.keystore.location = null
   	ssl.keystore.password = null
   	ssl.keystore.type = JKS
   	ssl.protocol = TLS
   	ssl.provider = null
   	ssl.secure.random.implementation = null
   	ssl.trustmanager.algorithm = PKIX
   	ssl.truststore.location = null
   	ssl.truststore.password = null
   	ssl.truststore.type = JKS
   	value.deserializer = class io.confluent.kafka.serializers.KafkaAvroDeserializer
   
   21/05/26 18:33:26 INFO serializers.KafkaAvroDeserializerConfig: KafkaAvroDeserializerConfig values: 
   	schema.registry.url = [xxx]
   	max.schemas.per.subject = 1000
   	specific.avro.reader = false
   
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.timestamp.type' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.output.dateformat' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.input.dateformat.list.delimiter.regex' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.input.dateformat' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.datasource.write.partitionpath.field' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.delete.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.datasource.write.recordkey.field' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.upsert.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.datasource.write.keygenerator.class' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.source.kafka.topic' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.schemaprovider.registry.url' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.insert.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.embed.timeline.server' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.bulkinsert.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.input.timezone' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.filesystem.view.type' was supplied but isn't a known config.
   21/05/26 18:33:26 INFO utils.AppInfoParser: Kafka version: 2.4.1
   21/05/26 18:33:26 INFO utils.AppInfoParser: Kafka commitId: c57222ae8cd7866b
   21/05/26 18:33:26 INFO utils.AppInfoParser: Kafka startTimeMs: 1622043206225
   21/05/26 18:33:26 INFO clients.Metadata: [Consumer clientId=consumer-hudi_group_080-1, groupId=hudi_group_080] Cluster ID: 5XoPi9AYT0mbHVQEj6VEaw
   21/05/26 18:33:27 INFO helpers.KafkaOffsetGen: SourceLimit not configured, set numEvents to default value : 5000000
   21/05/26 18:33:27 INFO sources.AvroKafkaSource: About to read 0 from Kafka for topic :xxx
   21/05/26 18:33:27 INFO deltastreamer.DeltaSync: No new data, perform empty commit.
   21/05/26 18:33:27 INFO deltastreamer.DeltaSync: Setting up new Hoodie Write Client
   21/05/26 18:33:27 INFO deltastreamer.DeltaSync: Registering Schema :[{"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.ml_xxx.public.foo.Value"}, {"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.m
 l_xxx.public.foo.Value"}]
   21/05/26 18:33:27 INFO embedded.EmbeddedTimelineService: Starting Timeline service !!
   21/05/26 18:33:27 INFO embedded.EmbeddedTimelineService: Overriding hostIp to (xxx) found in spark-conf. It was null
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating View Manager with storage type :EMBEDDED_KV_STORE
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating embedded rocks-db based Table View
   21/05/26 18:33:27 INFO util.log: Logging initialized @9978ms to org.apache.hudi.org.eclipse.jetty.util.log.Slf4jLog
   21/05/26 18:33:27 INFO javalin.Javalin: 
              __                      __ _
             / /____ _ _   __ ____ _ / /(_)____
        __  / // __ `/| | / // __ `// // // __ \
       / /_/ // /_/ / | |/ // /_/ // // // / / /
       \____/ \__,_/  |___/ \__,_//_//_//_/ /_/
   
           https://javalin.io/documentation
   
   21/05/26 18:33:27 INFO javalin.Javalin: Starting Javalin ...
   21/05/26 18:33:27 INFO javalin.Javalin: Listening on http://localhost:37089/
   21/05/26 18:33:27 INFO javalin.Javalin: Javalin started in 179ms \o/
   21/05/26 18:33:27 INFO service.TimelineService: Starting Timeline server on port :37089
   21/05/26 18:33:27 INFO embedded.EmbeddedTimelineService: Started embedded timeline server at xxx:37089
   21/05/26 18:33:27 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:27 INFO client.AbstractHoodieClient: Timeline Server already running. Not restarting the service
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:27 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:28 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:28 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:28 INFO client.AbstractHoodieWriteClient: Generate a new instant time: 20210526183328 action: deltacommit
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Creating a new instant [==>20210526183328__deltacommit__REQUESTED]
   21/05/26 18:33:28 INFO deltastreamer.DeltaSync: Starting commit  : 20210526183328
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:28 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Loaded instants [[==>20210526183328__deltacommit__REQUESTED]]
   21/05/26 18:33:28 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:28 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:28 INFO client.SparkRDDWriteClient: Successfully synced to metadata table
   21/05/26 18:33:28 INFO client.AsyncCleanerService: Auto cleaning is not enabled. Not running cleaner now
   21/05/26 18:33:28 INFO spark.SparkContext: Starting job: countByKey at SparkHoodieBloomIndex.java:114
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Registering RDD 1 (mapToPair at SparkWriteHelper.java:54) as input to shuffle 1
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Registering RDD 5 (countByKey at SparkHoodieBloomIndex.java:114) as input to shuffle 0
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Got job 0 (countByKey at SparkHoodieBloomIndex.java:114) with 2 output partitions
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Final stage: ResultStage 2 (countByKey at SparkHoodieBloomIndex.java:114)
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 1)
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 1)
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 1 (MapPartitionsRDD[5] at countByKey at SparkHoodieBloomIndex.java:114), which has no missing parents
   21/05/26 18:33:28 INFO memory.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 6.2 KB, free 912.3 MB)
   21/05/26 18:33:28 INFO yarn.YarnAllocator: Driver requested a total number of 2 executor(s).
   21/05/26 18:33:28 INFO yarn.YarnAllocator: Will request 1 executor container(s), each with 1 core(s) and 2432 MB memory (including 384 MB of overhead)
   21/05/26 18:33:28 INFO yarn.YarnAllocator: Submitted 1 unlocalized container requests.
   21/05/26 18:33:28 INFO spark.ExecutorAllocationManager: Requesting 1 new executor because tasks are backlogged (new desired total will be 2)
   21/05/26 18:33:28 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.3 KB, free 912.3 MB)
   21/05/26 18:33:28 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on xxx:38417 (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:28 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 1 (MapPartitionsRDD[5] at countByKey at SparkHoodieBloomIndex.java:114) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:28 INFO cluster.YarnClusterScheduler: Adding task set 1.0 with 2 tasks
   21/05/26 18:33:28 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 1.0 (TID 0, xxx, executor 1, partition 0, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:28 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on xxx:35696 (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:29 INFO impl.AMRMClientImpl: Received new token for : xxx:45454
   21/05/26 18:33:29 INFO yarn.YarnAllocator: Launching container container_e03_1618828995116_0162_01_000004 on host xxx for executor with ID 2
   21/05/26 18:33:29 INFO yarn.YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them.
   21/05/26 18:33:29 INFO impl.ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
   21/05/26 18:33:29 INFO impl.ContainerManagementProtocolProxy: Opening proxy : xxx:45454
   21/05/26 18:33:29 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to 10.246.3.9:49980
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added rdd_3_0 in memory on xxx:35696 (size: 0.0 B, free: 912.3 MB)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 1.0 (TID 1, xxx, executor 1, partition 1, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added rdd_3_1 in memory on xxx:35696 (size: 0.0 B, free: 912.3 MB)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 1.0 (TID 0) in 1023 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 1.0 (TID 1) in 70 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: ShuffleMapStage 1 (countByKey at SparkHoodieBloomIndex.java:114) finished in 1.177 s
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: looking for newly runnable stages
   21/05/26 18:33:29 INFO cluster.YarnClusterScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool 
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 2)
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: Submitting ResultStage 2 (ShuffledRDD[6] at countByKey at SparkHoodieBloomIndex.java:114), which has no missing parents
   21/05/26 18:33:29 INFO memory.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.8 KB, free 912.3 MB)
   21/05/26 18:33:29 INFO memory.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.2 KB, free 912.3 MB)
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on xxx:38417 (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:29 INFO spark.SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 2 (ShuffledRDD[6] at countByKey at SparkHoodieBloomIndex.java:114) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:29 INFO cluster.YarnClusterScheduler: Adding task set 2.0 with 2 tasks
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 2.0 (TID 2, xxx, executor 1, partition 0, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on xxx:35696 (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:29 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to 10.246.3.9:49980
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 2.0 (TID 3, xxx, executor 1, partition 1, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 2.0 (TID 2) in 85 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 2.0 (TID 3) in 32 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:29 INFO cluster.YarnClusterScheduler: Removed TaskSet 2.0, whose tasks have all completed, from pool 
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: ResultStage 2 (countByKey at SparkHoodieBloomIndex.java:114) finished in 0.126 s
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: Job 0 finished: countByKey at SparkHoodieBloomIndex.java:114, took 1.627903 s
   21/05/26 18:33:29 INFO yarn.YarnAllocator: Driver requested a total number of 1 executor(s).
   21/05/26 18:33:30 INFO spark.SparkContext: Starting job: collect at HoodieSparkEngineContext.java:78
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Got job 1 (collect at HoodieSparkEngineContext.java:78) with 1 output partitions
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Final stage: ResultStage 3 (collect at HoodieSparkEngineContext.java:78)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Parents of final stage: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Missing parents: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ResultStage 3 (MapPartitionsRDD[8] at flatMap at HoodieSparkEngineContext.java:78), which has no missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_2 stored as values in memory (estimated size 368.5 KB, free 911.9 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 101.0 KB, free 911.8 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on xxx:38417 (size: 101.0 KB, free: 912.2 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 3 (MapPartitionsRDD[8] at flatMap at HoodieSparkEngineContext.java:78) (first 15 tasks are for partitions Vector(0))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 3.0 with 1 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 3.0 (TID 4, xxx, executor 1, partition 0, PROCESS_LOCAL, 7710 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on xxx:35696 (size: 101.0 KB, free: 912.2 MB)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 3.0 (TID 4) in 178 ms on xxx (executor 1) (1/1)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 3.0, whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ResultStage 3 (collect at HoodieSparkEngineContext.java:78) finished in 0.233 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Job 1 finished: collect at HoodieSparkEngineContext.java:78, took 0.236923 s
   21/05/26 18:33:30 INFO spark.SparkContext: Starting job: collect at HoodieSparkEngineContext.java:73
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Got job 2 (collect at HoodieSparkEngineContext.java:73) with 1 output partitions
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Final stage: ResultStage 4 (collect at HoodieSparkEngineContext.java:73)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Parents of final stage: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Missing parents: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ResultStage 4 (MapPartitionsRDD[10] at map at HoodieSparkEngineContext.java:73), which has no missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_3 stored as values in memory (estimated size 368.3 KB, free 911.5 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 100.9 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_3_piece0 in memory on xxx:38417 (size: 100.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 4 (MapPartitionsRDD[10] at map at HoodieSparkEngineContext.java:73) (first 15 tasks are for partitions Vector(0))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 4.0 with 1 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 4.0 (TID 5, xxx, executor 1, partition 0, PROCESS_LOCAL, 7710 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_3_piece0 in memory on xxx:35696 (size: 100.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 4.0 (TID 5) in 94 ms on xxx (executor 1) (1/1)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 4.0, whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ResultStage 4 (collect at HoodieSparkEngineContext.java:73) finished in 0.167 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Job 2 finished: collect at HoodieSparkEngineContext.java:73, took 0.174163 s
   21/05/26 18:33:30 INFO spark.SparkContext: Starting job: countByKey at SparkHoodieBloomIndex.java:149
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Registering RDD 14 (countByKey at SparkHoodieBloomIndex.java:149) as input to shuffle 2
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Got job 3 (countByKey at SparkHoodieBloomIndex.java:149) with 2 output partitions
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Final stage: ResultStage 7 (countByKey at SparkHoodieBloomIndex.java:149)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 6)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 6)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 6 (MapPartitionsRDD[14] at countByKey at SparkHoodieBloomIndex.java:149), which has no missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_4 stored as values in memory (estimated size 7.5 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 3.9 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_4_piece0 in memory on xxx:38417 (size: 3.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 4 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 6 (MapPartitionsRDD[14] at countByKey at SparkHoodieBloomIndex.java:149) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 6.0 with 2 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 6.0 (TID 6, xxx, executor 1, partition 0, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_4_piece0 in memory on xxx:35696 (size: 3.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 6.0 (TID 7, xxx, executor 1, partition 1, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 6.0 (TID 6) in 60 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 6.0 (TID 7) in 36 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 6.0, whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ShuffleMapStage 6 (countByKey at SparkHoodieBloomIndex.java:149) finished in 0.121 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: looking for newly runnable stages
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 7)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ResultStage 7 (ShuffledRDD[15] at countByKey at SparkHoodieBloomIndex.java:149), which has no missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_5 stored as values in memory (estimated size 3.8 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_5_piece0 stored as bytes in memory (estimated size 2.2 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in memory on xxx:38417 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 5 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 7 (ShuffledRDD[15] at countByKey at SparkHoodieBloomIndex.java:149) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 7.0 with 2 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 7.0 (TID 8, xxx, executor 1, partition 0, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in memory on xxx:35696 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 2 to 10.246.3.9:49980
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 7.0 (TID 9, xxx, executor 1, partition 1, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 7.0 (TID 8) in 47 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 7.0 (TID 9) in 20 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 7.0, whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ResultStage 7 (countByKey at SparkHoodieBloomIndex.java:149) finished in 0.081 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Job 3 finished: countByKey at SparkHoodieBloomIndex.java:149, took 0.219895 s
   21/05/26 18:33:30 INFO bloom.SparkHoodieBloomIndex: InputParallelism: ${2}, IndexParallelism: ${0}
   21/05/26 18:33:30 INFO bloom.BucketizedBloomCheckPartitioner: TotalBuckets 0, min_buckets/partition 1
   21/05/26 18:33:30 INFO rdd.MapPartitionsRDD: Removing RDD 3 from persistence list
   21/05/26 18:33:30 INFO storage.BlockManager: Removing RDD 3
   21/05/26 18:33:31 INFO rdd.MapPartitionsRDD: Removing RDD 22 from persistence list
   21/05/26 18:33:31 INFO storage.BlockManager: Removing RDD 22
   21/05/26 18:33:31 INFO spark.SparkContext: Starting job: countByKey at BaseSparkCommitActionExecutor.java:158
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 16 (mapToPair at SparkHoodieBloomIndex.java:266) as input to shuffle 6
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 23 (mapToPair at SparkHoodieBloomIndex.java:287) as input to shuffle 3
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 22 (flatMapToPair at SparkHoodieBloomIndex.java:274) as input to shuffle 4
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 31 (countByKey at BaseSparkCommitActionExecutor.java:158) as input to shuffle 5
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Got job 4 (countByKey at BaseSparkCommitActionExecutor.java:158) with 2 output partitions
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Final stage: ResultStage 13 (countByKey at BaseSparkCommitActionExecutor.java:158)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 12)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 12)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 10 (MapPartitionsRDD[23] at mapToPair at SparkHoodieBloomIndex.java:287), which has no missing parents
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_6 stored as values in memory (estimated size 5.9 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_6_piece0 stored as bytes in memory (estimated size 3.3 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_6_piece0 in memory on xxx:38417 (size: 3.3 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Created broadcast 6 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 10 (MapPartitionsRDD[23] at mapToPair at SparkHoodieBloomIndex.java:287) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Adding task set 10.0 with 2 tasks
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 10.0 (TID 10, xxx, executor 1, partition 0, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_6_piece0 in memory on xxx:35696 (size: 3.3 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 10.0 (TID 11, xxx, executor 1, partition 1, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 10.0 (TID 10) in 50 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 10.0 (TID 11) in 24 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Removed TaskSet 10.0, whose tasks have all completed, from pool 
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: ShuffleMapStage 10 (mapToPair at SparkHoodieBloomIndex.java:287) finished in 0.092 s
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: looking for newly runnable stages
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: waiting: Set(ShuffleMapStage 12, ResultStage 13)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 12 (MapPartitionsRDD[31] at countByKey at BaseSparkCommitActionExecutor.java:158), which has no missing parents
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_7 stored as values in memory (estimated size 7.1 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_7_piece0 stored as bytes in memory (estimated size 3.8 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in memory on xxx:38417 (size: 3.8 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Created broadcast 7 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 12 (MapPartitionsRDD[31] at countByKey at BaseSparkCommitActionExecutor.java:158) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Adding task set 12.0 with 2 tasks
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 12.0 (TID 12, xxx, executor 1, partition 0, PROCESS_LOCAL, 7730 bytes)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in memory on xxx:35696 (size: 3.8 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 3 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 4 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added rdd_29_0 in memory on xxx:35696 (size: 0.0 B, free: 912.1 MB)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 12.0 (TID 13, xxx, executor 1, partition 1, PROCESS_LOCAL, 7730 bytes)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 12.0 (TID 12) in 105 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added rdd_29_1 in memory on xxx:35696 (size: 0.0 B, free: 912.1 MB)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 12.0 (TID 13) in 24 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Removed TaskSet 12.0, whose tasks have all completed, from pool 
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: ShuffleMapStage 12 (countByKey at BaseSparkCommitActionExecutor.java:158) finished in 0.146 s
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: looking for newly runnable stages
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 13)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting ResultStage 13 (ShuffledRDD[32] at countByKey at BaseSparkCommitActionExecutor.java:158), which has no missing parents
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_8 stored as values in memory (estimated size 3.8 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_8_piece0 stored as bytes in memory (estimated size 2.2 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in memory on xxx:38417 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Created broadcast 8 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 13 (ShuffledRDD[32] at countByKey at BaseSparkCommitActionExecutor.java:158) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Adding task set 13.0 with 2 tasks
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 13.0 (TID 14, xxx, executor 1, partition 0, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in memory on xxx:35696 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 5 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 13.0 (TID 15, xxx, executor 1, partition 1, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 13.0 (TID 14) in 31 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 13.0 (TID 15) in 12 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Removed TaskSet 13.0, whose tasks have all completed, from pool 
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: ResultStage 13 (countByKey at BaseSparkCommitActionExecutor.java:158) finished in 0.064 s
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Job 4 finished: countByKey at BaseSparkCommitActionExecutor.java:158, took 0.320123 s
   21/05/26 18:33:31 INFO commit.BaseSparkCommitActionExecutor: Workload profile :WorkloadProfile {globalStat=WorkloadStat {numInserts=0, numUpdates=0}, partitionStat={}, operationType=UPSERT}
   21/05/26 18:33:31 INFO timeline.HoodieActiveTimeline: Checking for file exists ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit.requested
   21/05/26 18:33:31 INFO timeline.HoodieActiveTimeline: Create new file for toInstant ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit.inflight
   21/05/26 18:33:31 INFO commit.UpsertPartitioner: AvgRecordSize => 1024
   21/05/26 18:33:31 INFO view.AbstractTableFileSystemView: Took 3 ms to read  0 instants, 0 replaced file groups
   21/05/26 18:33:31 INFO util.ClusteringUtils: Found 0 files in pending clustering operations
   21/05/26 18:33:31 INFO commit.UpsertPartitioner: Total Buckets :0, buckets info => {}, 
   Partition to insert buckets => {}, 
   UpdateLocations mapped to buckets =>{}
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 175
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 62
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 9
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 148
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 105
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 143
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 2
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 55
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 209
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 154
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 147
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 163
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 69
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 34
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 100
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned shuffle 5
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 1
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 193
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 169
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 27
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 16
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 115
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 120
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 106
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 174
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 210
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 96
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 6
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 57
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 133
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 11
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 74
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 107
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 164
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 172
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 176
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 194
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 109
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 37
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 177
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 128
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 182
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 205
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 30
   21/05/26 18:33:31 INFO commit.BaseCommitActionExecutor: Auto commit disabled for 20210526183328
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 102
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 180
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 150
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 186
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 89
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 223
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 47
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 158
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 162
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 88
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 39
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 8
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 29
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 124
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 75
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 165
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 217
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 134
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_5_piece0 on xxx:35696 in memory (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_5_piece0 on xxx:38417 in memory (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 35
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 216
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 22
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 114
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 152
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 42
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 94
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 145
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 126
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 144
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 168
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_3_piece0 on xxx:38417 in memory (size: 100.9 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_3_piece0 on xxx:35696 in memory (size: 100.9 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 149
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 38
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 70
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 15
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 118
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 166
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 207
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 170
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 171
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 65
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 5
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 97
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 110
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 222
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 87
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_6_piece0 on xxx:38417 in memory (size: 3.3 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_6_piece0 on xxx:35696 in memory (size: 3.3 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 192
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 201
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 117
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 123
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 12
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 60
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 84
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 127
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 91
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 136
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 45
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 200
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 64
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_2_piece0 on xxx:38417 in memory (size: 101.0 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_2_piece0 on xxx:35696 in memory (size: 101.0 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 92
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 0
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 81
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 185
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 214
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 21
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 31
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 67
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 112
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 178
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 208
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 78
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 73
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 131
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_8_piece0 on xxx:38417 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_8_piece0 on xxx:35696 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 61
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 3
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_7_piece0 on xxx:38417 in memory (size: 3.8 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_7_piece0 on xxx:35696 in memory (size: 3.8 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Starting job: sum at DeltaSync.java:448
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Job 5 finished: sum at DeltaSync.java:448, took 0.000044 s
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 36
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 80
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 103
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 108
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 183
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 72
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 54
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 132
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 99
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 19
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 93
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 179
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 215
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 66
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 77
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 151
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 116
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 191
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 17
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 14
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 18
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 125
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 204
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 146
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 50
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 56
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 52
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 101
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 221
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 213
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 181
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 190
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 85
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned shuffle 2
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 156
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 161
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 53
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 197
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 20
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 41
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 44
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 140
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 218
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 188
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 122
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 195
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 167
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 220
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 43
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 199
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 155
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 24
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 219
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 71
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 198
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 23
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 135
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 26
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 141
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 121
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 157
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 13
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 130
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned shuffle 0
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 7
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 138
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 63
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 187
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 32
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 196
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 48
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 206
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 119
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 160
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 90
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 40
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 113
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_0_piece0 on xxx:38417 in memory (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_0_piece0 on xxx:35696 in memory (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 68
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 224
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 28
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 202
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 10
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 139
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 76
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 49
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 137
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 58
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_4_piece0 on xxx:38417 in memory (size: 3.9 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_4_piece0 on xxx:35696 in memory (size: 3.9 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 4
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 211
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 212
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 83
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 203
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 33
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 86
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 82
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_1_piece0 on xxx:38417 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_1_piece0 on xxx:35696 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 95
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 142
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 111
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 98
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 184
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 46
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 129
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 104
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 159
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 59
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 25
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 173
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 79
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 153
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 189
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 51
   21/05/26 18:33:32 INFO spark.SparkContext: Starting job: sum at DeltaSync.java:449
   21/05/26 18:33:32 INFO scheduler.DAGScheduler: Job 6 finished: sum at DeltaSync.java:449, took 0.000035 s
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO spark.SparkContext: Starting job: collect at SparkRDDWriteClient.java:120
   21/05/26 18:33:32 INFO scheduler.DAGScheduler: Job 7 finished: collect at SparkRDDWriteClient.java:120, took 0.000039 s
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants [[==>20210526183328__deltacommit__INFLIGHT]]
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:32 INFO util.CommitUtils: Creating  metadata for UPSERT numWriteStats:0numReplaceFileIds:0
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants [[==>20210526183328__deltacommit__INFLIGHT]]
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:32 INFO client.AbstractHoodieWriteClient: Committing 20210526183328 action deltacommit
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Marking instant complete [==>20210526183328__deltacommit__INFLIGHT]
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Checking for file exists ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit.inflight
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Create new file for toInstant ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Completed [==>20210526183328__deltacommit__INFLIGHT]
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants [[==>20210526183328__deltacommit__REQUESTED], [==>20210526183328__deltacommit__INFLIGHT], [20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:32 INFO table.HoodieTimelineArchiveLog: No Instants to archive
   21/05/26 18:33:32 INFO client.AbstractHoodieWriteClient: Auto cleaning is enabled. Running cleaner now
   21/05/26 18:33:32 INFO client.AbstractHoodieWriteClient: Scheduling cleaning at instant time :20210526183332
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants [[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote view for basePath /user/hd_xyz/yyy/ml_xxx/foo. Server=xxx:37089, Timeout=300
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating InMemory based view for basePath /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO view.AbstractTableFileSystemView: Took 0 ms to read  0 instants, 0 replaced file groups
   21/05/26 18:33:32 INFO util.ClusteringUtils: Found 0 files in pending clustering operations
   21/05/26 18:33:32 INFO view.RemoteHoodieTableFileSystemView: Sending request : (http://xxx:37089/v1/hoodie/view/compactions/pending/?basepath=%2Fuser%2Fhdfs%2Fxyz%2Fpublic%2Fml_xxx%2Ffoo&lastinstantts=20210526183328&timelinehash=3cb19d4eacc8a39b3d4198ed17d5dac7ca1a076cc50020fab31fed29c6ccddb1)
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:33 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO timeline.HoodieActiveTimeline: Loaded instants [[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:33 INFO collection.RocksDBDAO: DELETING RocksDB persisted at /tmp/hoodie_timeline_rocksdb/_user_hdfs_xyz_public_ml_xxx_foo/a138e066-6b6b-4f72-8865-4c30301cbe11
   21/05/26 18:33:33 INFO collection.RocksDBDAO: No column family found. Loading default
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl_open.cc:230] Creating manifest 1 
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/version_set.cc:3406] Recovering from manifest file: MANIFEST-000001
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [default]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/version_set.cc:3610] Recovered from manifest file:/tmp/hoodie_timeline_rocksdb/_user_hdfs_xyz_public_ml_xxx_foo/a138e066-6b6b-4f72-8865-4c30301cbe11/MANIFEST-000001 succeeded,manifest_file_number is 1, next_file_number is 3, last_sequence is 0, log_number is 0,prev_log_number is 0,max_column_family is 0,min_log_number_to_keep is 0
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/version_set.cc:3618] Column family [default] (ID 0), log number is 0
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl_open.cc:1287] DB pointer 0x7f3aaccf1f20
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/version_set.cc:2936] Creating manifest 6
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_view__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_view__user_hdfs_xyz_public_ml_xxx_foo] (ID 1)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_pending_compaction__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_pending_compaction__user_hdfs_xyz_public_ml_xxx_foo] (ID 2)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_bootstrap_basefile__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_bootstrap_basefile__user_hdfs_xyz_public_ml_xxx_foo] (ID 3)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_partitions__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_partitions__user_hdfs_xyz_public_ml_xxx_foo] (ID 4)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_replaced_fg_user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_replaced_fg_user_hdfs_xyz_public_ml_xxx_foo] (ID 5)
   21/05/26 18:33:33 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.246.4.117:53684) with ID 2
   21/05/26 18:33:33 INFO spark.ExecutorAllocationManager: New executor 2 has registered (new total is 2)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_pending_clustering_fg_user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_pending_clustering_fg_user_hdfs_xyz_public_ml_xxx_foo] (ID 6)
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting replacedFileGroups to ROCKSDB based file-system view at /tmp/hoodie_timeline_rocksdb, Total file-groups=0
   21/05/26 18:33:33 INFO collection.RocksDBDAO: Prefix DELETE (query=part=) on hudi_replaced_fg_user_hdfs_xyz_public_ml_xxx_foo
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting replacedFileGroups to ROCKSDB based file-system view complete
   21/05/26 18:33:33 INFO view.AbstractTableFileSystemView: Took 9 ms to read  0 instants, 0 replaced file groups
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Initializing pending compaction operations. Count=0
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Initializing external data file mapping. Count=0
   21/05/26 18:33:33 INFO util.ClusteringUtils: Found 0 files in pending clustering operations
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting file groups in pending clustering to ROCKSDB based file-system view at /tmp/hoodie_timeline_rocksdb, Total file-groups=0
   21/05/26 18:33:33 INFO collection.RocksDBDAO: Prefix DELETE (query=part=) on hudi_pending_clustering_fg_user_hdfs_xyz_public_ml_xxx_foo
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting replacedFileGroups to ROCKSDB based file-system view complete
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Created ROCKSDB based file-system view at /tmp/hoodie_timeline_rocksdb
   21/05/26 18:33:33 INFO collection.RocksDBDAO: Prefix Search for (query=) on hudi_pending_compaction__user_hdfs_xyz_public_ml_xxx_foo. Total Time Taken (msec)=1. Serialization Time taken(micro)=0, num entries=0
   21/05/26 18:33:33 INFO service.RequestHandler: TimeTakenMillis[Total=791, Refresh=779, handle=11, Check=1], Success=true, Query=basepath=%2Fuser%2Fhdfs%2Fxyz%2Fpublic%2Fml_xxx%2Ffoo&lastinstantts=20210526183328&timelinehash=3cb19d4eacc8a39b3d4198ed17d5dac7ca1a076cc50020fab31fed29c6ccddb1, Host=xxx:37089, synced=false
   21/05/26 18:33:33 INFO storage.BlockManagerMasterEndpoint: Registering block manager xxx:36920 with 912.3 MB RAM, BlockManagerId(2, xxx, 36920, None)
   21/05/26 18:33:33 INFO clean.CleanPlanner: No earliest commit to retain. No need to scan partitions !!
   21/05/26 18:33:33 INFO clean.CleanPlanner: Nothing to clean here. It is already clean
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Cleaner started
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Cleaned failed attempts if any
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:33 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO timeline.HoodieActiveTimeline: Loaded instants [[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:33 INFO client.SparkRDDWriteClient: Successfully synced to metadata table
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Committed 20210526183328
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Scheduling table service COMPACT
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Scheduling compaction at instant time :20210526183333
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:33 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO timeline.HoodieActiveTimeline: Loaded instants [[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:33 INFO compact.SparkScheduleCompactionActionExecutor: Checking if compaction needs to be run on /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO deltastreamer.DeltaSync: Commit 20210526183328 successful!
   21/05/26 18:33:33 INFO rdd.MapPartitionsRDD: Removing RDD 29 from persistence list
   21/05/26 18:33:33 INFO storage.BlockManager: Removing RDD 29
   21/05/26 18:33:34 INFO rdd.MapPartitionsRDD: Removing RDD 37 from persistence list
   21/05/26 18:33:34 INFO storage.BlockManager: Removing RDD 37
   21/05/26 18:33:34 INFO deltastreamer.DeltaSync: Shutting down embedded timeline server
   21/05/26 18:33:34 INFO embedded.EmbeddedTimelineService: Closing Timeline server
   21/05/26 18:33:34 INFO service.TimelineService: Closing Timeline Service
   21/05/26 18:33:34 INFO javalin.Javalin: Stopping Javalin ...
   21/05/26 18:33:34 INFO javalin.Javalin: Javalin has stopped
   21/05/26 18:33:34 INFO view.RocksDbBasedFileSystemView: Closing Rocksdb !!
   21/05/26 18:33:34 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:365] Shutdown: canceling all background work
   21/05/26 18:33:34 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:521] Shutdown complete
   21/05/26 18:33:34 INFO view.RocksDbBasedFileSystemView: Closed Rocksdb !!
   21/05/26 18:33:34 INFO service.TimelineService: Closed Timeline Service
   21/05/26 18:33:34 INFO embedded.EmbeddedTimelineService: Closed Timeline server
   21/05/26 18:33:34 INFO deltastreamer.HoodieDeltaStreamer: Shut down delta streamer
   21/05/26 18:33:34 INFO server.AbstractConnector: Stopped Spark@7a0e94b4{HTTP/1.1,[http/1.1]}{0.0.0.0:0}
   21/05/26 18:33:34 INFO ui.SparkUI: Stopped Spark web UI at http://xxx:32822
   21/05/26 18:33:34 INFO yarn.YarnAllocator: Driver requested a total number of 0 executor(s).
   21/05/26 18:33:34 INFO cluster.YarnClusterSchedulerBackend: Shutting down all executors
   21/05/26 18:33:34 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down
   21/05/26 18:33:34 INFO cluster.SchedulerExtensionServices: Stopping SchedulerExtensionServices
   (serviceOption=None,
    services=List(),
    started=false)
   21/05/26 18:33:34 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
   21/05/26 18:33:34 INFO memory.MemoryStore: MemoryStore cleared
   21/05/26 18:33:34 INFO storage.BlockManager: BlockManager stopped
   21/05/26 18:33:34 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
   21/05/26 18:33:34 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
   21/05/26 18:33:34 INFO spark.SparkContext: Successfully stopped SparkContext
   21/05/26 18:33:34 INFO yarn.ApplicationMaster: Final app status: SUCCEEDED, exitCode: 0
   21/05/26 18:33:34 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with SUCCEEDED
   21/05/26 18:33:34 INFO impl.AMRMClientImpl: Waiting for application to be successfully unregistered.
   21/05/26 18:33:34 INFO yarn.ApplicationMaster: Deleting staging directory hdfs://xxx:8020/user/hd_xyz/.sparkStaging/application_1618828995116_0162
   21/05/26 18:33:34 INFO util.ShutdownHookManager: Shutdown hook called
   21/05/26 18:33:34 INFO util.ShutdownHookManager: Deleting directory /data/hadoop/yarn/local/usercache/hdfs/appcache/application_1618828995116_0162/spark-4c7e81b9-e526-4325-abf0-d163828b92b5
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] n3nash commented on issue #2959: No data stored after migrating to Hudi 0.8.0

Posted by GitBox <gi...@apache.org>.
n3nash commented on issue #2959:
URL: https://github.com/apache/hudi/issues/2959#issuecomment-847396093


   @PavelPetukhov 
   
   I'm guessing there's more in the logs to tell us what is happening even though the status is `SUCCEEDED`. 
   
   1. Can you paste the contents of the `.hoodie` directory please ?
   2. Can you look for any ERRORs/Exceptions in logs ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] PavelPetukhov edited a comment on issue #2959: No data stored after migrating to Hudi 0.8.0

Posted by GitBox <gi...@apache.org>.
PavelPetukhov edited a comment on issue #2959:
URL: https://github.com/apache/hudi/issues/2959#issuecomment-848885930


   .hoodie directory structure is the following
   hdfs dfs -ls /user/hdfs/raw_data/public/ml_training_data/foo/.hoodie
   Found 7 items
   drwxr-xr-x   - hdfs hadoop          0 2021-05-26 18:33 /user/hdfs/raw_data/public/ml_training_data/foo/.hoodie/.aux
   drwxr-xr-x   - hdfs hadoop          0 2021-05-26 18:33 /user/hdfs/raw_data/public/ml_training_data/foo/.hoodie/.temp
   -rw-r--r--   3 hdfs hadoop       1201 2021-05-26 18:33 /user/hdfs/raw_data/public/ml_training_data/foo/.hoodie/20210526183328.deltacommit
   -rw-r--r--   3 hdfs hadoop        518 2021-05-26 18:33 /user/hdfs/raw_data/public/ml_training_data/foo/.hoodie/20210526183328.deltacommit.inflight
   -rw-r--r--   3 hdfs hadoop          0 2021-05-26 18:33 /user/hdfs/raw_data/public/ml_training_data/foo/.hoodie/20210526183328.deltacommit.requested
   drwxr-xr-x   - hdfs hadoop          0 2021-05-26 18:33 /user/hdfs/raw_data/public/ml_training_data/foo/.hoodie/archived
   -rw-r--r--   3 hdfs hadoop        391 2021-05-26 18:33 /user/hdfs/raw_data/public/ml_training_data/foo/.hoodie/hoodie.properties
   
   
   Also, I have removed everything unrelated, so the request looks like this:
   
   /usr/local/spark/bin/spark-submit --conf "spark.yarn.submit.waitAppCompletion=false" \
   --conf "spark.dynamicAllocation.minExecutors=1" \
   --conf "spark.dynamicAllocation.maxExecutors=10" \
   --conf "spark.dynamicAllocation.enabled=true" \
   --conf "spark.dynamicAllocation.shuffleTracking.enabled=true" \
   --conf "spark.shuffle.service.enabled=true" \
   --conf "spark.eventLog.enabled=true" \
   --conf "spark.eventLog.dir=hdfs://xxx/eventLogging" \
   --conf "spark.executor.memoryOverhead=384" \
   --conf "spark.driver.memoryOverhead=384" \
   --conf "spark.driver.extraJavaOptions=-DsparkAappName=xxx -DlogIndex=GOLANG_JSON -DappName=data-lake-extractors-streamer -DlogFacility=stdout" \
   --packages org.apache.spark:spark-avro_2.12:2.4.7 \
   --master yarn \
   --deploy-mode cluster \
   --name xxx \
   --driver-memory 2G \
   --executor-memory 2G \
   --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \
   hdfs://xxx/user/hudi/hudi-utilities-bundle_2.12-0.8.0.jar \
   --op UPSERT \
   --table-type MERGE_ON_READ \
   --source-class org.apache.hudi.utilities.sources.AvroKafkaSource \
   --source-ordering-field __null_ts_ms \
   --schemaprovider-class org.apache.hudi.utilities.schema.SchemaRegistryProvider \
   --target-base-path /user/hdfs/raw_data/public/xxx/yyy \
   --target-table xxx \
   --hoodie-conf "hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.CustomKeyGenerator" \
   --hoodie-conf "hoodie.deltastreamer.keygen.timebased.timestamp.type=DATE_STRING" \
   --hoodie-conf "hoodie.deltastreamer.keygen.timebased.output.dateformat=yyyy/MM/dd" \
   --hoodie-conf "hoodie.deltastreamer.keygen.timebased.input.dateformat=yyyy-MM-ddTHH:mm:ssZ,yyyy-MM-ddTHH:mm:ss.SSSZ" \
   --hoodie-conf "hoodie.deltastreamer.keygen.timebased.input.dateformat.list.delimiter.regex=" \
   --hoodie-conf "hoodie.deltastreamer.keygen.timebased.input.timezone=" \
   --hoodie-conf "hoodie.upsert.shuffle.parallelism=2" \
   --hoodie-conf "hoodie.insert.shuffle.parallelism=2" \
   --hoodie-conf "hoodie.delete.shuffle.parallelism=2" \
   --hoodie-conf "hoodie.bulkinsert.shuffle.parallelism=2" \
   --hoodie-conf "hoodie.embed.timeline.server=true" \
   --hoodie-conf "hoodie.filesystem.view.type=EMBEDDED_KV_STORE" \
   --hoodie-conf "hoodie.deltastreamer.schemaprovider.registry.url=http://xxx/subjects/xxx-value/versions/latest" \
   --hoodie-conf "bootstrap.servers=xxx" \
   --hoodie-conf "auto.offset.reset=earliest" \
   --hoodie-conf "group.id=hudi_group" \
   --hoodie-conf "schema.registry.url=http://xxx" \
   --hoodie-conf "hoodie.datasource.write.recordkey.field=id" \
   --hoodie-conf "hoodie.datasource.write.partitionpath.field=date:TIMESTAMP" \
   --hoodie-conf "hoodie.deltastreamer.source.kafka.topic=xxx" \
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] PavelPetukhov edited a comment on issue #2959: No data stored after migrating to Hudi 0.8.0

Posted by GitBox <gi...@apache.org>.
PavelPetukhov edited a comment on issue #2959:
URL: https://github.com/apache/hudi/issues/2959#issuecomment-848930327


   Below is our full log
   
   
   
   Logged in as: dr.who 
   Application
   About
   Jobs
   Tools
   
   Log Type: stderr
   Log Upload Time: Wed May 26 18:33:34 +0300 2021
   Log Length: 104910
   21/05/26 18:33:18 INFO util.SignalUtils: Registered signal handler for TERM
   21/05/26 18:33:18 INFO util.SignalUtils: Registered signal handler for HUP
   21/05/26 18:33:18 INFO util.SignalUtils: Registered signal handler for INT
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing view acls to: yarn,hdfs
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing modify acls to: yarn,hdfs
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing view acls groups to: 
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing modify acls groups to: 
   21/05/26 18:33:18 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(yarn, hdfs); groups with view permissions: Set(); users  with modify permissions: Set(yarn, hdfs); groups with modify permissions: Set()
   21/05/26 18:33:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
   21/05/26 18:33:18 INFO yarn.ApplicationMaster: Preparing Local resources
   21/05/26 18:33:19 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
   21/05/26 18:33:19 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1618828995116_0162_000001
   21/05/26 18:33:19 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread
   21/05/26 18:33:19 INFO yarn.ApplicationMaster: Waiting for spark context initialization...
   21/05/26 18:33:19 WARN deltastreamer.SchedulerConfGenerator: Job Scheduling Configs will not be in effect as spark.scheduler.mode is not set to FAIR at instantiation time. Continuing without scheduling configs
   21/05/26 18:33:19 INFO spark.SparkContext: Running Spark version 2.4.7
   21/05/26 18:33:19 INFO spark.SparkContext: Submitted application: xxx
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing view acls to: yarn,hdfs
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing modify acls to: yarn,hdfs
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing view acls groups to: 
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing modify acls groups to: 
   21/05/26 18:33:19 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(yarn, hdfs); groups with view permissions: Set(); users  with modify permissions: Set(yarn, hdfs); groups with modify permissions: Set()
   21/05/26 18:33:20 INFO util.Utils: Successfully started service 'sparkDriver' on port 37691.
   21/05/26 18:33:20 INFO spark.SparkEnv: Registering MapOutputTracker
   21/05/26 18:33:20 INFO spark.SparkEnv: Registering BlockManagerMaster
   21/05/26 18:33:20 INFO storage.BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
   21/05/26 18:33:20 INFO storage.BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
   21/05/26 18:33:20 INFO storage.DiskBlockManager: Created local directory at /data/hadoop/yarn/local/usercache/hdfs/appcache/application_1618828995116_0162/blockmgr-9de167db-4756-414e-9126-32cb562e91aa
   21/05/26 18:33:20 INFO memory.MemoryStore: MemoryStore started with capacity 912.3 MB
   21/05/26 18:33:20 INFO spark.SparkEnv: Registering OutputCommitCoordinator
   21/05/26 18:33:20 INFO util.log: Logging initialized @2935ms
   21/05/26 18:33:20 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /jobs, /jobs/json, /jobs/job, /jobs/job/json, /stages, /stages/json, /stages/stage, /stages/stage/json, /stages/pool, /stages/pool/json, /storage, /storage/json, /storage/rdd, /storage/rdd/json, /environment, /environment/json, /executors, /executors/json, /executors/threadDump, /executors/threadDump/json, /static, /, /api, /jobs/job/kill, /stages/stage/kill.
   21/05/26 18:33:20 INFO server.Server: jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
   21/05/26 18:33:20 INFO server.Server: Started @3069ms
   21/05/26 18:33:20 INFO server.AbstractConnector: Started ServerConnector@7a0e94b4{HTTP/1.1,[http/1.1]}{0.0.0.0:32822}
   21/05/26 18:33:20 INFO util.Utils: Successfully started service 'SparkUI' on port 32822.
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@43837fbc{/jobs,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@d91ba30{/jobs/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4854d5d9{/jobs/job,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@672e7ec3{/jobs/job/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@67ee182c{/stages,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@97af315{/stages/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1936a0e0{/stages/stage,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@447ef19e{/stages/stage/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@68e36851{/stages/pool,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@352fe12b{/stages/pool/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3d39f28d{/storage,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@e7806b5{/storage/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7d2a56cb{/storage/rdd,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@37c6c6fc{/storage/rdd/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4599e713{/environment,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@b9a0cbb{/environment/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@24299f0d{/executors,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@25594c52{/executors/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2f728695{/executors/threadDump,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7456a814{/executors/threadDump/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1cef9064{/static,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@16ba2eda{/,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@dac88e2{/api,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@145850ef{/jobs/job/kill,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6d678cf2{/stages/stage/kill,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at http://xxx:32822
   21/05/26 18:33:20 INFO cluster.YarnClusterScheduler: Created YarnClusterScheduler
   21/05/26 18:33:20 INFO cluster.SchedulerExtensionServices: Starting Yarn extension services with app application_1618828995116_0162 and attemptId Some(appattempt_1618828995116_0162_000001)
   21/05/26 18:33:20 WARN util.Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs.
   21/05/26 18:33:20 INFO util.Utils: Using initial executors = 1, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
   21/05/26 18:33:20 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 38417.
   21/05/26 18:33:20 INFO netty.NettyBlockTransferService: Server created on xxx:38417
   21/05/26 18:33:20 INFO storage.BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
   21/05/26 18:33:20 INFO storage.BlockManagerMaster: Registering BlockManager BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO storage.BlockManagerMasterEndpoint: Registering block manager xxx:38417 with 912.3 MB RAM, BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO storage.BlockManager: external shuffle service port = 7337
   21/05/26 18:33:20 INFO storage.BlockManager: Initialized BlockManager: BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /metrics/json.
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1b3c78ce{/metrics/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:21 INFO scheduler.EventLoggingListener: Logging events to hdfs://xxx:8020/eventLogging/application_1618828995116_0162_1
   21/05/26 18:33:21 WARN util.Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs.
   21/05/26 18:33:21 INFO util.Utils: Using initial executors = 1, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
   21/05/26 18:33:21 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to request executors before the AM has registered!
   21/05/26 18:33:21 INFO client.RMProxy: Connecting to ResourceManager at xxx/10.246.4.117:8030
   21/05/26 18:33:21 INFO yarn.YarnRMClient: Registering the ApplicationMaster
   21/05/26 18:33:21 INFO yarn.ApplicationMaster: 
   ===============================================================================
   YARN executor launch context:
     env:
       CLASSPATH -> {{PWD}}<CPS>{{PWD}}/__spark_conf__<CPS>{{PWD}}/__spark_libs__/*<CPS>/usr/hdp/2.6.0.3-8/hadoop/conf<CPS>/usr/hdp/2.6.0.3-8/hadoop/*<CPS>/usr/hdp/2.6.0.3-8/hadoop/lib/*<CPS>/usr/hdp/current/hadoop-hdfs-client/*<CPS>/usr/hdp/current/hadoop-hdfs-client/lib/*<CPS>/usr/hdp/current/hadoop-yarn-client/*<CPS>/usr/hdp/current/hadoop-yarn-client/lib/*<CPS>/usr/hdp/current/ext/hadoop/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*<CPS>{{PWD}}/__spark_conf__/__hadoop_conf__
       SPARK_YARN_STAGING_DIR -> hdfs://xxx:8020/user/hd_xyz/.sparkStaging/application_1618828995116_0162
       SPARK_USER -> hdfs
   
     command:
       {{JAVA_HOME}}/bin/java \ 
         -server \ 
         -Xmx2048m \ 
         -Djava.io.tmpdir={{PWD}}/tmp \ 
         '-Dspark.driver.port=37691' \ 
         '-Dspark.ui.port=0' \ 
         -Dspark.yarn.app.container.log.dir=<LOG_DIR> \ 
         -XX:OnOutOfMemoryError='kill %p' \ 
         org.apache.spark.executor.CoarseGrainedExecutorBackend \ 
         --driver-url \ 
         spark://CoarseGrainedScheduler@xxx:37691 \ 
         --executor-id \ 
         <executorId> \ 
         --hostname \ 
         <hostname> \ 
         --cores \ 
         1 \ 
         --app-id \ 
         application_1618828995116_0162 \ 
         --user-class-path \ 
         file:$PWD/__app__.jar \ 
         --user-class-path \ 
         file:$PWD/org.apache.spark_spark-avro_2.12-2.4.7.jar \ 
         --user-class-path \ 
         file:$PWD/org.spark-project.spark_unused-1.0.0.jar \ 
         1><LOG_DIR>/stdout \ 
         2><LOG_DIR>/stderr
   
     resources:
       org.apache.spark_spark-avro_2.12-2.4.7.jar -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/hd_xyz/.sparkStaging/application_1618828995116_0162/org.apache.spark_spark-avro_2.12-2.4.7.jar" } size: 107269 timestamp: 1622043191967 type: FILE visibility: PRIVATE
       __app__.jar -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/jars/hudi/hudi-utilities-bundle_2.12-0.8.0.jar" } size: 40399204 timestamp: 1622022896130 type: FILE visibility: PUBLIC
       __spark_conf__ -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/hd_xyz/.sparkStaging/application_1618828995116_0162/__spark_conf__.zip" } size: 205423 timestamp: 1622043193955 type: ARCHIVE visibility: PRIVATE
       org.spark-project.spark_unused-1.0.0.jar -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/hd_xyz/.sparkStaging/application_1618828995116_0162/org.spark-project.spark_unused-1.0.0.jar" } size: 2777 timestamp: 1622043192905 type: FILE visibility: PRIVATE
       __spark_libs__ -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/hd_xyz/.sparkStaging/application_1618828995116_0162/__spark_libs__2858796966972713370.zip" } size: 242613518 timestamp: 1622043190403 type: ARCHIVE visibility: PRIVATE
   
   ===============================================================================
   21/05/26 18:33:21 WARN util.Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs.
   21/05/26 18:33:21 INFO util.Utils: Using initial executors = 1, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
   21/05/26 18:33:21 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(spark://YarnAM@xxx:37691)
   21/05/26 18:33:21 INFO yarn.YarnAllocator: Will request 1 executor container(s), each with 1 core(s) and 2432 MB memory (including 384 MB of overhead)
   21/05/26 18:33:21 INFO yarn.YarnAllocator: Submitted 1 unlocalized container requests.
   21/05/26 18:33:21 INFO yarn.ApplicationMaster: Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals
   21/05/26 18:33:22 INFO impl.AMRMClientImpl: Received new token for : xxx:45454
   21/05/26 18:33:22 INFO yarn.YarnAllocator: Launching container container_e03_1618828995116_0162_01_000002 on host xxx for executor with ID 1
   21/05/26 18:33:22 INFO yarn.YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them.
   21/05/26 18:33:22 INFO impl.ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
   21/05/26 18:33:22 INFO impl.ContainerManagementProtocolProxy: Opening proxy : xxx:45454
   21/05/26 18:33:25 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.246.3.9:49980) with ID 1
   21/05/26 18:33:25 INFO spark.ExecutorAllocationManager: New executor 1 has registered (new total is 1)
   21/05/26 18:33:25 INFO cluster.YarnClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
   21/05/26 18:33:25 INFO cluster.YarnClusterScheduler: YarnClusterScheduler.postStartHook done
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO utilities.UtilHelpers: Adding overridden properties to file properties.
   21/05/26 18:33:25 WARN spark.SparkContext: Using an existing SparkContext; some configuration may not take effect.
   21/05/26 18:33:25 INFO storage.BlockManagerMasterEndpoint: Registering block manager xxx:35696 with 912.3 MB RAM, BlockManagerId(1, xxx, 35696, None)
   21/05/26 18:33:25 INFO deltastreamer.HoodieDeltaStreamer: Creating delta streamer with configs : {hoodie.deltastreamer.keygen.timebased.input.timezone=, hoodie.embed.timeline.server=true, schema.registry.url=http://xxx, hoodie.filesystem.view.type=EMBEDDED_KV_STORE, hoodie.deltastreamer.keygen.timebased.input.dateformat=yyyy-MM-ddTHH:mm:ssZ,yyyy-MM-ddTHH:mm:ss.SSSZ, hoodie.delete.shuffle.parallelism=2, hoodie.bulkinsert.shuffle.parallelism=2, hoodie.deltastreamer.keygen.timebased.output.dateformat=yyyy/MM/dd, group.id=hudi_group_080, auto.offset.reset=earliest, hoodie.insert.shuffle.parallelism=2, hoodie.deltastreamer.keygen.timebased.timestamp.type=DATE_STRING, hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.CustomKeyGenerator, hoodie.deltastreamer.source.kafka.topic=xxx, bootstrap.servers=xxx:9092, hoodie.deltastreamer.keygen.timebased.input.dateformat.list.delimiter.regex=, hoodie.deltastreamer.schemaprovider.registry.url=http://xxx/subjects/xxx-value/versions
 /latest, hoodie.datasource.write.recordkey.field=id, hoodie.upsert.shuffle.parallelism=2, hoodie.datasource.write.partitionpath.field=date:TIMESTAMP}
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Initializing /user/hd_xyz/yyy/ml_xxx/foo as hoodie table /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Finished initializing Table of type MERGE_ON_READ from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO deltastreamer.DeltaSync: Registering Schema :[{"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.ml_xxx.public.foo.Value"}, {"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.m
 l_xxx.public.foo.Value"}]
   21/05/26 18:33:25 INFO deltastreamer.HoodieDeltaStreamer: Delta Streamer running only single round
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:26 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:26 INFO deltastreamer.DeltaSync: Checkpoint to resume from : Optional.empty
   21/05/26 18:33:26 INFO consumer.ConsumerConfig: ConsumerConfig values: 
   	allow.auto.create.topics = true
   	auto.commit.interval.ms = 5000
   	auto.offset.reset = earliest
   	bootstrap.servers = [xxx]
   	check.crcs = true
   	client.dns.lookup = default
   	client.id = 
   	client.rack = 
   	connections.max.idle.ms = 540000
   	default.api.timeout.ms = 60000
   	enable.auto.commit = true
   	exclude.internal.topics = true
   	fetch.max.bytes = 52428800
   	fetch.max.wait.ms = 500
   	fetch.min.bytes = 1
   	group.id = hudi_group_080
   	group.instance.id = null
   	heartbeat.interval.ms = 3000
   	interceptor.classes = []
   	internal.leave.group.on.close = true
   	isolation.level = read_uncommitted
   	key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
   	max.partition.fetch.bytes = 1048576
   	max.poll.interval.ms = 300000
   	max.poll.records = 500
   	metadata.max.age.ms = 300000
   	metric.reporters = []
   	metrics.num.samples = 2
   	metrics.recording.level = INFO
   	metrics.sample.window.ms = 30000
   	partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
   	receive.buffer.bytes = 65536
   	reconnect.backoff.max.ms = 1000
   	reconnect.backoff.ms = 50
   	request.timeout.ms = 30000
   	retry.backoff.ms = 100
   	sasl.client.callback.handler.class = null
   	sasl.jaas.config = null
   	sasl.kerberos.kinit.cmd = /usr/bin/kinit
   	sasl.kerberos.min.time.before.relogin = 60000
   	sasl.kerberos.service.name = null
   	sasl.kerberos.ticket.renew.jitter = 0.05
   	sasl.kerberos.ticket.renew.window.factor = 0.8
   	sasl.login.callback.handler.class = null
   	sasl.login.class = null
   	sasl.login.refresh.buffer.seconds = 300
   	sasl.login.refresh.min.period.seconds = 60
   	sasl.login.refresh.window.factor = 0.8
   	sasl.login.refresh.window.jitter = 0.05
   	sasl.mechanism = GSSAPI
   	security.protocol = PLAINTEXT
   	security.providers = null
   	send.buffer.bytes = 131072
   	session.timeout.ms = 10000
   	ssl.cipher.suites = null
   	ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
   	ssl.endpoint.identification.algorithm = https
   	ssl.key.password = null
   	ssl.keymanager.algorithm = SunX509
   	ssl.keystore.location = null
   	ssl.keystore.password = null
   	ssl.keystore.type = JKS
   	ssl.protocol = TLS
   	ssl.provider = null
   	ssl.secure.random.implementation = null
   	ssl.trustmanager.algorithm = PKIX
   	ssl.truststore.location = null
   	ssl.truststore.password = null
   	ssl.truststore.type = JKS
   	value.deserializer = class io.confluent.kafka.serializers.KafkaAvroDeserializer
   
   21/05/26 18:33:26 INFO serializers.KafkaAvroDeserializerConfig: KafkaAvroDeserializerConfig values: 
   	schema.registry.url = [xxx]
   	max.schemas.per.subject = 1000
   	specific.avro.reader = false
   
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.timestamp.type' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.output.dateformat' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.input.dateformat.list.delimiter.regex' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.input.dateformat' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.datasource.write.partitionpath.field' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.delete.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.datasource.write.recordkey.field' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.upsert.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.datasource.write.keygenerator.class' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.source.kafka.topic' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.schemaprovider.registry.url' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.insert.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.embed.timeline.server' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.bulkinsert.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.input.timezone' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.filesystem.view.type' was supplied but isn't a known config.
   21/05/26 18:33:26 INFO utils.AppInfoParser: Kafka version: 2.4.1
   21/05/26 18:33:26 INFO utils.AppInfoParser: Kafka commitId: c57222ae8cd7866b
   21/05/26 18:33:26 INFO utils.AppInfoParser: Kafka startTimeMs: 1622043206225
   21/05/26 18:33:26 INFO clients.Metadata: [Consumer clientId=consumer-hudi_group_080-1, groupId=hudi_group_080] Cluster ID: 5XoPi9AYT0mbHVQEj6VEaw
   21/05/26 18:33:27 INFO helpers.KafkaOffsetGen: SourceLimit not configured, set numEvents to default value : 5000000
   21/05/26 18:33:27 INFO sources.AvroKafkaSource: About to read 0 from Kafka for topic :xxx
   21/05/26 18:33:27 INFO deltastreamer.DeltaSync: No new data, perform empty commit.
   21/05/26 18:33:27 INFO deltastreamer.DeltaSync: Setting up new Hoodie Write Client
   21/05/26 18:33:27 INFO deltastreamer.DeltaSync: Registering Schema :[{"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.ml_xxx.public.foo.Value"}, {"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.m
 l_xxx.public.foo.Value"}]
   21/05/26 18:33:27 INFO embedded.EmbeddedTimelineService: Starting Timeline service !!
   21/05/26 18:33:27 INFO embedded.EmbeddedTimelineService: Overriding hostIp to (xxx) found in spark-conf. It was null
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating View Manager with storage type :EMBEDDED_KV_STORE
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating embedded rocks-db based Table View
   21/05/26 18:33:27 INFO util.log: Logging initialized @9978ms to org.apache.hudi.org.eclipse.jetty.util.log.Slf4jLog
   21/05/26 18:33:27 INFO javalin.Javalin: 
              __                      __ _
             / /____ _ _   __ ____ _ / /(_)____
        __  / // __ `/| | / // __ `// // // __ \
       / /_/ // /_/ / | |/ // /_/ // // // / / /
       \____/ \__,_/  |___/ \__,_//_//_//_/ /_/
   
           https://javalin.io/documentation
   
   21/05/26 18:33:27 INFO javalin.Javalin: Starting Javalin ...
   21/05/26 18:33:27 INFO javalin.Javalin: Listening on http://localhost:37089/
   21/05/26 18:33:27 INFO javalin.Javalin: Javalin started in 179ms \o/
   21/05/26 18:33:27 INFO service.TimelineService: Starting Timeline server on port :37089
   21/05/26 18:33:27 INFO embedded.EmbeddedTimelineService: Started embedded timeline server at xxx:37089
   21/05/26 18:33:27 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:27 INFO client.AbstractHoodieClient: Timeline Server already running. Not restarting the service
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:27 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:28 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:28 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:28 INFO client.AbstractHoodieWriteClient: Generate a new instant time: 20210526183328 action: deltacommit
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Creating a new instant [==>20210526183328__deltacommit__REQUESTED]
   21/05/26 18:33:28 INFO deltastreamer.DeltaSync: Starting commit  : 20210526183328
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:28 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Loaded instants [[==>20210526183328__deltacommit__REQUESTED]]
   21/05/26 18:33:28 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:28 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:28 INFO client.SparkRDDWriteClient: Successfully synced to metadata table
   21/05/26 18:33:28 INFO client.AsyncCleanerService: Auto cleaning is not enabled. Not running cleaner now
   21/05/26 18:33:28 INFO spark.SparkContext: Starting job: countByKey at SparkHoodieBloomIndex.java:114
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Registering RDD 1 (mapToPair at SparkWriteHelper.java:54) as input to shuffle 1
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Registering RDD 5 (countByKey at SparkHoodieBloomIndex.java:114) as input to shuffle 0
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Got job 0 (countByKey at SparkHoodieBloomIndex.java:114) with 2 output partitions
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Final stage: ResultStage 2 (countByKey at SparkHoodieBloomIndex.java:114)
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 1)
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 1)
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 1 (MapPartitionsRDD[5] at countByKey at SparkHoodieBloomIndex.java:114), which has no missing parents
   21/05/26 18:33:28 INFO memory.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 6.2 KB, free 912.3 MB)
   21/05/26 18:33:28 INFO yarn.YarnAllocator: Driver requested a total number of 2 executor(s).
   21/05/26 18:33:28 INFO yarn.YarnAllocator: Will request 1 executor container(s), each with 1 core(s) and 2432 MB memory (including 384 MB of overhead)
   21/05/26 18:33:28 INFO yarn.YarnAllocator: Submitted 1 unlocalized container requests.
   21/05/26 18:33:28 INFO spark.ExecutorAllocationManager: Requesting 1 new executor because tasks are backlogged (new desired total will be 2)
   21/05/26 18:33:28 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.3 KB, free 912.3 MB)
   21/05/26 18:33:28 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on xxx:38417 (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:28 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 1 (MapPartitionsRDD[5] at countByKey at SparkHoodieBloomIndex.java:114) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:28 INFO cluster.YarnClusterScheduler: Adding task set 1.0 with 2 tasks
   21/05/26 18:33:28 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 1.0 (TID 0, xxx, executor 1, partition 0, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:28 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on xxx:35696 (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:29 INFO impl.AMRMClientImpl: Received new token for : xxx:45454
   21/05/26 18:33:29 INFO yarn.YarnAllocator: Launching container container_e03_1618828995116_0162_01_000004 on host xxx for executor with ID 2
   21/05/26 18:33:29 INFO yarn.YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them.
   21/05/26 18:33:29 INFO impl.ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
   21/05/26 18:33:29 INFO impl.ContainerManagementProtocolProxy: Opening proxy : xxx:45454
   21/05/26 18:33:29 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to 10.246.3.9:49980
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added rdd_3_0 in memory on xxx:35696 (size: 0.0 B, free: 912.3 MB)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 1.0 (TID 1, xxx, executor 1, partition 1, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added rdd_3_1 in memory on xxx:35696 (size: 0.0 B, free: 912.3 MB)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 1.0 (TID 0) in 1023 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 1.0 (TID 1) in 70 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: ShuffleMapStage 1 (countByKey at SparkHoodieBloomIndex.java:114) finished in 1.177 s
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: looking for newly runnable stages
   21/05/26 18:33:29 INFO cluster.YarnClusterScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool 
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 2)
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: Submitting ResultStage 2 (ShuffledRDD[6] at countByKey at SparkHoodieBloomIndex.java:114), which has no missing parents
   21/05/26 18:33:29 INFO memory.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.8 KB, free 912.3 MB)
   21/05/26 18:33:29 INFO memory.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.2 KB, free 912.3 MB)
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on xxx:38417 (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:29 INFO spark.SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 2 (ShuffledRDD[6] at countByKey at SparkHoodieBloomIndex.java:114) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:29 INFO cluster.YarnClusterScheduler: Adding task set 2.0 with 2 tasks
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 2.0 (TID 2, xxx, executor 1, partition 0, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on xxx:35696 (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:29 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to 10.246.3.9:49980
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 2.0 (TID 3, xxx, executor 1, partition 1, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 2.0 (TID 2) in 85 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 2.0 (TID 3) in 32 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:29 INFO cluster.YarnClusterScheduler: Removed TaskSet 2.0, whose tasks have all completed, from pool 
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: ResultStage 2 (countByKey at SparkHoodieBloomIndex.java:114) finished in 0.126 s
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: Job 0 finished: countByKey at SparkHoodieBloomIndex.java:114, took 1.627903 s
   21/05/26 18:33:29 INFO yarn.YarnAllocator: Driver requested a total number of 1 executor(s).
   21/05/26 18:33:30 INFO spark.SparkContext: Starting job: collect at HoodieSparkEngineContext.java:78
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Got job 1 (collect at HoodieSparkEngineContext.java:78) with 1 output partitions
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Final stage: ResultStage 3 (collect at HoodieSparkEngineContext.java:78)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Parents of final stage: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Missing parents: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ResultStage 3 (MapPartitionsRDD[8] at flatMap at HoodieSparkEngineContext.java:78), which has no missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_2 stored as values in memory (estimated size 368.5 KB, free 911.9 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 101.0 KB, free 911.8 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on xxx:38417 (size: 101.0 KB, free: 912.2 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 3 (MapPartitionsRDD[8] at flatMap at HoodieSparkEngineContext.java:78) (first 15 tasks are for partitions Vector(0))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 3.0 with 1 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 3.0 (TID 4, xxx, executor 1, partition 0, PROCESS_LOCAL, 7710 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on xxx:35696 (size: 101.0 KB, free: 912.2 MB)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 3.0 (TID 4) in 178 ms on xxx (executor 1) (1/1)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 3.0, whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ResultStage 3 (collect at HoodieSparkEngineContext.java:78) finished in 0.233 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Job 1 finished: collect at HoodieSparkEngineContext.java:78, took 0.236923 s
   21/05/26 18:33:30 INFO spark.SparkContext: Starting job: collect at HoodieSparkEngineContext.java:73
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Got job 2 (collect at HoodieSparkEngineContext.java:73) with 1 output partitions
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Final stage: ResultStage 4 (collect at HoodieSparkEngineContext.java:73)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Parents of final stage: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Missing parents: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ResultStage 4 (MapPartitionsRDD[10] at map at HoodieSparkEngineContext.java:73), which has no missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_3 stored as values in memory (estimated size 368.3 KB, free 911.5 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 100.9 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_3_piece0 in memory on xxx:38417 (size: 100.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 4 (MapPartitionsRDD[10] at map at HoodieSparkEngineContext.java:73) (first 15 tasks are for partitions Vector(0))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 4.0 with 1 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 4.0 (TID 5, xxx, executor 1, partition 0, PROCESS_LOCAL, 7710 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_3_piece0 in memory on xxx:35696 (size: 100.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 4.0 (TID 5) in 94 ms on xxx (executor 1) (1/1)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 4.0, whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ResultStage 4 (collect at HoodieSparkEngineContext.java:73) finished in 0.167 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Job 2 finished: collect at HoodieSparkEngineContext.java:73, took 0.174163 s
   21/05/26 18:33:30 INFO spark.SparkContext: Starting job: countByKey at SparkHoodieBloomIndex.java:149
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Registering RDD 14 (countByKey at SparkHoodieBloomIndex.java:149) as input to shuffle 2
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Got job 3 (countByKey at SparkHoodieBloomIndex.java:149) with 2 output partitions
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Final stage: ResultStage 7 (countByKey at SparkHoodieBloomIndex.java:149)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 6)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 6)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 6 (MapPartitionsRDD[14] at countByKey at SparkHoodieBloomIndex.java:149), which has no missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_4 stored as values in memory (estimated size 7.5 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 3.9 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_4_piece0 in memory on xxx:38417 (size: 3.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 4 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 6 (MapPartitionsRDD[14] at countByKey at SparkHoodieBloomIndex.java:149) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 6.0 with 2 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 6.0 (TID 6, xxx, executor 1, partition 0, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_4_piece0 in memory on xxx:35696 (size: 3.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 6.0 (TID 7, xxx, executor 1, partition 1, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 6.0 (TID 6) in 60 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 6.0 (TID 7) in 36 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 6.0, whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ShuffleMapStage 6 (countByKey at SparkHoodieBloomIndex.java:149) finished in 0.121 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: looking for newly runnable stages
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 7)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ResultStage 7 (ShuffledRDD[15] at countByKey at SparkHoodieBloomIndex.java:149), which has no missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_5 stored as values in memory (estimated size 3.8 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_5_piece0 stored as bytes in memory (estimated size 2.2 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in memory on xxx:38417 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 5 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 7 (ShuffledRDD[15] at countByKey at SparkHoodieBloomIndex.java:149) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 7.0 with 2 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 7.0 (TID 8, xxx, executor 1, partition 0, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in memory on xxx:35696 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 2 to 10.246.3.9:49980
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 7.0 (TID 9, xxx, executor 1, partition 1, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 7.0 (TID 8) in 47 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 7.0 (TID 9) in 20 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 7.0, whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ResultStage 7 (countByKey at SparkHoodieBloomIndex.java:149) finished in 0.081 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Job 3 finished: countByKey at SparkHoodieBloomIndex.java:149, took 0.219895 s
   21/05/26 18:33:30 INFO bloom.SparkHoodieBloomIndex: InputParallelism: ${2}, IndexParallelism: ${0}
   21/05/26 18:33:30 INFO bloom.BucketizedBloomCheckPartitioner: TotalBuckets 0, min_buckets/partition 1
   21/05/26 18:33:30 INFO rdd.MapPartitionsRDD: Removing RDD 3 from persistence list
   21/05/26 18:33:30 INFO storage.BlockManager: Removing RDD 3
   21/05/26 18:33:31 INFO rdd.MapPartitionsRDD: Removing RDD 22 from persistence list
   21/05/26 18:33:31 INFO storage.BlockManager: Removing RDD 22
   21/05/26 18:33:31 INFO spark.SparkContext: Starting job: countByKey at BaseSparkCommitActionExecutor.java:158
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 16 (mapToPair at SparkHoodieBloomIndex.java:266) as input to shuffle 6
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 23 (mapToPair at SparkHoodieBloomIndex.java:287) as input to shuffle 3
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 22 (flatMapToPair at SparkHoodieBloomIndex.java:274) as input to shuffle 4
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 31 (countByKey at BaseSparkCommitActionExecutor.java:158) as input to shuffle 5
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Got job 4 (countByKey at BaseSparkCommitActionExecutor.java:158) with 2 output partitions
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Final stage: ResultStage 13 (countByKey at BaseSparkCommitActionExecutor.java:158)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 12)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 12)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 10 (MapPartitionsRDD[23] at mapToPair at SparkHoodieBloomIndex.java:287), which has no missing parents
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_6 stored as values in memory (estimated size 5.9 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_6_piece0 stored as bytes in memory (estimated size 3.3 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_6_piece0 in memory on xxx:38417 (size: 3.3 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Created broadcast 6 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 10 (MapPartitionsRDD[23] at mapToPair at SparkHoodieBloomIndex.java:287) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Adding task set 10.0 with 2 tasks
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 10.0 (TID 10, xxx, executor 1, partition 0, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_6_piece0 in memory on xxx:35696 (size: 3.3 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 10.0 (TID 11, xxx, executor 1, partition 1, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 10.0 (TID 10) in 50 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 10.0 (TID 11) in 24 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Removed TaskSet 10.0, whose tasks have all completed, from pool 
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: ShuffleMapStage 10 (mapToPair at SparkHoodieBloomIndex.java:287) finished in 0.092 s
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: looking for newly runnable stages
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: waiting: Set(ShuffleMapStage 12, ResultStage 13)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 12 (MapPartitionsRDD[31] at countByKey at BaseSparkCommitActionExecutor.java:158), which has no missing parents
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_7 stored as values in memory (estimated size 7.1 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_7_piece0 stored as bytes in memory (estimated size 3.8 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in memory on xxx:38417 (size: 3.8 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Created broadcast 7 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 12 (MapPartitionsRDD[31] at countByKey at BaseSparkCommitActionExecutor.java:158) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Adding task set 12.0 with 2 tasks
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 12.0 (TID 12, xxx, executor 1, partition 0, PROCESS_LOCAL, 7730 bytes)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in memory on xxx:35696 (size: 3.8 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 3 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 4 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added rdd_29_0 in memory on xxx:35696 (size: 0.0 B, free: 912.1 MB)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 12.0 (TID 13, xxx, executor 1, partition 1, PROCESS_LOCAL, 7730 bytes)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 12.0 (TID 12) in 105 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added rdd_29_1 in memory on xxx:35696 (size: 0.0 B, free: 912.1 MB)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 12.0 (TID 13) in 24 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Removed TaskSet 12.0, whose tasks have all completed, from pool 
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: ShuffleMapStage 12 (countByKey at BaseSparkCommitActionExecutor.java:158) finished in 0.146 s
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: looking for newly runnable stages
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 13)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting ResultStage 13 (ShuffledRDD[32] at countByKey at BaseSparkCommitActionExecutor.java:158), which has no missing parents
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_8 stored as values in memory (estimated size 3.8 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_8_piece0 stored as bytes in memory (estimated size 2.2 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in memory on xxx:38417 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Created broadcast 8 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 13 (ShuffledRDD[32] at countByKey at BaseSparkCommitActionExecutor.java:158) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Adding task set 13.0 with 2 tasks
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 13.0 (TID 14, xxx, executor 1, partition 0, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in memory on xxx:35696 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 5 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 13.0 (TID 15, xxx, executor 1, partition 1, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 13.0 (TID 14) in 31 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 13.0 (TID 15) in 12 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Removed TaskSet 13.0, whose tasks have all completed, from pool 
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: ResultStage 13 (countByKey at BaseSparkCommitActionExecutor.java:158) finished in 0.064 s
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Job 4 finished: countByKey at BaseSparkCommitActionExecutor.java:158, took 0.320123 s
   21/05/26 18:33:31 INFO commit.BaseSparkCommitActionExecutor: Workload profile :WorkloadProfile {globalStat=WorkloadStat {numInserts=0, numUpdates=0}, partitionStat={}, operationType=UPSERT}
   21/05/26 18:33:31 INFO timeline.HoodieActiveTimeline: Checking for file exists ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit.requested
   21/05/26 18:33:31 INFO timeline.HoodieActiveTimeline: Create new file for toInstant ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit.inflight
   21/05/26 18:33:31 INFO commit.UpsertPartitioner: AvgRecordSize => 1024
   21/05/26 18:33:31 INFO view.AbstractTableFileSystemView: Took 3 ms to read  0 instants, 0 replaced file groups
   21/05/26 18:33:31 INFO util.ClusteringUtils: Found 0 files in pending clustering operations
   21/05/26 18:33:31 INFO commit.UpsertPartitioner: Total Buckets :0, buckets info => {}, 
   Partition to insert buckets => {}, 
   UpdateLocations mapped to buckets =>{}
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 175
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 62
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 9
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 148
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 105
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 143
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 2
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 55
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 209
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 154
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 147
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 163
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 69
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 34
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 100
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned shuffle 5
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 1
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 193
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 169
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 27
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 16
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 115
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 120
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 106
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 174
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 210
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 96
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 6
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 57
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 133
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 11
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 74
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 107
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 164
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 172
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 176
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 194
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 109
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 37
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 177
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 128
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 182
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 205
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 30
   21/05/26 18:33:31 INFO commit.BaseCommitActionExecutor: Auto commit disabled for 20210526183328
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 102
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 180
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 150
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 186
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 89
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 223
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 47
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 158
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 162
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 88
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 39
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 8
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 29
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 124
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 75
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 165
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 217
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 134
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_5_piece0 on xxx:35696 in memory (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_5_piece0 on xxx:38417 in memory (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 35
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 216
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 22
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 114
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 152
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 42
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 94
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 145
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 126
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 144
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 168
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_3_piece0 on xxx:38417 in memory (size: 100.9 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_3_piece0 on xxx:35696 in memory (size: 100.9 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 149
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 38
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 70
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 15
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 118
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 166
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 207
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 170
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 171
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 65
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 5
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 97
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 110
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 222
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 87
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_6_piece0 on xxx:38417 in memory (size: 3.3 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_6_piece0 on xxx:35696 in memory (size: 3.3 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 192
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 201
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 117
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 123
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 12
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 60
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 84
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 127
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 91
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 136
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 45
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 200
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 64
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_2_piece0 on xxx:38417 in memory (size: 101.0 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_2_piece0 on xxx:35696 in memory (size: 101.0 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 92
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 0
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 81
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 185
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 214
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 21
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 31
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 67
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 112
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 178
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 208
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 78
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 73
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 131
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_8_piece0 on xxx:38417 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_8_piece0 on xxx:35696 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 61
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 3
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_7_piece0 on xxx:38417 in memory (size: 3.8 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_7_piece0 on xxx:35696 in memory (size: 3.8 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Starting job: sum at DeltaSync.java:448
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Job 5 finished: sum at DeltaSync.java:448, took 0.000044 s
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 36
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 80
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 103
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 108
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 183
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 72
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 54
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 132
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 99
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 19
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 93
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 179
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 215
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 66
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 77
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 151
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 116
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 191
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 17
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 14
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 18
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 125
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 204
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 146
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 50
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 56
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 52
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 101
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 221
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 213
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 181
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 190
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 85
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned shuffle 2
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 156
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 161
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 53
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 197
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 20
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 41
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 44
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 140
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 218
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 188
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 122
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 195
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 167
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 220
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 43
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 199
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 155
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 24
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 219
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 71
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 198
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 23
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 135
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 26
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 141
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 121
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 157
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 13
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 130
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned shuffle 0
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 7
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 138
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 63
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 187
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 32
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 196
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 48
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 206
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 119
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 160
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 90
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 40
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 113
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_0_piece0 on xxx:38417 in memory (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_0_piece0 on xxx:35696 in memory (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 68
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 224
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 28
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 202
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 10
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 139
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 76
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 49
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 137
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 58
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_4_piece0 on xxx:38417 in memory (size: 3.9 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_4_piece0 on xxx:35696 in memory (size: 3.9 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 4
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 211
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 212
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 83
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 203
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 33
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 86
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 82
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_1_piece0 on xxx:38417 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_1_piece0 on xxx:35696 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 95
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 142
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 111
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 98
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 184
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 46
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 129
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 104
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 159
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 59
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 25
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 173
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 79
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 153
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 189
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 51
   21/05/26 18:33:32 INFO spark.SparkContext: Starting job: sum at DeltaSync.java:449
   21/05/26 18:33:32 INFO scheduler.DAGScheduler: Job 6 finished: sum at DeltaSync.java:449, took 0.000035 s
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO spark.SparkContext: Starting job: collect at SparkRDDWriteClient.java:120
   21/05/26 18:33:32 INFO scheduler.DAGScheduler: Job 7 finished: collect at SparkRDDWriteClient.java:120, took 0.000039 s
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants [[==>20210526183328__deltacommit__INFLIGHT]]
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:32 INFO util.CommitUtils: Creating  metadata for UPSERT numWriteStats:0numReplaceFileIds:0
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants [[==>20210526183328__deltacommit__INFLIGHT]]
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:32 INFO client.AbstractHoodieWriteClient: Committing 20210526183328 action deltacommit
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Marking instant complete [==>20210526183328__deltacommit__INFLIGHT]
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Checking for file exists ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit.inflight
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Create new file for toInstant ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Completed [==>20210526183328__deltacommit__INFLIGHT]
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants [[==>20210526183328__deltacommit__REQUESTED], [==>20210526183328__deltacommit__INFLIGHT], [20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:32 INFO table.HoodieTimelineArchiveLog: No Instants to archive
   21/05/26 18:33:32 INFO client.AbstractHoodieWriteClient: Auto cleaning is enabled. Running cleaner now
   21/05/26 18:33:32 INFO client.AbstractHoodieWriteClient: Scheduling cleaning at instant time :20210526183332
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants [[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote view for basePath /user/hd_xyz/yyy/ml_xxx/foo. Server=xxx:37089, Timeout=300
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating InMemory based view for basePath /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO view.AbstractTableFileSystemView: Took 0 ms to read  0 instants, 0 replaced file groups
   21/05/26 18:33:32 INFO util.ClusteringUtils: Found 0 files in pending clustering operations
   21/05/26 18:33:32 INFO view.RemoteHoodieTableFileSystemView: Sending request : (http://xxx:37089/v1/hoodie/view/compactions/pending/?basepath=%2Fuser%2Fhdfs%2Fxyz%2Fpublic%2Fml_xxx%2Ffoo&lastinstantts=20210526183328&timelinehash=3cb19d4eacc8a39b3d4198ed17d5dac7ca1a076cc50020fab31fed29c6ccddb1)
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:33 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO timeline.HoodieActiveTimeline: Loaded instants [[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:33 INFO collection.RocksDBDAO: DELETING RocksDB persisted at /tmp/hoodie_timeline_rocksdb/_user_hdfs_xyz_public_ml_xxx_foo/a138e066-6b6b-4f72-8865-4c30301cbe11
   21/05/26 18:33:33 INFO collection.RocksDBDAO: No column family found. Loading default
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl_open.cc:230] Creating manifest 1 
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/version_set.cc:3406] Recovering from manifest file: MANIFEST-000001
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [default]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/version_set.cc:3610] Recovered from manifest file:/tmp/hoodie_timeline_rocksdb/_user_hdfs_xyz_public_ml_xxx_foo/a138e066-6b6b-4f72-8865-4c30301cbe11/MANIFEST-000001 succeeded,manifest_file_number is 1, next_file_number is 3, last_sequence is 0, log_number is 0,prev_log_number is 0,max_column_family is 0,min_log_number_to_keep is 0
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/version_set.cc:3618] Column family [default] (ID 0), log number is 0
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl_open.cc:1287] DB pointer 0x7f3aaccf1f20
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/version_set.cc:2936] Creating manifest 6
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_view__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_view__user_hdfs_xyz_public_ml_xxx_foo] (ID 1)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_pending_compaction__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_pending_compaction__user_hdfs_xyz_public_ml_xxx_foo] (ID 2)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_bootstrap_basefile__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_bootstrap_basefile__user_hdfs_xyz_public_ml_xxx_foo] (ID 3)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_partitions__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_partitions__user_hdfs_xyz_public_ml_xxx_foo] (ID 4)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_replaced_fg_user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_replaced_fg_user_hdfs_xyz_public_ml_xxx_foo] (ID 5)
   21/05/26 18:33:33 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.246.4.117:53684) with ID 2
   21/05/26 18:33:33 INFO spark.ExecutorAllocationManager: New executor 2 has registered (new total is 2)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_pending_clustering_fg_user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_pending_clustering_fg_user_hdfs_xyz_public_ml_xxx_foo] (ID 6)
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting replacedFileGroups to ROCKSDB based file-system view at /tmp/hoodie_timeline_rocksdb, Total file-groups=0
   21/05/26 18:33:33 INFO collection.RocksDBDAO: Prefix DELETE (query=part=) on hudi_replaced_fg_user_hdfs_xyz_public_ml_xxx_foo
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting replacedFileGroups to ROCKSDB based file-system view complete
   21/05/26 18:33:33 INFO view.AbstractTableFileSystemView: Took 9 ms to read  0 instants, 0 replaced file groups
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Initializing pending compaction operations. Count=0
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Initializing external data file mapping. Count=0
   21/05/26 18:33:33 INFO util.ClusteringUtils: Found 0 files in pending clustering operations
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting file groups in pending clustering to ROCKSDB based file-system view at /tmp/hoodie_timeline_rocksdb, Total file-groups=0
   21/05/26 18:33:33 INFO collection.RocksDBDAO: Prefix DELETE (query=part=) on hudi_pending_clustering_fg_user_hdfs_xyz_public_ml_xxx_foo
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting replacedFileGroups to ROCKSDB based file-system view complete
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Created ROCKSDB based file-system view at /tmp/hoodie_timeline_rocksdb
   21/05/26 18:33:33 INFO collection.RocksDBDAO: Prefix Search for (query=) on hudi_pending_compaction__user_hdfs_xyz_public_ml_xxx_foo. Total Time Taken (msec)=1. Serialization Time taken(micro)=0, num entries=0
   21/05/26 18:33:33 INFO service.RequestHandler: TimeTakenMillis[Total=791, Refresh=779, handle=11, Check=1], Success=true, Query=basepath=%2Fuser%2Fhdfs%2Fxyz%2Fpublic%2Fml_xxx%2Ffoo&lastinstantts=20210526183328&timelinehash=3cb19d4eacc8a39b3d4198ed17d5dac7ca1a076cc50020fab31fed29c6ccddb1, Host=xxx:37089, synced=false
   21/05/26 18:33:33 INFO storage.BlockManagerMasterEndpoint: Registering block manager xxx:36920 with 912.3 MB RAM, BlockManagerId(2, xxx, 36920, None)
   21/05/26 18:33:33 INFO clean.CleanPlanner: No earliest commit to retain. No need to scan partitions !!
   21/05/26 18:33:33 INFO clean.CleanPlanner: Nothing to clean here. It is already clean
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Cleaner started
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Cleaned failed attempts if any
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:33 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO timeline.HoodieActiveTimeline: Loaded instants [[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:33 INFO client.SparkRDDWriteClient: Successfully synced to metadata table
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Committed 20210526183328
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Scheduling table service COMPACT
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Scheduling compaction at instant time :20210526183333
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:33 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO timeline.HoodieActiveTimeline: Loaded instants [[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:33 INFO compact.SparkScheduleCompactionActionExecutor: Checking if compaction needs to be run on /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO deltastreamer.DeltaSync: Commit 20210526183328 successful!
   21/05/26 18:33:33 INFO rdd.MapPartitionsRDD: Removing RDD 29 from persistence list
   21/05/26 18:33:33 INFO storage.BlockManager: Removing RDD 29
   21/05/26 18:33:34 INFO rdd.MapPartitionsRDD: Removing RDD 37 from persistence list
   21/05/26 18:33:34 INFO storage.BlockManager: Removing RDD 37
   21/05/26 18:33:34 INFO deltastreamer.DeltaSync: Shutting down embedded timeline server
   21/05/26 18:33:34 INFO embedded.EmbeddedTimelineService: Closing Timeline server
   21/05/26 18:33:34 INFO service.TimelineService: Closing Timeline Service
   21/05/26 18:33:34 INFO javalin.Javalin: Stopping Javalin ...
   21/05/26 18:33:34 INFO javalin.Javalin: Javalin has stopped
   21/05/26 18:33:34 INFO view.RocksDbBasedFileSystemView: Closing Rocksdb !!
   21/05/26 18:33:34 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:365] Shutdown: canceling all background work
   21/05/26 18:33:34 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:521] Shutdown complete
   21/05/26 18:33:34 INFO view.RocksDbBasedFileSystemView: Closed Rocksdb !!
   21/05/26 18:33:34 INFO service.TimelineService: Closed Timeline Service
   21/05/26 18:33:34 INFO embedded.EmbeddedTimelineService: Closed Timeline server
   21/05/26 18:33:34 INFO deltastreamer.HoodieDeltaStreamer: Shut down delta streamer
   21/05/26 18:33:34 INFO server.AbstractConnector: Stopped Spark@7a0e94b4{HTTP/1.1,[http/1.1]}{0.0.0.0:0}
   21/05/26 18:33:34 INFO ui.SparkUI: Stopped Spark web UI at http://xxx:32822
   21/05/26 18:33:34 INFO yarn.YarnAllocator: Driver requested a total number of 0 executor(s).
   21/05/26 18:33:34 INFO cluster.YarnClusterSchedulerBackend: Shutting down all executors
   21/05/26 18:33:34 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down
   21/05/26 18:33:34 INFO cluster.SchedulerExtensionServices: Stopping SchedulerExtensionServices
   (serviceOption=None,
    services=List(),
    started=false)
   21/05/26 18:33:34 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
   21/05/26 18:33:34 INFO memory.MemoryStore: MemoryStore cleared
   21/05/26 18:33:34 INFO storage.BlockManager: BlockManager stopped
   21/05/26 18:33:34 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
   21/05/26 18:33:34 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
   21/05/26 18:33:34 INFO spark.SparkContext: Successfully stopped SparkContext
   21/05/26 18:33:34 INFO yarn.ApplicationMaster: Final app status: SUCCEEDED, exitCode: 0
   21/05/26 18:33:34 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with SUCCEEDED
   21/05/26 18:33:34 INFO impl.AMRMClientImpl: Waiting for application to be successfully unregistered.
   21/05/26 18:33:34 INFO yarn.ApplicationMaster: Deleting staging directory hdfs://xxx:8020/user/hd_xyz/.sparkStaging/application_1618828995116_0162
   21/05/26 18:33:34 INFO util.ShutdownHookManager: Shutdown hook called
   21/05/26 18:33:34 INFO util.ShutdownHookManager: Deleting directory /data/hadoop/yarn/local/usercache/hdfs/appcache/application_1618828995116_0162/spark-4c7e81b9-e526-4325-abf0-d163828b92b5
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] PavelPetukhov removed a comment on issue #2959: No data stored after migrating to Hudi 0.8.0

Posted by GitBox <gi...@apache.org>.
PavelPetukhov removed a comment on issue #2959:
URL: https://github.com/apache/hudi/issues/2959#issuecomment-848891756


   This is our full log
   [spark_log.txt](https://github.com/apache/hudi/files/6548390/spark_log.txt)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] PavelPetukhov commented on issue #2959: No data stored after migrating to Hudi 0.8.0

Posted by GitBox <gi...@apache.org>.
PavelPetukhov commented on issue #2959:
URL: https://github.com/apache/hudi/issues/2959#issuecomment-848891756


   There is no exceptions or errors in the logs, all warnings I found are the below
   
   21/05/26 18:33:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
   21/05/26 18:33:19 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
   21/05/26 18:33:19 WARN deltastreamer.SchedulerConfGenerator: Job Scheduling Configs will not be in effect as spark.scheduler.mode is not set to FAIR at instantiation time. Continuing without scheduling configs
   21/05/26 18:33:20 WARN util.Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs.
   21/05/26 18:33:21 WARN util.Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs.
   21/05/26 18:33:21 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to request executors before the AM has registered!
   21/05/26 18:33:25 WARN spark.SparkContext: Using an existing SparkContext; some configuration may not take effect.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.timestamp.type' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.output.dateformat' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.input.dateformat.list.delimiter.regex' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.input.dateformat' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.datasource.write.partitionpath.field' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.delete.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.datasource.write.recordkey.field' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.upsert.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.datasource.write.keygenerator.class' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.source.kafka.topic' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.schemaprovider.registry.url' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.insert.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.embed.timeline.server' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.bulkinsert.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.input.timezone' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.filesystem.view.type' was supplied but isn't a known config.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] PavelPetukhov commented on issue #2959: No data stored after migrating to Hudi 0.8.0

Posted by GitBox <gi...@apache.org>.
PavelPetukhov commented on issue #2959:
URL: https://github.com/apache/hudi/issues/2959#issuecomment-848885930


   .hoodie directory structure is the following
   hdfs dfs -ls /user/hdfs/raw_data/public/ml_training_data/foo/.hoodie
   Found 7 items
   drwxr-xr-x   - hdfs hadoop          0 2021-05-26 18:33 /user/hdfs/raw_data/public/ml_training_data/foo/.hoodie/.aux
   drwxr-xr-x   - hdfs hadoop          0 2021-05-26 18:33 /user/hdfs/raw_data/public/ml_training_data/foo/.hoodie/.temp
   -rw-r--r--   3 hdfs hadoop       1201 2021-05-26 18:33 /user/hdfs/raw_data/public/ml_training_data/foo/.hoodie/20210526183328.deltacommit
   -rw-r--r--   3 hdfs hadoop        518 2021-05-26 18:33 /user/hdfs/raw_data/public/ml_training_data/foo/.hoodie/20210526183328.deltacommit.inflight
   -rw-r--r--   3 hdfs hadoop          0 2021-05-26 18:33 /user/hdfs/raw_data/public/ml_training_data/foo/.hoodie/20210526183328.deltacommit.requested
   drwxr-xr-x   - hdfs hadoop          0 2021-05-26 18:33 /user/hdfs/raw_data/public/ml_training_data/foo/.hoodie/archived
   -rw-r--r--   3 hdfs hadoop        391 2021-05-26 18:33 /user/hdfs/raw_data/public/ml_training_data/foo/.hoodie/hoodie.properties
   [](url)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] PavelPetukhov edited a comment on issue #2959: No data stored after migrating to Hudi 0.8.0

Posted by GitBox <gi...@apache.org>.
PavelPetukhov edited a comment on issue #2959:
URL: https://github.com/apache/hudi/issues/2959#issuecomment-848930327


   """
   
   
   Logged in as: dr.who 
   Application
   About
   Jobs
   Tools
   
   Log Type: stderr
   Log Upload Time: Wed May 26 18:33:34 +0300 2021
   Log Length: 104910
   21/05/26 18:33:18 INFO util.SignalUtils: Registered signal handler for TERM
   21/05/26 18:33:18 INFO util.SignalUtils: Registered signal handler for HUP
   21/05/26 18:33:18 INFO util.SignalUtils: Registered signal handler for INT
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing view acls to: yarn,hdfs
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing modify acls to: yarn,hdfs
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing view acls groups to: 
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing modify acls groups to: 
   21/05/26 18:33:18 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(yarn, hdfs); groups with view permissions: Set(); users  with modify permissions: Set(yarn, hdfs); groups with modify permissions: Set()
   21/05/26 18:33:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
   21/05/26 18:33:18 INFO yarn.ApplicationMaster: Preparing Local resources
   21/05/26 18:33:19 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
   21/05/26 18:33:19 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1618828995116_0162_000001
   21/05/26 18:33:19 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread
   21/05/26 18:33:19 INFO yarn.ApplicationMaster: Waiting for spark context initialization...
   21/05/26 18:33:19 WARN deltastreamer.SchedulerConfGenerator: Job Scheduling Configs will not be in effect as spark.scheduler.mode is not set to FAIR at instantiation time. Continuing without scheduling configs
   21/05/26 18:33:19 INFO spark.SparkContext: Running Spark version 2.4.7
   21/05/26 18:33:19 INFO spark.SparkContext: Submitted application: xxx
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing view acls to: yarn,hdfs
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing modify acls to: yarn,hdfs
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing view acls groups to: 
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing modify acls groups to: 
   21/05/26 18:33:19 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(yarn, hdfs); groups with view permissions: Set(); users  with modify permissions: Set(yarn, hdfs); groups with modify permissions: Set()
   21/05/26 18:33:20 INFO util.Utils: Successfully started service 'sparkDriver' on port 37691.
   21/05/26 18:33:20 INFO spark.SparkEnv: Registering MapOutputTracker
   21/05/26 18:33:20 INFO spark.SparkEnv: Registering BlockManagerMaster
   21/05/26 18:33:20 INFO storage.BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
   21/05/26 18:33:20 INFO storage.BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
   21/05/26 18:33:20 INFO storage.DiskBlockManager: Created local directory at /data/hadoop/yarn/local/usercache/hdfs/appcache/application_1618828995116_0162/blockmgr-9de167db-4756-414e-9126-32cb562e91aa
   21/05/26 18:33:20 INFO memory.MemoryStore: MemoryStore started with capacity 912.3 MB
   21/05/26 18:33:20 INFO spark.SparkEnv: Registering OutputCommitCoordinator
   21/05/26 18:33:20 INFO util.log: Logging initialized @2935ms
   21/05/26 18:33:20 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /jobs, /jobs/json, /jobs/job, /jobs/job/json, /stages, /stages/json, /stages/stage, /stages/stage/json, /stages/pool, /stages/pool/json, /storage, /storage/json, /storage/rdd, /storage/rdd/json, /environment, /environment/json, /executors, /executors/json, /executors/threadDump, /executors/threadDump/json, /static, /, /api, /jobs/job/kill, /stages/stage/kill.
   21/05/26 18:33:20 INFO server.Server: jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
   21/05/26 18:33:20 INFO server.Server: Started @3069ms
   21/05/26 18:33:20 INFO server.AbstractConnector: Started ServerConnector@7a0e94b4{HTTP/1.1,[http/1.1]}{0.0.0.0:32822}
   21/05/26 18:33:20 INFO util.Utils: Successfully started service 'SparkUI' on port 32822.
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@43837fbc{/jobs,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@d91ba30{/jobs/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4854d5d9{/jobs/job,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@672e7ec3{/jobs/job/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@67ee182c{/stages,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@97af315{/stages/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1936a0e0{/stages/stage,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@447ef19e{/stages/stage/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@68e36851{/stages/pool,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@352fe12b{/stages/pool/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3d39f28d{/storage,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@e7806b5{/storage/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7d2a56cb{/storage/rdd,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@37c6c6fc{/storage/rdd/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4599e713{/environment,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@b9a0cbb{/environment/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@24299f0d{/executors,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@25594c52{/executors/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2f728695{/executors/threadDump,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7456a814{/executors/threadDump/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1cef9064{/static,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@16ba2eda{/,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@dac88e2{/api,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@145850ef{/jobs/job/kill,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6d678cf2{/stages/stage/kill,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at http://xxx:32822
   21/05/26 18:33:20 INFO cluster.YarnClusterScheduler: Created YarnClusterScheduler
   21/05/26 18:33:20 INFO cluster.SchedulerExtensionServices: Starting Yarn extension services with app application_1618828995116_0162 and attemptId Some(appattempt_1618828995116_0162_000001)
   21/05/26 18:33:20 WARN util.Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs.
   21/05/26 18:33:20 INFO util.Utils: Using initial executors = 1, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
   21/05/26 18:33:20 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 38417.
   21/05/26 18:33:20 INFO netty.NettyBlockTransferService: Server created on xxx:38417
   21/05/26 18:33:20 INFO storage.BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
   21/05/26 18:33:20 INFO storage.BlockManagerMaster: Registering BlockManager BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO storage.BlockManagerMasterEndpoint: Registering block manager xxx:38417 with 912.3 MB RAM, BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO storage.BlockManager: external shuffle service port = 7337
   21/05/26 18:33:20 INFO storage.BlockManager: Initialized BlockManager: BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /metrics/json.
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1b3c78ce{/metrics/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:21 INFO scheduler.EventLoggingListener: Logging events to hdfs://xxx:8020/eventLogging/application_1618828995116_0162_1
   21/05/26 18:33:21 WARN util.Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs.
   21/05/26 18:33:21 INFO util.Utils: Using initial executors = 1, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
   21/05/26 18:33:21 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to request executors before the AM has registered!
   21/05/26 18:33:21 INFO client.RMProxy: Connecting to ResourceManager at xxx/10.246.4.117:8030
   21/05/26 18:33:21 INFO yarn.YarnRMClient: Registering the ApplicationMaster
   21/05/26 18:33:21 INFO yarn.ApplicationMaster: 
   ===============================================================================
   YARN executor launch context:
     env:
       CLASSPATH -> {{PWD}}<CPS>{{PWD}}/__spark_conf__<CPS>{{PWD}}/__spark_libs__/*<CPS>/usr/hdp/2.6.0.3-8/hadoop/conf<CPS>/usr/hdp/2.6.0.3-8/hadoop/*<CPS>/usr/hdp/2.6.0.3-8/hadoop/lib/*<CPS>/usr/hdp/current/hadoop-hdfs-client/*<CPS>/usr/hdp/current/hadoop-hdfs-client/lib/*<CPS>/usr/hdp/current/hadoop-yarn-client/*<CPS>/usr/hdp/current/hadoop-yarn-client/lib/*<CPS>/usr/hdp/current/ext/hadoop/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*<CPS>{{PWD}}/__spark_conf__/__hadoop_conf__
       SPARK_YARN_STAGING_DIR -> hdfs://xxx:8020/user/hd_xyz/.sparkStaging/application_1618828995116_0162
       SPARK_USER -> hdfs
   
     command:
       {{JAVA_HOME}}/bin/java \ 
         -server \ 
         -Xmx2048m \ 
         -Djava.io.tmpdir={{PWD}}/tmp \ 
         '-Dspark.driver.port=37691' \ 
         '-Dspark.ui.port=0' \ 
         -Dspark.yarn.app.container.log.dir=<LOG_DIR> \ 
         -XX:OnOutOfMemoryError='kill %p' \ 
         org.apache.spark.executor.CoarseGrainedExecutorBackend \ 
         --driver-url \ 
         spark://CoarseGrainedScheduler@xxx:37691 \ 
         --executor-id \ 
         <executorId> \ 
         --hostname \ 
         <hostname> \ 
         --cores \ 
         1 \ 
         --app-id \ 
         application_1618828995116_0162 \ 
         --user-class-path \ 
         file:$PWD/__app__.jar \ 
         --user-class-path \ 
         file:$PWD/org.apache.spark_spark-avro_2.12-2.4.7.jar \ 
         --user-class-path \ 
         file:$PWD/org.spark-project.spark_unused-1.0.0.jar \ 
         1><LOG_DIR>/stdout \ 
         2><LOG_DIR>/stderr
   
     resources:
       org.apache.spark_spark-avro_2.12-2.4.7.jar -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/hd_xyz/.sparkStaging/application_1618828995116_0162/org.apache.spark_spark-avro_2.12-2.4.7.jar" } size: 107269 timestamp: 1622043191967 type: FILE visibility: PRIVATE
       __app__.jar -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/jars/hudi/hudi-utilities-bundle_2.12-0.8.0.jar" } size: 40399204 timestamp: 1622022896130 type: FILE visibility: PUBLIC
       __spark_conf__ -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/hd_xyz/.sparkStaging/application_1618828995116_0162/__spark_conf__.zip" } size: 205423 timestamp: 1622043193955 type: ARCHIVE visibility: PRIVATE
       org.spark-project.spark_unused-1.0.0.jar -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/hd_xyz/.sparkStaging/application_1618828995116_0162/org.spark-project.spark_unused-1.0.0.jar" } size: 2777 timestamp: 1622043192905 type: FILE visibility: PRIVATE
       __spark_libs__ -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/hd_xyz/.sparkStaging/application_1618828995116_0162/__spark_libs__2858796966972713370.zip" } size: 242613518 timestamp: 1622043190403 type: ARCHIVE visibility: PRIVATE
   
   ===============================================================================
   21/05/26 18:33:21 WARN util.Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs.
   21/05/26 18:33:21 INFO util.Utils: Using initial executors = 1, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
   21/05/26 18:33:21 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(spark://YarnAM@xxx:37691)
   21/05/26 18:33:21 INFO yarn.YarnAllocator: Will request 1 executor container(s), each with 1 core(s) and 2432 MB memory (including 384 MB of overhead)
   21/05/26 18:33:21 INFO yarn.YarnAllocator: Submitted 1 unlocalized container requests.
   21/05/26 18:33:21 INFO yarn.ApplicationMaster: Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals
   21/05/26 18:33:22 INFO impl.AMRMClientImpl: Received new token for : xxx:45454
   21/05/26 18:33:22 INFO yarn.YarnAllocator: Launching container container_e03_1618828995116_0162_01_000002 on host xxx for executor with ID 1
   21/05/26 18:33:22 INFO yarn.YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them.
   21/05/26 18:33:22 INFO impl.ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
   21/05/26 18:33:22 INFO impl.ContainerManagementProtocolProxy: Opening proxy : xxx:45454
   21/05/26 18:33:25 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.246.3.9:49980) with ID 1
   21/05/26 18:33:25 INFO spark.ExecutorAllocationManager: New executor 1 has registered (new total is 1)
   21/05/26 18:33:25 INFO cluster.YarnClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
   21/05/26 18:33:25 INFO cluster.YarnClusterScheduler: YarnClusterScheduler.postStartHook done
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO utilities.UtilHelpers: Adding overridden properties to file properties.
   21/05/26 18:33:25 WARN spark.SparkContext: Using an existing SparkContext; some configuration may not take effect.
   21/05/26 18:33:25 INFO storage.BlockManagerMasterEndpoint: Registering block manager xxx:35696 with 912.3 MB RAM, BlockManagerId(1, xxx, 35696, None)
   21/05/26 18:33:25 INFO deltastreamer.HoodieDeltaStreamer: Creating delta streamer with configs : {hoodie.deltastreamer.keygen.timebased.input.timezone=, hoodie.embed.timeline.server=true, schema.registry.url=http://xxx, hoodie.filesystem.view.type=EMBEDDED_KV_STORE, hoodie.deltastreamer.keygen.timebased.input.dateformat=yyyy-MM-ddTHH:mm:ssZ,yyyy-MM-ddTHH:mm:ss.SSSZ, hoodie.delete.shuffle.parallelism=2, hoodie.bulkinsert.shuffle.parallelism=2, hoodie.deltastreamer.keygen.timebased.output.dateformat=yyyy/MM/dd, group.id=hudi_group_080, auto.offset.reset=earliest, hoodie.insert.shuffle.parallelism=2, hoodie.deltastreamer.keygen.timebased.timestamp.type=DATE_STRING, hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.CustomKeyGenerator, hoodie.deltastreamer.source.kafka.topic=xxx, bootstrap.servers=xxx:9092, hoodie.deltastreamer.keygen.timebased.input.dateformat.list.delimiter.regex=, hoodie.deltastreamer.schemaprovider.registry.url=http://xxx/subjects/xxx-value/versions
 /latest, hoodie.datasource.write.recordkey.field=id, hoodie.upsert.shuffle.parallelism=2, hoodie.datasource.write.partitionpath.field=date:TIMESTAMP}
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Initializing /user/hd_xyz/yyy/ml_xxx/foo as hoodie table /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Finished initializing Table of type MERGE_ON_READ from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO deltastreamer.DeltaSync: Registering Schema :[{"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.ml_xxx.public.foo.Value"}, {"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.m
 l_xxx.public.foo.Value"}]
   21/05/26 18:33:25 INFO deltastreamer.HoodieDeltaStreamer: Delta Streamer running only single round
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:26 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:26 INFO deltastreamer.DeltaSync: Checkpoint to resume from : Optional.empty
   21/05/26 18:33:26 INFO consumer.ConsumerConfig: ConsumerConfig values: 
   	allow.auto.create.topics = true
   	auto.commit.interval.ms = 5000
   	auto.offset.reset = earliest
   	bootstrap.servers = [xxx]
   	check.crcs = true
   	client.dns.lookup = default
   	client.id = 
   	client.rack = 
   	connections.max.idle.ms = 540000
   	default.api.timeout.ms = 60000
   	enable.auto.commit = true
   	exclude.internal.topics = true
   	fetch.max.bytes = 52428800
   	fetch.max.wait.ms = 500
   	fetch.min.bytes = 1
   	group.id = hudi_group_080
   	group.instance.id = null
   	heartbeat.interval.ms = 3000
   	interceptor.classes = []
   	internal.leave.group.on.close = true
   	isolation.level = read_uncommitted
   	key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
   	max.partition.fetch.bytes = 1048576
   	max.poll.interval.ms = 300000
   	max.poll.records = 500
   	metadata.max.age.ms = 300000
   	metric.reporters = []
   	metrics.num.samples = 2
   	metrics.recording.level = INFO
   	metrics.sample.window.ms = 30000
   	partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
   	receive.buffer.bytes = 65536
   	reconnect.backoff.max.ms = 1000
   	reconnect.backoff.ms = 50
   	request.timeout.ms = 30000
   	retry.backoff.ms = 100
   	sasl.client.callback.handler.class = null
   	sasl.jaas.config = null
   	sasl.kerberos.kinit.cmd = /usr/bin/kinit
   	sasl.kerberos.min.time.before.relogin = 60000
   	sasl.kerberos.service.name = null
   	sasl.kerberos.ticket.renew.jitter = 0.05
   	sasl.kerberos.ticket.renew.window.factor = 0.8
   	sasl.login.callback.handler.class = null
   	sasl.login.class = null
   	sasl.login.refresh.buffer.seconds = 300
   	sasl.login.refresh.min.period.seconds = 60
   	sasl.login.refresh.window.factor = 0.8
   	sasl.login.refresh.window.jitter = 0.05
   	sasl.mechanism = GSSAPI
   	security.protocol = PLAINTEXT
   	security.providers = null
   	send.buffer.bytes = 131072
   	session.timeout.ms = 10000
   	ssl.cipher.suites = null
   	ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
   	ssl.endpoint.identification.algorithm = https
   	ssl.key.password = null
   	ssl.keymanager.algorithm = SunX509
   	ssl.keystore.location = null
   	ssl.keystore.password = null
   	ssl.keystore.type = JKS
   	ssl.protocol = TLS
   	ssl.provider = null
   	ssl.secure.random.implementation = null
   	ssl.trustmanager.algorithm = PKIX
   	ssl.truststore.location = null
   	ssl.truststore.password = null
   	ssl.truststore.type = JKS
   	value.deserializer = class io.confluent.kafka.serializers.KafkaAvroDeserializer
   
   21/05/26 18:33:26 INFO serializers.KafkaAvroDeserializerConfig: KafkaAvroDeserializerConfig values: 
   	schema.registry.url = [xxx]
   	max.schemas.per.subject = 1000
   	specific.avro.reader = false
   
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.timestamp.type' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.output.dateformat' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.input.dateformat.list.delimiter.regex' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.input.dateformat' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.datasource.write.partitionpath.field' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.delete.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.datasource.write.recordkey.field' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.upsert.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.datasource.write.keygenerator.class' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.source.kafka.topic' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.schemaprovider.registry.url' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.insert.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.embed.timeline.server' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.bulkinsert.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.input.timezone' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.filesystem.view.type' was supplied but isn't a known config.
   21/05/26 18:33:26 INFO utils.AppInfoParser: Kafka version: 2.4.1
   21/05/26 18:33:26 INFO utils.AppInfoParser: Kafka commitId: c57222ae8cd7866b
   21/05/26 18:33:26 INFO utils.AppInfoParser: Kafka startTimeMs: 1622043206225
   21/05/26 18:33:26 INFO clients.Metadata: [Consumer clientId=consumer-hudi_group_080-1, groupId=hudi_group_080] Cluster ID: 5XoPi9AYT0mbHVQEj6VEaw
   21/05/26 18:33:27 INFO helpers.KafkaOffsetGen: SourceLimit not configured, set numEvents to default value : 5000000
   21/05/26 18:33:27 INFO sources.AvroKafkaSource: About to read 0 from Kafka for topic :xxx
   21/05/26 18:33:27 INFO deltastreamer.DeltaSync: No new data, perform empty commit.
   21/05/26 18:33:27 INFO deltastreamer.DeltaSync: Setting up new Hoodie Write Client
   21/05/26 18:33:27 INFO deltastreamer.DeltaSync: Registering Schema :[{"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.ml_xxx.public.foo.Value"}, {"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.m
 l_xxx.public.foo.Value"}]
   21/05/26 18:33:27 INFO embedded.EmbeddedTimelineService: Starting Timeline service !!
   21/05/26 18:33:27 INFO embedded.EmbeddedTimelineService: Overriding hostIp to (xxx) found in spark-conf. It was null
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating View Manager with storage type :EMBEDDED_KV_STORE
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating embedded rocks-db based Table View
   21/05/26 18:33:27 INFO util.log: Logging initialized @9978ms to org.apache.hudi.org.eclipse.jetty.util.log.Slf4jLog
   21/05/26 18:33:27 INFO javalin.Javalin: 
              __                      __ _
             / /____ _ _   __ ____ _ / /(_)____
        __  / // __ `/| | / // __ `// // // __ \
       / /_/ // /_/ / | |/ // /_/ // // // / / /
       \____/ \__,_/  |___/ \__,_//_//_//_/ /_/
   
           https://javalin.io/documentation
   
   21/05/26 18:33:27 INFO javalin.Javalin: Starting Javalin ...
   21/05/26 18:33:27 INFO javalin.Javalin: Listening on http://localhost:37089/
   21/05/26 18:33:27 INFO javalin.Javalin: Javalin started in 179ms \o/
   21/05/26 18:33:27 INFO service.TimelineService: Starting Timeline server on port :37089
   21/05/26 18:33:27 INFO embedded.EmbeddedTimelineService: Started embedded timeline server at xxx:37089
   21/05/26 18:33:27 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:27 INFO client.AbstractHoodieClient: Timeline Server already running. Not restarting the service
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:27 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:28 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:28 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:28 INFO client.AbstractHoodieWriteClient: Generate a new instant time: 20210526183328 action: deltacommit
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Creating a new instant [==>20210526183328__deltacommit__REQUESTED]
   21/05/26 18:33:28 INFO deltastreamer.DeltaSync: Starting commit  : 20210526183328
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:28 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Loaded instants [[==>20210526183328__deltacommit__REQUESTED]]
   21/05/26 18:33:28 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:28 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:28 INFO client.SparkRDDWriteClient: Successfully synced to metadata table
   21/05/26 18:33:28 INFO client.AsyncCleanerService: Auto cleaning is not enabled. Not running cleaner now
   21/05/26 18:33:28 INFO spark.SparkContext: Starting job: countByKey at SparkHoodieBloomIndex.java:114
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Registering RDD 1 (mapToPair at SparkWriteHelper.java:54) as input to shuffle 1
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Registering RDD 5 (countByKey at SparkHoodieBloomIndex.java:114) as input to shuffle 0
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Got job 0 (countByKey at SparkHoodieBloomIndex.java:114) with 2 output partitions
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Final stage: ResultStage 2 (countByKey at SparkHoodieBloomIndex.java:114)
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 1)
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 1)
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 1 (MapPartitionsRDD[5] at countByKey at SparkHoodieBloomIndex.java:114), which has no missing parents
   21/05/26 18:33:28 INFO memory.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 6.2 KB, free 912.3 MB)
   21/05/26 18:33:28 INFO yarn.YarnAllocator: Driver requested a total number of 2 executor(s).
   21/05/26 18:33:28 INFO yarn.YarnAllocator: Will request 1 executor container(s), each with 1 core(s) and 2432 MB memory (including 384 MB of overhead)
   21/05/26 18:33:28 INFO yarn.YarnAllocator: Submitted 1 unlocalized container requests.
   21/05/26 18:33:28 INFO spark.ExecutorAllocationManager: Requesting 1 new executor because tasks are backlogged (new desired total will be 2)
   21/05/26 18:33:28 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.3 KB, free 912.3 MB)
   21/05/26 18:33:28 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on xxx:38417 (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:28 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 1 (MapPartitionsRDD[5] at countByKey at SparkHoodieBloomIndex.java:114) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:28 INFO cluster.YarnClusterScheduler: Adding task set 1.0 with 2 tasks
   21/05/26 18:33:28 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 1.0 (TID 0, xxx, executor 1, partition 0, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:28 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on xxx:35696 (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:29 INFO impl.AMRMClientImpl: Received new token for : xxx:45454
   21/05/26 18:33:29 INFO yarn.YarnAllocator: Launching container container_e03_1618828995116_0162_01_000004 on host xxx for executor with ID 2
   21/05/26 18:33:29 INFO yarn.YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them.
   21/05/26 18:33:29 INFO impl.ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
   21/05/26 18:33:29 INFO impl.ContainerManagementProtocolProxy: Opening proxy : xxx:45454
   21/05/26 18:33:29 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to 10.246.3.9:49980
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added rdd_3_0 in memory on xxx:35696 (size: 0.0 B, free: 912.3 MB)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 1.0 (TID 1, xxx, executor 1, partition 1, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added rdd_3_1 in memory on xxx:35696 (size: 0.0 B, free: 912.3 MB)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 1.0 (TID 0) in 1023 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 1.0 (TID 1) in 70 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: ShuffleMapStage 1 (countByKey at SparkHoodieBloomIndex.java:114) finished in 1.177 s
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: looking for newly runnable stages
   21/05/26 18:33:29 INFO cluster.YarnClusterScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool 
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 2)
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: Submitting ResultStage 2 (ShuffledRDD[6] at countByKey at SparkHoodieBloomIndex.java:114), which has no missing parents
   21/05/26 18:33:29 INFO memory.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.8 KB, free 912.3 MB)
   21/05/26 18:33:29 INFO memory.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.2 KB, free 912.3 MB)
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on xxx:38417 (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:29 INFO spark.SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 2 (ShuffledRDD[6] at countByKey at SparkHoodieBloomIndex.java:114) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:29 INFO cluster.YarnClusterScheduler: Adding task set 2.0 with 2 tasks
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 2.0 (TID 2, xxx, executor 1, partition 0, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on xxx:35696 (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:29 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to 10.246.3.9:49980
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 2.0 (TID 3, xxx, executor 1, partition 1, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 2.0 (TID 2) in 85 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 2.0 (TID 3) in 32 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:29 INFO cluster.YarnClusterScheduler: Removed TaskSet 2.0, whose tasks have all completed, from pool 
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: ResultStage 2 (countByKey at SparkHoodieBloomIndex.java:114) finished in 0.126 s
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: Job 0 finished: countByKey at SparkHoodieBloomIndex.java:114, took 1.627903 s
   21/05/26 18:33:29 INFO yarn.YarnAllocator: Driver requested a total number of 1 executor(s).
   21/05/26 18:33:30 INFO spark.SparkContext: Starting job: collect at HoodieSparkEngineContext.java:78
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Got job 1 (collect at HoodieSparkEngineContext.java:78) with 1 output partitions
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Final stage: ResultStage 3 (collect at HoodieSparkEngineContext.java:78)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Parents of final stage: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Missing parents: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ResultStage 3 (MapPartitionsRDD[8] at flatMap at HoodieSparkEngineContext.java:78), which has no missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_2 stored as values in memory (estimated size 368.5 KB, free 911.9 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 101.0 KB, free 911.8 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on xxx:38417 (size: 101.0 KB, free: 912.2 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 3 (MapPartitionsRDD[8] at flatMap at HoodieSparkEngineContext.java:78) (first 15 tasks are for partitions Vector(0))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 3.0 with 1 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 3.0 (TID 4, xxx, executor 1, partition 0, PROCESS_LOCAL, 7710 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on xxx:35696 (size: 101.0 KB, free: 912.2 MB)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 3.0 (TID 4) in 178 ms on xxx (executor 1) (1/1)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 3.0, whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ResultStage 3 (collect at HoodieSparkEngineContext.java:78) finished in 0.233 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Job 1 finished: collect at HoodieSparkEngineContext.java:78, took 0.236923 s
   21/05/26 18:33:30 INFO spark.SparkContext: Starting job: collect at HoodieSparkEngineContext.java:73
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Got job 2 (collect at HoodieSparkEngineContext.java:73) with 1 output partitions
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Final stage: ResultStage 4 (collect at HoodieSparkEngineContext.java:73)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Parents of final stage: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Missing parents: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ResultStage 4 (MapPartitionsRDD[10] at map at HoodieSparkEngineContext.java:73), which has no missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_3 stored as values in memory (estimated size 368.3 KB, free 911.5 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 100.9 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_3_piece0 in memory on xxx:38417 (size: 100.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 4 (MapPartitionsRDD[10] at map at HoodieSparkEngineContext.java:73) (first 15 tasks are for partitions Vector(0))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 4.0 with 1 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 4.0 (TID 5, xxx, executor 1, partition 0, PROCESS_LOCAL, 7710 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_3_piece0 in memory on xxx:35696 (size: 100.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 4.0 (TID 5) in 94 ms on xxx (executor 1) (1/1)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 4.0, whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ResultStage 4 (collect at HoodieSparkEngineContext.java:73) finished in 0.167 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Job 2 finished: collect at HoodieSparkEngineContext.java:73, took 0.174163 s
   21/05/26 18:33:30 INFO spark.SparkContext: Starting job: countByKey at SparkHoodieBloomIndex.java:149
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Registering RDD 14 (countByKey at SparkHoodieBloomIndex.java:149) as input to shuffle 2
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Got job 3 (countByKey at SparkHoodieBloomIndex.java:149) with 2 output partitions
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Final stage: ResultStage 7 (countByKey at SparkHoodieBloomIndex.java:149)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 6)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 6)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 6 (MapPartitionsRDD[14] at countByKey at SparkHoodieBloomIndex.java:149), which has no missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_4 stored as values in memory (estimated size 7.5 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 3.9 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_4_piece0 in memory on xxx:38417 (size: 3.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 4 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 6 (MapPartitionsRDD[14] at countByKey at SparkHoodieBloomIndex.java:149) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 6.0 with 2 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 6.0 (TID 6, xxx, executor 1, partition 0, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_4_piece0 in memory on xxx:35696 (size: 3.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 6.0 (TID 7, xxx, executor 1, partition 1, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 6.0 (TID 6) in 60 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 6.0 (TID 7) in 36 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 6.0, whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ShuffleMapStage 6 (countByKey at SparkHoodieBloomIndex.java:149) finished in 0.121 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: looking for newly runnable stages
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 7)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ResultStage 7 (ShuffledRDD[15] at countByKey at SparkHoodieBloomIndex.java:149), which has no missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_5 stored as values in memory (estimated size 3.8 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_5_piece0 stored as bytes in memory (estimated size 2.2 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in memory on xxx:38417 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 5 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 7 (ShuffledRDD[15] at countByKey at SparkHoodieBloomIndex.java:149) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 7.0 with 2 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 7.0 (TID 8, xxx, executor 1, partition 0, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in memory on xxx:35696 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 2 to 10.246.3.9:49980
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 7.0 (TID 9, xxx, executor 1, partition 1, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 7.0 (TID 8) in 47 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 7.0 (TID 9) in 20 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 7.0, whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ResultStage 7 (countByKey at SparkHoodieBloomIndex.java:149) finished in 0.081 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Job 3 finished: countByKey at SparkHoodieBloomIndex.java:149, took 0.219895 s
   21/05/26 18:33:30 INFO bloom.SparkHoodieBloomIndex: InputParallelism: ${2}, IndexParallelism: ${0}
   21/05/26 18:33:30 INFO bloom.BucketizedBloomCheckPartitioner: TotalBuckets 0, min_buckets/partition 1
   21/05/26 18:33:30 INFO rdd.MapPartitionsRDD: Removing RDD 3 from persistence list
   21/05/26 18:33:30 INFO storage.BlockManager: Removing RDD 3
   21/05/26 18:33:31 INFO rdd.MapPartitionsRDD: Removing RDD 22 from persistence list
   21/05/26 18:33:31 INFO storage.BlockManager: Removing RDD 22
   21/05/26 18:33:31 INFO spark.SparkContext: Starting job: countByKey at BaseSparkCommitActionExecutor.java:158
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 16 (mapToPair at SparkHoodieBloomIndex.java:266) as input to shuffle 6
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 23 (mapToPair at SparkHoodieBloomIndex.java:287) as input to shuffle 3
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 22 (flatMapToPair at SparkHoodieBloomIndex.java:274) as input to shuffle 4
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 31 (countByKey at BaseSparkCommitActionExecutor.java:158) as input to shuffle 5
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Got job 4 (countByKey at BaseSparkCommitActionExecutor.java:158) with 2 output partitions
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Final stage: ResultStage 13 (countByKey at BaseSparkCommitActionExecutor.java:158)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 12)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 12)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 10 (MapPartitionsRDD[23] at mapToPair at SparkHoodieBloomIndex.java:287), which has no missing parents
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_6 stored as values in memory (estimated size 5.9 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_6_piece0 stored as bytes in memory (estimated size 3.3 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_6_piece0 in memory on xxx:38417 (size: 3.3 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Created broadcast 6 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 10 (MapPartitionsRDD[23] at mapToPair at SparkHoodieBloomIndex.java:287) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Adding task set 10.0 with 2 tasks
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 10.0 (TID 10, xxx, executor 1, partition 0, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_6_piece0 in memory on xxx:35696 (size: 3.3 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 10.0 (TID 11, xxx, executor 1, partition 1, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 10.0 (TID 10) in 50 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 10.0 (TID 11) in 24 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Removed TaskSet 10.0, whose tasks have all completed, from pool 
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: ShuffleMapStage 10 (mapToPair at SparkHoodieBloomIndex.java:287) finished in 0.092 s
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: looking for newly runnable stages
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: waiting: Set(ShuffleMapStage 12, ResultStage 13)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 12 (MapPartitionsRDD[31] at countByKey at BaseSparkCommitActionExecutor.java:158), which has no missing parents
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_7 stored as values in memory (estimated size 7.1 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_7_piece0 stored as bytes in memory (estimated size 3.8 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in memory on xxx:38417 (size: 3.8 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Created broadcast 7 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 12 (MapPartitionsRDD[31] at countByKey at BaseSparkCommitActionExecutor.java:158) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Adding task set 12.0 with 2 tasks
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 12.0 (TID 12, xxx, executor 1, partition 0, PROCESS_LOCAL, 7730 bytes)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in memory on xxx:35696 (size: 3.8 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 3 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 4 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added rdd_29_0 in memory on xxx:35696 (size: 0.0 B, free: 912.1 MB)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 12.0 (TID 13, xxx, executor 1, partition 1, PROCESS_LOCAL, 7730 bytes)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 12.0 (TID 12) in 105 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added rdd_29_1 in memory on xxx:35696 (size: 0.0 B, free: 912.1 MB)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 12.0 (TID 13) in 24 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Removed TaskSet 12.0, whose tasks have all completed, from pool 
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: ShuffleMapStage 12 (countByKey at BaseSparkCommitActionExecutor.java:158) finished in 0.146 s
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: looking for newly runnable stages
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 13)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting ResultStage 13 (ShuffledRDD[32] at countByKey at BaseSparkCommitActionExecutor.java:158), which has no missing parents
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_8 stored as values in memory (estimated size 3.8 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_8_piece0 stored as bytes in memory (estimated size 2.2 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in memory on xxx:38417 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Created broadcast 8 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 13 (ShuffledRDD[32] at countByKey at BaseSparkCommitActionExecutor.java:158) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Adding task set 13.0 with 2 tasks
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 13.0 (TID 14, xxx, executor 1, partition 0, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in memory on xxx:35696 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 5 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 13.0 (TID 15, xxx, executor 1, partition 1, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 13.0 (TID 14) in 31 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 13.0 (TID 15) in 12 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Removed TaskSet 13.0, whose tasks have all completed, from pool 
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: ResultStage 13 (countByKey at BaseSparkCommitActionExecutor.java:158) finished in 0.064 s
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Job 4 finished: countByKey at BaseSparkCommitActionExecutor.java:158, took 0.320123 s
   21/05/26 18:33:31 INFO commit.BaseSparkCommitActionExecutor: Workload profile :WorkloadProfile {globalStat=WorkloadStat {numInserts=0, numUpdates=0}, partitionStat={}, operationType=UPSERT}
   21/05/26 18:33:31 INFO timeline.HoodieActiveTimeline: Checking for file exists ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit.requested
   21/05/26 18:33:31 INFO timeline.HoodieActiveTimeline: Create new file for toInstant ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit.inflight
   21/05/26 18:33:31 INFO commit.UpsertPartitioner: AvgRecordSize => 1024
   21/05/26 18:33:31 INFO view.AbstractTableFileSystemView: Took 3 ms to read  0 instants, 0 replaced file groups
   21/05/26 18:33:31 INFO util.ClusteringUtils: Found 0 files in pending clustering operations
   21/05/26 18:33:31 INFO commit.UpsertPartitioner: Total Buckets :0, buckets info => {}, 
   Partition to insert buckets => {}, 
   UpdateLocations mapped to buckets =>{}
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 175
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 62
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 9
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 148
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 105
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 143
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 2
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 55
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 209
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 154
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 147
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 163
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 69
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 34
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 100
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned shuffle 5
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 1
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 193
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 169
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 27
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 16
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 115
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 120
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 106
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 174
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 210
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 96
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 6
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 57
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 133
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 11
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 74
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 107
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 164
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 172
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 176
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 194
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 109
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 37
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 177
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 128
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 182
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 205
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 30
   21/05/26 18:33:31 INFO commit.BaseCommitActionExecutor: Auto commit disabled for 20210526183328
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 102
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 180
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 150
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 186
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 89
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 223
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 47
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 158
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 162
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 88
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 39
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 8
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 29
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 124
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 75
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 165
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 217
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 134
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_5_piece0 on xxx:35696 in memory (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_5_piece0 on xxx:38417 in memory (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 35
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 216
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 22
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 114
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 152
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 42
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 94
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 145
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 126
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 144
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 168
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_3_piece0 on xxx:38417 in memory (size: 100.9 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_3_piece0 on xxx:35696 in memory (size: 100.9 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 149
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 38
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 70
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 15
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 118
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 166
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 207
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 170
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 171
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 65
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 5
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 97
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 110
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 222
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 87
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_6_piece0 on xxx:38417 in memory (size: 3.3 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_6_piece0 on xxx:35696 in memory (size: 3.3 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 192
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 201
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 117
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 123
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 12
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 60
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 84
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 127
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 91
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 136
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 45
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 200
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 64
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_2_piece0 on xxx:38417 in memory (size: 101.0 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_2_piece0 on xxx:35696 in memory (size: 101.0 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 92
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 0
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 81
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 185
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 214
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 21
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 31
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 67
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 112
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 178
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 208
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 78
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 73
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 131
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_8_piece0 on xxx:38417 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_8_piece0 on xxx:35696 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 61
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 3
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_7_piece0 on xxx:38417 in memory (size: 3.8 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_7_piece0 on xxx:35696 in memory (size: 3.8 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Starting job: sum at DeltaSync.java:448
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Job 5 finished: sum at DeltaSync.java:448, took 0.000044 s
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 36
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 80
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 103
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 108
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 183
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 72
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 54
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 132
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 99
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 19
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 93
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 179
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 215
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 66
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 77
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 151
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 116
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 191
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 17
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 14
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 18
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 125
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 204
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 146
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 50
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 56
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 52
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 101
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 221
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 213
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 181
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 190
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 85
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned shuffle 2
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 156
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 161
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 53
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 197
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 20
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 41
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 44
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 140
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 218
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 188
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 122
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 195
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 167
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 220
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 43
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 199
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 155
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 24
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 219
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 71
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 198
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 23
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 135
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 26
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 141
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 121
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 157
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 13
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 130
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned shuffle 0
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 7
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 138
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 63
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 187
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 32
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 196
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 48
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 206
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 119
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 160
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 90
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 40
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 113
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_0_piece0 on xxx:38417 in memory (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_0_piece0 on xxx:35696 in memory (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 68
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 224
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 28
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 202
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 10
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 139
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 76
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 49
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 137
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 58
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_4_piece0 on xxx:38417 in memory (size: 3.9 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_4_piece0 on xxx:35696 in memory (size: 3.9 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 4
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 211
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 212
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 83
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 203
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 33
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 86
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 82
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_1_piece0 on xxx:38417 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_1_piece0 on xxx:35696 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 95
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 142
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 111
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 98
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 184
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 46
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 129
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 104
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 159
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 59
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 25
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 173
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 79
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 153
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 189
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 51
   21/05/26 18:33:32 INFO spark.SparkContext: Starting job: sum at DeltaSync.java:449
   21/05/26 18:33:32 INFO scheduler.DAGScheduler: Job 6 finished: sum at DeltaSync.java:449, took 0.000035 s
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO spark.SparkContext: Starting job: collect at SparkRDDWriteClient.java:120
   21/05/26 18:33:32 INFO scheduler.DAGScheduler: Job 7 finished: collect at SparkRDDWriteClient.java:120, took 0.000039 s
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants [[==>20210526183328__deltacommit__INFLIGHT]]
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:32 INFO util.CommitUtils: Creating  metadata for UPSERT numWriteStats:0numReplaceFileIds:0
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants [[==>20210526183328__deltacommit__INFLIGHT]]
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:32 INFO client.AbstractHoodieWriteClient: Committing 20210526183328 action deltacommit
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Marking instant complete [==>20210526183328__deltacommit__INFLIGHT]
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Checking for file exists ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit.inflight
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Create new file for toInstant ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Completed [==>20210526183328__deltacommit__INFLIGHT]
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants [[==>20210526183328__deltacommit__REQUESTED], [==>20210526183328__deltacommit__INFLIGHT], [20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:32 INFO table.HoodieTimelineArchiveLog: No Instants to archive
   21/05/26 18:33:32 INFO client.AbstractHoodieWriteClient: Auto cleaning is enabled. Running cleaner now
   21/05/26 18:33:32 INFO client.AbstractHoodieWriteClient: Scheduling cleaning at instant time :20210526183332
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants [[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote view for basePath /user/hd_xyz/yyy/ml_xxx/foo. Server=xxx:37089, Timeout=300
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating InMemory based view for basePath /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO view.AbstractTableFileSystemView: Took 0 ms to read  0 instants, 0 replaced file groups
   21/05/26 18:33:32 INFO util.ClusteringUtils: Found 0 files in pending clustering operations
   21/05/26 18:33:32 INFO view.RemoteHoodieTableFileSystemView: Sending request : (http://xxx:37089/v1/hoodie/view/compactions/pending/?basepath=%2Fuser%2Fhdfs%2Fxyz%2Fpublic%2Fml_xxx%2Ffoo&lastinstantts=20210526183328&timelinehash=3cb19d4eacc8a39b3d4198ed17d5dac7ca1a076cc50020fab31fed29c6ccddb1)
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:33 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO timeline.HoodieActiveTimeline: Loaded instants [[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:33 INFO collection.RocksDBDAO: DELETING RocksDB persisted at /tmp/hoodie_timeline_rocksdb/_user_hdfs_xyz_public_ml_xxx_foo/a138e066-6b6b-4f72-8865-4c30301cbe11
   21/05/26 18:33:33 INFO collection.RocksDBDAO: No column family found. Loading default
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl_open.cc:230] Creating manifest 1 
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/version_set.cc:3406] Recovering from manifest file: MANIFEST-000001
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [default]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/version_set.cc:3610] Recovered from manifest file:/tmp/hoodie_timeline_rocksdb/_user_hdfs_xyz_public_ml_xxx_foo/a138e066-6b6b-4f72-8865-4c30301cbe11/MANIFEST-000001 succeeded,manifest_file_number is 1, next_file_number is 3, last_sequence is 0, log_number is 0,prev_log_number is 0,max_column_family is 0,min_log_number_to_keep is 0
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/version_set.cc:3618] Column family [default] (ID 0), log number is 0
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl_open.cc:1287] DB pointer 0x7f3aaccf1f20
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/version_set.cc:2936] Creating manifest 6
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_view__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_view__user_hdfs_xyz_public_ml_xxx_foo] (ID 1)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_pending_compaction__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_pending_compaction__user_hdfs_xyz_public_ml_xxx_foo] (ID 2)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_bootstrap_basefile__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_bootstrap_basefile__user_hdfs_xyz_public_ml_xxx_foo] (ID 3)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_partitions__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_partitions__user_hdfs_xyz_public_ml_xxx_foo] (ID 4)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_replaced_fg_user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_replaced_fg_user_hdfs_xyz_public_ml_xxx_foo] (ID 5)
   21/05/26 18:33:33 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.246.4.117:53684) with ID 2
   21/05/26 18:33:33 INFO spark.ExecutorAllocationManager: New executor 2 has registered (new total is 2)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_pending_clustering_fg_user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_pending_clustering_fg_user_hdfs_xyz_public_ml_xxx_foo] (ID 6)
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting replacedFileGroups to ROCKSDB based file-system view at /tmp/hoodie_timeline_rocksdb, Total file-groups=0
   21/05/26 18:33:33 INFO collection.RocksDBDAO: Prefix DELETE (query=part=) on hudi_replaced_fg_user_hdfs_xyz_public_ml_xxx_foo
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting replacedFileGroups to ROCKSDB based file-system view complete
   21/05/26 18:33:33 INFO view.AbstractTableFileSystemView: Took 9 ms to read  0 instants, 0 replaced file groups
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Initializing pending compaction operations. Count=0
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Initializing external data file mapping. Count=0
   21/05/26 18:33:33 INFO util.ClusteringUtils: Found 0 files in pending clustering operations
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting file groups in pending clustering to ROCKSDB based file-system view at /tmp/hoodie_timeline_rocksdb, Total file-groups=0
   21/05/26 18:33:33 INFO collection.RocksDBDAO: Prefix DELETE (query=part=) on hudi_pending_clustering_fg_user_hdfs_xyz_public_ml_xxx_foo
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting replacedFileGroups to ROCKSDB based file-system view complete
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Created ROCKSDB based file-system view at /tmp/hoodie_timeline_rocksdb
   21/05/26 18:33:33 INFO collection.RocksDBDAO: Prefix Search for (query=) on hudi_pending_compaction__user_hdfs_xyz_public_ml_xxx_foo. Total Time Taken (msec)=1. Serialization Time taken(micro)=0, num entries=0
   21/05/26 18:33:33 INFO service.RequestHandler: TimeTakenMillis[Total=791, Refresh=779, handle=11, Check=1], Success=true, Query=basepath=%2Fuser%2Fhdfs%2Fxyz%2Fpublic%2Fml_xxx%2Ffoo&lastinstantts=20210526183328&timelinehash=3cb19d4eacc8a39b3d4198ed17d5dac7ca1a076cc50020fab31fed29c6ccddb1, Host=xxx:37089, synced=false
   21/05/26 18:33:33 INFO storage.BlockManagerMasterEndpoint: Registering block manager xxx:36920 with 912.3 MB RAM, BlockManagerId(2, xxx, 36920, None)
   21/05/26 18:33:33 INFO clean.CleanPlanner: No earliest commit to retain. No need to scan partitions !!
   21/05/26 18:33:33 INFO clean.CleanPlanner: Nothing to clean here. It is already clean
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Cleaner started
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Cleaned failed attempts if any
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:33 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO timeline.HoodieActiveTimeline: Loaded instants [[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:33 INFO client.SparkRDDWriteClient: Successfully synced to metadata table
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Committed 20210526183328
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Scheduling table service COMPACT
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Scheduling compaction at instant time :20210526183333
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:33 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO timeline.HoodieActiveTimeline: Loaded instants [[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:33 INFO compact.SparkScheduleCompactionActionExecutor: Checking if compaction needs to be run on /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO deltastreamer.DeltaSync: Commit 20210526183328 successful!
   21/05/26 18:33:33 INFO rdd.MapPartitionsRDD: Removing RDD 29 from persistence list
   21/05/26 18:33:33 INFO storage.BlockManager: Removing RDD 29
   21/05/26 18:33:34 INFO rdd.MapPartitionsRDD: Removing RDD 37 from persistence list
   21/05/26 18:33:34 INFO storage.BlockManager: Removing RDD 37
   21/05/26 18:33:34 INFO deltastreamer.DeltaSync: Shutting down embedded timeline server
   21/05/26 18:33:34 INFO embedded.EmbeddedTimelineService: Closing Timeline server
   21/05/26 18:33:34 INFO service.TimelineService: Closing Timeline Service
   21/05/26 18:33:34 INFO javalin.Javalin: Stopping Javalin ...
   21/05/26 18:33:34 INFO javalin.Javalin: Javalin has stopped
   21/05/26 18:33:34 INFO view.RocksDbBasedFileSystemView: Closing Rocksdb !!
   21/05/26 18:33:34 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:365] Shutdown: canceling all background work
   21/05/26 18:33:34 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:521] Shutdown complete
   21/05/26 18:33:34 INFO view.RocksDbBasedFileSystemView: Closed Rocksdb !!
   21/05/26 18:33:34 INFO service.TimelineService: Closed Timeline Service
   21/05/26 18:33:34 INFO embedded.EmbeddedTimelineService: Closed Timeline server
   21/05/26 18:33:34 INFO deltastreamer.HoodieDeltaStreamer: Shut down delta streamer
   21/05/26 18:33:34 INFO server.AbstractConnector: Stopped Spark@7a0e94b4{HTTP/1.1,[http/1.1]}{0.0.0.0:0}
   21/05/26 18:33:34 INFO ui.SparkUI: Stopped Spark web UI at http://xxx:32822
   21/05/26 18:33:34 INFO yarn.YarnAllocator: Driver requested a total number of 0 executor(s).
   21/05/26 18:33:34 INFO cluster.YarnClusterSchedulerBackend: Shutting down all executors
   21/05/26 18:33:34 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down
   21/05/26 18:33:34 INFO cluster.SchedulerExtensionServices: Stopping SchedulerExtensionServices
   (serviceOption=None,
    services=List(),
    started=false)
   21/05/26 18:33:34 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
   21/05/26 18:33:34 INFO memory.MemoryStore: MemoryStore cleared
   21/05/26 18:33:34 INFO storage.BlockManager: BlockManager stopped
   21/05/26 18:33:34 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
   21/05/26 18:33:34 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
   21/05/26 18:33:34 INFO spark.SparkContext: Successfully stopped SparkContext
   21/05/26 18:33:34 INFO yarn.ApplicationMaster: Final app status: SUCCEEDED, exitCode: 0
   21/05/26 18:33:34 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with SUCCEEDED
   21/05/26 18:33:34 INFO impl.AMRMClientImpl: Waiting for application to be successfully unregistered.
   21/05/26 18:33:34 INFO yarn.ApplicationMaster: Deleting staging directory hdfs://xxx:8020/user/hd_xyz/.sparkStaging/application_1618828995116_0162
   21/05/26 18:33:34 INFO util.ShutdownHookManager: Shutdown hook called
   21/05/26 18:33:34 INFO util.ShutdownHookManager: Deleting directory /data/hadoop/yarn/local/usercache/hdfs/appcache/application_1618828995116_0162/spark-4c7e81b9-e526-4325-abf0-d163828b92b5
   
   
   """


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] PavelPetukhov edited a comment on issue #2959: No data stored after migrating to Hudi 0.8.0

Posted by GitBox <gi...@apache.org>.
PavelPetukhov edited a comment on issue #2959:
URL: https://github.com/apache/hudi/issues/2959#issuecomment-848930327


   This is our full log:
   [spark_log.txt](https://github.com/apache/hudi/files/6548394/spark_log.txt)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] PavelPetukhov edited a comment on issue #2959: No data stored after migrating to Hudi 0.8.0

Posted by GitBox <gi...@apache.org>.
PavelPetukhov edited a comment on issue #2959:
URL: https://github.com/apache/hudi/issues/2959#issuecomment-848930327


   `
   Logged in as: dr.who 
   Application
   About
   Jobs
   Tools
   
   Log Type: stderr
   Log Upload Time: Wed May 26 18:33:34 +0300 2021
   Log Length: 104910
   21/05/26 18:33:18 INFO util.SignalUtils: Registered signal handler for TERM
   21/05/26 18:33:18 INFO util.SignalUtils: Registered signal handler for HUP
   21/05/26 18:33:18 INFO util.SignalUtils: Registered signal handler for INT
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing view acls to: yarn,hdfs
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing modify acls to: yarn,hdfs
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing view acls groups to: 
   21/05/26 18:33:18 INFO spark.SecurityManager: Changing modify acls groups to: 
   21/05/26 18:33:18 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(yarn, hdfs); groups with view permissions: Set(); users  with modify permissions: Set(yarn, hdfs); groups with modify permissions: Set()
   21/05/26 18:33:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
   21/05/26 18:33:18 INFO yarn.ApplicationMaster: Preparing Local resources
   21/05/26 18:33:19 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
   21/05/26 18:33:19 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1618828995116_0162_000001
   21/05/26 18:33:19 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread
   21/05/26 18:33:19 INFO yarn.ApplicationMaster: Waiting for spark context initialization...
   21/05/26 18:33:19 WARN deltastreamer.SchedulerConfGenerator: Job Scheduling Configs will not be in effect as spark.scheduler.mode is not set to FAIR at instantiation time. Continuing without scheduling configs
   21/05/26 18:33:19 INFO spark.SparkContext: Running Spark version 2.4.7
   21/05/26 18:33:19 INFO spark.SparkContext: Submitted application: xxx
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing view acls to: yarn,hdfs
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing modify acls to: yarn,hdfs
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing view acls groups to: 
   21/05/26 18:33:19 INFO spark.SecurityManager: Changing modify acls groups to: 
   21/05/26 18:33:19 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(yarn, hdfs); groups with view permissions: Set(); users  with modify permissions: Set(yarn, hdfs); groups with modify permissions: Set()
   21/05/26 18:33:20 INFO util.Utils: Successfully started service 'sparkDriver' on port 37691.
   21/05/26 18:33:20 INFO spark.SparkEnv: Registering MapOutputTracker
   21/05/26 18:33:20 INFO spark.SparkEnv: Registering BlockManagerMaster
   21/05/26 18:33:20 INFO storage.BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
   21/05/26 18:33:20 INFO storage.BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
   21/05/26 18:33:20 INFO storage.DiskBlockManager: Created local directory at /data/hadoop/yarn/local/usercache/hdfs/appcache/application_1618828995116_0162/blockmgr-9de167db-4756-414e-9126-32cb562e91aa
   21/05/26 18:33:20 INFO memory.MemoryStore: MemoryStore started with capacity 912.3 MB
   21/05/26 18:33:20 INFO spark.SparkEnv: Registering OutputCommitCoordinator
   21/05/26 18:33:20 INFO util.log: Logging initialized @2935ms
   21/05/26 18:33:20 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /jobs, /jobs/json, /jobs/job, /jobs/job/json, /stages, /stages/json, /stages/stage, /stages/stage/json, /stages/pool, /stages/pool/json, /storage, /storage/json, /storage/rdd, /storage/rdd/json, /environment, /environment/json, /executors, /executors/json, /executors/threadDump, /executors/threadDump/json, /static, /, /api, /jobs/job/kill, /stages/stage/kill.
   21/05/26 18:33:20 INFO server.Server: jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
   21/05/26 18:33:20 INFO server.Server: Started @3069ms
   21/05/26 18:33:20 INFO server.AbstractConnector: Started ServerConnector@7a0e94b4{HTTP/1.1,[http/1.1]}{0.0.0.0:32822}
   21/05/26 18:33:20 INFO util.Utils: Successfully started service 'SparkUI' on port 32822.
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@43837fbc{/jobs,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@d91ba30{/jobs/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4854d5d9{/jobs/job,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@672e7ec3{/jobs/job/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@67ee182c{/stages,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@97af315{/stages/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1936a0e0{/stages/stage,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@447ef19e{/stages/stage/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@68e36851{/stages/pool,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@352fe12b{/stages/pool/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3d39f28d{/storage,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@e7806b5{/storage/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7d2a56cb{/storage/rdd,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@37c6c6fc{/storage/rdd/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4599e713{/environment,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@b9a0cbb{/environment/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@24299f0d{/executors,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@25594c52{/executors/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2f728695{/executors/threadDump,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7456a814{/executors/threadDump/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1cef9064{/static,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@16ba2eda{/,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@dac88e2{/api,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@145850ef{/jobs/job/kill,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6d678cf2{/stages/stage/kill,null,AVAILABLE,@Spark}
   21/05/26 18:33:20 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at http://xxx:32822
   21/05/26 18:33:20 INFO cluster.YarnClusterScheduler: Created YarnClusterScheduler
   21/05/26 18:33:20 INFO cluster.SchedulerExtensionServices: Starting Yarn extension services with app application_1618828995116_0162 and attemptId Some(appattempt_1618828995116_0162_000001)
   21/05/26 18:33:20 WARN util.Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs.
   21/05/26 18:33:20 INFO util.Utils: Using initial executors = 1, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
   21/05/26 18:33:20 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 38417.
   21/05/26 18:33:20 INFO netty.NettyBlockTransferService: Server created on xxx:38417
   21/05/26 18:33:20 INFO storage.BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
   21/05/26 18:33:20 INFO storage.BlockManagerMaster: Registering BlockManager BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO storage.BlockManagerMasterEndpoint: Registering block manager xxx:38417 with 912.3 MB RAM, BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO storage.BlockManager: external shuffle service port = 7337
   21/05/26 18:33:20 INFO storage.BlockManager: Initialized BlockManager: BlockManagerId(driver, xxx, 38417, None)
   21/05/26 18:33:20 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /metrics/json.
   21/05/26 18:33:20 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1b3c78ce{/metrics/json,null,AVAILABLE,@Spark}
   21/05/26 18:33:21 INFO scheduler.EventLoggingListener: Logging events to hdfs://xxx:8020/eventLogging/application_1618828995116_0162_1
   21/05/26 18:33:21 WARN util.Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs.
   21/05/26 18:33:21 INFO util.Utils: Using initial executors = 1, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
   21/05/26 18:33:21 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to request executors before the AM has registered!
   21/05/26 18:33:21 INFO client.RMProxy: Connecting to ResourceManager at xxx/10.246.4.117:8030
   21/05/26 18:33:21 INFO yarn.YarnRMClient: Registering the ApplicationMaster
   21/05/26 18:33:21 INFO yarn.ApplicationMaster: 
   ===============================================================================
   YARN executor launch context:
     env:
       CLASSPATH -> {{PWD}}<CPS>{{PWD}}/__spark_conf__<CPS>{{PWD}}/__spark_libs__/*<CPS>/usr/hdp/2.6.0.3-8/hadoop/conf<CPS>/usr/hdp/2.6.0.3-8/hadoop/*<CPS>/usr/hdp/2.6.0.3-8/hadoop/lib/*<CPS>/usr/hdp/current/hadoop-hdfs-client/*<CPS>/usr/hdp/current/hadoop-hdfs-client/lib/*<CPS>/usr/hdp/current/hadoop-yarn-client/*<CPS>/usr/hdp/current/hadoop-yarn-client/lib/*<CPS>/usr/hdp/current/ext/hadoop/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*<CPS>{{PWD}}/__spark_conf__/__hadoop_conf__
       SPARK_YARN_STAGING_DIR -> hdfs://xxx:8020/user/hd_xyz/.sparkStaging/application_1618828995116_0162
       SPARK_USER -> hdfs
   
     command:
       {{JAVA_HOME}}/bin/java \ 
         -server \ 
         -Xmx2048m \ 
         -Djava.io.tmpdir={{PWD}}/tmp \ 
         '-Dspark.driver.port=37691' \ 
         '-Dspark.ui.port=0' \ 
         -Dspark.yarn.app.container.log.dir=<LOG_DIR> \ 
         -XX:OnOutOfMemoryError='kill %p' \ 
         org.apache.spark.executor.CoarseGrainedExecutorBackend \ 
         --driver-url \ 
         spark://CoarseGrainedScheduler@xxx:37691 \ 
         --executor-id \ 
         <executorId> \ 
         --hostname \ 
         <hostname> \ 
         --cores \ 
         1 \ 
         --app-id \ 
         application_1618828995116_0162 \ 
         --user-class-path \ 
         file:$PWD/__app__.jar \ 
         --user-class-path \ 
         file:$PWD/org.apache.spark_spark-avro_2.12-2.4.7.jar \ 
         --user-class-path \ 
         file:$PWD/org.spark-project.spark_unused-1.0.0.jar \ 
         1><LOG_DIR>/stdout \ 
         2><LOG_DIR>/stderr
   
     resources:
       org.apache.spark_spark-avro_2.12-2.4.7.jar -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/hd_xyz/.sparkStaging/application_1618828995116_0162/org.apache.spark_spark-avro_2.12-2.4.7.jar" } size: 107269 timestamp: 1622043191967 type: FILE visibility: PRIVATE
       __app__.jar -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/jars/hudi/hudi-utilities-bundle_2.12-0.8.0.jar" } size: 40399204 timestamp: 1622022896130 type: FILE visibility: PUBLIC
       __spark_conf__ -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/hd_xyz/.sparkStaging/application_1618828995116_0162/__spark_conf__.zip" } size: 205423 timestamp: 1622043193955 type: ARCHIVE visibility: PRIVATE
       org.spark-project.spark_unused-1.0.0.jar -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/hd_xyz/.sparkStaging/application_1618828995116_0162/org.spark-project.spark_unused-1.0.0.jar" } size: 2777 timestamp: 1622043192905 type: FILE visibility: PRIVATE
       __spark_libs__ -> resource { scheme: "hdfs" host: "xxx" port: 8020 file: "/user/hd_xyz/.sparkStaging/application_1618828995116_0162/__spark_libs__2858796966972713370.zip" } size: 242613518 timestamp: 1622043190403 type: ARCHIVE visibility: PRIVATE
   
   ===============================================================================
   21/05/26 18:33:21 WARN util.Utils: spark.executor.instances less than spark.dynamicAllocation.minExecutors is invalid, ignoring its setting, please update your configs.
   21/05/26 18:33:21 INFO util.Utils: Using initial executors = 1, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances
   21/05/26 18:33:21 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(spark://YarnAM@xxx:37691)
   21/05/26 18:33:21 INFO yarn.YarnAllocator: Will request 1 executor container(s), each with 1 core(s) and 2432 MB memory (including 384 MB of overhead)
   21/05/26 18:33:21 INFO yarn.YarnAllocator: Submitted 1 unlocalized container requests.
   21/05/26 18:33:21 INFO yarn.ApplicationMaster: Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals
   21/05/26 18:33:22 INFO impl.AMRMClientImpl: Received new token for : xxx:45454
   21/05/26 18:33:22 INFO yarn.YarnAllocator: Launching container container_e03_1618828995116_0162_01_000002 on host xxx for executor with ID 1
   21/05/26 18:33:22 INFO yarn.YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them.
   21/05/26 18:33:22 INFO impl.ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
   21/05/26 18:33:22 INFO impl.ContainerManagementProtocolProxy: Opening proxy : xxx:45454
   21/05/26 18:33:25 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.246.3.9:49980) with ID 1
   21/05/26 18:33:25 INFO spark.ExecutorAllocationManager: New executor 1 has registered (new total is 1)
   21/05/26 18:33:25 INFO cluster.YarnClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
   21/05/26 18:33:25 INFO cluster.YarnClusterScheduler: YarnClusterScheduler.postStartHook done
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO utilities.UtilHelpers: Adding overridden properties to file properties.
   21/05/26 18:33:25 WARN spark.SparkContext: Using an existing SparkContext; some configuration may not take effect.
   21/05/26 18:33:25 INFO storage.BlockManagerMasterEndpoint: Registering block manager xxx:35696 with 912.3 MB RAM, BlockManagerId(1, xxx, 35696, None)
   21/05/26 18:33:25 INFO deltastreamer.HoodieDeltaStreamer: Creating delta streamer with configs : {hoodie.deltastreamer.keygen.timebased.input.timezone=, hoodie.embed.timeline.server=true, schema.registry.url=http://xxx, hoodie.filesystem.view.type=EMBEDDED_KV_STORE, hoodie.deltastreamer.keygen.timebased.input.dateformat=yyyy-MM-ddTHH:mm:ssZ,yyyy-MM-ddTHH:mm:ss.SSSZ, hoodie.delete.shuffle.parallelism=2, hoodie.bulkinsert.shuffle.parallelism=2, hoodie.deltastreamer.keygen.timebased.output.dateformat=yyyy/MM/dd, group.id=hudi_group_080, auto.offset.reset=earliest, hoodie.insert.shuffle.parallelism=2, hoodie.deltastreamer.keygen.timebased.timestamp.type=DATE_STRING, hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.CustomKeyGenerator, hoodie.deltastreamer.source.kafka.topic=xxx, bootstrap.servers=xxx:9092, hoodie.deltastreamer.keygen.timebased.input.dateformat.list.delimiter.regex=, hoodie.deltastreamer.schemaprovider.registry.url=http://xxx/subjects/xxx-value/versions
 /latest, hoodie.datasource.write.recordkey.field=id, hoodie.upsert.shuffle.parallelism=2, hoodie.datasource.write.partitionpath.field=date:TIMESTAMP}
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Initializing /user/hd_xyz/yyy/ml_xxx/foo as hoodie table /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Finished initializing Table of type MERGE_ON_READ from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO deltastreamer.DeltaSync: Registering Schema :[{"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.ml_xxx.public.foo.Value"}, {"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.m
 l_xxx.public.foo.Value"}]
   21/05/26 18:33:25 INFO deltastreamer.HoodieDeltaStreamer: Delta Streamer running only single round
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:25 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:25 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:25 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:26 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:26 INFO deltastreamer.DeltaSync: Checkpoint to resume from : Optional.empty
   21/05/26 18:33:26 INFO consumer.ConsumerConfig: ConsumerConfig values: 
   	allow.auto.create.topics = true
   	auto.commit.interval.ms = 5000
   	auto.offset.reset = earliest
   	bootstrap.servers = [xxx]
   	check.crcs = true
   	client.dns.lookup = default
   	client.id = 
   	client.rack = 
   	connections.max.idle.ms = 540000
   	default.api.timeout.ms = 60000
   	enable.auto.commit = true
   	exclude.internal.topics = true
   	fetch.max.bytes = 52428800
   	fetch.max.wait.ms = 500
   	fetch.min.bytes = 1
   	group.id = hudi_group_080
   	group.instance.id = null
   	heartbeat.interval.ms = 3000
   	interceptor.classes = []
   	internal.leave.group.on.close = true
   	isolation.level = read_uncommitted
   	key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer
   	max.partition.fetch.bytes = 1048576
   	max.poll.interval.ms = 300000
   	max.poll.records = 500
   	metadata.max.age.ms = 300000
   	metric.reporters = []
   	metrics.num.samples = 2
   	metrics.recording.level = INFO
   	metrics.sample.window.ms = 30000
   	partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
   	receive.buffer.bytes = 65536
   	reconnect.backoff.max.ms = 1000
   	reconnect.backoff.ms = 50
   	request.timeout.ms = 30000
   	retry.backoff.ms = 100
   	sasl.client.callback.handler.class = null
   	sasl.jaas.config = null
   	sasl.kerberos.kinit.cmd = /usr/bin/kinit
   	sasl.kerberos.min.time.before.relogin = 60000
   	sasl.kerberos.service.name = null
   	sasl.kerberos.ticket.renew.jitter = 0.05
   	sasl.kerberos.ticket.renew.window.factor = 0.8
   	sasl.login.callback.handler.class = null
   	sasl.login.class = null
   	sasl.login.refresh.buffer.seconds = 300
   	sasl.login.refresh.min.period.seconds = 60
   	sasl.login.refresh.window.factor = 0.8
   	sasl.login.refresh.window.jitter = 0.05
   	sasl.mechanism = GSSAPI
   	security.protocol = PLAINTEXT
   	security.providers = null
   	send.buffer.bytes = 131072
   	session.timeout.ms = 10000
   	ssl.cipher.suites = null
   	ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
   	ssl.endpoint.identification.algorithm = https
   	ssl.key.password = null
   	ssl.keymanager.algorithm = SunX509
   	ssl.keystore.location = null
   	ssl.keystore.password = null
   	ssl.keystore.type = JKS
   	ssl.protocol = TLS
   	ssl.provider = null
   	ssl.secure.random.implementation = null
   	ssl.trustmanager.algorithm = PKIX
   	ssl.truststore.location = null
   	ssl.truststore.password = null
   	ssl.truststore.type = JKS
   	value.deserializer = class io.confluent.kafka.serializers.KafkaAvroDeserializer
   
   21/05/26 18:33:26 INFO serializers.KafkaAvroDeserializerConfig: KafkaAvroDeserializerConfig values: 
   	schema.registry.url = [xxx]
   	max.schemas.per.subject = 1000
   	specific.avro.reader = false
   
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.timestamp.type' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.output.dateformat' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.input.dateformat.list.delimiter.regex' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.input.dateformat' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.datasource.write.partitionpath.field' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.delete.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.datasource.write.recordkey.field' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.upsert.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.datasource.write.keygenerator.class' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.source.kafka.topic' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.schemaprovider.registry.url' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.insert.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.embed.timeline.server' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.bulkinsert.shuffle.parallelism' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.deltastreamer.keygen.timebased.input.timezone' was supplied but isn't a known config.
   21/05/26 18:33:26 WARN consumer.ConsumerConfig: The configuration 'hoodie.filesystem.view.type' was supplied but isn't a known config.
   21/05/26 18:33:26 INFO utils.AppInfoParser: Kafka version: 2.4.1
   21/05/26 18:33:26 INFO utils.AppInfoParser: Kafka commitId: c57222ae8cd7866b
   21/05/26 18:33:26 INFO utils.AppInfoParser: Kafka startTimeMs: 1622043206225
   21/05/26 18:33:26 INFO clients.Metadata: [Consumer clientId=consumer-hudi_group_080-1, groupId=hudi_group_080] Cluster ID: 5XoPi9AYT0mbHVQEj6VEaw
   21/05/26 18:33:27 INFO helpers.KafkaOffsetGen: SourceLimit not configured, set numEvents to default value : 5000000
   21/05/26 18:33:27 INFO sources.AvroKafkaSource: About to read 0 from Kafka for topic :xxx
   21/05/26 18:33:27 INFO deltastreamer.DeltaSync: No new data, perform empty commit.
   21/05/26 18:33:27 INFO deltastreamer.DeltaSync: Setting up new Hoodie Write Client
   21/05/26 18:33:27 INFO deltastreamer.DeltaSync: Registering Schema :[{"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.ml_xxx.public.foo.Value"}, {"type":"record","name":"Value","namespace":"mlops911.ml_xxx.public.foo","fields":[{"name":"id","type":"int"},{"name":"date","type":["null",{"type":"string","connect.version":1,"connect.name":"io.debezium.time.ZonedTimestamp"}],"default":null},{"name":"text","type":["null","string"],"default":null},{"name":"__null_ts_ms","type":["null","long"],"default":null},{"name":"__deleted","type":["null","string"],"default":null}],"connect.name":"mlops911.m
 l_xxx.public.foo.Value"}]
   21/05/26 18:33:27 INFO embedded.EmbeddedTimelineService: Starting Timeline service !!
   21/05/26 18:33:27 INFO embedded.EmbeddedTimelineService: Overriding hostIp to (xxx) found in spark-conf. It was null
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating View Manager with storage type :EMBEDDED_KV_STORE
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating embedded rocks-db based Table View
   21/05/26 18:33:27 INFO util.log: Logging initialized @9978ms to org.apache.hudi.org.eclipse.jetty.util.log.Slf4jLog
   21/05/26 18:33:27 INFO javalin.Javalin: 
              __                      __ _
             / /____ _ _   __ ____ _ / /(_)____
        __  / // __ `/| | / // __ `// // // __ \
       / /_/ // /_/ / | |/ // /_/ // // // / / /
       \____/ \__,_/  |___/ \__,_//_//_//_/ /_/
   
           https://javalin.io/documentation
   
   21/05/26 18:33:27 INFO javalin.Javalin: Starting Javalin ...
   21/05/26 18:33:27 INFO javalin.Javalin: Listening on http://localhost:37089/
   21/05/26 18:33:27 INFO javalin.Javalin: Javalin started in 179ms \o/
   21/05/26 18:33:27 INFO service.TimelineService: Starting Timeline server on port :37089
   21/05/26 18:33:27 INFO embedded.EmbeddedTimelineService: Started embedded timeline server at xxx:37089
   21/05/26 18:33:27 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:27 INFO client.AbstractHoodieClient: Timeline Server already running. Not restarting the service
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:27 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:27 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:27 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:27 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:28 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:28 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Loaded instants []
   21/05/26 18:33:28 INFO client.AbstractHoodieWriteClient: Generate a new instant time: 20210526183328 action: deltacommit
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Creating a new instant [==>20210526183328__deltacommit__REQUESTED]
   21/05/26 18:33:28 INFO deltastreamer.DeltaSync: Starting commit  : 20210526183328
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:28 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:28 INFO timeline.HoodieActiveTimeline: Loaded instants [[==>20210526183328__deltacommit__REQUESTED]]
   21/05/26 18:33:28 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:28 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:28 INFO client.SparkRDDWriteClient: Successfully synced to metadata table
   21/05/26 18:33:28 INFO client.AsyncCleanerService: Auto cleaning is not enabled. Not running cleaner now
   21/05/26 18:33:28 INFO spark.SparkContext: Starting job: countByKey at SparkHoodieBloomIndex.java:114
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Registering RDD 1 (mapToPair at SparkWriteHelper.java:54) as input to shuffle 1
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Registering RDD 5 (countByKey at SparkHoodieBloomIndex.java:114) as input to shuffle 0
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Got job 0 (countByKey at SparkHoodieBloomIndex.java:114) with 2 output partitions
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Final stage: ResultStage 2 (countByKey at SparkHoodieBloomIndex.java:114)
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 1)
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 1)
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 1 (MapPartitionsRDD[5] at countByKey at SparkHoodieBloomIndex.java:114), which has no missing parents
   21/05/26 18:33:28 INFO memory.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 6.2 KB, free 912.3 MB)
   21/05/26 18:33:28 INFO yarn.YarnAllocator: Driver requested a total number of 2 executor(s).
   21/05/26 18:33:28 INFO yarn.YarnAllocator: Will request 1 executor container(s), each with 1 core(s) and 2432 MB memory (including 384 MB of overhead)
   21/05/26 18:33:28 INFO yarn.YarnAllocator: Submitted 1 unlocalized container requests.
   21/05/26 18:33:28 INFO spark.ExecutorAllocationManager: Requesting 1 new executor because tasks are backlogged (new desired total will be 2)
   21/05/26 18:33:28 INFO memory.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.3 KB, free 912.3 MB)
   21/05/26 18:33:28 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on xxx:38417 (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:28 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:28 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 1 (MapPartitionsRDD[5] at countByKey at SparkHoodieBloomIndex.java:114) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:28 INFO cluster.YarnClusterScheduler: Adding task set 1.0 with 2 tasks
   21/05/26 18:33:28 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 1.0 (TID 0, xxx, executor 1, partition 0, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:28 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on xxx:35696 (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:29 INFO impl.AMRMClientImpl: Received new token for : xxx:45454
   21/05/26 18:33:29 INFO yarn.YarnAllocator: Launching container container_e03_1618828995116_0162_01_000004 on host xxx for executor with ID 2
   21/05/26 18:33:29 INFO yarn.YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them.
   21/05/26 18:33:29 INFO impl.ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
   21/05/26 18:33:29 INFO impl.ContainerManagementProtocolProxy: Opening proxy : xxx:45454
   21/05/26 18:33:29 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to 10.246.3.9:49980
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added rdd_3_0 in memory on xxx:35696 (size: 0.0 B, free: 912.3 MB)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 1.0 (TID 1, xxx, executor 1, partition 1, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added rdd_3_1 in memory on xxx:35696 (size: 0.0 B, free: 912.3 MB)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 1.0 (TID 0) in 1023 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 1.0 (TID 1) in 70 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: ShuffleMapStage 1 (countByKey at SparkHoodieBloomIndex.java:114) finished in 1.177 s
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: looking for newly runnable stages
   21/05/26 18:33:29 INFO cluster.YarnClusterScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool 
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 2)
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: Submitting ResultStage 2 (ShuffledRDD[6] at countByKey at SparkHoodieBloomIndex.java:114), which has no missing parents
   21/05/26 18:33:29 INFO memory.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.8 KB, free 912.3 MB)
   21/05/26 18:33:29 INFO memory.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.2 KB, free 912.3 MB)
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on xxx:38417 (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:29 INFO spark.SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 2 (ShuffledRDD[6] at countByKey at SparkHoodieBloomIndex.java:114) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:29 INFO cluster.YarnClusterScheduler: Adding task set 2.0 with 2 tasks
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 2.0 (TID 2, xxx, executor 1, partition 0, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:29 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on xxx:35696 (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:29 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to 10.246.3.9:49980
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 2.0 (TID 3, xxx, executor 1, partition 1, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 2.0 (TID 2) in 85 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:29 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 2.0 (TID 3) in 32 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:29 INFO cluster.YarnClusterScheduler: Removed TaskSet 2.0, whose tasks have all completed, from pool 
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: ResultStage 2 (countByKey at SparkHoodieBloomIndex.java:114) finished in 0.126 s
   21/05/26 18:33:29 INFO scheduler.DAGScheduler: Job 0 finished: countByKey at SparkHoodieBloomIndex.java:114, took 1.627903 s
   21/05/26 18:33:29 INFO yarn.YarnAllocator: Driver requested a total number of 1 executor(s).
   21/05/26 18:33:30 INFO spark.SparkContext: Starting job: collect at HoodieSparkEngineContext.java:78
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Got job 1 (collect at HoodieSparkEngineContext.java:78) with 1 output partitions
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Final stage: ResultStage 3 (collect at HoodieSparkEngineContext.java:78)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Parents of final stage: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Missing parents: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ResultStage 3 (MapPartitionsRDD[8] at flatMap at HoodieSparkEngineContext.java:78), which has no missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_2 stored as values in memory (estimated size 368.5 KB, free 911.9 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 101.0 KB, free 911.8 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on xxx:38417 (size: 101.0 KB, free: 912.2 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 3 (MapPartitionsRDD[8] at flatMap at HoodieSparkEngineContext.java:78) (first 15 tasks are for partitions Vector(0))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 3.0 with 1 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 3.0 (TID 4, xxx, executor 1, partition 0, PROCESS_LOCAL, 7710 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on xxx:35696 (size: 101.0 KB, free: 912.2 MB)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 3.0 (TID 4) in 178 ms on xxx (executor 1) (1/1)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 3.0, whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ResultStage 3 (collect at HoodieSparkEngineContext.java:78) finished in 0.233 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Job 1 finished: collect at HoodieSparkEngineContext.java:78, took 0.236923 s
   21/05/26 18:33:30 INFO spark.SparkContext: Starting job: collect at HoodieSparkEngineContext.java:73
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Got job 2 (collect at HoodieSparkEngineContext.java:73) with 1 output partitions
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Final stage: ResultStage 4 (collect at HoodieSparkEngineContext.java:73)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Parents of final stage: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Missing parents: List()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ResultStage 4 (MapPartitionsRDD[10] at map at HoodieSparkEngineContext.java:73), which has no missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_3 stored as values in memory (estimated size 368.3 KB, free 911.5 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 100.9 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_3_piece0 in memory on xxx:38417 (size: 100.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 4 (MapPartitionsRDD[10] at map at HoodieSparkEngineContext.java:73) (first 15 tasks are for partitions Vector(0))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 4.0 with 1 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 4.0 (TID 5, xxx, executor 1, partition 0, PROCESS_LOCAL, 7710 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_3_piece0 in memory on xxx:35696 (size: 100.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 4.0 (TID 5) in 94 ms on xxx (executor 1) (1/1)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 4.0, whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ResultStage 4 (collect at HoodieSparkEngineContext.java:73) finished in 0.167 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Job 2 finished: collect at HoodieSparkEngineContext.java:73, took 0.174163 s
   21/05/26 18:33:30 INFO spark.SparkContext: Starting job: countByKey at SparkHoodieBloomIndex.java:149
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Registering RDD 14 (countByKey at SparkHoodieBloomIndex.java:149) as input to shuffle 2
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Got job 3 (countByKey at SparkHoodieBloomIndex.java:149) with 2 output partitions
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Final stage: ResultStage 7 (countByKey at SparkHoodieBloomIndex.java:149)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 6)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 6)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 6 (MapPartitionsRDD[14] at countByKey at SparkHoodieBloomIndex.java:149), which has no missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_4 stored as values in memory (estimated size 7.5 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 3.9 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_4_piece0 in memory on xxx:38417 (size: 3.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 4 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 6 (MapPartitionsRDD[14] at countByKey at SparkHoodieBloomIndex.java:149) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 6.0 with 2 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 6.0 (TID 6, xxx, executor 1, partition 0, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_4_piece0 in memory on xxx:35696 (size: 3.9 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 6.0 (TID 7, xxx, executor 1, partition 1, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 6.0 (TID 6) in 60 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 6.0 (TID 7) in 36 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 6.0, whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ShuffleMapStage 6 (countByKey at SparkHoodieBloomIndex.java:149) finished in 0.121 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: looking for newly runnable stages
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 7)
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting ResultStage 7 (ShuffledRDD[15] at countByKey at SparkHoodieBloomIndex.java:149), which has no missing parents
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_5 stored as values in memory (estimated size 3.8 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO memory.MemoryStore: Block broadcast_5_piece0 stored as bytes in memory (estimated size 2.2 KB, free 911.4 MB)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in memory on xxx:38417 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.SparkContext: Created broadcast 5 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 7 (ShuffledRDD[15] at countByKey at SparkHoodieBloomIndex.java:149) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Adding task set 7.0 with 2 tasks
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 7.0 (TID 8, xxx, executor 1, partition 0, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:30 INFO storage.BlockManagerInfo: Added broadcast_5_piece0 in memory on xxx:35696 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:30 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 2 to 10.246.3.9:49980
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 7.0 (TID 9, xxx, executor 1, partition 1, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 7.0 (TID 8) in 47 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:30 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 7.0 (TID 9) in 20 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:30 INFO cluster.YarnClusterScheduler: Removed TaskSet 7.0, whose tasks have all completed, from pool 
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: ResultStage 7 (countByKey at SparkHoodieBloomIndex.java:149) finished in 0.081 s
   21/05/26 18:33:30 INFO scheduler.DAGScheduler: Job 3 finished: countByKey at SparkHoodieBloomIndex.java:149, took 0.219895 s
   21/05/26 18:33:30 INFO bloom.SparkHoodieBloomIndex: InputParallelism: ${2}, IndexParallelism: ${0}
   21/05/26 18:33:30 INFO bloom.BucketizedBloomCheckPartitioner: TotalBuckets 0, min_buckets/partition 1
   21/05/26 18:33:30 INFO rdd.MapPartitionsRDD: Removing RDD 3 from persistence list
   21/05/26 18:33:30 INFO storage.BlockManager: Removing RDD 3
   21/05/26 18:33:31 INFO rdd.MapPartitionsRDD: Removing RDD 22 from persistence list
   21/05/26 18:33:31 INFO storage.BlockManager: Removing RDD 22
   21/05/26 18:33:31 INFO spark.SparkContext: Starting job: countByKey at BaseSparkCommitActionExecutor.java:158
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 16 (mapToPair at SparkHoodieBloomIndex.java:266) as input to shuffle 6
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 23 (mapToPair at SparkHoodieBloomIndex.java:287) as input to shuffle 3
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 22 (flatMapToPair at SparkHoodieBloomIndex.java:274) as input to shuffle 4
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Registering RDD 31 (countByKey at BaseSparkCommitActionExecutor.java:158) as input to shuffle 5
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Got job 4 (countByKey at BaseSparkCommitActionExecutor.java:158) with 2 output partitions
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Final stage: ResultStage 13 (countByKey at BaseSparkCommitActionExecutor.java:158)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 12)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 12)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 10 (MapPartitionsRDD[23] at mapToPair at SparkHoodieBloomIndex.java:287), which has no missing parents
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_6 stored as values in memory (estimated size 5.9 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_6_piece0 stored as bytes in memory (estimated size 3.3 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_6_piece0 in memory on xxx:38417 (size: 3.3 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Created broadcast 6 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 10 (MapPartitionsRDD[23] at mapToPair at SparkHoodieBloomIndex.java:287) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Adding task set 10.0 with 2 tasks
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 10.0 (TID 10, xxx, executor 1, partition 0, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_6_piece0 in memory on xxx:35696 (size: 3.3 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 10.0 (TID 11, xxx, executor 1, partition 1, PROCESS_LOCAL, 7640 bytes)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 10.0 (TID 10) in 50 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 10.0 (TID 11) in 24 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Removed TaskSet 10.0, whose tasks have all completed, from pool 
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: ShuffleMapStage 10 (mapToPair at SparkHoodieBloomIndex.java:287) finished in 0.092 s
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: looking for newly runnable stages
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: waiting: Set(ShuffleMapStage 12, ResultStage 13)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 12 (MapPartitionsRDD[31] at countByKey at BaseSparkCommitActionExecutor.java:158), which has no missing parents
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_7 stored as values in memory (estimated size 7.1 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_7_piece0 stored as bytes in memory (estimated size 3.8 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in memory on xxx:38417 (size: 3.8 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Created broadcast 7 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 12 (MapPartitionsRDD[31] at countByKey at BaseSparkCommitActionExecutor.java:158) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Adding task set 12.0 with 2 tasks
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 12.0 (TID 12, xxx, executor 1, partition 0, PROCESS_LOCAL, 7730 bytes)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in memory on xxx:35696 (size: 3.8 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 3 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 4 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added rdd_29_0 in memory on xxx:35696 (size: 0.0 B, free: 912.1 MB)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 12.0 (TID 13, xxx, executor 1, partition 1, PROCESS_LOCAL, 7730 bytes)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 12.0 (TID 12) in 105 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added rdd_29_1 in memory on xxx:35696 (size: 0.0 B, free: 912.1 MB)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 12.0 (TID 13) in 24 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Removed TaskSet 12.0, whose tasks have all completed, from pool 
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: ShuffleMapStage 12 (countByKey at BaseSparkCommitActionExecutor.java:158) finished in 0.146 s
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: looking for newly runnable stages
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: running: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 13)
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: failed: Set()
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting ResultStage 13 (ShuffledRDD[32] at countByKey at BaseSparkCommitActionExecutor.java:158), which has no missing parents
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_8 stored as values in memory (estimated size 3.8 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO memory.MemoryStore: Block broadcast_8_piece0 stored as bytes in memory (estimated size 2.2 KB, free 911.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in memory on xxx:38417 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Created broadcast 8 from broadcast at DAGScheduler.scala:1184
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from ResultStage 13 (ShuffledRDD[32] at countByKey at BaseSparkCommitActionExecutor.java:158) (first 15 tasks are for partitions Vector(0, 1))
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Adding task set 13.0 with 2 tasks
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 13.0 (TID 14, xxx, executor 1, partition 0, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Added broadcast_8_piece0 in memory on xxx:35696 (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 5 to 10.246.3.9:49980
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 13.0 (TID 15, xxx, executor 1, partition 1, PROCESS_LOCAL, 7651 bytes)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 13.0 (TID 14) in 31 ms on xxx (executor 1) (1/2)
   21/05/26 18:33:31 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 13.0 (TID 15) in 12 ms on xxx (executor 1) (2/2)
   21/05/26 18:33:31 INFO cluster.YarnClusterScheduler: Removed TaskSet 13.0, whose tasks have all completed, from pool 
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: ResultStage 13 (countByKey at BaseSparkCommitActionExecutor.java:158) finished in 0.064 s
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Job 4 finished: countByKey at BaseSparkCommitActionExecutor.java:158, took 0.320123 s
   21/05/26 18:33:31 INFO commit.BaseSparkCommitActionExecutor: Workload profile :WorkloadProfile {globalStat=WorkloadStat {numInserts=0, numUpdates=0}, partitionStat={}, operationType=UPSERT}
   21/05/26 18:33:31 INFO timeline.HoodieActiveTimeline: Checking for file exists ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit.requested
   21/05/26 18:33:31 INFO timeline.HoodieActiveTimeline: Create new file for toInstant ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit.inflight
   21/05/26 18:33:31 INFO commit.UpsertPartitioner: AvgRecordSize => 1024
   21/05/26 18:33:31 INFO view.AbstractTableFileSystemView: Took 3 ms to read  0 instants, 0 replaced file groups
   21/05/26 18:33:31 INFO util.ClusteringUtils: Found 0 files in pending clustering operations
   21/05/26 18:33:31 INFO commit.UpsertPartitioner: Total Buckets :0, buckets info => {}, 
   Partition to insert buckets => {}, 
   UpdateLocations mapped to buckets =>{}
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 175
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 62
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 9
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 148
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 105
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 143
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 2
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 55
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 209
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 154
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 147
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 163
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 69
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 34
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 100
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned shuffle 5
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 1
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 193
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 169
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 27
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 16
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 115
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 120
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 106
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 174
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 210
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 96
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 6
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 57
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 133
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 11
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 74
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 107
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 164
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 172
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 176
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 194
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 109
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 37
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 177
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 128
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 182
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 205
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 30
   21/05/26 18:33:31 INFO commit.BaseCommitActionExecutor: Auto commit disabled for 20210526183328
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 102
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 180
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 150
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 186
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 89
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 223
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 47
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 158
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 162
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 88
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 39
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 8
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 29
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 124
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 75
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 165
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 217
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 134
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_5_piece0 on xxx:35696 in memory (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_5_piece0 on xxx:38417 in memory (size: 2.2 KB, free: 912.1 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 35
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 216
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 22
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 114
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 152
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 42
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 94
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 145
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 126
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 144
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 168
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_3_piece0 on xxx:38417 in memory (size: 100.9 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_3_piece0 on xxx:35696 in memory (size: 100.9 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 149
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 38
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 70
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 15
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 118
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 166
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 207
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 170
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 171
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 65
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 5
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 97
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 110
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 222
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 87
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_6_piece0 on xxx:38417 in memory (size: 3.3 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_6_piece0 on xxx:35696 in memory (size: 3.3 KB, free: 912.2 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 192
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 201
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 117
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 123
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 12
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 60
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 84
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 127
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 91
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 136
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 45
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 200
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 64
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_2_piece0 on xxx:38417 in memory (size: 101.0 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_2_piece0 on xxx:35696 in memory (size: 101.0 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 92
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 0
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 81
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 185
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 214
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 21
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 31
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 67
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 112
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 178
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 208
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 78
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 73
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 131
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_8_piece0 on xxx:38417 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_8_piece0 on xxx:35696 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 61
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 3
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_7_piece0 on xxx:38417 in memory (size: 3.8 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO storage.BlockManagerInfo: Removed broadcast_7_piece0 on xxx:35696 in memory (size: 3.8 KB, free: 912.3 MB)
   21/05/26 18:33:31 INFO spark.SparkContext: Starting job: sum at DeltaSync.java:448
   21/05/26 18:33:31 INFO scheduler.DAGScheduler: Job 5 finished: sum at DeltaSync.java:448, took 0.000044 s
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 36
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 80
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 103
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 108
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 183
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 72
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 54
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 132
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 99
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 19
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 93
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 179
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 215
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 66
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 77
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 151
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 116
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 191
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 17
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 14
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 18
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 125
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 204
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 146
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 50
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 56
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 52
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 101
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 221
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 213
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 181
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 190
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 85
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned shuffle 2
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 156
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 161
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 53
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 197
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 20
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 41
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 44
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 140
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 218
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 188
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 122
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 195
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 167
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 220
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 43
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 199
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 155
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 24
   21/05/26 18:33:31 INFO spark.ContextCleaner: Cleaned accumulator 219
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 71
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 198
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 23
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 135
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 26
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 141
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 121
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 157
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 13
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 130
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned shuffle 0
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 7
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 138
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 63
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 187
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 32
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 196
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 48
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 206
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 119
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 160
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 90
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 40
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 113
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_0_piece0 on xxx:38417 in memory (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_0_piece0 on xxx:35696 in memory (size: 3.3 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 68
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 224
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 28
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 202
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 10
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 139
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 76
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 49
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 137
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 58
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_4_piece0 on xxx:38417 in memory (size: 3.9 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_4_piece0 on xxx:35696 in memory (size: 3.9 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 4
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 211
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 212
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 83
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 203
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 33
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 86
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 82
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_1_piece0 on xxx:38417 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO storage.BlockManagerInfo: Removed broadcast_1_piece0 on xxx:35696 in memory (size: 2.2 KB, free: 912.3 MB)
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 95
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 142
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 111
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 98
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 184
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 46
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 129
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 104
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 159
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 59
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 25
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 173
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 79
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 153
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 189
   21/05/26 18:33:32 INFO spark.ContextCleaner: Cleaned accumulator 51
   21/05/26 18:33:32 INFO spark.SparkContext: Starting job: sum at DeltaSync.java:449
   21/05/26 18:33:32 INFO scheduler.DAGScheduler: Job 6 finished: sum at DeltaSync.java:449, took 0.000035 s
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO spark.SparkContext: Starting job: collect at SparkRDDWriteClient.java:120
   21/05/26 18:33:32 INFO scheduler.DAGScheduler: Job 7 finished: collect at SparkRDDWriteClient.java:120, took 0.000039 s
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants [[==>20210526183328__deltacommit__INFLIGHT]]
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:32 INFO util.CommitUtils: Creating  metadata for UPSERT numWriteStats:0numReplaceFileIds:0
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants [[==>20210526183328__deltacommit__INFLIGHT]]
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:32 INFO client.AbstractHoodieWriteClient: Committing 20210526183328 action deltacommit
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Marking instant complete [==>20210526183328__deltacommit__INFLIGHT]
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Checking for file exists ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit.inflight
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Create new file for toInstant ?/user/hd_xyz/yyy/ml_xxx/foo/.hoodie/20210526183328.deltacommit
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Completed [==>20210526183328__deltacommit__INFLIGHT]
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants [[==>20210526183328__deltacommit__REQUESTED], [==>20210526183328__deltacommit__INFLIGHT], [20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:32 INFO table.HoodieTimelineArchiveLog: No Instants to archive
   21/05/26 18:33:32 INFO client.AbstractHoodieWriteClient: Auto cleaning is enabled. Running cleaner now
   21/05/26 18:33:32 INFO client.AbstractHoodieWriteClient: Scheduling cleaning at instant time :20210526183332
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:32 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO timeline.HoodieActiveTimeline: Loaded instants [[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating remote view for basePath /user/hd_xyz/yyy/ml_xxx/foo. Server=xxx:37089, Timeout=300
   21/05/26 18:33:32 INFO view.FileSystemViewManager: Creating InMemory based view for basePath /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:32 INFO view.AbstractTableFileSystemView: Took 0 ms to read  0 instants, 0 replaced file groups
   21/05/26 18:33:32 INFO util.ClusteringUtils: Found 0 files in pending clustering operations
   21/05/26 18:33:32 INFO view.RemoteHoodieTableFileSystemView: Sending request : (http://xxx:37089/v1/hoodie/view/compactions/pending/?basepath=%2Fuser%2Fhdfs%2Fxyz%2Fpublic%2Fml_xxx%2Ffoo&lastinstantts=20210526183328&timelinehash=3cb19d4eacc8a39b3d4198ed17d5dac7ca1a076cc50020fab31fed29c6ccddb1)
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:33 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO timeline.HoodieActiveTimeline: Loaded instants [[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:33 INFO collection.RocksDBDAO: DELETING RocksDB persisted at /tmp/hoodie_timeline_rocksdb/_user_hdfs_xyz_public_ml_xxx_foo/a138e066-6b6b-4f72-8865-4c30301cbe11
   21/05/26 18:33:33 INFO collection.RocksDBDAO: No column family found. Loading default
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl_open.cc:230] Creating manifest 1 
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/version_set.cc:3406] Recovering from manifest file: MANIFEST-000001
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [default]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/version_set.cc:3610] Recovered from manifest file:/tmp/hoodie_timeline_rocksdb/_user_hdfs_xyz_public_ml_xxx_foo/a138e066-6b6b-4f72-8865-4c30301cbe11/MANIFEST-000001 succeeded,manifest_file_number is 1, next_file_number is 3, last_sequence is 0, log_number is 0,prev_log_number is 0,max_column_family is 0,min_log_number_to_keep is 0
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/version_set.cc:3618] Column family [default] (ID 0), log number is 0
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl_open.cc:1287] DB pointer 0x7f3aaccf1f20
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/version_set.cc:2936] Creating manifest 6
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_view__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_view__user_hdfs_xyz_public_ml_xxx_foo] (ID 1)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_pending_compaction__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_pending_compaction__user_hdfs_xyz_public_ml_xxx_foo] (ID 2)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_bootstrap_basefile__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_bootstrap_basefile__user_hdfs_xyz_public_ml_xxx_foo] (ID 3)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_partitions__user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_partitions__user_hdfs_xyz_public_ml_xxx_foo] (ID 4)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_replaced_fg_user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_replaced_fg_user_hdfs_xyz_public_ml_xxx_foo] (ID 5)
   21/05/26 18:33:33 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.246.4.117:53684) with ID 2
   21/05/26 18:33:33 INFO spark.ExecutorAllocationManager: New executor 2 has registered (new total is 2)
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/column_family.cc:475] --------------- Options for column family [hudi_pending_clustering_fg_user_hdfs_xyz_public_ml_xxx_foo]:
   
   21/05/26 18:33:33 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:1546] Created column family [hudi_pending_clustering_fg_user_hdfs_xyz_public_ml_xxx_foo] (ID 6)
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting replacedFileGroups to ROCKSDB based file-system view at /tmp/hoodie_timeline_rocksdb, Total file-groups=0
   21/05/26 18:33:33 INFO collection.RocksDBDAO: Prefix DELETE (query=part=) on hudi_replaced_fg_user_hdfs_xyz_public_ml_xxx_foo
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting replacedFileGroups to ROCKSDB based file-system view complete
   21/05/26 18:33:33 INFO view.AbstractTableFileSystemView: Took 9 ms to read  0 instants, 0 replaced file groups
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Initializing pending compaction operations. Count=0
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Initializing external data file mapping. Count=0
   21/05/26 18:33:33 INFO util.ClusteringUtils: Found 0 files in pending clustering operations
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting file groups in pending clustering to ROCKSDB based file-system view at /tmp/hoodie_timeline_rocksdb, Total file-groups=0
   21/05/26 18:33:33 INFO collection.RocksDBDAO: Prefix DELETE (query=part=) on hudi_pending_clustering_fg_user_hdfs_xyz_public_ml_xxx_foo
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Resetting replacedFileGroups to ROCKSDB based file-system view complete
   21/05/26 18:33:33 INFO view.RocksDbBasedFileSystemView: Created ROCKSDB based file-system view at /tmp/hoodie_timeline_rocksdb
   21/05/26 18:33:33 INFO collection.RocksDBDAO: Prefix Search for (query=) on hudi_pending_compaction__user_hdfs_xyz_public_ml_xxx_foo. Total Time Taken (msec)=1. Serialization Time taken(micro)=0, num entries=0
   21/05/26 18:33:33 INFO service.RequestHandler: TimeTakenMillis[Total=791, Refresh=779, handle=11, Check=1], Success=true, Query=basepath=%2Fuser%2Fhdfs%2Fxyz%2Fpublic%2Fml_xxx%2Ffoo&lastinstantts=20210526183328&timelinehash=3cb19d4eacc8a39b3d4198ed17d5dac7ca1a076cc50020fab31fed29c6ccddb1, Host=xxx:37089, synced=false
   21/05/26 18:33:33 INFO storage.BlockManagerMasterEndpoint: Registering block manager xxx:36920 with 912.3 MB RAM, BlockManagerId(2, xxx, 36920, None)
   21/05/26 18:33:33 INFO clean.CleanPlanner: No earliest commit to retain. No need to scan partitions !!
   21/05/26 18:33:33 INFO clean.CleanPlanner: Nothing to clean here. It is already clean
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Cleaner started
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Cleaned failed attempts if any
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:33 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO timeline.HoodieActiveTimeline: Loaded instants [[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:33 INFO client.SparkRDDWriteClient: Successfully synced to metadata table
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Committed 20210526183328
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Scheduling table service COMPACT
   21/05/26 18:33:33 INFO client.AbstractHoodieWriteClient: Scheduling compaction at instant time :20210526183333
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading HoodieTableMetaClient from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://xxx:8020], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, __spark_hadoop_conf__.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_-862249120_1, ugi=hdfs (auth:SIMPLE)]]]
   21/05/26 18:33:33 INFO table.HoodieTableConfig: Loading table properties from /user/hd_xyz/yyy/ml_xxx/foo/.hoodie/hoodie.properties
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=PARQUET) from /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO table.HoodieTableMetaClient: Loading Active commit timeline for /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO timeline.HoodieActiveTimeline: Loaded instants [[20210526183328__deltacommit__COMPLETED]]
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating View Manager with storage type :REMOTE_FIRST
   21/05/26 18:33:33 INFO view.FileSystemViewManager: Creating remote first table view
   21/05/26 18:33:33 INFO compact.SparkScheduleCompactionActionExecutor: Checking if compaction needs to be run on /user/hd_xyz/yyy/ml_xxx/foo
   21/05/26 18:33:33 INFO deltastreamer.DeltaSync: Commit 20210526183328 successful!
   21/05/26 18:33:33 INFO rdd.MapPartitionsRDD: Removing RDD 29 from persistence list
   21/05/26 18:33:33 INFO storage.BlockManager: Removing RDD 29
   21/05/26 18:33:34 INFO rdd.MapPartitionsRDD: Removing RDD 37 from persistence list
   21/05/26 18:33:34 INFO storage.BlockManager: Removing RDD 37
   21/05/26 18:33:34 INFO deltastreamer.DeltaSync: Shutting down embedded timeline server
   21/05/26 18:33:34 INFO embedded.EmbeddedTimelineService: Closing Timeline server
   21/05/26 18:33:34 INFO service.TimelineService: Closing Timeline Service
   21/05/26 18:33:34 INFO javalin.Javalin: Stopping Javalin ...
   21/05/26 18:33:34 INFO javalin.Javalin: Javalin has stopped
   21/05/26 18:33:34 INFO view.RocksDbBasedFileSystemView: Closing Rocksdb !!
   21/05/26 18:33:34 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:365] Shutdown: canceling all background work
   21/05/26 18:33:34 INFO collection.RocksDBDAO: From Rocks DB : [db/db_impl.cc:521] Shutdown complete
   21/05/26 18:33:34 INFO view.RocksDbBasedFileSystemView: Closed Rocksdb !!
   21/05/26 18:33:34 INFO service.TimelineService: Closed Timeline Service
   21/05/26 18:33:34 INFO embedded.EmbeddedTimelineService: Closed Timeline server
   21/05/26 18:33:34 INFO deltastreamer.HoodieDeltaStreamer: Shut down delta streamer
   21/05/26 18:33:34 INFO server.AbstractConnector: Stopped Spark@7a0e94b4{HTTP/1.1,[http/1.1]}{0.0.0.0:0}
   21/05/26 18:33:34 INFO ui.SparkUI: Stopped Spark web UI at http://xxx:32822
   21/05/26 18:33:34 INFO yarn.YarnAllocator: Driver requested a total number of 0 executor(s).
   21/05/26 18:33:34 INFO cluster.YarnClusterSchedulerBackend: Shutting down all executors
   21/05/26 18:33:34 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down
   21/05/26 18:33:34 INFO cluster.SchedulerExtensionServices: Stopping SchedulerExtensionServices
   (serviceOption=None,
    services=List(),
    started=false)
   21/05/26 18:33:34 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
   21/05/26 18:33:34 INFO memory.MemoryStore: MemoryStore cleared
   21/05/26 18:33:34 INFO storage.BlockManager: BlockManager stopped
   21/05/26 18:33:34 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
   21/05/26 18:33:34 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
   21/05/26 18:33:34 INFO spark.SparkContext: Successfully stopped SparkContext
   21/05/26 18:33:34 INFO yarn.ApplicationMaster: Final app status: SUCCEEDED, exitCode: 0
   21/05/26 18:33:34 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with SUCCEEDED
   21/05/26 18:33:34 INFO impl.AMRMClientImpl: Waiting for application to be successfully unregistered.
   21/05/26 18:33:34 INFO yarn.ApplicationMaster: Deleting staging directory hdfs://xxx:8020/user/hd_xyz/.sparkStaging/application_1618828995116_0162
   21/05/26 18:33:34 INFO util.ShutdownHookManager: Shutdown hook called
   21/05/26 18:33:34 INFO util.ShutdownHookManager: Deleting directory /data/hadoop/yarn/local/usercache/hdfs/appcache/application_1618828995116_0162/spark-4c7e81b9-e526-4325-abf0-d163828b92b5
   `


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] vinothchandar closed issue #2959: No data stored after migrating to Hudi 0.8.0

Posted by GitBox <gi...@apache.org>.
vinothchandar closed issue #2959:
URL: https://github.com/apache/hudi/issues/2959


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] PavelPetukhov edited a comment on issue #2959: No data stored after migrating to Hudi 0.8.0

Posted by GitBox <gi...@apache.org>.
PavelPetukhov edited a comment on issue #2959:
URL: https://github.com/apache/hudi/issues/2959#issuecomment-848930327


   @n3nash 
   
   This is our full log:
   [spark_log.txt](https://github.com/apache/hudi/files/6548394/spark_log.txt)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org