You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/02/14 13:28:40 UTC

[GitHub] [hudi] nsivabalan opened a new issue #4814: [SUPPORT] NPE thrown while archiving data table when metadata is enabled

nsivabalan opened a new issue #4814:
URL: https://github.com/apache/hudi/issues/4814


   **Describe the problem you faced**
   
   My job fails w/ NPE during archival of data table when metadata table is enabled. 
   
   Hi All, I am facing an issue regarding table type after bumping to 0.10.1 version.
   I have specified COW table type but somehow in logs I can see MERGE_ON_READ table type also. Is it for metadata table?
   The jobs are intermittently failing with Null pointer exception.
   Can someone please help me with this error. Attached stack trace and below are the write options used
   
   HoodieWriteConfig.TBL_NAME.key -> options.tableName,
   DataSourceWriteOptions.TABLE_TYPE.key -> DataSourceWriteOptions.TABLE_TYPE.defaultValue(),
   DataSourceWriteOptions.RECORDKEY_FIELD.key -> options.primaryKey,
   DataSourceWriteOptions.PRECOMBINE_FIELD.key -> RECKO_UPDATED_TIME,
   DataSourceWriteOptions.HIVE_PARTITION_EXTRACTOR_CLASS.key -> classOf[MultiPartKeysValueExtractor].getName,
   DataSourceWriteOptions.HIVE_STYLE_PARTITIONING.key -> "true",
   DataSourceWriteOptions.PARTITIONPATH_FIELD.key -> options.partitionCols.mkString(",")
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1.
   2.
   3.
   4.
   
   **Expected behavior**
   
   A clear and concise description of what you expected to happen.
   
   **Environment Description**
   
   * Hudi version :
   
   * Spark version :
   
   * Hive version :
   
   * Hadoop version :
   
   * Storage (HDFS/S3/GCS..) :
   
   * Running on Docker? (yes/no) :
   
   
   **Additional context**
   
   Add any other context about the problem here.
   
   **Stacktrace**
   
   ```
   2022-02-13 13:44:57,475 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient  - Loading HoodieTableMetaClient from s3://recko-ego/datalake_streamline_concurrency_1/structured_zone/validated_data/__recko__org_id=ego/__recko__tenant_id=ego/__recko__source=EXPECTED/__recko__policy_id=691508e5-9af8-4254-8211-3effed593c20/
   2022-02-13 13:44:57,487 [main] INFO  org.apache.hudi.common.table.HoodieTableConfig  - Loading table properties from s3://recko-ego/datalake_streamline_concurrency_1/structured_zone/validated_data/__recko__org_id=ego/__recko__tenant_id=ego/__recko__source=EXPECTED/__recko__policy_id=691508e5-9af8-4254-8211-3effed593c20/.hoodie/hoodie.properties
   2022-02-13 13:44:57,487 [main] INFO  com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem  - Opening 's3://recko-ego/datalake_streamline_concurrency_1/structured_zone/validated_data/__recko__org_id=ego/__recko__tenant_id=ego/__recko__source=EXPECTED/__recko__policy_id=691508e5-9af8-4254-8211-3effed593c20/.hoodie/hoodie.properties' for reading
   2022-02-13 13:44:57,505 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient  - Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from s3://recko-ego/datalake_streamline_concurrency_1/structured_zone/validated_data/__recko__org_id=ego/__recko__tenant_id=ego/__recko__source=EXPECTED/__recko__policy_id=691508e5-9af8-4254-8211-3effed593c20/
   2022-02-13 13:44:57,505 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient  - Loading Active commit timeline for s3://recko-ego/datalake_streamline_concurrency_1/structured_zone/validated_data/__recko__org_id=ego/__recko__tenant_id=ego/__recko__source=EXPECTED/__recko__policy_id=691508e5-9af8-4254-8211-3effed593c20/
   2022-02-13 13:44:57,562 [main] INFO  org.apache.hudi.common.table.timeline.HoodieActiveTimeline  - Loaded instants upto : Option{val=[==>20220213134444484__commit__REQUESTED]}
   2022-02-13 13:44:57,562 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient  - Loading HoodieTableMetaClient from s3://recko-ego/datalake_streamline_concurrency_1/structured_zone/validated_data/__recko__org_id=ego/__recko__tenant_id=ego/__recko__source=EXPECTED/__recko__policy_id=691508e5-9af8-4254-8211-3effed593c20/
   2022-02-13 13:44:57,574 [main] INFO  org.apache.hudi.common.table.HoodieTableConfig  - Loading table properties from s3://recko-ego/datalake_streamline_concurrency_1/structured_zone/validated_data/__recko__org_id=ego/__recko__tenant_id=ego/__recko__source=EXPECTED/__recko__policy_id=691508e5-9af8-4254-8211-3effed593c20/.hoodie/hoodie.properties
   2022-02-13 13:44:57,574 [main] INFO  com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem  - Opening 's3://recko-ego/datalake_streamline_concurrency_1/structured_zone/validated_data/__recko__org_id=ego/__recko__tenant_id=ego/__recko__source=EXPECTED/__recko__policy_id=691508e5-9af8-4254-8211-3effed593c20/.hoodie/hoodie.properties' for reading
   2022-02-13 13:44:57,592 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient  - Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from s3://recko-ego/datalake_streamline_concurrency_1/structured_zone/validated_data/__recko__org_id=ego/__recko__tenant_id=ego/__recko__source=EXPECTED/__recko__policy_id=691508e5-9af8-4254-8211-3effed593c20/
   2022-02-13 13:44:57,592 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient  - Loading HoodieTableMetaClient from s3://recko-ego/datalake_streamline_concurrency_1/structured_zone/validated_data/__recko__org_id=ego/__recko__tenant_id=ego/__recko__source=EXPECTED/__recko__policy_id=691508e5-9af8-4254-8211-3effed593c20//.hoodie/metadata
   2022-02-13 13:44:57,615 [main] INFO  org.apache.hudi.common.table.HoodieTableConfig  - Loading table properties from s3://recko-ego/datalake_streamline_concurrency_1/structured_zone/validated_data/__recko__org_id=ego/__recko__tenant_id=ego/__recko__source=EXPECTED/__recko__policy_id=691508e5-9af8-4254-8211-3effed593c20/.hoodie/metadata/.hoodie/hoodie.properties
   2022-02-13 13:44:57,615 [main] INFO  com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem  - Opening 's3://recko-ego/datalake_streamline_concurrency_1/structured_zone/validated_data/__recko__org_id=ego/__recko__tenant_id=ego/__recko__source=EXPECTED/__recko__policy_id=691508e5-9af8-4254-8211-3effed593c20/.hoodie/metadata/.hoodie/hoodie.properties' for reading
   2022-02-13 13:44:57,650 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient  - Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from s3://recko-ego/datalake_streamline_concurrency_1/structured_zone/validated_data/__recko__org_id=ego/__recko__tenant_id=ego/__recko__source=EXPECTED/__recko__policy_id=691508e5-9af8-4254-8211-3effed593c20//.hoodie/metadata
   2022-02-13 13:44:57,650 [main] INFO  org.apache.hudi.common.table.view.FileSystemViewManager  - Creating View Manager with storage type :REMOTE_FIRST
   2022-02-13 13:44:57,650 [main] INFO  org.apache.hudi.common.table.view.FileSystemViewManager  - Creating remote first table view
   2022-02-13 13:44:57,657 [dispatcher-BlockManagerMaster] INFO  org.apache.spark.storage.BlockManagerInfo  - Removed broadcast_55_piece0 on ip-10-9-157-234.ap-south-1.compute.internal:33735 in memory (size: 351.9 KiB, free: 693.4 MiB)
   2022-02-13 13:44:57,658 [dispatcher-BlockManagerMaster] INFO  org.apache.spark.storage.BlockManagerInfo  - Removed broadcast_55_piece0 on ip-10-9-130-208.ap-south-1.compute.internal:39427 in memory (size: 351.9 KiB, free: 2004.5 MiB)
   2022-02-13 13:44:57,658 [dispatcher-BlockManagerMaster] INFO  org.apache.spark.storage.BlockManagerInfo  - Removed broadcast_55_piece0 on ip-10-9-130-208.ap-south-1.compute.internal:37557 in memory (size: 351.9 KiB, free: 2004.5 MiB)
   2022-02-13 13:44:57,714 [main] INFO  org.apache.hudi.common.table.timeline.HoodieActiveTimeline  - Loaded instants upto : Option{val=[==>20220213134444484__commit__REQUESTED]}
   2022-02-13 13:44:57,715 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient  - Loading HoodieTableMetaClient from s3://recko-ego/datalake_streamline_concurrency_1/structured_zone/validated_data/__recko__org_id=ego/__recko__tenant_id=ego/__recko__source=EXPECTED/__recko__policy_id=691508e5-9af8-4254-8211-3effed593c20/
   2022-02-13 13:44:57,728 [main] INFO  org.apache.hudi.common.table.HoodieTableConfig  - Loading table properties from s3://recko-ego/datalake_streamline_concurrency_1/structured_zone/validated_data/__recko__org_id=ego/__recko__tenant_id=ego/__recko__source=EXPECTED/__recko__policy_id=691508e5-9af8-4254-8211-3effed593c20/.hoodie/hoodie.properties
   2022-02-13 13:44:57,728 [main] INFO  com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem  - Opening 's3://recko-ego/datalake_streamline_concurrency_1/structured_zone/validated_data/__recko__org_id=ego/__recko__tenant_id=ego/__recko__source=EXPECTED/__recko__policy_id=691508e5-9af8-4254-8211-3effed593c20/.hoodie/hoodie.properties' for reading
   2022-02-13 13:44:57,745 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient  - Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from s3://recko-ego/datalake_streamline_concurrency_1/structured_zone/validated_data/__recko__org_id=ego/__recko__tenant_id=ego/__recko__source=EXPECTED/__recko__policy_id=691508e5-9af8-4254-8211-3effed593c20/
   2022-02-13 13:44:57,745 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient  - Loading HoodieTableMetaClient from s3://recko-ego/datalake_streamline_concurrency_1/structured_zone/validated_data/__recko__org_id=ego/__recko__tenant_id=ego/__recko__source=EXPECTED/__recko__policy_id=691508e5-9af8-4254-8211-3effed593c20//.hoodie/metadata
   2022-02-13 13:44:57,766 [main] INFO  org.apache.hudi.common.table.HoodieTableConfig  - Loading table properties from s3://recko-ego/datalake_streamline_concurrency_1/structured_zone/validated_data/__recko__org_id=ego/__recko__tenant_id=ego/__recko__source=EXPECTED/__recko__policy_id=691508e5-9af8-4254-8211-3effed593c20/.hoodie/metadata/.hoodie/hoodie.properties
   2022-02-13 13:44:57,766 [main] INFO  com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem  - Opening 's3://recko-ego/datalake_streamline_concurrency_1/structured_zone/validated_data/__recko__org_id=ego/__recko__tenant_id=ego/__recko__source=EXPECTED/__recko__policy_id=691508e5-9af8-4254-8211-3effed593c20/.hoodie/metadata/.hoodie/hoodie.properties' for reading
   2022-02-13 13:44:57,785 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient  - Finished Loading Table of type MERGE_ON_READ(version=1, baseFileFormat=HFILE) from s3://recko-ego/datalake_streamline_concurrency_1/structured_zone/validated_data/__recko__org_id=ego/__recko__tenant_id=ego/__recko__source=EXPECTED/__recko__policy_id=691508e5-9af8-4254-8211-3effed593c20//.hoodie/metadata
   2022-02-13 13:44:57,841 [main] INFO  org.apache.hudi.common.table.timeline.HoodieActiveTimeline  - Loaded instants upto : Option{val=[20220213134350974001__commit__COMPLETED]}
   2022-02-13 13:44:57,841 [main] INFO  org.apache.hudi.table.HoodieTimelineArchiveLog  - Limiting archiving of instants to latest compaction on metadata table at 20220213134350974001
   2022-02-13 13:44:57,842 [main] INFO  org.apache.hudi.client.heartbeat.HoodieHeartbeatClient  - Stopping heartbeat for instant 20220213134342038
   2022-02-13 13:44:57,842 [main] INFO  org.apache.hudi.client.heartbeat.HoodieHeartbeatClient  - Stopped heartbeat for instant 20220213134342038
   2022-02-13 13:44:57,907 [main] INFO  org.apache.hudi.client.heartbeat.HeartbeatUtils  - Deleted the heartbeat for instant 20220213134342038
   2022-02-13 13:44:57,907 [main] INFO  org.apache.hudi.client.heartbeat.HoodieHeartbeatClient  - Deleted heartbeat file for instant 20220213134342038
   2022-02-13 13:44:57,907 [main] INFO  org.apache.hudi.client.transaction.TransactionManager  - Transaction ending with transaction owner Option{val=[==>20220213134342038__commit__INFLIGHT]}
   2022-02-13 13:44:57,907 [main] INFO  org.apache.hudi.client.transaction.lock.ZookeeperBasedLockProvider  - RELEASING lock atZkBasePath = /hudi, lock key = 691508e5-9af8-4254-8211-3effed593c20
   2022-02-13 13:44:57,908 [main] INFO  org.apache.hudi.client.transaction.lock.ZookeeperBasedLockProvider  - RELEASED lock atZkBasePath = /hudi, lock key = 691508e5-9af8-4254-8211-3effed593c20
   2022-02-13 13:44:57,908 [main] INFO  org.apache.hudi.client.transaction.TransactionManager  - Transaction ended with transaction owner Option{val=[==>20220213134342038__commit__INFLIGHT]}
   2022-02-13 13:44:57,910 [main] INFO  com.recko.spark.jobs.v1.ingestion.DataValidationJob  - Current State :: ================ FAILED
   2022-02-13 13:44:57,912 [main] INFO  com.recko.spark.jobs.v1.ingestion.listeners.TaskStatusLogger  - TimeStamp : 2022-02-13 13:44:57 =========== TaskId : 420a79fd-b810-47f7-ad28-dfcd7ea03c10 ===== State : FAILED
   2022-02-13 13:44:57,936 [main] INFO  com.recko.spark.utils.db.RDSUtil$  - Executing procedure :: INSERT INTO scepter.task_metrics(id, task_id, task_type, stage_num, batch_id, policy_id, file_size, start_time, end_time, time_taken, estimated_eta, other_attributes, org_id, tenant_id) VALUES (  '4cf1da7d-2cf9-4d72-8be2-a679797c5a88', '420a79fd-b810-47f7-ad28-dfcd7ea03c10', 'DATA_VALIDATION', '1', '1b5fbfd4-e793-4b7a-b896-f6b73e033000', '691508e5-9af8-4254-8211-3effed593c20', 1, '2022-02-13 13:43:24.041', '2022-02-13 13:44:57.913', 93872, 129663, '', 'ego', 'ego');
   2022-02-13 13:44:57,938 [main] INFO  com.recko.spark.utils.db.RDSUtil$  - Execution complete, TimeTaken :: 25
   2022-02-13 13:44:57,938 [main] INFO  org.apache.kafka.clients.producer.ProducerConfig  - ProducerConfig values: 
   	acks = 1
   	batch.size = 16384
   	bootstrap.servers = [ego-kafka.reckostaging.com:9094]
   	buffer.memory = 33554432
   	client.dns.lookup = use_all_dns_ips
   	client.id = producer-5
   	compression.type = none
   	connections.max.idle.ms = 540000
   	delivery.timeout.ms = 120000
   	enable.idempotence = false
   	interceptor.classes = []
   	internal.auto.downgrade.txn.commit = false
   	key.serializer = class org.apache.kafka.common.serialization.StringSerializer
   	linger.ms = 0
   	max.block.ms = 60000
   	max.in.flight.requests.per.connection = 5
   	max.request.size = 1048576
   	metadata.max.age.ms = 300000
   	metadata.max.idle.ms = 300000
   	metric.reporters = []
   	metrics.num.samples = 2
   	metrics.recording.level = INFO
   	metrics.sample.window.ms = 30000
   	partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
   	receive.buffer.bytes = 32768
   	reconnect.backoff.max.ms = 1000
   	reconnect.backoff.ms = 50
   	request.timeout.ms = 30000
   	retries = 2147483647
   	retry.backoff.ms = 100
   	sasl.client.callback.handler.class = null
   	sasl.jaas.config = [hidden]
   	sasl.kerberos.kinit.cmd = /usr/bin/kinit
   	sasl.kerberos.min.time.before.relogin = 60000
   	sasl.kerberos.service.name = null
   	sasl.kerberos.ticket.renew.jitter = 0.05
   	sasl.kerberos.ticket.renew.window.factor = 0.8
   	sasl.login.callback.handler.class = null
   	sasl.login.class = null
   	sasl.login.refresh.buffer.seconds = 300
   	sasl.login.refresh.min.period.seconds = 60
   	sasl.login.refresh.window.factor = 0.8
   	sasl.login.refresh.window.jitter = 0.05
   	sasl.mechanism = PLAIN
   	security.protocol = SASL_PLAINTEXT
   	security.providers = null
   	send.buffer.bytes = 131072
   	ssl.cipher.suites = null
   	ssl.enabled.protocols = [TLSv1.2]
   	ssl.endpoint.identification.algorithm = https
   	ssl.engine.factory.class = null
   	ssl.key.password = null
   	ssl.keymanager.algorithm = SunX509
   	ssl.keystore.location = null
   	ssl.keystore.password = null
   	ssl.keystore.type = JKS
   	ssl.protocol = TLSv1.2
   	ssl.provider = null
   	ssl.secure.random.implementation = null
   	ssl.trustmanager.algorithm = PKIX
   	ssl.truststore.location = null
   	ssl.truststore.password = null
   	ssl.truststore.type = JKS
   	transaction.timeout.ms = 60000
   	transactional.id = null
   	value.serializer = class org.apache.kafka.common.serialization.StringSerializer
   
   2022-02-13 13:44:57,941 [main] INFO  org.apache.kafka.common.security.authenticator.AbstractLogin  - Successfully logged in.
   2022-02-13 13:44:57,942 [main] INFO  org.apache.kafka.common.utils.AppInfoParser  - Kafka version: 2.6.0
   2022-02-13 13:44:57,942 [main] INFO  org.apache.kafka.common.utils.AppInfoParser  - Kafka commitId: 62abe01bee039651
   2022-02-13 13:44:57,942 [main] INFO  org.apache.kafka.common.utils.AppInfoParser  - Kafka startTimeMs: 1644759897942
   2022-02-13 13:44:57,953 [kafka-producer-network-thread | producer-5] INFO  org.apache.kafka.clients.Metadata  - [Producer clientId=producer-5] Cluster ID: 8XeF52xMSRmjjqZb4A0Ceg
   2022-02-13 13:44:57,963 [main] INFO  com.recko.spark.utils.tasktracker.KafkaTaskTrackerService  - recordMeta offset = 6114
   2022-02-13 13:44:57,963 [main] INFO  org.apache.kafka.clients.producer.KafkaProducer  - [Producer clientId=producer-5] Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms.
   2022-02-13 13:44:57,985 [main] INFO  com.recko.spark.utils.S3Util$  - cleaning up : s3://recko-ego/datalake_streamline_concurrency_1/journal_duplicates/420a79fd-b810-47f7-ad28-dfcd7ea03c10/
   2022-02-13 13:44:58,047 [main] INFO  com.recko.spark.services.ingester.RollbackService$  - Query :: ALTER TABLE ego_ext.raw_journal DROP IF EXISTS PARTITION (org_id = 'ego', tenant_id = 'ego', data_source = 'EXPECTED', batch_id = '1b5fbfd4-e793-4b7a-b896-f6b73e033000', task_id = '420a79fd-b810-47f7-ad28-dfcd7ea03c10' );
   2022-02-13 13:44:58,102 [main] INFO  org.apache.spark.sql.hive.HiveUtils  - Initializing HiveMetastoreConnection version 3.1.2-amzn-5 using Spark classes.
   2022-02-13 13:44:58,202 [main] INFO  org.apache.hadoop.hive.conf.HiveConf  - Found configuration file file:/mnt/yarn/usercache/hadoop/appcache/application_1642451881812_7589/container_1642451881812_7589_01_000001/hive-site.xml
   2022-02-13 13:44:58,397 [main] INFO  SessionState  - Hive Session ID = 24800b67-fd35-4f59-b8c4-43956e2fdce1
   2022-02-13 13:44:58,495 [main] INFO  org.apache.hadoop.hive.ql.session.SessionState  - Created HDFS directory: /tmp/hive/hadoop/24800b67-fd35-4f59-b8c4-43956e2fdce1
   2022-02-13 13:44:58,502 [main] INFO  org.apache.hadoop.hive.ql.session.SessionState  - Created local directory: /tmp/yarn/24800b67-fd35-4f59-b8c4-43956e2fdce1
   2022-02-13 13:44:58,503 [main] INFO  org.apache.hadoop.hive.ql.session.SessionState  - Created HDFS directory: /tmp/hive/hadoop/24800b67-fd35-4f59-b8c4-43956e2fdce1/_tmp_space.db
   2022-02-13 13:44:58,517 [main] INFO  org.apache.spark.sql.hive.client.HiveClientImpl  - Warehouse location for Hive client (version 3.1.2) is hdfs:///user/spark/warehouse
   2022-02-13 13:44:58,663 [dispatcher-CoarseGrainedScheduler] INFO  org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnDriverEndpoint  - Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.9.157.234:45834) with ID 3,  ResourceProfileId 0
   2022-02-13 13:44:58,664 [spark-listener-group-executorManagement] INFO  org.apache.spark.scheduler.dynalloc.ExecutorMonitor  - New executor 3 has registered (new total is 4)
   2022-02-13 13:44:58,767 [dispatcher-BlockManagerMaster] INFO  org.apache.spark.storage.BlockManagerMasterEndpoint  - Registering block manager ip-10-9-157-234.ap-south-1.compute.internal:38329 with 2004.6 MiB RAM, BlockManagerId(3, ip-10-9-157-234.ap-south-1.compute.internal, 38329, None)
   2022-02-13 13:45:00,299 [block-manager-storage-async-thread-pool-165] INFO  org.apache.spark.storage.BlockManager  - Removing RDD 94
   2022-02-13 13:45:00,315 [block-manager-storage-async-thread-pool-172] INFO  org.apache.spark.storage.BlockManager  - Removing RDD 114
   2022-02-13 13:45:00,319 [block-manager-storage-async-thread-pool-5] INFO  org.apache.spark.storage.BlockManager  - Removing RDD 137
   2022-02-13 13:45:00,322 [block-manager-storage-async-thread-pool-17] INFO  org.apache.spark.storage.BlockManager  - Removing RDD 122
   2022-02-13 13:45:02,507 [main] INFO  org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController  - Created SQLStdHiveAccessController for session context : HiveAuthzSessionContext [sessionString=24800b67-fd35-4f59-b8c4-43956e2fdce1, clientType=HIVECLI]
   2022-02-13 13:45:02,509 [main] WARN  org.apache.hadoop.hive.ql.session.SessionState  - METASTORE_FILTER_HOOK will be ignored, since hive.security.authorization.manager is set to instance of HiveAuthorizerFactory.
   2022-02-13 13:45:03,707 [main] INFO  org.sparkproject.jetty.server.AbstractConnector  - Stopped Spark@60f662bd{HTTP/1.1, (http/1.1)}{0.0.0.0:4040}
   2022-02-13 13:45:03,710 [main] INFO  org.apache.spark.ui.SparkUI  - Stopped Spark web UI at http://ip-10-9-157-234.ap-south-1.compute.internal:4040
   2022-02-13 13:45:03,714 [YARN application state monitor] INFO  org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend  - Interrupting monitor thread
   2022-02-13 13:45:03,716 [main] INFO  org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend  - Shutting down all executors
   2022-02-13 13:45:03,716 [dispatcher-CoarseGrainedScheduler] INFO  org.apache.spark.scheduler.cluster.YarnSchedulerBackend$YarnDriverEndpoint  - Asking each executor to shut down
   2022-02-13 13:45:03,721 [main] INFO  org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend  - YARN client scheduler backend Stopped
   2022-02-13 13:45:03,750 [dispatcher-event-loop-2] INFO  org.apache.spark.MapOutputTrackerMasterEndpoint  - MapOutputTrackerMasterEndpoint stopped!
   2022-02-13 13:45:03,764 [main] INFO  org.apache.spark.storage.memory.MemoryStore  - MemoryStore cleared
   2022-02-13 13:45:03,764 [main] INFO  org.apache.spark.storage.BlockManager  - BlockManager stopped
   2022-02-13 13:45:03,772 [main] INFO  org.apache.spark.storage.BlockManagerMaster  - BlockManagerMaster stopped
   2022-02-13 13:45:03,778 [dispatcher-event-loop-3] INFO  org.apache.spark.scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint  - OutputCommitCoordinator stopped!
   2022-02-13 13:45:03,784 [main] INFO  org.apache.spark.SparkContext  - Successfully stopped SparkContext
   
   <<< Invocation of Spark command completed <<<
   
   Hadoop Job IDs executed by Spark: job_1642451881812_7593
   
   java.lang.reflect.InvocationTargetException
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.apache.oozie.action.hadoop.LauncherAM.runActionMain(LauncherAM.java:412)
   	at org.apache.oozie.action.hadoop.LauncherAM.access$400(LauncherAM.java:54)
   	at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:225)
   	at java.security.AccessController.doPrivileged(Native Method)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
   	at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:219)
   	at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:155)
   	at java.security.AccessController.doPrivileged(Native Method)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
   	at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:143)
   Caused by: java.lang.NullPointerException
   	at org.apache.hudi.table.HoodieTimelineArchiveLog.lambda$getInstantsToArchive$9(HoodieTimelineArchiveLog.java:230)
   	at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:269)
   	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
   	at java.util.stream.SliceOps$1$1.accept(SliceOps.java:204)
   	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
   	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
   	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
   	at java.util.ArrayList$ArrayListSpliterator.tryAdvance(ArrayList.java:1361)
   	at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126)
   	at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:499)
   	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:486)
   	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
   	at java.util.stream.StreamSpliterators$WrappingSpliterator.forEachRemaining(StreamSpliterators.java:313)
   	at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:743)
   	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
   	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
   	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
   	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
   	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
   	at org.apache.hudi.table.HoodieTimelineArchiveLog.archiveIfRequired(HoodieTimelineArchiveLog.java:124)
   	at org.apache.hudi.client.AbstractHoodieWriteClient.archive(AbstractHoodieWriteClient.java:766)
   	at org.apache.hudi.client.AbstractHoodieWriteClient.postCommit(AbstractHoodieWriteClient.java:459)
   	at org.apache.hudi.client.AbstractHoodieWriteClient.commitStats(AbstractHoodieWriteClient.java:198)
   	at org.apache.hudi.client.SparkRDDWriteClient.commit(SparkRDDWriteClient.java:125)
   	at org.apache.hudi.HoodieSparkSqlWriter$.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:635)
   	at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:286)
   	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:164)
   	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:194)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:232)
   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:229)
   	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:190)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:134)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:133)
   	at org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:989)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:110)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:135)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:135)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:253)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:134)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68)
   	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:989)
   	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:438)
   	at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:415)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:293)
   	at com.recko.spark.io.lake.HudiIO.writeDataFrame(HudiIO.scala:66)
   	at com.recko.spark.utils.datalake.StructuredZoneUtils$.putValidationData(StructuredZoneUtils.scala:135)
   	at com.recko.spark.utils.datalake.LakeUtils$.writeIntoLake(LakeUtils.scala:10)
   	at com.recko.spark.jobs.v1.ingestion.IngestionWrapper.postExecute(IngestionWrapper.scala:184)
   	at com.recko.spark.jobs.v1.ingestion.DataValidationJob.execute(DataValidationJob.scala:63)
   	at com.recko.spark.jobs.v1.triggers.IngestionJob$.$anonfun$main$1(IngestionJob.scala:30)
   	at com.recko.spark.executors.Executor.jobWrapper(Executor.scala:71)
   	at com.recko.spark.jobs.v1.triggers.IngestionJob$.main(IngestionJob.scala:23)
   	at com.recko.spark.jobs.v1.triggers.IngestionJob.main(IngestionJob.scala)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
   	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:959)
   	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
   	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
   	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
   	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1047)
   	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1056)
   	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   	at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:186)
   	at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:93)
   	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:107)
   	at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:60)
   	... 16 more
   Failing Oozie Launcher, null
   java.lang.NullPointerException
   	at org.apache.hudi.table.HoodieTimelineArchiveLog.lambda$getInstantsToArchive$9(HoodieTimelineArchiveLog.java:230)
   	at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:269)
   	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
   	at java.util.stream.SliceOps$1$1.accept(SliceOps.java:204)
   	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
   	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
   	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
   	at java.util.ArrayList$ArrayListSpliterator.tryAdvance(ArrayList.java:1361)
   	at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126)
   	at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:499)
   	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:486)
   	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
   	at java.util.stream.StreamSpliterators$WrappingSpliterator.forEachRemaining(StreamSpliterators.java:313)
   	at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:743)
   	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
   	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
   	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
   	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
   	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
   	at org.apache.hudi.table.HoodieTimelineArchiveLog.archiveIfRequired(HoodieTimelineArchiveLog.java:124)
   	at org.apache.hudi.client.AbstractHoodieWriteClient.archive(AbstractHoodieWriteClient.java:766)
   	at org.apache.hudi.client.AbstractHoodieWriteClient.postCommit(AbstractHoodieWriteClient.java:459)
   	at org.apache.hudi.client.AbstractHoodieWriteClient.commitStats(AbstractHoodieWriteClient.java:198)
   	at org.apache.hudi.client.SparkRDDWriteClient.commit(SparkRDDWriteClient.java:125)
   	at org.apache.hudi.HoodieSparkSqlWriter$.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:635)
   	at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:286)
   	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:164)
   	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:194)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:232)
   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:229)
   	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:190)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:134)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:133)
   	at org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:989)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:110)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:135)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:135)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:253)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:134)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68)
   	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:989)
   	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:438)
   	at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:415)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:293)
   	at com.recko.spark.io.lake.HudiIO.writeDataFrame(HudiIO.scala:66)
   	at com.recko.spark.utils.datalake.StructuredZoneUtils$.putValidationData(StructuredZoneUtils.scala:135)
   	at com.recko.spark.utils.datalake.LakeUtils$.writeIntoLake(LakeUtils.scala:10)
   	at com.recko.spark.jobs.v1.ingestion.IngestionWrapper.postExecute(IngestionWrapper.scala:184)
   	at com.recko.spark.jobs.v1.ingestion.DataValidationJob.execute(DataValidationJob.scala:63)
   	at com.recko.spark.jobs.v1.triggers.IngestionJob$.$anonfun$main$1(IngestionJob.scala:30)
   	at com.recko.spark.executors.Executor.jobWrapper(Executor.scala:71)
   	at com.recko.spark.jobs.v1.triggers.IngestionJob$.main(IngestionJob.scala:23)
   	at com.recko.spark.jobs.v1.triggers.IngestionJob.main(IngestionJob.scala)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
   	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:959)
   	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
   	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
   	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
   	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1047)
   	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1056)
   	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   	at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:186)
   	at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:93)
   	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:107)
   	at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:60)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.apache.oozie.action.hadoop.LauncherAM.runActionMain(LauncherAM.java:412)
   	at org.apache.oozie.action.hadoop.LauncherAM.access$400(LauncherAM.java:54)
   	at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:225)
   	at java.security.AccessController.doPrivileged(Native Method)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
   	at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:219)
   	at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:155)
   	at java.security.AccessController.doPrivileged(Native Method)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
   	at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:143)
   Oozie Launcher, uploading action data to HDFS sequence file: hdfs://qa-emr.reckostaging.com:8020/user/hadoop/oozie-oozi/0005119-220117203901194-oozie-oozi-W/IngestionJob--spark/action-data.seq
   2022-02-13 13:45:03,825 [main] INFO  org.apache.hadoop.io.compress.CodecPool  - Got brand-new compressor [.deflate]
   Stopping AM
   2022-02-13 13:45:03,837 [main] INFO  org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl  - Waiting for application to be successfully unregistered.
   Callback notification attempts left 0
   Callback notification trying http://ip-10-9-153-3.ap-south-1.compute.internal:11000/oozie/callback?id=0005119-220117203901194-oozie-oozi-W@IngestionJob&status=FAILED
   Callback notification to http://ip-10-9-153-3.ap-south-1.compute.internal:11000/oozie/callback?id=0005119-220117203901194-oozie-oozi-W@IngestionJob&status=FAILED succeeded
   Callback notification succeeded
   
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] megha2411 edited a comment on issue #4814: [SUPPORT] NPE thrown while archiving data table when metadata is enabled

Posted by GitBox <gi...@apache.org>.

megha2411 edited a comment on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1039112003


   **To Reproduce**
   
   Steps to reproduce the behavior:
   This is coming up intermittently. I tried processing 30 files and 1 failed with this error. It did not fail in 2nd run. But again failed in 5th run. So not sure on the steps to reproduce it.
   
   **Expected behavior:**
   IMO it should not fail with any NULL pointer exception
   
   **Environment Description**
   
   Hudi version : 0.10.1
   
   Spark version : 3.1.2
   
   Hive version : Hive 3.1.2
   
   Hadoop version : 
   
   Storage (HDFS/S3/GCS..) : s3
   
   Running on Docker? (yes/no) : no
   
   Additional context
   We run our application on EMR 6.4.0
   Concurrency options
         "hoodie.write.lock.provider" -> "org.apache.hudi.client.transaction.lock.ZookeeperBasedLockProvider",
         "hoodie.write.lock.zookeeper.url"  -> Conf.hiveConf.host,
         "hoodie.write.lock.zookeeper.port" ->  Conf.hudiConf.zookeeperConf.port,
         "hoodie.write.lock.zookeeper.base_path" -> Conf.hudiConf.zookeeperConf.basePath,
         "hoodie.write.concurrency.mode" -> "optimistic_concurrency_control",
         "hoodie.cleaner.policy.failed.writes" -> "LAZY",
         "hoodie.write.lock.zookeeper.lock_key" -> options.tableName,
         "hoodie.write.lock.wait_time_ms" -> 5000,
         "hoodie.write.lock.num_retries" -> 5,
         "hoodie.write.lock.client.wait_time_ms" ->5000,
         "hoodie.write.lock.client.num_retries" -> 5s,
         "hoodie.write.lock.wait_time_ms_between_retry" -> 12000,
         "hoodie.metadata.enable" -> true
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] megha2411 edited a comment on issue #4814: [SUPPORT] NPE thrown while archiving data table when metadata is enabled

Posted by GitBox <gi...@apache.org>.

megha2411 edited a comment on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1039270588


   I tried reproducing the issue with metadata disabled. I am still getting this error.
   Trace with meta disabled
   Hadoop Job IDs executed by Spark: job_1642451881812_8703
   
   java.lang.reflect.InvocationTargetException
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.apache.oozie.action.hadoop.LauncherAM.runActionMain(LauncherAM.java:412)
   	at org.apache.oozie.action.hadoop.LauncherAM.access$400(LauncherAM.java:54)
   	at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:225)
   	at java.security.AccessController.doPrivileged(Native Method)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
   	at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:219)
   	at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:155)
   	at java.security.AccessController.doPrivileged(Native Method)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
   	at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:143)
   Caused by: java.lang.NullPointerException
   	at org.apache.hudi.table.HoodieTimelineArchiveLog.lambda$getInstantsToArchive$9(HoodieTimelineArchiveLog.java:230)
   	at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:269)
   	at java.util.stream.SliceOps$1$1.accept(SliceOps.java:204)
   	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
   	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
   	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
   	at java.util.ArrayList$ArrayListSpliterator.tryAdvance(ArrayList.java:1361)
   	at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126)
   	at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:499)
   	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:486)
   	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
   	at java.util.stream.StreamSpliterators$WrappingSpliterator.forEachRemaining(StreamSpliterators.java:313)
   	at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:743)
   	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
   	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
   	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
   	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
   	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
   	at org.apache.hudi.table.HoodieTimelineArchiveLog.archiveIfRequired(HoodieTimelineArchiveLog.java:124)
   	at org.apache.hudi.client.AbstractHoodieWriteClient.archive(AbstractHoodieWriteClient.java:766)
   	at org.apache.hudi.client.AbstractHoodieWriteClient.postCommit(AbstractHoodieWriteClient.java:459)
   	at org.apache.hudi.client.AbstractHoodieWriteClient.commitStats(AbstractHoodieWriteClient.java:198)
   	at org.apache.hudi.client.SparkRDDWriteClient.commit(SparkRDDWriteClient.java:125)
   	at org.apache.hudi.HoodieSparkSqlWriter$.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:635)
   	at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:286)
   	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:164)
   	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:194)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:232)
   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:229)
   	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:190)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:134)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:133)
   	at org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:989)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:110)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:135)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:135)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:253)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:134)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68)
   	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:989)
   	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:438)
   	at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:415)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:293)
   	at com.recko.spark.io.lake.HudiIO.writeDataFrame(HudiIO.scala:67)
   	at com.recko.spark.utils.datalake.StructuredZoneUtils$.putValidationData(StructuredZoneUtils.scala:136)
   	at com.recko.spark.utils.datalake.LakeUtils$.writeIntoLake(LakeUtils.scala:10)
   	at com.recko.spark.jobs.v1.ingestion.IngestionWrapper.postExecute(IngestionWrapper.scala:184)
   	at com.recko.spark.jobs.v1.ingestion.DataValidationJob.execute(DataValidationJob.scala:63)
   	at com.recko.spark.jobs.v1.triggers.IngestionJob$.$anonfun$main$1(IngestionJob.scala:30)
   	at com.recko.spark.executors.Executor.jobWrapper(Executor.scala:71)
   	at com.recko.spark.jobs.v1.triggers.IngestionJob$.main(IngestionJob.scala:23)
   	at com.recko.spark.jobs.v1.triggers.IngestionJob.main(IngestionJob.scala)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
   	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:959)
   	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
   	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
   	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
   	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1047)
   	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1056)
   	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   	at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:186)
   	at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:93)
   	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:107)
   	at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:60)
   	... 16 more
   Failing Oozie Launcher, null
   java.lang.NullPointerException
   	at org.apache.hudi.table.HoodieTimelineArchiveLog.lambda$getInstantsToArchive$9(HoodieTimelineArchiveLog.java:230)
   	at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:269)
   	at java.util.stream.SliceOps$1$1.accept(SliceOps.java:204)
   	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
   	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
   	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
   	at java.util.ArrayList$ArrayListSpliterator.tryAdvance(ArrayList.java:1361)
   	at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126)
   	at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:499)
   	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:486)
   	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
   	at java.util.stream.StreamSpliterators$WrappingSpliterator.forEachRemaining(StreamSpliterators.java:313)
   	at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:743)
   	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
   	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
   	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
   	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
   	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
   	at org.apache.hudi.table.HoodieTimelineArchiveLog.archiveIfRequired(HoodieTimelineArchiveLog.java:124)
   	at org.apache.hudi.client.AbstractHoodieWriteClient.archive(AbstractHoodieWriteClient.java:766)
   	at org.apache.hudi.client.AbstractHoodieWriteClient.postCommit(AbstractHoodieWriteClient.java:459)
   	at org.apache.hudi.client.AbstractHoodieWriteClient.commitStats(AbstractHoodieWriteClient.java:198)
   	at org.apache.hudi.client.SparkRDDWriteClient.commit(SparkRDDWriteClient.java:125)
   	at org.apache.hudi.HoodieSparkSqlWriter$.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:635)
   	at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:286)
   	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:164)
   	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:194)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:232)
   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:229)
   	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:190)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:134)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:133)
   	at org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:989)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:110)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:135)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:135)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:253)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:134)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68)
   	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:989)
   	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:438)
   	at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:415)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:293)
   	at com.recko.spark.io.lake.HudiIO.writeDataFrame(HudiIO.scala:67)
   	at com.recko.spark.utils.datalake.StructuredZoneUtils$.putValidationData(StructuredZoneUtils.scala:136)
   	at com.recko.spark.utils.datalake.LakeUtils$.writeIntoLake(LakeUtils.scala:10)
   	at com.recko.spark.jobs.v1.ingestion.IngestionWrapper.postExecute(IngestionWrapper.scala:184)
   	at com.recko.spark.jobs.v1.ingestion.DataValidationJob.execute(DataValidationJob.scala:63)
   	at com.recko.spark.jobs.v1.triggers.IngestionJob$.$anonfun$main$1(IngestionJob.scala:30)
   	at com.recko.spark.executors.Executor.jobWrapper(Executor.scala:71)
   	at com.recko.spark.jobs.v1.triggers.IngestionJob$.main(IngestionJob.scala:23)
   	at com.recko.spark.jobs.v1.triggers.IngestionJob.main(IngestionJob.scala)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
   	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:959)
   	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
   	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
   	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
   	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1047)
   	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1056)
   	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   	at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:186)
   	at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:93)
   	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:107)
   	at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:60)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.apache.oozie.action.hadoop.LauncherAM.runActionMain(LauncherAM.java:412)
   	at org.apache.oozie.action.hadoop.LauncherAM.access$400(LauncherAM.java:54)
   	at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:225)
   	at java.security.AccessController.doPrivileged(Native Method)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
   	at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:219)
   	at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:155)
   	at java.security.AccessController.doPrivileged(Native Method)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
   	at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:143)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] nsivabalan commented on issue #4814: [SUPPORT] NPE thrown while archiving data table with 0.10.1

Posted by GitBox <gi...@apache.org>.

nsivabalan commented on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1047293152


   hey @megha2411 : If you can enable debug logs and share it w/ us, would be helpful in the mean time. 
   And I understand, multi-writer is a mandatory requirement for you. But if you have a staging pipeline w/o multi-writer, do you know if you are hitting the issue there as well. would help us rule out whether its multi-writer issue or something that happens even w/ single writer. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] NiteshSK edited a comment on issue #4814: [SUPPORT] NPE thrown while archiving data table with 0.10.1

Posted by GitBox <gi...@apache.org>.

NiteshSK edited a comment on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1041127602


   @nsivabalan 
   We are running on Hudi version : 0.10.1
   After upgrading to Hudi 0.10.1 this issue happening intermittently not with every case.
   We didn't evaluated it with multi-writer disabled (as our use case requires multi-writer capacity)
   cc : @megha2411


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] NiteshSK edited a comment on issue #4814: [SUPPORT] NPE thrown while archiving data table with 0.10.1

Posted by GitBox <gi...@apache.org>.

NiteshSK edited a comment on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1041127602


   @nsivabalan 
   We are running on Hudi version : 0.10.1
   After upgrading to Hudi 0.10.1 this issue is with intermittently not for every case.
   We didn't evaluated it with multi-writer disabled (as our use case requires multi-writer capacity)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] megha2411 edited a comment on issue #4814: [SUPPORT] NPE thrown while archiving data table when metadata is enabled

Posted by GitBox <gi...@apache.org>.

megha2411 edited a comment on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1039112003


   **To Reproduce**
   
   Steps to reproduce the behavior:
   This is coming up intermittently. I tried processing 30 files and 1 failed with this error. It did not fail in 2ns run. But again failed in 5th run. So not sure on the steps to reproduce it.
   
   **Expected behavior:**
   IMO it should not fail with any NULL pointer exception
   
   **Environment Description**
   
   Hudi version : 0.10.1
   
   Spark version : 3.1.2
   
   Hive version : Hive 3.1.2
   
   Hadoop version : 
   
   Storage (HDFS/S3/GCS..) : s3
   
   Running on Docker? (yes/no) : no
   
   Additional context
   
   Attaching sbt build context for additional context. We run our application on EMR 6.4.0
   Concurrency options
         "hoodie.write.lock.provider" -> "org.apache.hudi.client.transaction.lock.ZookeeperBasedLockProvider",
         "hoodie.write.lock.zookeeper.url"  -> Conf.hiveConf.host,
         "hoodie.write.lock.zookeeper.port" ->  Conf.hudiConf.zookeeperConf.port,
         "hoodie.write.lock.zookeeper.base_path" -> Conf.hudiConf.zookeeperConf.basePath,
         "hoodie.write.concurrency.mode" -> "optimistic_concurrency_control",
         "hoodie.cleaner.policy.failed.writes" -> "LAZY",
         "hoodie.write.lock.zookeeper.lock_key" -> options.tableName,
         "hoodie.write.lock.wait_time_ms" -> 5000,
         "hoodie.write.lock.num_retries" -> 5,
         "hoodie.write.lock.client.wait_time_ms" ->5000,
         "hoodie.write.lock.client.num_retries" -> 5s,
         "hoodie.write.lock.wait_time_ms_between_retry" -> 12000,
         "hoodie.metadata.enable" -> true
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] megha2411 commented on issue #4814: [SUPPORT] NPE thrown while archiving data table when metadata is enabled

Posted by GitBox <gi...@apache.org>.

megha2411 commented on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1039270588


   I tried reproducing the issue with metadata disabled. I am still getting this error.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] nsivabalan closed issue #4814: [SUPPORT] NPE thrown while archiving data table with 0.10.1

Posted by GitBox <gi...@apache.org>.

nsivabalan closed issue #4814:
URL: https://github.com/apache/hudi/issues/4814


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] megha2411 edited a comment on issue #4814: [SUPPORT] NPE thrown while archiving data table when metadata is enabled

Posted by GitBox <gi...@apache.org>.

megha2411 edited a comment on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1039112003


   **To Reproduce**
   
   Steps to reproduce the behavior:
   This is coming up intermittently. I tried processing 30 files and 1 failed with this error. It did not fail in 2ns run. But again failed in 5th run. So not sure on the steps to reproduce it.
   
   **Expected behavior:**
   IMO it should not fail with any NULL pointer exception
   
   **Environment Description**
   
   Hudi version : 0.10.1
   
   Spark version : 3.1.2
   
   Hive version : Hive 3.1.2
   
   Hadoop version : 
   
   Storage (HDFS/S3/GCS..) : s3
   
   Running on Docker? (yes/no) : no
   
   Additional context
   We run our application on EMR 6.4.0
   Concurrency options
         "hoodie.write.lock.provider" -> "org.apache.hudi.client.transaction.lock.ZookeeperBasedLockProvider",
         "hoodie.write.lock.zookeeper.url"  -> Conf.hiveConf.host,
         "hoodie.write.lock.zookeeper.port" ->  Conf.hudiConf.zookeeperConf.port,
         "hoodie.write.lock.zookeeper.base_path" -> Conf.hudiConf.zookeeperConf.basePath,
         "hoodie.write.concurrency.mode" -> "optimistic_concurrency_control",
         "hoodie.cleaner.policy.failed.writes" -> "LAZY",
         "hoodie.write.lock.zookeeper.lock_key" -> options.tableName,
         "hoodie.write.lock.wait_time_ms" -> 5000,
         "hoodie.write.lock.num_retries" -> 5,
         "hoodie.write.lock.client.wait_time_ms" ->5000,
         "hoodie.write.lock.client.num_retries" -> 5s,
         "hoodie.write.lock.wait_time_ms_between_retry" -> 12000,
         "hoodie.metadata.enable" -> true
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] megha2411 commented on issue #4814: [SUPPORT] NPE thrown while archiving data table when metadata is enabled

Posted by GitBox <gi...@apache.org>.

megha2411 commented on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1039190472


   After disabling metadata I haven't faced this issue yet. Also, FYI, I started using this 0.10 version just 2 days back.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] nsivabalan commented on issue #4814: [SUPPORT] NPE thrown while archiving data table with 0.10.1

Posted by GitBox <gi...@apache.org>.

nsivabalan commented on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1042546427


   @xushiyan : hey can you triage this. 
   One possible reason I can think of is: 
   when concurrent writers are enabled, may be there is some overstepping during archival and so hudi runs into NPE (two writers fetch similar list of instants to archive and one of them goes ahead and archives, while the other runs into NPE while trying to archive). 
   
   I mean, I am just speculating. I will let you triage this. And do you know if there is way to disable archiving with 0.10.1? then we can suggest to disable in all writers but one and see if the issue goes away. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] nsivabalan commented on issue #4814: [SUPPORT] NPE thrown while archiving data table when metadata is enabled

Posted by GitBox <gi...@apache.org>.

nsivabalan commented on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1040776155


   @megha2411 : would you mind posting the contents of .hoodie. it does not contain any user data, just the timeline files pertaining to the hudi table. 
   Either you can zip and share it here or you can do "ls" and paste the output here. But do ensure you sort it based on FileModTime when you perform "ls". 
   
   And if you can enable debug logs for hudi and share us what you see when you encounter the issue again, would help us triage the issue as well. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] megha2411 commented on issue #4814: [SUPPORT] NPE thrown while archiving data table with 0.10.1

Posted by GitBox <gi...@apache.org>.

megha2411 commented on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1060877946


   Hi @nsivabalan , We have handled this issue with retries for now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] megha2411 commented on issue #4814: [SUPPORT] NPE thrown while archiving data table with 0.10.1

Posted by GitBox <gi...@apache.org>.

megha2411 commented on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1041192387


   @nsivabalan I was not able to get Hoodie contents for the previous stack trace. So I reproduced the issue with meta disabled. Below is the trace along with the Hoodie Contents.
   
   <<< Invocation of Spark command completed <<<
   
   Hadoop Job IDs executed by Spark: job_1644866841988_0880
   
   java.lang.reflect.InvocationTargetException
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.apache.oozie.action.hadoop.LauncherAM.runActionMain(LauncherAM.java:412)
   	at org.apache.oozie.action.hadoop.LauncherAM.access$400(LauncherAM.java:54)
   	at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:225)
   	at java.security.AccessController.doPrivileged(Native Method)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
   	at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:219)
   	at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:155)
   	at java.security.AccessController.doPrivileged(Native Method)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
   	at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:143)
   Caused by: java.lang.NullPointerException
   	at org.apache.hudi.table.HoodieTimelineArchiveLog.lambda$getInstantsToArchive$9(HoodieTimelineArchiveLog.java:230)
   	at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:269)
   	at java.util.stream.SliceOps$1$1.accept(SliceOps.java:204)
   	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
   	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
   	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
   	at java.util.ArrayList$ArrayListSpliterator.tryAdvance(ArrayList.java:1361)
   	at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126)
   	at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:499)
   	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:486)
   	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
   	at java.util.stream.StreamSpliterators$WrappingSpliterator.forEachRemaining(StreamSpliterators.java:313)
   	at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:743)
   	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
   	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
   	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
   	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
   	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
   	at org.apache.hudi.table.HoodieTimelineArchiveLog.archiveIfRequired(HoodieTimelineArchiveLog.java:124)
   	at org.apache.hudi.client.AbstractHoodieWriteClient.archive(AbstractHoodieWriteClient.java:766)
   	at org.apache.hudi.client.AbstractHoodieWriteClient.postCommit(AbstractHoodieWriteClient.java:459)
   	at org.apache.hudi.client.AbstractHoodieWriteClient.commitStats(AbstractHoodieWriteClient.java:198)
   	at org.apache.hudi.client.SparkRDDWriteClient.commit(SparkRDDWriteClient.java:125)
   	at org.apache.hudi.HoodieSparkSqlWriter$.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:635)
   	at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:286)
   	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:164)
   	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:194)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:232)
   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:229)
   	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:190)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:134)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:133)
   	at org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:989)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:110)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:135)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:135)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:253)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:134)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68)
   	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:989)
   	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:438)
   	at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:415)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:293)
   	at com.recko.spark.io.lake.HudiIO.writeDataFrame(HudiIO.scala:67)
   	at com.recko.spark.utils.datalake.StructuredZoneUtils$.putValidationData(StructuredZoneUtils.scala:135)
   	at com.recko.spark.utils.datalake.LakeUtils$.writeIntoLake(LakeUtils.scala:10)
   	at com.recko.spark.jobs.v1.ingestion.IngestionWrapper.postExecute(IngestionWrapper.scala:184)
   	at com.recko.spark.jobs.v1.ingestion.DataValidationJob.execute(DataValidationJob.scala:63)
   	at com.recko.spark.jobs.v1.triggers.IngestionJob$.$anonfun$main$1(IngestionJob.scala:30)
   	at com.recko.spark.executors.Executor.jobWrapper(Executor.scala:71)
   	at com.recko.spark.jobs.v1.triggers.IngestionJob$.main(IngestionJob.scala:23)
   	at com.recko.spark.jobs.v1.triggers.IngestionJob.main(IngestionJob.scala)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
   	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:959)
   	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
   	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
   	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
   	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1047)
   	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1056)
   	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   	at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:186)
   	at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:93)
   	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:107)
   	at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:60)
   	... 16 more
   Failing Oozie Launcher, null
   java.lang.NullPointerException
   	at org.apache.hudi.table.HoodieTimelineArchiveLog.lambda$getInstantsToArchive$9(HoodieTimelineArchiveLog.java:230)
   	at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:269)
   	at java.util.stream.SliceOps$1$1.accept(SliceOps.java:204)
   	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
   	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
   	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
   	at java.util.ArrayList$ArrayListSpliterator.tryAdvance(ArrayList.java:1361)
   	at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126)
   	at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:499)
   	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:486)
   	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
   	at java.util.stream.StreamSpliterators$WrappingSpliterator.forEachRemaining(StreamSpliterators.java:313)
   	at java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:743)
   	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
   	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
   	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
   	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
   	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
   	at org.apache.hudi.table.HoodieTimelineArchiveLog.archiveIfRequired(HoodieTimelineArchiveLog.java:124)
   	at org.apache.hudi.client.AbstractHoodieWriteClient.archive(AbstractHoodieWriteClient.java:766)
   	at org.apache.hudi.client.AbstractHoodieWriteClient.postCommit(AbstractHoodieWriteClient.java:459)
   	at org.apache.hudi.client.AbstractHoodieWriteClient.commitStats(AbstractHoodieWriteClient.java:198)
   	at org.apache.hudi.client.SparkRDDWriteClient.commit(SparkRDDWriteClient.java:125)
   	at org.apache.hudi.HoodieSparkSqlWriter$.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:635)
   	at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:286)
   	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:164)
   	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
   	at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:194)
   	at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:232)
   	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:229)
   	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:190)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:134)
   	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:133)
   	at org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:989)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:110)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:135)
   	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
   	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:135)
   	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:253)
   	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:134)
   	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
   	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68)
   	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:989)
   	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:438)
   	at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:415)
   	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:293)
   	at com.recko.spark.io.lake.HudiIO.writeDataFrame(HudiIO.scala:67)
   	at com.recko.spark.utils.datalake.StructuredZoneUtils$.putValidationData(StructuredZoneUtils.scala:135)
   	at com.recko.spark.utils.datalake.LakeUtils$.writeIntoLake(LakeUtils.scala:10)
   	at com.recko.spark.jobs.v1.ingestion.IngestionWrapper.postExecute(IngestionWrapper.scala:184)
   	at com.recko.spark.jobs.v1.ingestion.DataValidationJob.execute(DataValidationJob.scala:63)
   	at com.recko.spark.jobs.v1.triggers.IngestionJob$.$anonfun$main$1(IngestionJob.scala:30)
   	at com.recko.spark.executors.Executor.jobWrapper(Executor.scala:71)
   	at com.recko.spark.jobs.v1.triggers.IngestionJob$.main(IngestionJob.scala:23)
   	at com.recko.spark.jobs.v1.triggers.IngestionJob.main(IngestionJob.scala)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
   	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:959)
   	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
   	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
   	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
   	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1047)
   	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1056)
   	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   	at org.apache.oozie.action.hadoop.SparkMain.runSpark(SparkMain.java:186)
   	at org.apache.oozie.action.hadoop.SparkMain.run(SparkMain.java:93)
   	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:107)
   	at org.apache.oozie.action.hadoop.SparkMain.main(SparkMain.java:60)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.apache.oozie.action.hadoop.LauncherAM.runActionMain(LauncherAM.java:412)
   	at org.apache.oozie.action.hadoop.LauncherAM.access$400(LauncherAM.java:54)
   	at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:225)
   	at java.security.AccessController.doPrivileged(Native Method)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
   	at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:219)
   	at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:155)
   	at java.security.AccessController.doPrivileged(Native Method)
   	at javax.security.auth.Subject.doAs(Subject.java:422)
   	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
   	at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:143)
   
   
   [Hoodie Contents.csv](https://github.com/apache/hudi/files/8077650/Hoodie.Contents.csv)
   
   **#partition metadata
   #Wed Feb 16 06:34:30 UTC 2022
   commitTime=20220216063402201
   partitionDepth=1**
   
   
   We were using Hudi 0.80 before upgrading to 0.10.1 as 0.80 gave lot of NPE errors. At the moment Hudi is not in prod yet. We have been testing concurrency with 0.10.1 version only. Once it is stable in our staging env, we'll move to prod. 
   These NPE errors goes away after retry.
   It comes with both meta enabled and disabled but intermittently. Traces are pasted above for both the cases.
   Disabling multi-writer is not an option for us as we have tenant(clientId) wise partitioning.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] nsivabalan commented on issue #4814: [SUPPORT] NPE thrown while archiving data table when metadata is enabled

Posted by GitBox <gi...@apache.org>.

nsivabalan commented on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1040778429


   also, may I know if you are facing this issue w/ all the tables after you upgraded to 0.10.1 or just one off case? 
   And once the job fails, if you retry, do you still hit the same exception. or does it proceed w/o issues? 
   Also, I see you have enabled multi-writer. Do you have any tables where you don't have multi-writer enabled and do you see the exceptions in those tables as well ? 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] megha2411 edited a comment on issue #4814: [SUPPORT] NPE thrown while archiving data table when metadata is enabled

Posted by GitBox <gi...@apache.org>.

megha2411 edited a comment on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1039112003


   **To Reproduce**
   
   Steps to reproduce the behavior:
   This is coming up intermittently. I tried processing 30 files and 1 failed with this error. It did not fail in 2ns run. But again failed in 5th run. So not sure on the steps to reproduce it.
   
   **Expected behavior:**
   IMO it should not fail with any NULL pointer exception
   
   **Environment Description**
   
   Hudi version : 0.10.1
   
   Spark version : 3.1.2
   
   Hive version : Hive 3.1.2
   
   Hadoop version : 
   
   Storage (HDFS/S3/GCS..) : s3
   
   Running on Docker? (yes/no) : no
   
   Additional context
   
   Attaching sbt build context for additional context. We run our application on EMR 6.4.0
   Concurrency options
         "hoodie.write.lock.provider" -> "org.apache.hudi.client.transaction.lock.ZookeeperBasedLockProvider",
         "hoodie.write.lock.zookeeper.url"  -> Conf.hiveConf.host,
         "hoodie.write.lock.zookeeper.port" ->  Conf.hudiConf.zookeeperConf.port,
         "hoodie.write.lock.zookeeper.base_path" -> Conf.hudiConf.zookeeperConf.basePath,
         "hoodie.write.concurrency.mode" -> "optimistic_concurrency_control",
         "hoodie.cleaner.policy.failed.writes" -> "LAZY",
         "hoodie.write.lock.zookeeper.lock_key" -> options.tableName,
         "hoodie.write.lock.wait_time_ms" -> Conf.hudiConf.lockWaitTimeMs,
         "hoodie.write.lock.num_retries" -> Conf.hudiConf.lockNumRetries,
         "hoodie.write.lock.client.wait_time_ms" -> Conf.hudiConf.lockClientWaitTimeMs,
         "hoodie.write.lock.client.num_retries" -> Conf.hudiConf.lockClientNumRetries,
         "hoodie.write.lock.wait_time_ms_between_retry" -> Conf.hudiConf.lockWaitTimeMsBetweenRetry,
         "hoodie.metadata.enable" -> Conf.hudiConf.enableMetadata.toString


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] nsivabalan commented on issue #4814: [SUPPORT] NPE thrown while archiving data table with 0.10.1

Posted by GitBox <gi...@apache.org>.

nsivabalan commented on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1048885247


   hey @megha2411 : 
   one possible approach to unblock you until we find the root cause. 
   in 0.10.1, we don't have a easy way to disable archival. So, what we can possibly do is, 
   we can increase the archival configs to very large no for all writers except one. 
   and you can leave one writer to your actual archival configs. 
   
   Writer1: 
   your regular archival configs or default
   
   all other writers: 
   https://hudi.apache.org/docs/configurations#hoodiekeepmincommits
   https://hudi.apache.org/docs/configurations#hoodiekeepmaxcommits
   
   You can set the values for these to something like 100000. 
   
   
   By this, we can ensure writes happen concurrently via all writers, but archival happens only via one write. hope that helps. 
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] nsivabalan commented on issue #4814: [SUPPORT] NPE thrown while archiving data table with 0.10.1

Posted by GitBox <gi...@apache.org>.

nsivabalan commented on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1057506999


   @megha2411 : any updates on this end. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] NiteshSK edited a comment on issue #4814: [SUPPORT] NPE thrown while archiving data table with 0.10.1

Posted by GitBox <gi...@apache.org>.

NiteshSK edited a comment on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1041127602


   @nsivabalan 
   We are running on Hudi version : 0.10.1
   After upgrading to Hudi 0.10.1 this issue is with intermittently not for every case.
   We didn't evaluated it with multi-writer disabled (as our use case requires multi-writer capacity)
   cc : @megha2411


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] NiteshSK commented on issue #4814: [SUPPORT] NPE thrown while archiving data table with 0.10.1

Posted by GitBox <gi...@apache.org>.

NiteshSK commented on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1041127602


   @nsivabalan 
   We are running on Hudi version : 0.10.1
   After upgrading to Hudi 0.10.1 this issue is with intermittently not for every case.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] nsivabalan edited a comment on issue #4814: [SUPPORT] NPE thrown while archiving data table with 0.10.1

Posted by GitBox <gi...@apache.org>.

nsivabalan edited a comment on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1042546427


   @xushiyan : hey can you help triage this. 
   One possible reason I can think of is: 
   when concurrent writers are enabled, may be there is some overstepping during archival and so hudi runs into NPE (two writers fetch similar list of instants to archive and one of them goes ahead and archives, while the other runs into NPE while trying to archive). 
   
   I mean, I am just speculating. I will let you triage this. And do you know if there is way to disable archiving with 0.10.1? then we can suggest to disable in all writers but one and see if the issue goes away. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] megha2411 edited a comment on issue #4814: [SUPPORT] NPE thrown while archiving data table when metadata is enabled

Posted by GitBox <gi...@apache.org>.

megha2411 edited a comment on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1039112003






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] nsivabalan commented on issue #4814: [SUPPORT] NPE thrown while archiving data table with 0.10.1

Posted by GitBox <gi...@apache.org>.

nsivabalan commented on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1061359061


   thanks!. will close the issue for now. but will fine a tracking jira. 
   https://issues.apache.org/jira/browse/HUDI-3578
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] megha2411 commented on issue #4814: [SUPPORT] NPE thrown while archiving data table when metadata is enabled

Posted by GitBox <gi...@apache.org>.

megha2411 commented on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1039112003


   **To Reproduce**
   
   Steps to reproduce the behavior:
   This is coming up intermittently. I tried processing 30 files and 1 failed with this error. It did not fail in 2ns run. But again failed in 5th run. So not sure on the steps to reproduce it.
   
   **Expected behavior:**
   IMO it should not fail with any NULL pointer exception
   
   **Environment Description**
   
   Hudi version : 0.10.1
   
   Spark version : 3.1.2
   
   Hive version : Hive 3.1.2
   
   Hadoop version : 
   
   Storage (HDFS/S3/GCS..) : s3
   
   Running on Docker? (yes/no) : no
   
   Additional context
   
   Attaching sbt build context for additional context. We run our application on EMR 6.4.0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] megha2411 commented on issue #4814: [SUPPORT] NPE thrown while archiving data table when metadata is enabled

Posted by GitBox <gi...@apache.org>.

megha2411 commented on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1039270588


   I tried reproducing the issue with metadata disabled. I am still getting this error.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] nsivabalan commented on issue #4814: [SUPPORT] NPE thrown while archiving data table when metadata is enabled

Posted by GitBox <gi...@apache.org>.

nsivabalan commented on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1040779886


   oh btw, is it 0.10.1 or is it 0.10.0? in one of your early messages you mentioned 0.10. just wanted to double confirm. Also, I assume you upgraded from 0.10.0 to 0.10.1 or was it from some a prior version .
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] nsivabalan commented on issue #4814: [SUPPORT] NPE thrown while archiving data table when metadata is enabled

Posted by GitBox <gi...@apache.org>.

nsivabalan commented on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1039171153


   CC @yihua who is looking into all metadata related bugs. 
   @megha2411 : if you have some pipelines where metadata is disabled, do you also see similar issues. or is it happening only when metadata is enabled. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hudi] NiteshSK edited a comment on issue #4814: [SUPPORT] NPE thrown while archiving data table with 0.10.1

Posted by GitBox <gi...@apache.org>.

NiteshSK edited a comment on issue #4814:
URL: https://github.com/apache/hudi/issues/4814#issuecomment-1041127602


   @nsivabalan 
   We are running on Hudi version : 0.10.1
   After upgrading to Hudi 0.10.1 this issue happening intermittently not with every case.
   We didn't evaluate it with multi-writer disabled (as our use case requires multi-writer capacity)
   cc : @megha2411


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org