You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Soumitra Sulav (Jira)" <ji...@apache.org> on 2022/04/14 14:59:00 UTC
[jira] [Created] (HDDS-6584) [spark] Spark-HWC Error log with AcidUtils
Soumitra Sulav created HDDS-6584:
------------------------------------
Summary: [spark] Spark-HWC Error log with AcidUtils
Key: HDDS-6584
URL: https://issues.apache.org/jira/browse/HDDS-6584
Project: Apache Ozone
Issue Type: Bug
Components: build
Affects Versions: 1.3.0
Reporter: Soumitra Sulav
Attachments: spark-hwc-aicderror-debug.log, spark-hwc-aicderror-info.log
AcidUtils error messages are observed with Spark HiveWarehouseConnector with OzoneFilesystem.
The job doesn't abort but this might lead to issues in acid scenarios.
*Test:* TPCDS queries are run via spark-hwc on the ozone filesystem.
Table Info under query
{code:java}
|# Detailed Table Information|
|Database |tpcds_src
|Table |store_sales
|Owner |hrt_qa
|Created Time |Fri Mar 04 14:48:15 UTC 2022
|Last Access |Thu Jan 01 00:00:00 UTC 1970
|Created By |Spark 2.2 or prior
|Type |EXTERNAL
|Provider |hive
|Table Properties |[numFilesErasureCoded=0, bucketing_version=2, transient_lastDdlTime=1646405295]
|Statistics |388445409 bytes
|Location |o3fs://hivetest.ozonestage.ozone1/user/hrt_qa/tpcds/tests/data/store_sales
|Serde Library |org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
|InputFormat |org.apache.hadoop.mapred.TextInputFormat
|OutputFormat |org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
|Storage Properties |[serialization.format=|, field.delim=|]
|Partition Provider |Catalog {code}
*Info Logs :*
{code:java}
# spark-shell --jars /opt/cloudera/parcels/CDH/lib/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.7.1.7.1000-114.jar --master yarn --deploy-mode client --conf spark.sql.broadcastTimeout=1000 --conf spark.datasource.hive.warehouse.read.mode=DIRECT_READER_V2 --conf spark.sql.extensions=com.hortonworks.spark.sql.rule.Extensions --conf spark.driver.memory=15g --conf spark.network.timeout=1000s --conf spark.sql.crossJoin.enabled=true --conf spark.eventLog.enabled=false --conf spark.sql.hive.hiveserver2.jdbc.url.principal=hive/quasar-whkave-8.quasar-whkave.root.hwx.site@ROOT.HWX.SITE --conf spark.executor.memory=2g --conf spark.kryo.registrator=com.qubole.spark.hiveacid.util.HiveAcidKyroRegistrator --conf spark.driver.log.persistToDfs.enabled=false --conf spark.security.credentials.hiveserver2.enabled=true --name "PySparkShellT" {code}
{code:java}
scala> spark.sql("SELECT * FROM ( SELECT i_category, i_class, i_brand, s_store_name, s_company_name, d_moy, sum(ss_sales_price) sum_sales, avg(sum(ss_sales_price)) OVER (PARTITION BY i_category, i_brand, s_store_name, s_company_name) avg_monthly_sales FROM item, store_sales, date_dim, store WHERE ss_item_sk = i_item_sk AND ss_sold_date_sk = d_date_sk AND ss_store_sk = s_store_sk AND d_year IN (1999) AND ((i_category IN ('Books', 'Electronics', 'Sports') AND i_class IN ('computers', 'stereo', 'football')) OR (i_category IN ('Men', 'Jewelry', 'Women') AND i_class IN ('shirts', 'birdal', 'dresses'))) GROUP BY i_category, i_class, i_brand, s_store_name, s_company_name, d_moy) tmp1 WHERE CASE WHEN (avg_monthly_sales <> 0) THEN (abs(sum_sales - avg_monthly_sales) / avg_monthly_sales) ELSE NULL END > 0.1 ORDER BY sum_sales - avg_monthly_sales, s_store_name LIMIT 100").show()
22/03/07 12:45:09 WARN conf.HiveConf: HiveConf of name hive.masking.algo does not exist
Hive Session ID = 500e8d1c-0481-4dd8-96ce-7040f9ebea0f
22/03/07 12:45:10 INFO rule.HWCSwitchRule: using DIRECT_READER_V2 extension for reading
22/03/07 12:45:10 WARN conf.HiveConf: HiveConf of name hive.masking.algo does not exist
22/03/07 12:45:10 INFO rule.HWCSwitchRule: using DIRECT_READER_V2 extension for reading
22/03/07 12:45:10 WARN conf.HiveConf: HiveConf of name hive.masking.algo does not exist
22/03/07 12:45:11 INFO rule.HWCSwitchRule: using DIRECT_READER_V2 extension for reading
22/03/07 12:45:11 WARN conf.HiveConf: HiveConf of name hive.masking.algo does not exist
22/03/07 12:45:11 INFO rule.HWCSwitchRule: using DIRECT_READER_V2 extension for reading
22/03/07 12:45:11 WARN conf.HiveConf: HiveConf of name hive.masking.algo does not exist
22/03/07 12:45:12 WARN conf.HiveConf: HiveConf of name hive.masking.algo does not exist
22/03/07 12:45:12 WARN conf.HiveConf: HiveConf of name hive.masking.algo does not exist
22/03/07 12:45:13 WARN conf.HiveConf: HiveConf of name hive.masking.algo does not exist
22/03/07 12:45:13 WARN conf.HiveConf: HiveConf of name hive.masking.algo does not exist
22/03/07 12:45:13 WARN conf.HiveConf: HiveConf of name hive.masking.algo does not exist
22/03/07 12:45:13 WARN conf.HiveConf: HiveConf of name hive.masking.algo does not exist
22/03/07 12:45:13 WARN conf.HiveConf: HiveConf of name hive.masking.algo does not exist
22/03/07 12:45:14 WARN conf.HiveConf: HiveConf of name hive.masking.algo does not exist
22/03/07 12:45:14 WARN conf.HiveConf: HiveConf of name hive.masking.algo does not exist
22/03/07 12:45:15 ERROR io.AcidUtils: Failed to get files with ID; using regular API: Only supported for DFS; got class org.apache.hadoop.fs.ozone.OzoneFileSystem
22/03/07 12:45:15 WARN impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-xceiverclientmetrics.properties,hadoop-metrics2.properties
22/03/07 12:45:16 ERROR io.AcidUtils: Failed to get files with ID; using regular API: Only supported for DFS; got class org.apache.hadoop.fs.ozone.OzoneFileSystem
22/03/07 12:45:16 ERROR io.AcidUtils: Failed to get files with ID; using regular API: Only supported for DFS; got class org.apache.hadoop.fs.ozone.OzoneFileSystem
22/03/07 12:45:16 ERROR io.AcidUtils: Failed to get files with ID; using regular API: Only supported for DFS; got class org.apache.hadoop.fs.ozone.OzoneFileSystem
22/03/07 12:45:16 ERROR io.AcidUtils: Failed to get files with ID; using regular API: Only supported for DFS; got class org.apache.hadoop.fs.ozone.OzoneFileSystem
22/03/07 12:45:17 ERROR io.AcidUtils: Failed to get files with ID; using regular API: Only supported for DFS; got class org.apache.hadoop.fs.ozone.OzoneFileSystem {code}
Attached are [^spark-hwc-aicderror-debug.log]
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org