You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2022/10/20 12:49:00 UTC

[jira] [Commented] (SPARK-39203) Fix remote table location based on database location

    [ https://issues.apache.org/jira/browse/SPARK-39203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17621089#comment-17621089 ] 

Apache Spark commented on SPARK-39203:
--------------------------------------

User 'cloud-fan' has created a pull request for this issue:
https://github.com/apache/spark/pull/38321

> Fix remote table location based on database location
> ----------------------------------------------------
>
>                 Key: SPARK-39203
>                 URL: https://issues.apache.org/jira/browse/SPARK-39203
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0, 2.3.0, 2.4.0, 3.0.0, 3.1.0, 3.1.1, 3.2.0, 3.3.0, 3.4.0
>            Reporter: Yuming Wang
>            Assignee: Yuming Wang
>            Priority: Major
>             Fix For: 3.4.0
>
>
> We have HDFS and Hive on cluster A. We have Spark on cluster B and need to read data from cluster A. The table location is incorrect:
> {noformat}
> spark-sql> desc formatted  default.test_table;
> fas_acct_id         	decimal(18,0)
> fas_acct_cd         	string
> cmpny_cd            	string
> entity_id           	string
> cre_date            	date
> cre_user            	string
> upd_date            	timestamp
> upd_user            	string
> # Detailed Table Information
> Database             default
> Table               	test_table
> Type                	EXTERNAL
> Provider            	parquet
> Statistics          	25310025737 bytes
> Location            	/user/hive/warehouse/test_table
> Serde Library       	org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
> InputFormat         	org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
> OutputFormat        	org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
> Storage Properties  	[compression=snappy]
> spark-sql> desc database default;
> Namespace Name	default
> Comment
> Location	viewfs://clusterA/user/hive/warehouse/
> Owner     hive_dba
> {noformat}
> The correct table location should be viewfs://clusterA/user/hive/warehouse/test_table.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org