You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Yuming Wang (Jira)" <ji...@apache.org> on 2022/05/31 10:11:00 UTC

[jira] [Updated] (SPARK-39203) Fix remote table location based on database location

     [ https://issues.apache.org/jira/browse/SPARK-39203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuming Wang updated SPARK-39203:
--------------------------------
    Affects Version/s: 3.2.0
                       3.1.1
                       3.1.0
                       3.0.0
                       2.4.0
                       2.3.0
                       2.2.0
                       3.3.0

> Fix remote table location based on database location
> ----------------------------------------------------
>
>                 Key: SPARK-39203
>                 URL: https://issues.apache.org/jira/browse/SPARK-39203
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0, 2.3.0, 2.4.0, 3.0.0, 3.1.0, 3.1.1, 3.2.0, 3.3.0, 3.4.0
>            Reporter: Yuming Wang
>            Assignee: Yuming Wang
>            Priority: Major
>             Fix For: 3.4.0
>
>
> We have HDFS and Hive on cluster A. We have Spark on cluster B and need to read data from cluster A. The table location is incorrect:
> {noformat}
> spark-sql> desc formatted  default.test_table;
> fas_acct_id         	decimal(18,0)
> fas_acct_cd         	string
> cmpny_cd            	string
> entity_id           	string
> cre_date            	date
> cre_user            	string
> upd_date            	timestamp
> upd_user            	string
> # Detailed Table Information
> Database             default
> Table               	test_table
> Type                	EXTERNAL
> Provider            	parquet
> Statistics          	25310025737 bytes
> Location            	/user/hive/warehouse/test_table
> Serde Library       	org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
> InputFormat         	org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
> OutputFormat        	org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
> Storage Properties  	[compression=snappy]
> spark-sql> desc database default;
> Namespace Name	default
> Comment
> Location	viewfs://clusterA/user/hive/warehouse/
> Owner     hive_dba
> {noformat}
> The correct table location should be viewfs://clusterA/user/hive/warehouse/test_table.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org