You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Bao Yunz (JIRA)" <ji...@apache.org> on 2018/12/19 12:13:00 UTC

[jira] [Updated] (SPARK-26407) Select result is incorrect when add a directory named with k=v to the table path of external non-partition table

     [ https://issues.apache.org/jira/browse/SPARK-26407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bao Yunz updated SPARK-26407:
-----------------------------
    Description: 
Scene 1

Create a external non-partition table, in which location directory has a directory named with "part=1", for example. Then desc the table, we will find the string "part" is showed in table column. when insert the table with data which has same column with target table , will throw a exception that target table has different column number with the inserted data. 

Scene 2

Create a external non-partition table, which location path is empty. After several times insert operation, we add a directory named with "part=1" in the table location directory. Then do insert and select operation, we will find the scan path is changed to "tablePath/part=1",so that we will get a wrong result.

 

It seems that the existing logic of spark will process this kind of table like a partition table. But when we do show partitions operation, it will throw the exception that the table is not partitioned, which is confusing。We believe that the normal logic should be that if a table is a non-partitioned table, the folder under tablePath should not change its basic properties.

  was:
Scene 1

Create a external non-partition table, in which location path has a directory named with "part=1", for example. Then desc the table, we will find the string "part" is showed in table column. when insert the table with data which has same column with target table , will throw a exception that target table has different column number with the inserted data. 

Scene 2

Create a external non-partition table, which location path is empty. After several times insert operation, we add a directory named with "part=1" in the table location directory. Then do insert and select operation, we will find the scan path is changed to "tablePath/part=1",so that we will get a wrong result.

 

It seems that the existing logic of spark will process this kind of table like a partition table. But when we do show partitions operation, it will throw the exception that the table is not partitioned, which is confusing。We believe that the normal logic should be that if a table is a non-partitioned table, the folder under tablePath should not change its basic properties.


> Select result is incorrect when add a directory named with k=v to the table path of external non-partition table
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-26407
>                 URL: https://issues.apache.org/jira/browse/SPARK-26407
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.2, 2.4.0
>            Reporter: Bao Yunz
>            Priority: Major
>              Labels: usability
>
> Scene 1
> Create a external non-partition table, in which location directory has a directory named with "part=1", for example. Then desc the table, we will find the string "part" is showed in table column. when insert the table with data which has same column with target table , will throw a exception that target table has different column number with the inserted data. 
> Scene 2
> Create a external non-partition table, which location path is empty. After several times insert operation, we add a directory named with "part=1" in the table location directory. Then do insert and select operation, we will find the scan path is changed to "tablePath/part=1",so that we will get a wrong result.
>  
> It seems that the existing logic of spark will process this kind of table like a partition table. But when we do show partitions operation, it will throw the exception that the table is not partitioned, which is confusing。We believe that the normal logic should be that if a table is a non-partitioned table, the folder under tablePath should not change its basic properties.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org