You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sushanta Sen (JIRA)" <ji...@apache.org> on 2019/03/08 10:58:00 UTC

[jira] [Commented] (SPARK-24291) Data source table is not displaying records when files are uploaded to table location

    [ https://issues.apache.org/jira/browse/SPARK-24291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16787786#comment-16787786 ] 

Sushanta Sen commented on SPARK-24291:
--------------------------------------

Yes,after refresh it fetches the data.But why this is happening when tables created with 'USING'.

> Data source table is not displaying records when files are uploaded to table location
> -------------------------------------------------------------------------------------
>
>                 Key: SPARK-24291
>                 URL: https://issues.apache.org/jira/browse/SPARK-24291
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.0
>         Environment: OS: SUSE11
> Spark Version: 2.3
>            Reporter: Sushanta Sen
>            Priority: Major
>
> Precondition:
> 1.Already one .orc file exists in the /tmp/orcdata/ location
>  # Launch Spark-sql
>  # spark-sql> CREATE TABLE os_orc (name string, version string, other string) USING ORC OPTIONS (path '/tmp/orcdata/');
>  # spark-sql> select * from os_orc;
> Spark 2.3.0 Apache
> Time taken: 2.538 seconds, Fetched 1 row(s)
>  # pc1:/opt/# *./hadoop dfs -ls /tmp/orcdata*
> Found 1 items
> -rw-r--r-- 3 spark hadoop 475 2018-05-09 18:21 /tmp/orcdata/part-00000-d488121b-e9fd-4269-a6ea-842c631722ee-c000.snappy.orc
> pc1:/opt/# *./hadoop fs -copyFromLocal /opt/OS/loaddata/orcdata/part-00001-d488121b-e9fd-4269-a6ea-842c631722ee-c000.snappy.orc /tmp/orcdata/data2.orc*
> pc1:/opt/# *./hadoop dfs -ls /tmp/orcdata*
> Found *2* items
> -rw-r--r-- 3 spark hadoop 475 2018-05-15 14:59 /tmp/orcdata/data2.orc
> -rw-r--r-- 3 spark hadoop 475 2018-05-09 18:21 /tmp/orcdata/part-00000-d488121b-e9fd-4269-a6ea-842c631722ee-c000.snappy.orc
> pc1:/opt/# ** 
>  5. Again execute the select command on the table os_orc
> spark-sql> select * from os_orc;
> Spark 2.3.0 Apache
> Time taken: 1.528 seconds, Fetched {color:#FF0000}1 row(s){color}
> Actual Result: On executing select command it does not display the all the records exist in the data source table location
> Expected Result: All the records should be fetched and displayed for the data source table from the location
> NB:
> 1.On exiting and relaunching the spark-sql session, select command fetches the correct # of records.
>  2.This issue is valid for all the data source tables created with 'Using' .
> I came across this use case in Spark 2.2.1 when tried to reproduce a customer site observation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org