You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sushanta Sen (JIRA)" <ji...@apache.org> on 2019/03/08 10:58:00 UTC
[jira] [Commented] (SPARK-24291) Data source table is not
displaying records when files are uploaded to table location
[ https://issues.apache.org/jira/browse/SPARK-24291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16787786#comment-16787786 ]
Sushanta Sen commented on SPARK-24291:
--------------------------------------
Yes,after refresh it fetches the data.But why this is happening when tables created with 'USING'.
> Data source table is not displaying records when files are uploaded to table location
> -------------------------------------------------------------------------------------
>
> Key: SPARK-24291
> URL: https://issues.apache.org/jira/browse/SPARK-24291
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.3.0
> Environment: OS: SUSE11
> Spark Version: 2.3
> Reporter: Sushanta Sen
> Priority: Major
>
> Precondition:
> 1.Already one .orc file exists in the /tmp/orcdata/ location
> # Launch Spark-sql
> # spark-sql> CREATE TABLE os_orc (name string, version string, other string) USING ORC OPTIONS (path '/tmp/orcdata/');
> # spark-sql> select * from os_orc;
> Spark 2.3.0 Apache
> Time taken: 2.538 seconds, Fetched 1 row(s)
> # pc1:/opt/# *./hadoop dfs -ls /tmp/orcdata*
> Found 1 items
> -rw-r--r-- 3 spark hadoop 475 2018-05-09 18:21 /tmp/orcdata/part-00000-d488121b-e9fd-4269-a6ea-842c631722ee-c000.snappy.orc
> pc1:/opt/# *./hadoop fs -copyFromLocal /opt/OS/loaddata/orcdata/part-00001-d488121b-e9fd-4269-a6ea-842c631722ee-c000.snappy.orc /tmp/orcdata/data2.orc*
> pc1:/opt/# *./hadoop dfs -ls /tmp/orcdata*
> Found *2* items
> -rw-r--r-- 3 spark hadoop 475 2018-05-15 14:59 /tmp/orcdata/data2.orc
> -rw-r--r-- 3 spark hadoop 475 2018-05-09 18:21 /tmp/orcdata/part-00000-d488121b-e9fd-4269-a6ea-842c631722ee-c000.snappy.orc
> pc1:/opt/# **
> 5. Again execute the select command on the table os_orc
> spark-sql> select * from os_orc;
> Spark 2.3.0 Apache
> Time taken: 1.528 seconds, Fetched {color:#FF0000}1 row(s){color}
> Actual Result: On executing select command it does not display the all the records exist in the data source table location
> Expected Result: All the records should be fetched and displayed for the data source table from the location
> NB:
> 1.On exiting and relaunching the spark-sql session, select command fetches the correct # of records.
> 2.This issue is valid for all the data source tables created with 'Using' .
> I came across this use case in Spark 2.2.1 when tried to reproduce a customer site observation.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org