You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yuming Wang (JIRA)" <ji...@apache.org> on 2018/11/04 08:38:00 UTC

[jira] [Updated] (SPARK-25936) InsertIntoDataSourceCommand does not use Cached Data

     [ https://issues.apache.org/jira/browse/SPARK-25936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yuming Wang updated SPARK-25936:
--------------------------------
    Description: 
How to reproduce this issue:
{code:scala}
spark.sql("""
  CREATE TABLE jdbcTable
  USING org.apache.spark.sql.jdbc
  OPTIONS (
    url "jdbc:mysql://localhost:3306/test",
    dbtable "test.InsertIntoDataSourceCommand",
    user "hive",
    password "hive"
  )""")

spark.range(2).createTempView("test_view")
spark.catalog.cacheTable("test_view")
spark.sql("INSERT INTO TABLE jdbcTable SELECT * FROM test_view").explain
{code}

{noformat}
== Physical Plan ==                                                             
Execute InsertIntoDataSourceCommand
   +- InsertIntoDataSourceCommand
         +- Project
            +- SubqueryAlias
               +- Range (0, 2, step=1, splits=Some(8))

{noformat}

It should be:
{noformat}
== Physical Plan ==                                                             
Execute InsertIntoDataSourceCommand InsertIntoDataSourceCommand Relation[id#8L] JDBCRelation(test.InsertIntoDataSourceCommand) [numPartitions=1], false, [id]
+- *(1) InMemoryTableScan [id#0L]
      +- InMemoryRelation [id#0L], StorageLevel(disk, memory, deserialized, 1 replicas)
            +- *(1) Range (0, 2, step=1, splits=8)
{noformat}


  was:
How to reproduce this issue:
{code:scala}
spark.sql("""
  CREATE TABLE jdbcTable
  USING org.apache.spark.sql.jdbc
  OPTIONS (
    url "jdbc:mysql://localhost:3306/test",
    dbtable "test.InsertIntoDataSourceCommand",
    user "hive",
    password "hive"
  )""")

spark.range(2).createTempView("test_view")
spark.catalog.cacheTable("test_view")
spark.sql("INSERT INTO TABLE jdbcTable SELECT * FROM test_view").explain
{code}

{noformat}
== Physical Plan ==                                                             
Execute InsertIntoDataSourceCommand
   +- InsertIntoDataSourceCommand
         +- Project
            +- SubqueryAlias
               +- Range (0, 2, step=1, splits=Some(8))

{noformat}





> InsertIntoDataSourceCommand does not use Cached Data
> ----------------------------------------------------
>
>                 Key: SPARK-25936
>                 URL: https://issues.apache.org/jira/browse/SPARK-25936
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Yuming Wang
>            Priority: Major
>
> How to reproduce this issue:
> {code:scala}
> spark.sql("""
>   CREATE TABLE jdbcTable
>   USING org.apache.spark.sql.jdbc
>   OPTIONS (
>     url "jdbc:mysql://localhost:3306/test",
>     dbtable "test.InsertIntoDataSourceCommand",
>     user "hive",
>     password "hive"
>   )""")
> spark.range(2).createTempView("test_view")
> spark.catalog.cacheTable("test_view")
> spark.sql("INSERT INTO TABLE jdbcTable SELECT * FROM test_view").explain
> {code}
> {noformat}
> == Physical Plan ==                                                             
> Execute InsertIntoDataSourceCommand
>    +- InsertIntoDataSourceCommand
>          +- Project
>             +- SubqueryAlias
>                +- Range (0, 2, step=1, splits=Some(8))
> {noformat}
> It should be:
> {noformat}
> == Physical Plan ==                                                             
> Execute InsertIntoDataSourceCommand InsertIntoDataSourceCommand Relation[id#8L] JDBCRelation(test.InsertIntoDataSourceCommand) [numPartitions=1], false, [id]
> +- *(1) InMemoryTableScan [id#0L]
>       +- InMemoryRelation [id#0L], StorageLevel(disk, memory, deserialized, 1 replicas)
>             +- *(1) Range (0, 2, step=1, splits=8)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org