You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Reynold Xin (JIRA)" <ji...@apache.org> on 2015/07/16 19:59:05 UTC
[jira] [Closed] (SPARK-6217) insertInto doesn't work in PySpark

     [ https://issues.apache.org/jira/browse/SPARK-6217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Reynold Xin closed SPARK-6217.
------------------------------
    Assignee: Wenchen Fan

> insertInto doesn't work in PySpark
> ----------------------------------
>
>                 Key: SPARK-6217
>                 URL: https://issues.apache.org/jira/browse/SPARK-6217
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark, SQL
>    Affects Versions: 1.3.0
>         Environment: Mac OS X Yosemite 10.10.2
> Python 2.7.9
> Spark 1.3.0
>            Reporter: Charles Cloud
>            Assignee: Wenchen Fan
>
> The following code, running in an IPython shell throws an error:
> {code:none}
> In [1]: from pyspark import SparkContext, HiveContext
> In [2]: sc = SparkContext('local[*]', 'test')
> Spark assembly has been built with Hive, including Datanucleus jars on classpath
> In [3]: sql = HiveContext(sc)
> In [4]: import pandas as pd
> In [5]: df = pd.DataFrame({'a': [1.0, 2.0, 3.0], 'b': [1, 2, 3], 'c': list('abc')})
> In [6]: df2 = pd.DataFrame({'a': [2.0, 3.0, 4.0], 'b': [4, 5, 6], 'c': list('def')})
> In [7]: sdf = sql.createDataFrame(df)
> In [8]: sdf2 = sql.createDataFrame(df2)
> In [9]: sql.registerDataFrameAsTable(sdf, 'sdf')
> In [10]: sql.registerDataFrameAsTable(sdf2, 'sdf2')
> In [11]: sql.cacheTable('sdf')
> In [12]: sql.cacheTable('sdf2')
> In [13]: sdf2.insertInto('sdf')  # throws an error
> {code}
> Here's the Java traceback:
> {code:none}
> Py4JJavaError: An error occurred while calling o270.insertInto.
> : java.lang.AssertionError: assertion failed: No plan for InsertIntoTable (LogicalRDD [a#0,b#1L,c#2], MapPartitionsRDD[13] at mapPartitions at SQLContext.scala:1167), Map(), false
>  InMemoryRelation [a#6,b#7L,c#8], true, 10000, StorageLevel(true, true, false, true, 1), (PhysicalRDD [a#6,b#7L,c#8], MapPartitionsRDD[41] at mapPartitions at SQLContext.scala:1167), Some(sdf2)
>         at scala.Predef$.assert(Predef.scala:179)
>         at org.apache.spark.sql.catalyst.planning.QueryPlanner.apply(QueryPlanner.scala:59)
>         at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute(SQLContext.scala:1085)
>         at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLContext.scala:1083)
>         at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycompute(SQLContext.scala:1089)
>         at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan(SQLContext.scala:1089)
>         at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:1092)
>         at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:1092)
>         at org.apache.spark.sql.DataFrame.insertInto(DataFrame.scala:1134)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:483)
>         at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
>         at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
>         at py4j.Gateway.invoke(Gateway.java:259)
>         at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
>         at py4j.commands.CallCommand.execute(CallCommand.java:79)
>         at py4j.GatewayConnection.run(GatewayConnection.java:207)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> I'd be ecstatic if this was my own fault, and I'm somehow using it incorrectly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org