You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2017/06/26 19:57:01 UTC
[jira] [Assigned] (SPARK-21216) Streaming DataFrames fail to join
with Hive tables
[ https://issues.apache.org/jira/browse/SPARK-21216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-21216:
------------------------------------
Assignee: Apache Spark (was: Burak Yavuz)
> Streaming DataFrames fail to join with Hive tables
> --------------------------------------------------
>
> Key: SPARK-21216
> URL: https://issues.apache.org/jira/browse/SPARK-21216
> Project: Spark
> Issue Type: Bug
> Components: Structured Streaming
> Affects Versions: 2.1.1
> Reporter: Burak Yavuz
> Assignee: Apache Spark
>
> The following code will throw a cryptic exception:
> {code}
> import org.apache.spark.sql.execution.streaming.MemoryStream
> import testImplicits._
> implicit val _sqlContext = spark.sqlContext
> Seq((1, "one"), (2, "two"), (4, "four")).toDF("number", "word").createOrReplaceTempView("t1")
> // Make a table and ensure it will be broadcast.
> sql("""CREATE TABLE smallTable(word string, number int)
> |ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> |STORED AS TEXTFILE
> """.stripMargin)
> sql(
> """INSERT INTO smallTable
> |SELECT word, number from t1
> """.stripMargin)
> val inputData = MemoryStream[Int]
> val joined = inputData.toDS().toDF()
> .join(spark.table("smallTable"), $"value" === $"number")
> val sq = joined.writeStream
> .format("memory")
> .queryName("t2")
> .start()
> try {
> inputData.addData(1, 2)
> sq.processAllAvailable()
> } finally {
> sq.stop()
> }
> {code}
> If someone creates a HiveSession, the planner in `IncrementalExecution` doesn't take into account the Hive scan strategies
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org