You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Ihor Bobak (Jira)" <ji...@apache.org> on 2020/07/17 15:13:00 UTC

[jira] [Created] (SPARK-32347) BROADCAST hint makes a weird message that "column can't be resolved" (it was OK in Spark 2.4)

Ihor Bobak created SPARK-32347:
----------------------------------

             Summary: BROADCAST hint makes a weird message that "column can't be resolved" (it was OK in Spark 2.4)
                 Key: SPARK-32347
                 URL: https://issues.apache.org/jira/browse/SPARK-32347
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.0.0
         Environment: Spark 3.0.0, jupyter notebook, spark launched in local[4] mode, but with Standalone cluster it also fails the same way.

 

 
            Reporter: Ihor Bobak
             Fix For: 3.0.1
         Attachments: 2020-07-17 17_46_32-Window.png, 2020-07-17 17_49_27-Window.png, 2020-07-17 17_52_51-Window.png

The bug is very easily reproduced: run the following same code in Spark 2.4.3. and in 3.0.0.

The SQL parser will raise an invalid error message, although everything seems to be OK with the SQL statement.
{code:python}
import pandas as pd

pdf_sales = pd.DataFrame([(1, 10), (2, 20)], columns=["BuyerID", "Qty"])
pdf_buyers = pd.DataFrame([(1, "John"), (2, "Jack")], columns=["BuyerID", "BuyerName"])

df_sales = spark.createDataFrame(pdf_sales)
df_buyers = spark.createDataFrame(pdf_buyers)

df_sales.createOrReplaceTempView("df_sales")
df_buyers.createOrReplaceTempView("df_buyers")

spark.sql("""
    with b as (
        select /*+ BROADCAST(df_buyers) */
            BuyerID, BuyerName 
        from df_buyers
    )
    select 
        b.BuyerID,
        b.BuyerName,
        s.Qty
    from df_sales s
        inner join b on s.BuyerID =  b.BuyerID
""").toPandas()
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org