You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "L. C. Hsieh (Jira)" <ji...@apache.org> on 2019/09/30 03:05:00 UTC

[jira] [Resolved] (SPARK-29186) SubqueryAlias name value is null in Spark 2.4.3 Logical plan.

     [ https://issues.apache.org/jira/browse/SPARK-29186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

L. C. Hsieh resolved SPARK-29186.
---------------------------------
    Resolution: Resolved

> SubqueryAlias name value is null in Spark 2.4.3 Logical plan.
> -------------------------------------------------------------
>
>                 Key: SPARK-29186
>                 URL: https://issues.apache.org/jira/browse/SPARK-29186
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.3
>         Environment: I have tried this on AWS Glue with Spark 2.4.3
> and on windows 10 with 2.4.4
> at both of them facing same issue
>            Reporter: Tarun Khaneja
>            Assignee: L. C. Hsieh
>            Priority: Minor
>         Attachments: image-2019-09-25-12-17-53-552.png, image-2019-09-25-12-21-52-136.png
>
>
> I am writing a program to analyze sql query. So I am using Spark logical plan.I am writing a program to analyze sql query. So I am using Spark logical plan.
> Below is the code which I am using
>     
> {code:java}
> object QueryAnalyzer
> {   
> val LOG = LoggerFactory.getLogger(this.getClass)     //Spark Conf 
>    
> val conf = new     SparkConf().setMaster("local[2]").setAppName("LocalEdlExecutor")     
> //Spark Context    
> val sc = new SparkContext(conf)
> //sql Context    
> val sqlContext = new SQLContext(sc)   
>   
> //Spark Session    
> val sparkSession = SparkSession      
> .builder()      
> .appName("Spark User Data")      .config("spark.app.name", "LocalEdl")      .getOrCreate()     
> def main(args: Array[String])
> {          
> var inputDfColumns = Map[String,List[String]]() 
> val dfSession =  sparkSession.read.format("csv").      option("header", "true").      option("inferschema", "true").      option("delimiter", ",").option("decoding", "utf8").option("multiline", true) 
>        
> var oDF = dfSession.      load("C:\\Users\\tarun.khaneja\\data\\order.csv")        
> println("smaple data in oDF====>")
>       
> oDF.show()           
> var cusDF = dfSession.        load("C:\\Users\\tarun.khaneja\\data\\customer.csv")          
> println("smaple data in cusDF====>")      cusDF.show() 
>              oDF.createOrReplaceTempView("orderTempView")      cusDF.createOrReplaceTempView("customerTempView")
>             
> //get input columns from all dataframe      
> inputDfColumns += 
> ("orderTempView"->oDF.columns.toList) 
>      
> inputDfColumns += 
> ("customerTempView"->cusDF.columns.toList) 
>            
> val res = sqlContext.sql("""select OID, max(MID+CID) as MID_new,ROW_NUMBER() OVER (                      
> ORDER BY CID) as rn from                             (select OID_1 as OID, CID_1 as CID, OID_1+CID_1 as MID from (select min(ot.OrderID) as OID_1, ct.CustomerID as CID_1 from orderTempView as ot inner join customerTempView as ct                          on ot.CustomerID = ct.CustomerID group by CID_1)) group by OID,CID""")
> println(res.show(false))                             
> val analyzedPlan = res.queryExecution.analyzed      println(analyzedPlan.prettyJson)
> }
> {code}
>  
>  Now problem is, with *Spark 2.2.1*, I am getting below json. where I have SubqueryAlias which provide important information of alias name for table which we used in query, as shown below.
>   !image-2019-09-25-12-17-53-552.png!
>    
> But with Spark 2.4, I am getting SubqueryAlias name as null. As shown below in json screenshot
>  
> !image-2019-09-25-12-21-52-136.png!
>  
> So, I am not sure if it is bug in Spark 2.4 because of which I am getting name as null in SubquerAlias.
> Or if it is not bug then how can I get relation between alias name and real table name.
>  Any idea on this?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org