You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by Xiao Li <ga...@gmail.com> on 2015/10/19 08:38:57 UTC

Spark SQL: what does an exclamation mark mean in the plan?

Hi, all,

After turning on the trace, I saw a strange exclamation mark in
the intermediate plans. This happened in catalyst analyzer.

Join Inner, Some((col1#0 = col1#6))
 Project [col1#0,col2#1,col3#2,col2_alias#24,col3#2 AS col3_alias#13]
  Project [col1#0,col2#1,col3#2,col2#1 AS col2_alias#24]
   LogicalRDD [col1#0,col2#1,col3#2], MapPartitionsRDD[1] at
createDataFrame at SimpleApp.scala:32
 Aggregate [col1#6], [col1#6,count(col1#6) AS count(col1)#5L]
  *!Project [col1#6,col2#7,col3#8,col2_alias#24,col3#8 AS col3_alias#4]*
   Project [col1#6,col2#7,col3#8,col2#7 AS col2_alias#3]
    LogicalRDD [col1#6,col2#7,col3#8], MapPartitionsRDD[1] at
createDataFrame at SimpleApp.scala:32

Could anybody give me a hint why there exists a !(exclamation mark) before
the node name (Project)? This ! mark does not disappear in the subsequent
query plan.

Thank you!

Xiao Li

Re: Spark SQL: what does an exclamation mark mean in the plan?

Posted by Xiao Li <ga...@gmail.com>.

Hi, Michael,

Thank you again! Just found the functions that generate the ! mark

  /**
   * A prefix string used when printing the plan.
   *
   * We use "!" to indicate an invalid plan, and "'" to indicate an
unresolved plan.
   */
  protected def statePrefix = if (missingInput.nonEmpty &&
children.nonEmpty) "!" else ""

  override def simpleString: String = statePrefix + super.simpleString


Xiao Li

2015-10-19 11:16 GMT-07:00 Michael Armbrust <mi...@databricks.com>:

> It means that there is an invalid attribute reference (i.e. a #n where the
> attribute is missing from the child operator).
>
> On Sun, Oct 18, 2015 at 11:38 PM, Xiao Li <ga...@gmail.com> wrote:
>
>> Hi, all,
>>
>> After turning on the trace, I saw a strange exclamation mark in
>> the intermediate plans. This happened in catalyst analyzer.
>>
>> Join Inner, Some((col1#0 = col1#6))
>>  Project [col1#0,col2#1,col3#2,col2_alias#24,col3#2 AS col3_alias#13]
>>   Project [col1#0,col2#1,col3#2,col2#1 AS col2_alias#24]
>>    LogicalRDD [col1#0,col2#1,col3#2], MapPartitionsRDD[1] at
>> createDataFrame at SimpleApp.scala:32
>>  Aggregate [col1#6], [col1#6,count(col1#6) AS count(col1)#5L]
>>   *!Project [col1#6,col2#7,col3#8,col2_alias#24,col3#8 AS col3_alias#4]*
>>    Project [col1#6,col2#7,col3#8,col2#7 AS col2_alias#3]
>>     LogicalRDD [col1#6,col2#7,col3#8], MapPartitionsRDD[1] at
>> createDataFrame at SimpleApp.scala:32
>>
>> Could anybody give me a hint why there exists a !(exclamation mark)
>> before the node name (Project)? This ! mark does not disappear in the
>> subsequent query plan.
>>
>> Thank you!
>>
>> Xiao Li
>>
>
>

Re: Spark SQL: what does an exclamation mark mean in the plan?

Posted by Michael Armbrust <mi...@databricks.com>.

It means that there is an invalid attribute reference (i.e. a #n where the
attribute is missing from the child operator).

On Sun, Oct 18, 2015 at 11:38 PM, Xiao Li <ga...@gmail.com> wrote:

> Hi, all,
>
> After turning on the trace, I saw a strange exclamation mark in
> the intermediate plans. This happened in catalyst analyzer.
>
> Join Inner, Some((col1#0 = col1#6))
>  Project [col1#0,col2#1,col3#2,col2_alias#24,col3#2 AS col3_alias#13]
>   Project [col1#0,col2#1,col3#2,col2#1 AS col2_alias#24]
>    LogicalRDD [col1#0,col2#1,col3#2], MapPartitionsRDD[1] at
> createDataFrame at SimpleApp.scala:32
>  Aggregate [col1#6], [col1#6,count(col1#6) AS count(col1)#5L]
>   *!Project [col1#6,col2#7,col3#8,col2_alias#24,col3#8 AS col3_alias#4]*
>    Project [col1#6,col2#7,col3#8,col2#7 AS col2_alias#3]
>     LogicalRDD [col1#6,col2#7,col3#8], MapPartitionsRDD[1] at
> createDataFrame at SimpleApp.scala:32
>
> Could anybody give me a hint why there exists a !(exclamation mark) before
> the node name (Project)? This ! mark does not disappear in the subsequent
> query plan.
>
> Thank you!
>
> Xiao Li
>