You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by dragonly <li...@gmail.com> on 2017/01/10 11:21:59 UTC

[SQL][CodeGen] Is there a way to set break point and debug the generated code?

I am recently hacking into the SparkSQL and trying to add some new udts and
functions, as well as some new Expression classes. I run into the problem of
the return type of nullSafeEval method. In one of the new Expression
classes, I want to return an array of my udt, and my code is like `return
new GenericArrayData(Array[udt](the array))`. my dataType of the new
Expression class is like `ArrayType(new MyUDT(), containsNull = false)`. And
I finally get an java object type conversion error.

So I tried to debug into the code and see where the conversion happened,
only to found that after some generated code execution, I stepped into the
GenericArrayData.getAs[T](ordinal: Int) method, and find the ordinal always
0. So here's the problem: SparkSQL is getting the 0th element out of the
GenericArrayData and treat it as a MyUDT, but I told it to treat the output
of the Expression class as ArrayType of MyUDT.

It's obscure to me how this ordinal variable comes in and is always 0. Is
there a way of debugging into the generated code?

PS: just reading the code generation part without jumping back and forth is
really not cool :/



--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/SQL-CodeGen-Is-there-a-way-to-set-break-point-and-debug-the-generated-code-tp20535.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Re: [SQL][CodeGen] Is there a way to set break point and debug the generated code?

Posted by Reynold Xin <rx...@databricks.com>.
It's unfortunately difficult to debug -- that's one downside of codegen.
You can dump all the code via "explain codegen" though. That's typically
enough for me to debug.


On Tue, Jan 10, 2017 at 3:21 AM, dragonly <li...@gmail.com> wrote:

> I am recently hacking into the SparkSQL and trying to add some new udts and
> functions, as well as some new Expression classes. I run into the problem
> of
> the return type of nullSafeEval method. In one of the new Expression
> classes, I want to return an array of my udt, and my code is like `return
> new GenericArrayData(Array[udt](the array))`. my dataType of the new
> Expression class is like `ArrayType(new MyUDT(), containsNull = false)`.
> And
> I finally get an java object type conversion error.
>
> So I tried to debug into the code and see where the conversion happened,
> only to found that after some generated code execution, I stepped into the
> GenericArrayData.getAs[T](ordinal: Int) method, and find the ordinal
> always
> 0. So here's the problem: SparkSQL is getting the 0th element out of the
> GenericArrayData and treat it as a MyUDT, but I told it to treat the output
> of the Expression class as ArrayType of MyUDT.
>
> It's obscure to me how this ordinal variable comes in and is always 0. Is
> there a way of debugging into the generated code?
>
> PS: just reading the code generation part without jumping back and forth is
> really not cool :/
>
>
>
> --
> View this message in context: http://apache-spark-
> developers-list.1001551.n3.nabble.com/SQL-CodeGen-Is-
> there-a-way-to-set-break-point-and-debug-the-generated-code-tp20535.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
>