You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by salexln <sa...@gmail.com> on 2015/12/27 09:20:43 UTC

what is the best way to debug spark / mllib?

Hi guys,

I'm debugging my code in mllib/clustering but i'm not sure i'm doing it the
best way:
I build my changes in mllib using "build/mvn -DskipTests package" and then
running invoking my code using 
"./bin/spark-shell"

My two main issues:
1) After each change the build (build/mvn -DskipTests package) takes ~15
mins
2) I cannot put breakpoints
3) If I add println of logInfo, I do not see it in the console.

What us the best way to debug it?




--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/what-is-the-best-way-to-debug-spark-mllib-tp15809.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: what is the best way to debug spark / mllib?

Posted by "Fathi Salmi, Meisam" <me...@gmail.com>.
If you are modifying only mlib, you can use the "-am" and "-pl" options 
with mvn to cut the build time even more.

Thanks,
Meisam

On 12/27/2015 11:45 AM, salexln wrote:
> Thanks for the response, I have several more questions:
>
> *1) you should run zinc incremental compiler*
> I run "./build/zinc-0.3.9/bin/zinc -scala-home $SCALA_HOME -nailed -start"
> but the compilation time of
> "build/mvn -DskipTests package' is still about 9 mins. Is this normal?
>
> *2) if you want breakpoints that should likely be done in local mode*
> What do you mean by local mode? I've downloaded the latest version from
> github, and then made my changes on it.
>
> *3) adjust the log4j.properties settings and you can start to see the
> logInfo*
> I've copied log4j.properties from log4j.properties.template and is it set:
> log4j.rootCategory=INFO, console
>
> But I still do not see the logInfo in the console.
>
>
>
>
>
> --
> View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/what-is-the-best-way-to-debug-spark-mllib-tp15809p15815.html
> Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: what is the best way to debug spark / mllib?

Posted by Ted Yu <yu...@gmail.com>.
For #1, 9 minutes seem to be normal. Here was duration for recent build on
master branch:

[INFO]
------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO]
------------------------------------------------------------------------
[INFO] Total time: 10:44 min
[INFO] Finished at: 2015-12-25T09:53:28-08:00

For #2, please take a look at
https://spark.apache.org/docs/latest/submitting-applications.html
Look for 'Run application locally;'

Cheers

On Sun, Dec 27, 2015 at 8:45 AM, salexln <sa...@gmail.com> wrote:

> Thanks for the response, I have several more questions:
>
> *1) you should run zinc incremental compiler*
> I run "./build/zinc-0.3.9/bin/zinc -scala-home $SCALA_HOME -nailed -start"
> but the compilation time of
> "build/mvn -DskipTests package' is still about 9 mins. Is this normal?
>
> *2) if you want breakpoints that should likely be done in local mode*
> What do you mean by local mode? I've downloaded the latest version from
> github, and then made my changes on it.
>
> *3) adjust the log4j.properties settings and you can start to see the
> logInfo*
> I've copied log4j.properties from log4j.properties.template and is it set:
> log4j.rootCategory=INFO, console
>
> But I still do not see the logInfo in the console.
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/what-is-the-best-way-to-debug-spark-mllib-tp15809p15815.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>

Re: what is the best way to debug spark / mllib?

Posted by salexln <sa...@gmail.com>.
Thanks for the response, I have several more questions:

*1) you should run zinc incremental compiler*
I run "./build/zinc-0.3.9/bin/zinc -scala-home $SCALA_HOME -nailed -start"
but the compilation time of 
"build/mvn -DskipTests package' is still about 9 mins. Is this normal?

*2) if you want breakpoints that should likely be done in local mode*
What do you mean by local mode? I've downloaded the latest version from
github, and then made my changes on it.

*3) adjust the log4j.properties settings and you can start to see the
logInfo*
I've copied log4j.properties from log4j.properties.template and is it set:
log4j.rootCategory=INFO, console

But I still do not see the logInfo in the console.





--
View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/what-is-the-best-way-to-debug-spark-mllib-tp15809p15815.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: what is the best way to debug spark / mllib?

Posted by Stephen Boesch <ja...@gmail.com>.
1) you should run zinc incremental compiler
2) if you want breakpoints that should likely be done in local mode
3) adjust the log4j.properties settings and you can start to see the logInfo

2015-12-27 0:20 GMT-08:00 salexln <sa...@gmail.com>:

> Hi guys,
>
> I'm debugging my code in mllib/clustering but i'm not sure i'm doing it the
> best way:
> I build my changes in mllib using "build/mvn -DskipTests package" and then
> running invoking my code using
> "./bin/spark-shell"
>
> My two main issues:
> 1) After each change the build (build/mvn -DskipTests package) takes ~15
> mins
> 2) I cannot put breakpoints
> 3) If I add println of logInfo, I do not see it in the console.
>
> What us the best way to debug it?
>
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/what-is-the-best-way-to-debug-spark-mllib-tp15809.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>