You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Arpan Bhandari <ar...@gmail.com> on 2021/01/29 07:28:10 UTC
Spark SQL query
Hi ,
Is there a way to track back spark sql after it has been already run i.e.
query has been already submitted by a person and i have to back trace what
query actually got submitted.
Appreciate any help on this.
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org
Re: Spark SQL query
Posted by Arpan Bhandari <ar...@gmail.com>.
Sanchit,
It seems I have to do some sort of analysis from the plan to get the query.
Appreciate all your help on this.
Thanks,
Arpan Bhandari
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org
Re: Spark SQL query
Posted by Sachit Murarka <co...@gmail.com>.
Application wise it wont show as such.
You can try to corelate it with explain plain output using some filters or
attribute.
Or else if you do not have too much queries in history. Just take queries
and find plan of those queries and match it with shown in UI.
I know thats the tedious task. But I dont think that there is other way.
Thanks
Sachit
On Mon, 1 Feb 2021, 22:32 Arpan Bhandari, <ar...@gmail.com> wrote:
> Sachit,
>
> That is showing all the queries that got executed, but how it would get
> mapped to specific application Id it was associated with ?
>
> Thanks,
> Arpan Bhandari
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>
Re: Spark SQL query
Posted by Arpan Bhandari <ar...@gmail.com>.
Sachit,
That is showing all the queries that got executed, but how it would get
mapped to specific application Id it was associated with ?
Thanks,
Arpan Bhandari
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org
Re: Spark SQL query
Posted by Sachit Murarka <co...@gmail.com>.
Hi arpan,
In spark shell when you type
:history.
then also it is not showing?
Thanks
Sachit
On Mon, 1 Feb 2021, 21:13 Arpan Bhandari, <ar...@gmail.com> wrote:
> Hey Sachit,
>
> It shows the query plan, which is difficult to diagnose out and depict the
> actual query.
>
>
> Thanks,
> Arpan Bhandari
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>
Re: Spark SQL query
Posted by Arpan Bhandari <ar...@gmail.com>.
Hey Sachit,
It shows the query plan, which is difficult to diagnose out and depict the
actual query.
Thanks,
Arpan Bhandari
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org
Re: Spark SQL query
Posted by Sachit Murarka <co...@gmail.com>.
Hi Arpan,
Launch spark shell and in the shell type ":history" , you will see the
query executed.
In the Spark UI under SQL Tab you can see the query plan when you click on
the details button(Though it won't show you the complete query). But by
looking at the plan you can get your query.
Hope this helps!
Kind Regards,
Sachit Murarka
On Fri, Jan 29, 2021 at 9:33 PM Arpan Bhandari <ar...@gmail.com> wrote:
> Hi Sachit,
>
> Yes it was executed using spark shell, history is already enabled. already
> checked sql tab but it is not showing the query. My spark version is 2.4.5
>
> Thanks,
> Arpan Bhandari
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>
Re: Spark SQL query
Posted by Arpan Bhandari <ar...@gmail.com>.
Hi Sachit,
Yes it was executed using spark shell, history is already enabled. already
checked sql tab but it is not showing the query. My spark version is 2.4.5
Thanks,
Arpan Bhandari
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org
Re: Spark SQL query
Posted by Sachit Murarka <co...@gmail.com>.
Hi Arpan,
Was it executed using spark shell?
If yes type :history
Do u have history server enabled?
If yes , go to the history and go to the SQL tab in History UI.
Thanks
Sachit
On Fri, 29 Jan 2021, 19:19 Arpan Bhandari, <ar...@gmail.com> wrote:
> Hi ,
>
> Is there a way to track back spark sql after it has been already run i.e.
> query has been already submitted by a person and i have to back trace what
> query actually got submitted.
>
>
> Appreciate any help on this.
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>
Re: Spark SQL query
Posted by Mich Talebzadeh <mi...@gmail.com>.
I suggest one thing you can do is to open another thread for this feature
request
"Having functionality in Spark to allow queries to be gathered and analyzed"
and see what forum responds to it.
HTH
LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.
On Wed, 3 Feb 2021 at 11:17, Arpan Bhandari <ar...@gmail.com> wrote:
> Yes Mich,
>
> Mapping the spark sql query that got executed corresponding to an
> application Id on yarn would greatly help in analyzing and debugging the
> query for any potential problems.
>
>
> Thanks,
> Arpan Bhandari
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>
Re: Spark SQL query
Posted by Arpan Bhandari <ar...@gmail.com>.
Yes Mich,
Mapping the spark sql query that got executed corresponding to an
application Id on yarn would greatly help in analyzing and debugging the
query for any potential problems.
Thanks,
Arpan Bhandari
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org
Re: Spark SQL query
Posted by Mich Talebzadeh <mi...@gmail.com>.
I gather what you are after is a code sniffer for Spark that provides a
form of GUI to get the code that applications run against spark.
I don't think Spark has this type of plug-in although it would be
potentially useful. Some RDBMS provide this. Usually stored on some form of
persistent storage or database. I have not come across it in Spark.
HTH
LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.
On Wed, 3 Feb 2021 at 05:10, Arpan Bhandari <ar...@gmail.com> wrote:
> Mich,
>
> The directory is already there and event logs are getting generated, I have
> checked them it contains the query plan but not the actual query.
>
>
> Thanks,
> Arpan Bhandari
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>
Re: Spark SQL query
Posted by Arpan Bhandari <ar...@gmail.com>.
Mich,
The directory is already there and event logs are getting generated, I have
checked them it contains the query plan but not the actual query.
Thanks,
Arpan Bhandari
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org
Re: Spark SQL query
Posted by Mich Talebzadeh <mi...@gmail.com>.
create a directory in hdfs
hdfs dfs -mkdir /spark_event_logs
modify file $SPARK_HOME/conf/spark-defaults.conf and add these two lines
spark.eventLog.enabled=true
# do not use quotes below
spark.eventLog.dir=hdfs://rhes75:9000/spark_event_logs
Then run a job and check it
hdfs dfs -ls /spark_event_logs
-rw-rw---- 3 hduser supergroup 33795834 2021-02-02 19:48
/spark_event_logs/yarn-1612295234284
That should have all the info you need
Make sure the directory hdfs://<NAME_NODE>:9000/spark_event_logs is
writable by spark
HTH
LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.
On Tue, 2 Feb 2021 at 15:59, Arpan Bhandari <ar...@gmail.com> wrote:
> Yes i can see the jobs on 8088 and also on the spark history url. spark
> history server is showing up the plan details on the sql tab but not giving
> the query.
>
>
> Thanks,
> Arpan Bhandari
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>
Re: Spark SQL query
Posted by Arpan Bhandari <ar...@gmail.com>.
Yes i can see the jobs on 8088 and also on the spark history url. spark
history server is showing up the plan details on the sql tab but not giving
the query.
Thanks,
Arpan Bhandari
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org
Re: Spark SQL query
Posted by Mich Talebzadeh <mi...@gmail.com>.
Ok
on host starting the job on port* 8088*, do you have access to all
applications like shown in the attached file. If you look at history can
you see the jobs?
Also if you go to history next to Tracking URL: History
HTH
*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.
On Tue, 2 Feb 2021 at 14:47, Arpan Bhandari <ar...@gmail.com> wrote:
> Hi Mich,
>
> I do see the .scala_history directory, but it contains all the queries
> which
> got executed uptill now, but if i have to map a specific query to an
> application Id in yarn that would not correlate, hence this method alone
> won't suffice
>
> Thanks,
> Arpan Bhandari
>
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>
Re: Spark SQL query
Posted by Arpan Bhandari <ar...@gmail.com>.
Hi Mich,
I do see the .scala_history directory, but it contains all the queries which
got executed uptill now, but if i have to map a specific query to an
application Id in yarn that would not correlate, hence this method alone
won't suffice
Thanks,
Arpan Bhandari
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org
Re: Spark SQL query
Posted by Mich Talebzadeh <mi...@gmail.com>.
Hi Arpan.
I believe all applications including spark and scala create a hidden
history file
You can go to home directory
cd
# see list of all hidden files
ls -a | egrep '^\.'
If you are using scala do you see .scala_history file?
.scala_history
HTH
LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.
On Tue, 2 Feb 2021 at 10:16, Arpan Bhandari <ar...@gmail.com> wrote:
> Hi Mich,
>
> Repeated the steps as suggested, but still there is no such folder created
> in the home directory. Do we need to enable some property so that it
> creates
> one.
>
>
> Thanks,
> Arpan Bhandari
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>
Re: Spark SQL query
Posted by Arpan Bhandari <ar...@gmail.com>.
Hi Mich,
Repeated the steps as suggested, but still there is no such folder created
in the home directory. Do we need to enable some property so that it creates
one.
Thanks,
Arpan Bhandari
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org
Re: Spark SQL query
Posted by Mich Talebzadeh <mi...@gmail.com>.
Hi Arpan,
log in as any user that has execution right for spark. type spark-shell, do
some simple commands then exit. go to home directory of that user and look
for that hidden file
${HOME/.spark_history
it will be there.
HTH,
LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.
On Mon, 1 Feb 2021 at 15:44, Arpan Bhandari <ar...@gmail.com> wrote:
> Hey Mich,
>
> Thanks for the suggestions, but i don't see any such folder created on the
> edge node.
>
>
> Thanks,
> Arpan Bhandari
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>
Re: Spark SQL query
Posted by Arpan Bhandari <ar...@gmail.com>.
Hey Mich,
Thanks for the suggestions, but i don't see any such folder created on the
edge node.
Thanks,
Arpan Bhandari
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org
Re: Spark SQL query
Posted by Mich Talebzadeh <mi...@gmail.com>.
Hi Arpan,
I presume you are interested in what client was doing.
If you have access to the edge node (where spark code is submitted), look
for the following file
${HOME/.spark_history
example
-rw-r--r--. 1 hduser hadoop 111997 Jun 2 2018 .spark_history
just use shell tools (cat, grep etc) to have a look
Or put it in HDFS somewhere
hdfs dfs -put .spark_history /misc/spark_history ## Spark cannot read a
hidden file
#and read it as text file through sparkRDD in spark-shell
scala> val historyRDD = spark.sparkContext.textFile("/misc/spark_history")
historyRDD: org.apache.spark.rdd.RDD[String] = /misc/spark_history
MapPartitionsRDD[11] at textFile at <console>:23
#print it out
historyRDD.collect().foreach(f=>{println(f)})
HTH
LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.
On Fri, 29 Jan 2021 at 13:49, Arpan Bhandari <ar...@gmail.com> wrote:
> Hi ,
>
> Is there a way to track back spark sql after it has been already run i.e.
> query has been already submitted by a person and i have to back trace what
> query actually got submitted.
>
>
> Appreciate any help on this.
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>