You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Divya Gehlot <di...@gmail.com> on 2018/07/10 07:24:30 UTC

Best Practice to check Drillbit status(Cluster mode)

Hi,
I would like to know the best practice to check the Drillbits status in
cluster mode.
I have encountered the scenario when check Drillbits process running fine
and When check in Drll WebUI , some of the Drillbits are down.
When do RCA(root cause analysis) , got to know due to some reason drillbits
process hanged .
For now the alert system which I have implemented now is checking the


> drill/bin/drillbit.sh status


Is there any other best way to catch the hung Drillbit process?
Appreciate the advise from Drill community users.

Thanks,
Divya

Re: Best Practice to check Drillbit status(Cluster mode)

Posted by Abhishek Girish <ag...@apache.org>.
I think logs may be the only way to figure it out, at the present. You
could have a watch on your logs to be informed of such events. For
notifications, I would say file an enhancement JIRA - if it gathers enough
attention, perhaps someone would volunteer to work or comment on it.

On Mon, Jul 16, 2018 at 2:08 AM Divya Gehlot <di...@gmail.com>
wrote:

> Hi ,
> Thanks Abhishek !
> I would like to have a notification of that orphan drillbit process when it
> gets disconnected from other running drillbits for some reason , definitely
> not because of the unclean shut down as those drill bits are running for
> months .
> I know I can check the logs and kill that orphaned , which what I did in my
> case, but I  would like to have notification for down drillbit.
>
>
> Thanks,
> Divya
>
> On Fri, 13 Jul 2018 at 04:15, Abhishek Girish <ag...@apache.org> wrote:
>
> > Hey Divya,
> >
> > It would depend on the situation, afaik. The sys.drillbits table
> contains a
> > list of all running drillibits. If one of the Drillbit has issues and
> > cannot stay connected to the cluster, I would assume it would be
> > unregistered and may not show up in the output of sys.drillbits. If it's
> an
> > intermittent issue and Drillbit process maintains it's heartbeat
> > connection, it may show up in the output.
> >
> > If you take a look at the logs, you might be able to figure out what is
> > causing the issue. There may be orphan Drillbit processes which may be
> have
> > left behind due to a previous unclean shutdown. Can you clean up all
> > Drillbit processes (using 'ps -ef | grep -i drillbit' and then a kill -9)
> > on nodes where you suspect issues and restart Drillbits?
> >
> > -Abhishek
> >
> > On Tue, Jul 10, 2018 at 7:16 PM Divya Gehlot <di...@gmail.com>
> > wrote:
> >
> > > Hi ,
> > > select * from sys.drillbits;
> > > What does above query shows if drillbits process hangs ?
> > >
> > >
> > > Thanks
> > >
> > > On Tue, 10 Jul 2018 at 15:36, Khurram Faraaz <kf...@mapr.com> wrote:
> > >
> > > > You can run the below query, and look for the *state *column in the
> > > result
> > > > of the query. Online drillbits will be marked as ONLINE.
> > > >
> > > > select * from sys.drillbits;
> > > >
> > > > - Khurram
> > > >
> > > > On Tue, Jul 10, 2018 at 12:24 AM, Divya Gehlot <
> > divya.htconex@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > > I would like to know the best practice to check the Drillbits
> status
> > in
> > > > > cluster mode.
> > > > > I have encountered the scenario when check Drillbits process
> running
> > > fine
> > > > > and When check in Drll WebUI , some of the Drillbits are down.
> > > > > When do RCA(root cause analysis) , got to know due to some reason
> > > > drillbits
> > > > > process hanged .
> > > > > For now the alert system which I have implemented now is checking
> the
> > > > >
> > > > >
> > > > > > drill/bin/drillbit.sh status
> > > > >
> > > > >
> > > > > Is there any other best way to catch the hung Drillbit process?
> > > > > Appreciate the advise from Drill community users.
> > > > >
> > > > > Thanks,
> > > > > Divya
> > > > >
> > > >
> > >
> >
>

Re: Best Practice to check Drillbit status(Cluster mode)

Posted by Divya Gehlot <di...@gmail.com>.
Hi ,
Thanks Abhishek !
I would like to have a notification of that orphan drillbit process when it
gets disconnected from other running drillbits for some reason , definitely
not because of the unclean shut down as those drill bits are running for
months .
I know I can check the logs and kill that orphaned , which what I did in my
case, but I  would like to have notification for down drillbit.


Thanks,
Divya

On Fri, 13 Jul 2018 at 04:15, Abhishek Girish <ag...@apache.org> wrote:

> Hey Divya,
>
> It would depend on the situation, afaik. The sys.drillbits table contains a
> list of all running drillibits. If one of the Drillbit has issues and
> cannot stay connected to the cluster, I would assume it would be
> unregistered and may not show up in the output of sys.drillbits. If it's an
> intermittent issue and Drillbit process maintains it's heartbeat
> connection, it may show up in the output.
>
> If you take a look at the logs, you might be able to figure out what is
> causing the issue. There may be orphan Drillbit processes which may be have
> left behind due to a previous unclean shutdown. Can you clean up all
> Drillbit processes (using 'ps -ef | grep -i drillbit' and then a kill -9)
> on nodes where you suspect issues and restart Drillbits?
>
> -Abhishek
>
> On Tue, Jul 10, 2018 at 7:16 PM Divya Gehlot <di...@gmail.com>
> wrote:
>
> > Hi ,
> > select * from sys.drillbits;
> > What does above query shows if drillbits process hangs ?
> >
> >
> > Thanks
> >
> > On Tue, 10 Jul 2018 at 15:36, Khurram Faraaz <kf...@mapr.com> wrote:
> >
> > > You can run the below query, and look for the *state *column in the
> > result
> > > of the query. Online drillbits will be marked as ONLINE.
> > >
> > > select * from sys.drillbits;
> > >
> > > - Khurram
> > >
> > > On Tue, Jul 10, 2018 at 12:24 AM, Divya Gehlot <
> divya.htconex@gmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > > I would like to know the best practice to check the Drillbits status
> in
> > > > cluster mode.
> > > > I have encountered the scenario when check Drillbits process running
> > fine
> > > > and When check in Drll WebUI , some of the Drillbits are down.
> > > > When do RCA(root cause analysis) , got to know due to some reason
> > > drillbits
> > > > process hanged .
> > > > For now the alert system which I have implemented now is checking the
> > > >
> > > >
> > > > > drill/bin/drillbit.sh status
> > > >
> > > >
> > > > Is there any other best way to catch the hung Drillbit process?
> > > > Appreciate the advise from Drill community users.
> > > >
> > > > Thanks,
> > > > Divya
> > > >
> > >
> >
>

Re: Best Practice to check Drillbit status(Cluster mode)

Posted by Abhishek Girish <ag...@apache.org>.
Hey Divya,

It would depend on the situation, afaik. The sys.drillbits table contains a
list of all running drillibits. If one of the Drillbit has issues and
cannot stay connected to the cluster, I would assume it would be
unregistered and may not show up in the output of sys.drillbits. If it's an
intermittent issue and Drillbit process maintains it's heartbeat
connection, it may show up in the output.

If you take a look at the logs, you might be able to figure out what is
causing the issue. There may be orphan Drillbit processes which may be have
left behind due to a previous unclean shutdown. Can you clean up all
Drillbit processes (using 'ps -ef | grep -i drillbit' and then a kill -9)
on nodes where you suspect issues and restart Drillbits?

-Abhishek

On Tue, Jul 10, 2018 at 7:16 PM Divya Gehlot <di...@gmail.com>
wrote:

> Hi ,
> select * from sys.drillbits;
> What does above query shows if drillbits process hangs ?
>
>
> Thanks
>
> On Tue, 10 Jul 2018 at 15:36, Khurram Faraaz <kf...@mapr.com> wrote:
>
> > You can run the below query, and look for the *state *column in the
> result
> > of the query. Online drillbits will be marked as ONLINE.
> >
> > select * from sys.drillbits;
> >
> > - Khurram
> >
> > On Tue, Jul 10, 2018 at 12:24 AM, Divya Gehlot <di...@gmail.com>
> > wrote:
> >
> > > Hi,
> > > I would like to know the best practice to check the Drillbits status in
> > > cluster mode.
> > > I have encountered the scenario when check Drillbits process running
> fine
> > > and When check in Drll WebUI , some of the Drillbits are down.
> > > When do RCA(root cause analysis) , got to know due to some reason
> > drillbits
> > > process hanged .
> > > For now the alert system which I have implemented now is checking the
> > >
> > >
> > > > drill/bin/drillbit.sh status
> > >
> > >
> > > Is there any other best way to catch the hung Drillbit process?
> > > Appreciate the advise from Drill community users.
> > >
> > > Thanks,
> > > Divya
> > >
> >
>

Re: Best Practice to check Drillbit status(Cluster mode)

Posted by Divya Gehlot <di...@gmail.com>.
Hi ,
select * from sys.drillbits;
What does above query shows if drillbits process hangs ?


Thanks

On Tue, 10 Jul 2018 at 15:36, Khurram Faraaz <kf...@mapr.com> wrote:

> You can run the below query, and look for the *state *column in the result
> of the query. Online drillbits will be marked as ONLINE.
>
> select * from sys.drillbits;
>
> - Khurram
>
> On Tue, Jul 10, 2018 at 12:24 AM, Divya Gehlot <di...@gmail.com>
> wrote:
>
> > Hi,
> > I would like to know the best practice to check the Drillbits status in
> > cluster mode.
> > I have encountered the scenario when check Drillbits process running fine
> > and When check in Drll WebUI , some of the Drillbits are down.
> > When do RCA(root cause analysis) , got to know due to some reason
> drillbits
> > process hanged .
> > For now the alert system which I have implemented now is checking the
> >
> >
> > > drill/bin/drillbit.sh status
> >
> >
> > Is there any other best way to catch the hung Drillbit process?
> > Appreciate the advise from Drill community users.
> >
> > Thanks,
> > Divya
> >
>

Re: Best Practice to check Drillbit status(Cluster mode)

Posted by Khurram Faraaz <kf...@mapr.com>.
You can run the below query, and look for the *state *column in the result
of the query. Online drillbits will be marked as ONLINE.

select * from sys.drillbits;

- Khurram

On Tue, Jul 10, 2018 at 12:24 AM, Divya Gehlot <di...@gmail.com>
wrote:

> Hi,
> I would like to know the best practice to check the Drillbits status in
> cluster mode.
> I have encountered the scenario when check Drillbits process running fine
> and When check in Drll WebUI , some of the Drillbits are down.
> When do RCA(root cause analysis) , got to know due to some reason drillbits
> process hanged .
> For now the alert system which I have implemented now is checking the
>
>
> > drill/bin/drillbit.sh status
>
>
> Is there any other best way to catch the hung Drillbit process?
> Appreciate the advise from Drill community users.
>
> Thanks,
> Divya
>