You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Demai Ni <ni...@gmail.com> on 2014/07/30 00:48:51 UTC
meaning for AgeOfLastAppliedOp in Replication MetricsSink
hi,
A quick question to clarify this AgeOfLastAppliedOp in MetricsSink.java. I
assume it is used as an indicator about how long for a Sink OP to be
applied; but instead, it is more like to show how long since the last Sink
OP applied
/**
* Set the age of the last applied operation
*
* @param timestamp The timestamp of the last operation applied.
* @return the age that was set
*/
public long setAgeOfLastAppliedOp(long timestamp) {
lastTimestampForAge = timestamp;
long age = System.currentTimeMillis() - lastTimestampForAge;
rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age);
return age;
}
In the following scenario:
1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is
set for example 100ms;
2) and then NO new Sink op occur.
3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of
return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, which
doesn't make sense, right?
should we put a check for (lastTimestampForAge != timestamp) before refresh
the age?
Thanks
Demai
Re: meaning for AgeOfLastAppliedOp in Replication MetricsSink
Posted by Demai Ni <ni...@gmail.com>.
Lars,
But I see you're looking at 0.98. There indeed we are called
> refreshAgeOfLastAppliedOp every time we call getStats(), which would
> increase that metric even when there is nothing to ship. That looks like a
> bug.
>
yeah. that is what I am thinking. Let's have a quick fix on it. I will
provide a patch later today. (was working on HBASE-9531 when the problem
surfaces).
Demai
On Wed, Jul 30, 2014 at 10:41 AM, lars hofhansl <la...@apache.org> wrote:
> I was also confusing ageOfLastShippedEdit with ageOfLastAppliedEdit. I had
> fixed ageOfLastShippedEdit on the source in 0.94.
> Looks like 0.94 is doing the right thing with ageOfLastAppliedEdit, but
> 0.98+ is not.
>
> -- Lars
>
> ------------------------------
> *From:* lars hofhansl <la...@apache.org>
> *To:* Demai Ni <ni...@gmail.com>; "dev@hbase.apache.org" <
> dev@hbase.apache.org>
> *Sent:* Wednesday, July 30, 2014 10:30 AM
>
> *Subject:* Re: meaning for AgeOfLastAppliedOp in Replication MetricsSink
>
> > When the 'the time an edit entered the system' doesn't change (in the
> case
> of no Sink Op enter for a period of time), the age will keep growing
> since current time moving forward, which gives a false impression that
> an edit sitting in the queue for very long time. isn't it?
>
>
> I think until the item is shipped it should be counted as waiting. I.e.
> the time this reports is the time between an edit entered the system and
> the time it finally gets shipped to the replication
> sink.refreshAgeOfLastAppliedOp() should only be called when something is
> actually being shipped, not periodically (I just fixed that in 0.94
> HBASE-11143).
>
>
>
> But I see you're looking at 0.98. There indeed we are called
> refreshAgeOfLastAppliedOp every time we call getStats(), which would
> increase that metric even when there is nothing to ship. That looks like a
> bug.
>
> -- Lars
>
>
>
> ________________________________
>
> From: Demai Ni <ni...@gmail.com>
> To: "dev@hbase.apache.org" <de...@hbase.apache.org>; lars hofhansl <
> larsh@apache.org>
> Sent: Wednesday, July 30, 2014 8:49 AM
> Subject: Re: meaning for AgeOfLastAppliedOp in Replication MetricsSink
>
>
>
> Lars, thanks for your input.
>
>
> This metric indicates the time an edit sat in the "replication queue"
> before it got replicated.
> yeah, I am with you on this.
>
>
> With that definition it is doing the right thing: Reporting current time -
> the time an edit entered the system (it's WAL time)
> >
> When the 'the time an edit entered the system' doesn't change (in the
> case of no Sink Op enter for a period of time), the age will keep growing
> since current time moving forward, which gives a false impression that an
> edit sitting in the queue for very long time. isn't it?
>
>
>
>
> On Tue, Jul 29, 2014 at 10:54 PM, lars hofhansl <la...@apache.org> wrote:
>
> This metric indicates the time an edit sat in the "replication queue"
> before it got replicated.
> >With that definition it is doing the right thing: Reporting current time
> - the time an edit entered the system (it's WAL time)
> >
> >
> >-- Lars
> >
> >
> >
> >________________________________
> > From: Demai Ni <ni...@gmail.com>
> >To: "dev@hbase.apache.org" <de...@hbase.apache.org>
> >Sent: Tuesday, July 29, 2014 3:48 PM
> >Subject: meaning for AgeOfLastAppliedOp in Replication MetricsSink
> >
> >
> >
> >hi,
> >
> >A quick question to clarify this AgeOfLastAppliedOp in MetricsSink.java.
> I
> >assume it is used as an indicator about how long for a Sink OP to be
> >applied; but instead, it is more like to show how long since the last Sink
> >OP applied
> >
> > /**
> > * Set the age of the last applied operation
> > *
> > * @param timestamp The timestamp of the last operation applied.
> > * @return the age that was set
> > */
> > public long setAgeOfLastAppliedOp(long timestamp) {
> > lastTimestampForAge = timestamp;
> > long age = System.currentTimeMillis() - lastTimestampForAge;
> > rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age);
> > return age;
> > }
> >
> >In the following scenario:
> >1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is
> >set for example 100ms;
> >2) and then NO new Sink op occur.
> >3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of
> >return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, which
> >doesn't make sense, right?
> >
> >should we put a check for (lastTimestampForAge != timestamp) before
> refresh
> >the age?
> >
> >Thanks
> >
> >Demai
>
>
>
Re: meaning for AgeOfLastAppliedOp in Replication MetricsSink
Posted by lars hofhansl <la...@apache.org>.
I was also confusing ageOfLastShippedEdit with ageOfLastAppliedEdit. I had fixed ageOfLastShippedEdit on the source in 0.94.
Looks like 0.94 is doing the right thing with ageOfLastAppliedEdit, but 0.98+ is not.
-- Lars
________________________________
From: lars hofhansl <la...@apache.org>
To: Demai Ni <ni...@gmail.com>; "dev@hbase.apache.org" <de...@hbase.apache.org>
Sent: Wednesday, July 30, 2014 10:30 AM
Subject: Re: meaning for AgeOfLastAppliedOp in Replication MetricsSink
> When the 'the time an edit entered the system' doesn't change (in the case
of no Sink Op enter for a period of time), the age will keep growing
since current time moving forward, which gives a false impression that
an edit sitting in the queue for very long time. isn't it?
I think until the item is shipped it should be counted as waiting. I.e. the time this reports is the time between an edit entered the system and the time it finally gets shipped to the replication sink.refreshAgeOfLastAppliedOp() should only be called when something is actually being shipped, not periodically (I just fixed that in 0.94 HBASE-11143).
But I see you're looking at 0.98. There indeed we are called refreshAgeOfLastAppliedOp every time we call getStats(), which would increase that metric even when there is nothing to ship. That looks like a bug.
-- Lars
________________________________
From: Demai Ni <ni...@gmail.com>
To: "dev@hbase.apache.org" <de...@hbase.apache.org>; lars hofhansl <la...@apache.org>
Sent: Wednesday, July 30, 2014 8:49 AM
Subject: Re: meaning for AgeOfLastAppliedOp in Replication MetricsSink
Lars, thanks for your input.
This metric indicates the time an edit sat in the "replication queue" before it got replicated.
yeah, I am with you on this.
With that definition it is doing the right thing: Reporting current time - the time an edit entered the system (it's WAL time)
>
When the 'the time an edit entered the system' doesn't change (in the case of no Sink Op enter for a period of time), the age will keep growing since current time moving forward, which gives a false impression that an edit sitting in the queue for very long time. isn't it?
On Tue, Jul 29, 2014 at 10:54 PM, lars hofhansl <la...@apache.org> wrote:
This metric indicates the time an edit sat in the "replication queue" before it got replicated.
>With that definition it is doing the right thing: Reporting current time - the time an edit entered the system (it's WAL time)
>
>
>-- Lars
>
>
>
>________________________________
> From: Demai Ni <ni...@gmail.com>
>To: "dev@hbase.apache.org" <de...@hbase.apache.org>
>Sent: Tuesday, July 29, 2014 3:48 PM
>Subject: meaning for AgeOfLastAppliedOp in Replication MetricsSink
>
>
>
>hi,
>
>A quick question to clarify this AgeOfLastAppliedOp in MetricsSink.java. I
>assume it is used as an indicator about how long for a Sink OP to be
>applied; but instead, it is more like to show how long since the last Sink
>OP applied
>
> /**
> * Set the age of the last applied operation
> *
> * @param timestamp The timestamp of the last operation applied.
> * @return the age that was set
> */
> public long setAgeOfLastAppliedOp(long timestamp) {
> lastTimestampForAge = timestamp;
> long age = System.currentTimeMillis() - lastTimestampForAge;
> rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age);
> return age;
> }
>
>In the following scenario:
>1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is
>set for example 100ms;
>2) and then NO new Sink op occur.
>3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of
>return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, which
>doesn't make sense, right?
>
>should we put a check for (lastTimestampForAge != timestamp) before refresh
>the age?
>
>Thanks
>
>Demai
Re: meaning for AgeOfLastAppliedOp in Replication MetricsSink
Posted by lars hofhansl <la...@apache.org>.
> When the 'the time an edit entered the system' doesn't change (in the case
of no Sink Op enter for a period of time), the age will keep growing
since current time moving forward, which gives a false impression that
an edit sitting in the queue for very long time. isn't it?
I think until the item is shipped it should be counted as waiting. I.e. the time this reports is the time between an edit entered the system and the time it finally gets shipped to the replication sink.refreshAgeOfLastAppliedOp() should only be called when something is actually being shipped, not periodically (I just fixed that in 0.94 HBASE-11143).
But I see you're looking at 0.98. There indeed we are called refreshAgeOfLastAppliedOp every time we call getStats(), which would increase that metric even when there is nothing to ship. That looks like a bug.
-- Lars
________________________________
From: Demai Ni <ni...@gmail.com>
To: "dev@hbase.apache.org" <de...@hbase.apache.org>; lars hofhansl <la...@apache.org>
Sent: Wednesday, July 30, 2014 8:49 AM
Subject: Re: meaning for AgeOfLastAppliedOp in Replication MetricsSink
Lars, thanks for your input.
This metric indicates the time an edit sat in the "replication queue" before it got replicated.
yeah, I am with you on this.
With that definition it is doing the right thing: Reporting current time - the time an edit entered the system (it's WAL time)
>
When the 'the time an edit entered the system' doesn't change (in the case of no Sink Op enter for a period of time), the age will keep growing since current time moving forward, which gives a false impression that an edit sitting in the queue for very long time. isn't it?
On Tue, Jul 29, 2014 at 10:54 PM, lars hofhansl <la...@apache.org> wrote:
This metric indicates the time an edit sat in the "replication queue" before it got replicated.
>With that definition it is doing the right thing: Reporting current time - the time an edit entered the system (it's WAL time)
>
>
>-- Lars
>
>
>
>________________________________
> From: Demai Ni <ni...@gmail.com>
>To: "dev@hbase.apache.org" <de...@hbase.apache.org>
>Sent: Tuesday, July 29, 2014 3:48 PM
>Subject: meaning for AgeOfLastAppliedOp in Replication MetricsSink
>
>
>
>hi,
>
>A quick question to clarify this AgeOfLastAppliedOp in MetricsSink.java. I
>assume it is used as an indicator about how long for a Sink OP to be
>applied; but instead, it is more like to show how long since the last Sink
>OP applied
>
> /**
> * Set the age of the last applied operation
> *
> * @param timestamp The timestamp of the last operation applied.
> * @return the age that was set
> */
> public long setAgeOfLastAppliedOp(long timestamp) {
> lastTimestampForAge = timestamp;
> long age = System.currentTimeMillis() - lastTimestampForAge;
> rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age);
> return age;
> }
>
>In the following scenario:
>1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is
>set for example 100ms;
>2) and then NO new Sink op occur.
>3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of
>return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, which
>doesn't make sense, right?
>
>should we put a check for (lastTimestampForAge != timestamp) before refresh
>the age?
>
>Thanks
>
>Demai
Re: meaning for AgeOfLastAppliedOp in Replication MetricsSink
Posted by Demai Ni <ni...@gmail.com>.
Lars, thanks for your input.
> This metric indicates the time an edit sat in the "replication queue"
> before it got replicated.
yeah, I am with you on this.
> With that definition it is doing the right thing: Reporting current time -
> the time an edit entered the system (it's WAL time)
>
When the 'the time an edit entered the system' doesn't change (in the case
of no Sink Op enter for a period of time), the age will keep growing since
current time moving forward, which gives a false impression that an edit
sitting in the queue for very long time. isn't it?
On Tue, Jul 29, 2014 at 10:54 PM, lars hofhansl <la...@apache.org> wrote:
> This metric indicates the time an edit sat in the "replication queue"
> before it got replicated.
> With that definition it is doing the right thing: Reporting current time -
> the time an edit entered the system (it's WAL time)
>
>
> -- Lars
>
>
>
> ________________________________
> From: Demai Ni <ni...@gmail.com>
> To: "dev@hbase.apache.org" <de...@hbase.apache.org>
> Sent: Tuesday, July 29, 2014 3:48 PM
> Subject: meaning for AgeOfLastAppliedOp in Replication MetricsSink
>
>
> hi,
>
> A quick question to clarify this AgeOfLastAppliedOp in MetricsSink.java. I
> assume it is used as an indicator about how long for a Sink OP to be
> applied; but instead, it is more like to show how long since the last Sink
> OP applied
>
> /**
> * Set the age of the last applied operation
> *
> * @param timestamp The timestamp of the last operation applied.
> * @return the age that was set
> */
> public long setAgeOfLastAppliedOp(long timestamp) {
> lastTimestampForAge = timestamp;
> long age = System.currentTimeMillis() - lastTimestampForAge;
> rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age);
> return age;
> }
>
> In the following scenario:
> 1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is
> set for example 100ms;
> 2) and then NO new Sink op occur.
> 3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of
> return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, which
> doesn't make sense, right?
>
> should we put a check for (lastTimestampForAge != timestamp) before refresh
> the age?
>
> Thanks
>
> Demai
>
Re: meaning for AgeOfLastAppliedOp in Replication MetricsSink
Posted by lars hofhansl <la...@apache.org>.
This metric indicates the time an edit sat in the "replication queue" before it got replicated.
With that definition it is doing the right thing: Reporting current time - the time an edit entered the system (it's WAL time)
-- Lars
________________________________
From: Demai Ni <ni...@gmail.com>
To: "dev@hbase.apache.org" <de...@hbase.apache.org>
Sent: Tuesday, July 29, 2014 3:48 PM
Subject: meaning for AgeOfLastAppliedOp in Replication MetricsSink
hi,
A quick question to clarify this AgeOfLastAppliedOp in MetricsSink.java. I
assume it is used as an indicator about how long for a Sink OP to be
applied; but instead, it is more like to show how long since the last Sink
OP applied
/**
* Set the age of the last applied operation
*
* @param timestamp The timestamp of the last operation applied.
* @return the age that was set
*/
public long setAgeOfLastAppliedOp(long timestamp) {
lastTimestampForAge = timestamp;
long age = System.currentTimeMillis() - lastTimestampForAge;
rms.setGauge(SINK_AGE_OF_LAST_APPLIED_OP, age);
return age;
}
In the following scenario:
1) at 7:00am a sink op is applied, and the SINK_AGE_OF_LAST_APPLIED_OP is
set for example 100ms;
2) and then NO new Sink op occur.
3) when a refreshAgeOfLastAppliedOp() is invoked at 8:00am. Instead of
return the 100ms, the AgeOfLastAppliedOp become 1hour + 100ms, which
doesn't make sense, right?
should we put a check for (lastTimestampForAge != timestamp) before refresh
the age?
Thanks
Demai