You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by Bill Havanki <bh...@clouderagovt.com> on 2014/03/18 14:18:32 UTC

Review Request 19352: ACCUMULO-2488 - refined balance check for concurrent randomwalk

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19352/
-----------------------------------------------------------

Review request for accumulo and Mike Drob.


Bugs: ACCUMULO-2488
    https://issues.apache.org/jira/browse/ACCUMULO-2488


Repository: accumulo


Description
-------

The Concurrent randomwalk test used to consider servers unbalanced if any server's tablet count differed from the cluster average by more than a fifth of the average or by one, whichever was larger. This would cause failures under typical balancings from the default balancer.

This commit changes the criterion for an unbalanced server to be double the standard deviation from the cluster average.


Diffs
-----

  src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/concurrent/CheckBalance.java d00e2b4 

Diff: https://reviews.apache.org/r/19352/diff/


Testing
-------

Ran Concurrent randomwalk three times on 7-node cluster.


Thanks,

Bill Havanki


Re: Review Request 19352: ACCUMULO-2488 - refined balance check for concurrent randomwalk

Posted by Eric Newton <er...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19352/#review37551
-----------------------------------------------------------

Ship it!


Ship It!

- Eric Newton


On March 18, 2014, 1:18 p.m., Bill Havanki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19352/
> -----------------------------------------------------------
> 
> (Updated March 18, 2014, 1:18 p.m.)
> 
> 
> Review request for accumulo and Mike Drob.
> 
> 
> Bugs: ACCUMULO-2488
>     https://issues.apache.org/jira/browse/ACCUMULO-2488
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> The Concurrent randomwalk test used to consider servers unbalanced if any server's tablet count differed from the cluster average by more than a fifth of the average or by one, whichever was larger. This would cause failures under typical balancings from the default balancer.
> 
> This commit changes the criterion for an unbalanced server to be double the standard deviation from the cluster average.
> 
> 
> Diffs
> -----
> 
>   src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/concurrent/CheckBalance.java d00e2b4 
> 
> Diff: https://reviews.apache.org/r/19352/diff/
> 
> 
> Testing
> -------
> 
> Ran Concurrent randomwalk three times on 7-node cluster.
> 
> 
> Thanks,
> 
> Bill Havanki
> 
>


Re: Review Request 19352: ACCUMULO-2488 - refined balance check for concurrent randomwalk

Posted by Bill Havanki <bh...@clouderagovt.com>.

> On March 18, 2014, 12:21 p.m., Mike Drob wrote:
> > src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/concurrent/CheckBalance.java, line 64
> > <https://reviews.apache.org/r/19352/diff/1/?file=526225#file526225line64>
> >
> >     This comment does not seem accurate.
> 
> Bill Havanki wrote:
>     True! s/even/balanced/

D'oh, this wasn't in my latest diff. I will fix the comment on commit.


- Bill


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19352/#review37560
-----------------------------------------------------------


On March 18, 2014, 2:43 p.m., Bill Havanki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19352/
> -----------------------------------------------------------
> 
> (Updated March 18, 2014, 2:43 p.m.)
> 
> 
> Review request for accumulo and Mike Drob.
> 
> 
> Bugs: ACCUMULO-2488
>     https://issues.apache.org/jira/browse/ACCUMULO-2488
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> The Concurrent randomwalk test used to consider servers unbalanced if any server's tablet count differed from the cluster average by more than a fifth of the average or by one, whichever was larger. This would cause failures under typical balancings from the default balancer.
> 
> This commit changes the criterion for an unbalanced server to be double the standard deviation from the cluster average.
> 
> 
> Diffs
> -----
> 
>   src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/concurrent/CheckBalance.java d00e2b4 
> 
> Diff: https://reviews.apache.org/r/19352/diff/
> 
> 
> Testing
> -------
> 
> Ran Concurrent randomwalk three times on 7-node cluster.
> 
> 
> Thanks,
> 
> Bill Havanki
> 
>


Re: Review Request 19352: ACCUMULO-2488 - refined balance check for concurrent randomwalk

Posted by Bill Havanki <bh...@clouderagovt.com>.

> On March 18, 2014, 12:21 p.m., Mike Drob wrote:
> > src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/concurrent/CheckBalance.java, lines 55-61
> > <https://reviews.apache.org/r/19352/diff/1/?file=526225#file526225line55>
> >
> >     This could all be pushed into the sd method, since I don't think total and average are used anywhere else.

average is used on line 71. Still, I can rework this a bit to skip creating that Long array.


> On March 18, 2014, 12:21 p.m., Mike Drob wrote:
> > src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/concurrent/CheckBalance.java, line 64
> > <https://reviews.apache.org/r/19352/diff/1/?file=526225#file526225line64>
> >
> >     This comment does not seem accurate.

True! s/even/balanced/


- Bill


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19352/#review37560
-----------------------------------------------------------


On March 18, 2014, 9:18 a.m., Bill Havanki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19352/
> -----------------------------------------------------------
> 
> (Updated March 18, 2014, 9:18 a.m.)
> 
> 
> Review request for accumulo and Mike Drob.
> 
> 
> Bugs: ACCUMULO-2488
>     https://issues.apache.org/jira/browse/ACCUMULO-2488
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> The Concurrent randomwalk test used to consider servers unbalanced if any server's tablet count differed from the cluster average by more than a fifth of the average or by one, whichever was larger. This would cause failures under typical balancings from the default balancer.
> 
> This commit changes the criterion for an unbalanced server to be double the standard deviation from the cluster average.
> 
> 
> Diffs
> -----
> 
>   src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/concurrent/CheckBalance.java d00e2b4 
> 
> Diff: https://reviews.apache.org/r/19352/diff/
> 
> 
> Testing
> -------
> 
> Ran Concurrent randomwalk three times on 7-node cluster.
> 
> 
> Thanks,
> 
> Bill Havanki
> 
>


Re: Review Request 19352: ACCUMULO-2488 - refined balance check for concurrent randomwalk

Posted by Mike Drob <md...@mdrob.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19352/#review37560
-----------------------------------------------------------



src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/concurrent/CheckBalance.java
<https://reviews.apache.org/r/19352/#comment69160>

    This could all be pushed into the sd method, since I don't think total and average are used anywhere else.



src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/concurrent/CheckBalance.java
<https://reviews.apache.org/r/19352/#comment69159>

    This comment does not seem accurate.


- Mike Drob


On March 18, 2014, 1:18 p.m., Bill Havanki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19352/
> -----------------------------------------------------------
> 
> (Updated March 18, 2014, 1:18 p.m.)
> 
> 
> Review request for accumulo and Mike Drob.
> 
> 
> Bugs: ACCUMULO-2488
>     https://issues.apache.org/jira/browse/ACCUMULO-2488
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> The Concurrent randomwalk test used to consider servers unbalanced if any server's tablet count differed from the cluster average by more than a fifth of the average or by one, whichever was larger. This would cause failures under typical balancings from the default balancer.
> 
> This commit changes the criterion for an unbalanced server to be double the standard deviation from the cluster average.
> 
> 
> Diffs
> -----
> 
>   src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/concurrent/CheckBalance.java d00e2b4 
> 
> Diff: https://reviews.apache.org/r/19352/diff/
> 
> 
> Testing
> -------
> 
> Ran Concurrent randomwalk three times on 7-node cluster.
> 
> 
> Thanks,
> 
> Bill Havanki
> 
>


Re: Review Request 19352: ACCUMULO-2488 - refined balance check for concurrent randomwalk

Posted by Mike Drob <md...@mdrob.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19352/#review37564
-----------------------------------------------------------



src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/concurrent/CheckBalance.java
<https://reviews.apache.org/r/19352/#comment69163>

    Minor nit: can use s.doubleValue()


- Mike Drob


On March 18, 2014, 4:54 p.m., Bill Havanki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19352/
> -----------------------------------------------------------
> 
> (Updated March 18, 2014, 4:54 p.m.)
> 
> 
> Review request for accumulo and Mike Drob.
> 
> 
> Bugs: ACCUMULO-2488
>     https://issues.apache.org/jira/browse/ACCUMULO-2488
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> The Concurrent randomwalk test used to consider servers unbalanced if any server's tablet count differed from the cluster average by more than a fifth of the average or by one, whichever was larger. This would cause failures under typical balancings from the default balancer.
> 
> This commit changes the criterion for an unbalanced server to be double the standard deviation from the cluster average.
> 
> 
> Diffs
> -----
> 
>   src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/concurrent/CheckBalance.java d00e2b4 
> 
> Diff: https://reviews.apache.org/r/19352/diff/
> 
> 
> Testing
> -------
> 
> Ran Concurrent randomwalk three times on 7-node cluster.
> 
> 
> Thanks,
> 
> Bill Havanki
> 
>


Re: Review Request 19352: ACCUMULO-2488 - refined balance check for concurrent randomwalk

Posted by Bill Havanki <bh...@clouderagovt.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19352/
-----------------------------------------------------------

(Updated March 18, 2014, 2:43 p.m.)


Review request for accumulo and Mike Drob.


Changes
-------

Fixed Mike's nit, which I do like.


Bugs: ACCUMULO-2488
    https://issues.apache.org/jira/browse/ACCUMULO-2488


Repository: accumulo


Description
-------

The Concurrent randomwalk test used to consider servers unbalanced if any server's tablet count differed from the cluster average by more than a fifth of the average or by one, whichever was larger. This would cause failures under typical balancings from the default balancer.

This commit changes the criterion for an unbalanced server to be double the standard deviation from the cluster average.


Diffs (updated)
-----

  src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/concurrent/CheckBalance.java d00e2b4 

Diff: https://reviews.apache.org/r/19352/diff/


Testing
-------

Ran Concurrent randomwalk three times on 7-node cluster.


Thanks,

Bill Havanki


Re: Review Request 19352: ACCUMULO-2488 - refined balance check for concurrent randomwalk

Posted by Bill Havanki <bh...@clouderagovt.com>.

> On March 18, 2014, 1:29 p.m., kturner wrote:
> > src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/concurrent/CheckBalance.java, line 100
> > <https://reviews.apache.org/r/19352/diff/2/?file=526277#file526277line100>
> >
> >     could use o.a.a.core.util.Stat

I just looked at Stat. I don't think it calculates standard deviation correctly, actually.


- Bill


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19352/#review37568
-----------------------------------------------------------


On March 18, 2014, 12:54 p.m., Bill Havanki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19352/
> -----------------------------------------------------------
> 
> (Updated March 18, 2014, 12:54 p.m.)
> 
> 
> Review request for accumulo and Mike Drob.
> 
> 
> Bugs: ACCUMULO-2488
>     https://issues.apache.org/jira/browse/ACCUMULO-2488
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> The Concurrent randomwalk test used to consider servers unbalanced if any server's tablet count differed from the cluster average by more than a fifth of the average or by one, whichever was larger. This would cause failures under typical balancings from the default balancer.
> 
> This commit changes the criterion for an unbalanced server to be double the standard deviation from the cluster average.
> 
> 
> Diffs
> -----
> 
>   src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/concurrent/CheckBalance.java d00e2b4 
> 
> Diff: https://reviews.apache.org/r/19352/diff/
> 
> 
> Testing
> -------
> 
> Ran Concurrent randomwalk three times on 7-node cluster.
> 
> 
> Thanks,
> 
> Bill Havanki
> 
>


Re: Review Request 19352: ACCUMULO-2488 - refined balance check for concurrent randomwalk

Posted by ke...@deenlo.com.

> On March 18, 2014, 5:29 p.m., kturner wrote:
> > src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/concurrent/CheckBalance.java, line 100
> > <https://reviews.apache.org/r/19352/diff/2/?file=526277#file526277line100>
> >
> >     could use o.a.a.core.util.Stat
> 
> Bill Havanki wrote:
>     I just looked at Stat. I don't think it calculates standard deviation correctly, actually.
> 
> Bill Havanki wrote:
>     For future reference: See ACCUMULO-2494 for discussion. The calculation in Stat is OK but not great. I'll leave this one as is.

Could use the commons math function you referenced in the 2494 code review instead of Stat.


- kturner


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19352/#review37568
-----------------------------------------------------------


On March 18, 2014, 6:43 p.m., Bill Havanki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19352/
> -----------------------------------------------------------
> 
> (Updated March 18, 2014, 6:43 p.m.)
> 
> 
> Review request for accumulo and Mike Drob.
> 
> 
> Bugs: ACCUMULO-2488
>     https://issues.apache.org/jira/browse/ACCUMULO-2488
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> The Concurrent randomwalk test used to consider servers unbalanced if any server's tablet count differed from the cluster average by more than a fifth of the average or by one, whichever was larger. This would cause failures under typical balancings from the default balancer.
> 
> This commit changes the criterion for an unbalanced server to be double the standard deviation from the cluster average.
> 
> 
> Diffs
> -----
> 
>   src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/concurrent/CheckBalance.java d00e2b4 
> 
> Diff: https://reviews.apache.org/r/19352/diff/
> 
> 
> Testing
> -------
> 
> Ran Concurrent randomwalk three times on 7-node cluster.
> 
> 
> Thanks,
> 
> Bill Havanki
> 
>


Re: Review Request 19352: ACCUMULO-2488 - refined balance check for concurrent randomwalk

Posted by Bill Havanki <bh...@clouderagovt.com>.

> On March 18, 2014, 1:29 p.m., kturner wrote:
> > src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/concurrent/CheckBalance.java, line 100
> > <https://reviews.apache.org/r/19352/diff/2/?file=526277#file526277line100>
> >
> >     could use o.a.a.core.util.Stat
> 
> Bill Havanki wrote:
>     I just looked at Stat. I don't think it calculates standard deviation correctly, actually.

For future reference: See ACCUMULO-2494 for discussion. The calculation in Stat is OK but not great. I'll leave this one as is.


- Bill


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19352/#review37568
-----------------------------------------------------------


On March 18, 2014, 2:43 p.m., Bill Havanki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19352/
> -----------------------------------------------------------
> 
> (Updated March 18, 2014, 2:43 p.m.)
> 
> 
> Review request for accumulo and Mike Drob.
> 
> 
> Bugs: ACCUMULO-2488
>     https://issues.apache.org/jira/browse/ACCUMULO-2488
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> The Concurrent randomwalk test used to consider servers unbalanced if any server's tablet count differed from the cluster average by more than a fifth of the average or by one, whichever was larger. This would cause failures under typical balancings from the default balancer.
> 
> This commit changes the criterion for an unbalanced server to be double the standard deviation from the cluster average.
> 
> 
> Diffs
> -----
> 
>   src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/concurrent/CheckBalance.java d00e2b4 
> 
> Diff: https://reviews.apache.org/r/19352/diff/
> 
> 
> Testing
> -------
> 
> Ran Concurrent randomwalk three times on 7-node cluster.
> 
> 
> Thanks,
> 
> Bill Havanki
> 
>


Re: Review Request 19352: ACCUMULO-2488 - refined balance check for concurrent randomwalk

Posted by ke...@deenlo.com.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19352/#review37568
-----------------------------------------------------------



src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/concurrent/CheckBalance.java
<https://reviews.apache.org/r/19352/#comment69164>

    could use o.a.a.core.util.Stat


- kturner


On March 18, 2014, 4:54 p.m., Bill Havanki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19352/
> -----------------------------------------------------------
> 
> (Updated March 18, 2014, 4:54 p.m.)
> 
> 
> Review request for accumulo and Mike Drob.
> 
> 
> Bugs: ACCUMULO-2488
>     https://issues.apache.org/jira/browse/ACCUMULO-2488
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> The Concurrent randomwalk test used to consider servers unbalanced if any server's tablet count differed from the cluster average by more than a fifth of the average or by one, whichever was larger. This would cause failures under typical balancings from the default balancer.
> 
> This commit changes the criterion for an unbalanced server to be double the standard deviation from the cluster average.
> 
> 
> Diffs
> -----
> 
>   src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/concurrent/CheckBalance.java d00e2b4 
> 
> Diff: https://reviews.apache.org/r/19352/diff/
> 
> 
> Testing
> -------
> 
> Ran Concurrent randomwalk three times on 7-node cluster.
> 
> 
> Thanks,
> 
> Bill Havanki
> 
>


Re: Review Request 19352: ACCUMULO-2488 - refined balance check for concurrent randomwalk

Posted by Bill Havanki <bh...@clouderagovt.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19352/
-----------------------------------------------------------

(Updated March 18, 2014, 12:54 p.m.)


Review request for accumulo and Mike Drob.


Changes
-------

>From Mike's review.


Bugs: ACCUMULO-2488
    https://issues.apache.org/jira/browse/ACCUMULO-2488


Repository: accumulo


Description
-------

The Concurrent randomwalk test used to consider servers unbalanced if any server's tablet count differed from the cluster average by more than a fifth of the average or by one, whichever was larger. This would cause failures under typical balancings from the default balancer.

This commit changes the criterion for an unbalanced server to be double the standard deviation from the cluster average.


Diffs (updated)
-----

  src/server/src/main/java/org/apache/accumulo/server/test/randomwalk/concurrent/CheckBalance.java d00e2b4 

Diff: https://reviews.apache.org/r/19352/diff/


Testing
-------

Ran Concurrent randomwalk three times on 7-node cluster.


Thanks,

Bill Havanki