You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Yash Sharma <ya...@gmail.com> on 2014/05/27 22:24:04 UTC

Review Request 21941: Drill-842: Correlation coefficient calculation function

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21941/
-----------------------------------------------------------

Review request for drill, Aditya Kishore, Mehant Baid, and Timothy Chen.


Repository: drill-git


Description
-------

Added implementation for Correlation coefficient calculation function.


Diffs
-----

  exec/java-exec/src/main/codegen/config.fmpp 5c0d03b 
  exec/java-exec/src/main/codegen/data/CorrelationTypes.tdd PRE-CREATION 
  exec/java-exec/src/main/codegen/templates/CorrelationTypeFunctions.java PRE-CREATION 
  exec/java-exec/src/test/java/org/apache/drill/exec/fn/impl/TestAggregateFunction.java ffb372d 
  exec/java-exec/src/test/resources/functions/test_covariance.json PRE-CREATION 

Diff: https://reviews.apache.org/r/21941/diff/


Testing
-------

Yes.
$mvn test -Dtest=TestAggregateFunction#testCovarianceCorrelation


Thanks,

Yash Sharma


Re: Review Request 21941: Drill-842: Correlation coefficient calculation function

Posted by Ted Dunning <td...@apache.org>.

> On June 11, 2014, 7:31 a.m., Mehant Baid wrote:
> > exec/java-exec/src/main/codegen/templates/CorrelationTypeFunctions.java, line 107
> > <https://reviews.apache.org/r/21941/diff/1/?file=595305#file595305line107>
> >
> >     which method/ formula are you using to compute the correlation coefficient? Do you think if it might suffer from catastrophic cancellation problem?

This appears to use Welford's method and should be numerically quite stable.

It will be difficult to do better in a single pass.


- Ted


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21941/#review45346
-----------------------------------------------------------


On May 27, 2014, 8:24 p.m., Yash Sharma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/21941/
> -----------------------------------------------------------
> 
> (Updated May 27, 2014, 8:24 p.m.)
> 
> 
> Review request for drill, Aditya Kishore, Mehant Baid, and Timothy Chen.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> Added implementation for Correlation coefficient calculation function.
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/codegen/config.fmpp 5c0d03b 
>   exec/java-exec/src/main/codegen/data/CorrelationTypes.tdd PRE-CREATION 
>   exec/java-exec/src/main/codegen/templates/CorrelationTypeFunctions.java PRE-CREATION 
>   exec/java-exec/src/test/java/org/apache/drill/exec/fn/impl/TestAggregateFunction.java ffb372d 
>   exec/java-exec/src/test/resources/functions/test_covariance.json PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/21941/diff/
> 
> 
> Testing
> -------
> 
> Yes.
> $mvn test -Dtest=TestAggregateFunction#testCovarianceCorrelation
> 
> 
> Thanks,
> 
> Yash Sharma
> 
>


Re: Review Request 21941: Drill-842: Correlation coefficient calculation function

Posted by Yash Sharma <ya...@gmail.com>.

> On June 11, 2014, 7:31 a.m., Mehant Baid wrote:
> > exec/java-exec/src/main/codegen/templates/CorrelationTypeFunctions.java, line 27
> > <https://reviews.apache.org/r/21941/diff/1/?file=595305#file595305line27>
> >
> >     Correct the comment to indicate the right function names

Fixed!


> On June 11, 2014, 7:31 a.m., Mehant Baid wrote:
> > exec/java-exec/src/main/codegen/templates/CorrelationTypeFunctions.java, line 52
> > <https://reviews.apache.org/r/21941/diff/1/?file=595305#file595305line52>
> >
> >     You can probably get rid of the 'if' here. Since you always have the aliasname.

Fixed!


> On June 11, 2014, 7:31 a.m., Mehant Baid wrote:
> > exec/java-exec/src/main/codegen/templates/CorrelationTypeFunctions.java, line 107
> > <https://reviews.apache.org/r/21941/diff/1/?file=595305#file595305line107>
> >
> >     which method/ formula are you using to compute the correlation coefficient? Do you think if it might suffer from catastrophic cancellation problem?
> 
> Ted Dunning wrote:
>     This appears to use Welford's method and should be numerically quite stable.
>     
>     It will be difficult to do better in a single pass.

It uses Welford's online incremental algo and is much less prone to loss of precision due to catastrophic cancellation - wiki.


- Yash


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21941/#review45346
-----------------------------------------------------------


On June 14, 2014, 7:21 p.m., Yash Sharma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/21941/
> -----------------------------------------------------------
> 
> (Updated June 14, 2014, 7:21 p.m.)
> 
> 
> Review request for drill, Aditya Kishore, Mehant Baid, and Timothy Chen.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> Added implementation for Correlation coefficient calculation function.
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/codegen/config.fmpp 5c0d03b 
>   exec/java-exec/src/main/codegen/data/CorrelationTypes.tdd PRE-CREATION 
>   exec/java-exec/src/main/codegen/data/CovarTypes.tdd PRE-CREATION 
>   exec/java-exec/src/main/codegen/templates/CorrelationTypeFunctions.java PRE-CREATION 
>   exec/java-exec/src/main/codegen/templates/CovarTypeFunctions.java PRE-CREATION 
>   exec/java-exec/src/test/java/org/apache/drill/exec/fn/impl/TestAggregateFunction.java ffb372d 
>   exec/java-exec/src/test/resources/functions/test_covariance.json PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/21941/diff/
> 
> 
> Testing
> -------
> 
> Yes.
> $mvn test -Dtest=TestAggregateFunction#testCovarianceCorrelation
> 
> 
> Thanks,
> 
> Yash Sharma
> 
>


Re: Review Request 21941: Drill-842: Correlation coefficient calculation function

Posted by Mehant Baid <ba...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21941/#review45346
-----------------------------------------------------------



exec/java-exec/src/main/codegen/templates/CorrelationTypeFunctions.java
<https://reviews.apache.org/r/21941/#comment80158>

    Correct the comment to indicate the right function names



exec/java-exec/src/main/codegen/templates/CorrelationTypeFunctions.java
<https://reviews.apache.org/r/21941/#comment80159>

    You can probably get rid of the 'if' here. Since you always have the aliasname.



exec/java-exec/src/main/codegen/templates/CorrelationTypeFunctions.java
<https://reviews.apache.org/r/21941/#comment80160>

    which method/ formula are you using to compute the correlation coefficient? Do you think if it might suffer from catastrophic cancellation problem?


- Mehant Baid


On May 27, 2014, 8:24 p.m., Yash Sharma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/21941/
> -----------------------------------------------------------
> 
> (Updated May 27, 2014, 8:24 p.m.)
> 
> 
> Review request for drill, Aditya Kishore, Mehant Baid, and Timothy Chen.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> Added implementation for Correlation coefficient calculation function.
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/codegen/config.fmpp 5c0d03b 
>   exec/java-exec/src/main/codegen/data/CorrelationTypes.tdd PRE-CREATION 
>   exec/java-exec/src/main/codegen/templates/CorrelationTypeFunctions.java PRE-CREATION 
>   exec/java-exec/src/test/java/org/apache/drill/exec/fn/impl/TestAggregateFunction.java ffb372d 
>   exec/java-exec/src/test/resources/functions/test_covariance.json PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/21941/diff/
> 
> 
> Testing
> -------
> 
> Yes.
> $mvn test -Dtest=TestAggregateFunction#testCovarianceCorrelation
> 
> 
> Thanks,
> 
> Yash Sharma
> 
>


Re: Review Request 21941: Drill-842: Correlation coefficient calculation function

Posted by Mehant Baid <ba...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21941/#review45922
-----------------------------------------------------------

Ship it!


Ship It!

- Mehant Baid


On June 14, 2014, 7:21 p.m., Yash Sharma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/21941/
> -----------------------------------------------------------
> 
> (Updated June 14, 2014, 7:21 p.m.)
> 
> 
> Review request for drill, Aditya Kishore, Mehant Baid, and Timothy Chen.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> Added implementation for Correlation coefficient calculation function.
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/codegen/config.fmpp 5c0d03b 
>   exec/java-exec/src/main/codegen/data/CorrelationTypes.tdd PRE-CREATION 
>   exec/java-exec/src/main/codegen/data/CovarTypes.tdd PRE-CREATION 
>   exec/java-exec/src/main/codegen/templates/CorrelationTypeFunctions.java PRE-CREATION 
>   exec/java-exec/src/main/codegen/templates/CovarTypeFunctions.java PRE-CREATION 
>   exec/java-exec/src/test/java/org/apache/drill/exec/fn/impl/TestAggregateFunction.java ffb372d 
>   exec/java-exec/src/test/resources/functions/test_covariance.json PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/21941/diff/
> 
> 
> Testing
> -------
> 
> Yes.
> $mvn test -Dtest=TestAggregateFunction#testCovarianceCorrelation
> 
> 
> Thanks,
> 
> Yash Sharma
> 
>


Re: Review Request 21941: Drill-842: Correlation coefficient calculation function

Posted by Yash Sharma <ya...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21941/
-----------------------------------------------------------

(Updated June 14, 2014, 7:21 p.m.)


Review request for drill, Aditya Kishore, Mehant Baid, and Timothy Chen.


Changes
-------

Implemented review comments!


Repository: drill-git


Description
-------

Added implementation for Correlation coefficient calculation function.


Diffs (updated)
-----

  exec/java-exec/src/main/codegen/config.fmpp 5c0d03b 
  exec/java-exec/src/main/codegen/data/CorrelationTypes.tdd PRE-CREATION 
  exec/java-exec/src/main/codegen/data/CovarTypes.tdd PRE-CREATION 
  exec/java-exec/src/main/codegen/templates/CorrelationTypeFunctions.java PRE-CREATION 
  exec/java-exec/src/main/codegen/templates/CovarTypeFunctions.java PRE-CREATION 
  exec/java-exec/src/test/java/org/apache/drill/exec/fn/impl/TestAggregateFunction.java ffb372d 
  exec/java-exec/src/test/resources/functions/test_covariance.json PRE-CREATION 

Diff: https://reviews.apache.org/r/21941/diff/


Testing
-------

Yes.
$mvn test -Dtest=TestAggregateFunction#testCovarianceCorrelation


Thanks,

Yash Sharma