You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Jeff Eastman <jd...@windwardsolutions.com> on 2011/12/05 23:29:56 UTC
Major Issues for 0.6
Here's the list of Major issues marked for 0.6. I can commit to the 3
which I own by end of this month.
Issue Type Key Summary Assignee Reporter Priority Status
Resolution Created Updated
Improvement MAHOUT-910
<https://issues.apache.org/jira/browse/MAHOUT-910> Improve sampling in
SamplingCandidateItemStrategy, optimize intersection computations Sean
Owen Sean Owen Major Patch Available Unresolved 12/2/11 16:46
12/5/11 22:00
New Feature MAHOUT-897
<https://issues.apache.org/jira/browse/MAHOUT-897> New implementation
for LDA: Collapsed Variational Bayes (0th derivative approximation),
with map-side model caching Jake Mannix Jake Mannix Major Patch
Available Unresolved 11/27/11 7:34 12/3/11 5:16
Improvement MAHOUT-846
<https://issues.apache.org/jira/browse/MAHOUT-846> Improve Scalability
of Gaussian Cluster For Wide Vectors Jeff Eastman Jeff Eastman
Major Open Unresolved 10/19/11 17:06 10/20/11 3:35
New Feature MAHOUT-843
<https://issues.apache.org/jira/browse/MAHOUT-843> Top Down
Clustering Jeff Eastman Paritosh Ranjan Major Patch Available
Unresolved 10/15/11 19:27 12/5/11 19:56
Bug MAHOUT-825 <https://issues.apache.org/jira/browse/MAHOUT-825>
Canopies grouping records outside t1 Jeff Eastman Paritosh Ranjan
Major Patch Available Unresolved 10/3/11 8:10 12/3/11 18:29
Bug MAHOUT-794 <https://issues.apache.org/jira/browse/MAHOUT-794>
Eigencuts produces unexpected results, part 2 Shannon Quinn Sean
Owen Major Open Unresolved 8/21/11 19:37 8/21/11 19:37
Improvement MAHOUT-772
<https://issues.apache.org/jira/browse/MAHOUT-772> Refactor
Matrix/Vector implementation with linear operators Unassigned Jonathan
Traupman Major Open Unresolved 7/25/11 6:13 8/18/11 17:41
Task MAHOUT-627 <https://issues.apache.org/jira/browse/MAHOUT-627>
Baum-Welch Algorithm on Map-Reduce for Parallel Hidden Markov Model
Training. Grant Ingersoll Dhruv Kumar Major Patch Available
Unresolved 3/17/11 15:59 11/8/11 19:16
Bug MAHOUT-598 <https://issues.apache.org/jira/browse/MAHOUT-598>
Downstream steps in the seq2sparse job flow looking in wrong location
for output from previous steps when running in Elastic MapReduce (EMR)
cluster Robin Anil Timothy Potter Major Open Unresolved 1/27/11
16:20 8/21/11 20:01
Bug MAHOUT-524 <https://issues.apache.org/jira/browse/MAHOUT-524>
DisplaySpectralKMeans example fails Shannon Quinn Jeff Eastman
Major Open Unresolved 10/12/10 3:31 11/22/11 18:04
Bug MAHOUT-399 <https://issues.apache.org/jira/browse/MAHOUT-399> LDA
on Mahout 0.3 does not converge to correct solution for overlapping
pyramids toy problem. Jake Mannix Michael Lazarus Major Patch
Available Unresolved 5/24/10 17:17 12/2/11 22:32
Re: Major Issues for 0.6
Posted by Dmitriy Lyubimov <dl...@gmail.com>.
if you are going to release after NY i may have a chance to sneak in 817 in
that release, after all.
-d
On Mon, Dec 5, 2011 at 2:29 PM, Jeff Eastman <jd...@windwardsolutions.com>wrote:
> **
> Here's the list of Major issues marked for 0.6. I can commit to the 3
> which I own by end of this month.
>
> Issue Type Key Summary Assignee Reporter Priority Status Resolution
> Created Updated Improvement MAHOUT-910<https://issues.apache.org/jira/browse/MAHOUT-910> Improve
> sampling in SamplingCandidateItemStrategy, optimize intersection
> computations Sean Owen Sean Owen Major Patch Available Unresolved 12/2/11
> 16:46 12/5/11 22:00 New Feature MAHOUT-897<https://issues.apache.org/jira/browse/MAHOUT-897> New
> implementation for LDA: Collapsed Variational Bayes (0th derivative
> approximation), with map-side model caching Jake Mannix Jake Mannix Major Patch
> Available Unresolved 11/27/11 7:34 12/3/11 5:16 Improvement MAHOUT-846<https://issues.apache.org/jira/browse/MAHOUT-846> Improve
> Scalability of Gaussian Cluster For Wide Vectors Jeff Eastman Jeff Eastman
> Major Open Unresolved 10/19/11 17:06 10/20/11 3:35 New Feature MAHOUT-843<https://issues.apache.org/jira/browse/MAHOUT-843> Top
> Down Clustering Jeff Eastman Paritosh Ranjan Major Patch Available
> Unresolved 10/15/11 19:27 12/5/11 19:56 Bug MAHOUT-825<https://issues.apache.org/jira/browse/MAHOUT-825> Canopies
> grouping records outside t1 Jeff Eastman Paritosh Ranjan Major Patch
> Available Unresolved 10/3/11 8:10 12/3/11 18:29 Bug MAHOUT-794<https://issues.apache.org/jira/browse/MAHOUT-794> Eigencuts
> produces unexpected results, part 2 Shannon Quinn Sean Owen Major Open
> Unresolved 8/21/11 19:37 8/21/11 19:37 Improvement MAHOUT-772<https://issues.apache.org/jira/browse/MAHOUT-772> Refactor
> Matrix/Vector implementation with linear operators Unassigned Jonathan
> Traupman Major Open Unresolved 7/25/11 6:13 8/18/11 17:41 Task MAHOUT-627<https://issues.apache.org/jira/browse/MAHOUT-627> Baum-Welch
> Algorithm on Map-Reduce for Parallel Hidden Markov Model Training. Grant
> Ingersoll Dhruv Kumar Major Patch Available Unresolved 3/17/11 15:59 11/8/11
> 19:16 Bug MAHOUT-598 <https://issues.apache.org/jira/browse/MAHOUT-598> Downstream
> steps in the seq2sparse job flow looking in wrong location for output from
> previous steps when running in Elastic MapReduce (EMR) cluster Robin Anil Timothy
> Potter Major Open Unresolved 1/27/11 16:20 8/21/11 20:01 Bug MAHOUT-524<https://issues.apache.org/jira/browse/MAHOUT-524> DisplaySpectralKMeans
> example fails Shannon Quinn Jeff Eastman Major Open Unresolved 10/12/10
> 3:31 11/22/11 18:04 Bug MAHOUT-399<https://issues.apache.org/jira/browse/MAHOUT-399> LDA
> on Mahout 0.3 does not converge to correct solution for overlapping
> pyramids toy problem. Jake Mannix Michael Lazarus Major Patch Available
> Unresolved 5/24/10 17:17 12/2/11 22:32
>
Re: Major Issues for 0.6
Posted by Ted Dunning <te...@gmail.com>.
Wow.
Good luck on that.
On Mon, Dec 5, 2011 at 3:05 PM, Shannon Quinn <sq...@gatech.edu> wrote:
> I can definitely have 524 done by the end of the month...currently
> finishing up a very large grant proposal that's due at midnight tonight
> which involves Mahout in a 3-year research project. Fingers crossed!
>
> On 12/5/2011 5:29 PM, Jeff Eastman wrote:
>
>> Here's the list of Major issues marked for 0.6. I can commit to the 3
>> which I own by end of this month.
>>
>> Issue Type Key Summary Assignee Reporter
>> Priority Status Resolution Created Updated
>> Improvement MAHOUT-910 <https://issues.apache.org/**
>> jira/browse/MAHOUT-910 <https://issues.apache.org/jira/browse/MAHOUT-910>>
>> Improve sampling in SamplingCandidateItemStrategy, optimize intersection
>> computations Sean Owen Sean Owen Major Patch Available
>> Unresolved 12/2/11 16:46 12/5/11 22:00
>> New Feature MAHOUT-897 <https://issues.apache.org/**
>> jira/browse/MAHOUT-897 <https://issues.apache.org/jira/browse/MAHOUT-897>>
>> New implementation for LDA: Collapsed Variational Bayes (0th derivative
>> approximation), with map-side model caching Jake Mannix Jake Mannix
>> Major Patch Available Unresolved 11/27/11 7:34 12/3/11
>> 5:16
>> Improvement MAHOUT-846 <https://issues.apache.org/**
>> jira/browse/MAHOUT-846 <https://issues.apache.org/jira/browse/MAHOUT-846>>
>> Improve Scalability of Gaussian Cluster For Wide Vectors Jeff
>> Eastman Jeff Eastman Major Open Unresolved 10/19/11 17:06
>> 10/20/11 3:35
>> New Feature MAHOUT-843 <https://issues.apache.org/**
>> jira/browse/MAHOUT-843 <https://issues.apache.org/jira/browse/MAHOUT-843>>
>> Top Down Clustering Jeff Eastman Paritosh Ranjan Major
>> Patch Available Unresolved 10/15/11 19:27 12/5/11 19:56
>> Bug MAHOUT-825 <https://issues.apache.org/**jira/browse/MAHOUT-825<https://issues.apache.org/jira/browse/MAHOUT-825>>
>> Canopies grouping records outside t1 Jeff Eastman Paritosh Ranjan
>> Major Patch Available Unresolved 10/3/11 8:10 12/3/11
>> 18:29
>> Bug MAHOUT-794 <https://issues.apache.org/**jira/browse/MAHOUT-794<https://issues.apache.org/jira/browse/MAHOUT-794>>
>> Eigencuts produces unexpected results, part 2 Shannon Quinn Sean Owen
>> Major Open Unresolved 8/21/11 19:37 8/21/11 19:37
>> Improvement MAHOUT-772 <https://issues.apache.org/**
>> jira/browse/MAHOUT-772 <https://issues.apache.org/jira/browse/MAHOUT-772>>
>> Refactor Matrix/Vector implementation with linear operators
>> Unassigned Jonathan Traupman Major Open Unresolved 7/25/11
>> 6:13 8/18/11 17:41
>> Task MAHOUT-627 <https://issues.apache.org/**jira/browse/MAHOUT-627<https://issues.apache.org/jira/browse/MAHOUT-627>>
>> Baum-Welch Algorithm on Map-Reduce for Parallel Hidden Markov Model
>> Training. Grant Ingersoll Dhruv Kumar Major Patch
>> Available Unresolved 3/17/11 15:59 11/8/11 19:16
>> Bug MAHOUT-598 <https://issues.apache.org/**jira/browse/MAHOUT-598<https://issues.apache.org/jira/browse/MAHOUT-598>>
>> Downstream steps in the seq2sparse job flow looking in wrong location for
>> output from previous steps when running in Elastic MapReduce (EMR) cluster
>> Robin Anil Timothy Potter Major Open Unresolved
>> 1/27/11 16:20 8/21/11 20:01
>> Bug MAHOUT-524 <https://issues.apache.org/**jira/browse/MAHOUT-524<https://issues.apache.org/jira/browse/MAHOUT-524>>
>> DisplaySpectralKMeans example fails Shannon Quinn Jeff Eastman
>> Major Open Unresolved 10/12/10 3:31 11/22/11 18:04
>> Bug MAHOUT-399 <https://issues.apache.org/**jira/browse/MAHOUT-399<https://issues.apache.org/jira/browse/MAHOUT-399>>
>> LDA on Mahout 0.3 does not converge to correct solution for overlapping
>> pyramids toy problem. Jake Mannix Michael Lazarus Major Patch
>> Available Unresolved 5/24/10 17:17 12/2/11 22:32
>>
>>
>>
Re: Major Issues for 0.6
Posted by Isabel Drost <is...@apache.org>.
On 06.12.2011 Shannon Quinn wrote:
> I can definitely have 524 done by the end of the month...currently
> finishing up a very large grant proposal that's due at midnight tonight
> which involves Mahout in a 3-year research project. Fingers crossed!
Not sure whether it helps but keeping my fingers crossed as well - just in case.
Isabel
Re: Major Issues for 0.6
Posted by Shannon Quinn <sq...@gatech.edu>.
I can definitely have 524 done by the end of the month...currently
finishing up a very large grant proposal that's due at midnight tonight
which involves Mahout in a 3-year research project. Fingers crossed!
On 12/5/2011 5:29 PM, Jeff Eastman wrote:
> Here's the list of Major issues marked for 0.6. I can commit to the 3
> which I own by end of this month.
>
> Issue Type Key Summary Assignee Reporter Priority Status
> Resolution Created Updated
> Improvement MAHOUT-910
> <https://issues.apache.org/jira/browse/MAHOUT-910> Improve sampling
> in SamplingCandidateItemStrategy, optimize intersection computations
> Sean Owen Sean Owen Major Patch Available Unresolved 12/2/11
> 16:46 12/5/11 22:00
> New Feature MAHOUT-897
> <https://issues.apache.org/jira/browse/MAHOUT-897> New implementation
> for LDA: Collapsed Variational Bayes (0th derivative approximation),
> with map-side model caching Jake Mannix Jake Mannix Major Patch
> Available Unresolved 11/27/11 7:34 12/3/11 5:16
> Improvement MAHOUT-846
> <https://issues.apache.org/jira/browse/MAHOUT-846> Improve
> Scalability of Gaussian Cluster For Wide Vectors Jeff Eastman Jeff
> Eastman Major Open Unresolved 10/19/11 17:06 10/20/11 3:35
> New Feature MAHOUT-843
> <https://issues.apache.org/jira/browse/MAHOUT-843> Top Down
> Clustering Jeff Eastman Paritosh Ranjan Major Patch Available
> Unresolved 10/15/11 19:27 12/5/11 19:56
> Bug MAHOUT-825 <https://issues.apache.org/jira/browse/MAHOUT-825>
> Canopies grouping records outside t1 Jeff Eastman Paritosh Ranjan
> Major Patch Available Unresolved 10/3/11 8:10 12/3/11 18:29
> Bug MAHOUT-794 <https://issues.apache.org/jira/browse/MAHOUT-794>
> Eigencuts produces unexpected results, part 2 Shannon Quinn Sean
> Owen Major Open Unresolved 8/21/11 19:37 8/21/11 19:37
> Improvement MAHOUT-772
> <https://issues.apache.org/jira/browse/MAHOUT-772> Refactor
> Matrix/Vector implementation with linear operators Unassigned
> Jonathan Traupman Major Open Unresolved 7/25/11 6:13 8/18/11 17:41
> Task MAHOUT-627 <https://issues.apache.org/jira/browse/MAHOUT-627>
> Baum-Welch Algorithm on Map-Reduce for Parallel Hidden Markov Model
> Training. Grant Ingersoll Dhruv Kumar Major Patch Available
> Unresolved 3/17/11 15:59 11/8/11 19:16
> Bug MAHOUT-598 <https://issues.apache.org/jira/browse/MAHOUT-598>
> Downstream steps in the seq2sparse job flow looking in wrong location
> for output from previous steps when running in Elastic MapReduce (EMR)
> cluster Robin Anil Timothy Potter Major Open Unresolved 1/27/11
> 16:20 8/21/11 20:01
> Bug MAHOUT-524 <https://issues.apache.org/jira/browse/MAHOUT-524>
> DisplaySpectralKMeans example fails Shannon Quinn Jeff Eastman
> Major Open Unresolved 10/12/10 3:31 11/22/11 18:04
> Bug MAHOUT-399 <https://issues.apache.org/jira/browse/MAHOUT-399>
> LDA on Mahout 0.3 does not converge to correct solution for
> overlapping pyramids toy problem. Jake Mannix Michael Lazarus
> Major Patch Available Unresolved 5/24/10 17:17 12/2/11 22:32
>
>
Re: Major Issues for 0.6
Posted by Grant Ingersoll <gs...@apache.org>.
I'll take a crack at it, just good to have a 2nd set of eyes.
On Dec 6, 2011, at 4:53 PM, Sebastian Schelter wrote:
> Our expert is currently doing an internship at google. He'll be back in
> january.
> Am 06.12.2011 22:24 schrieb "Isabel Drost" <is...@apache.org>:
>
>> On 06.12.2011 Ted Dunning wrote:
>>> Isabel?
>>>
>>> Do you have an HMM expert handy?
>>
>> Can only defer to Sebastian who is working at a research lab that might
>> have -
>> or at least has contacts to one.
>>
>>
>> Isabel
>>
Re: Major Issues for 0.6
Posted by Sebastian Schelter <ss...@googlemail.com>.
Our expert is currently doing an internship at google. He'll be back in
january.
Am 06.12.2011 22:24 schrieb "Isabel Drost" <is...@apache.org>:
> On 06.12.2011 Ted Dunning wrote:
> > Isabel?
> >
> > Do you have an HMM expert handy?
>
> Can only defer to Sebastian who is working at a research lab that might
> have -
> or at least has contacts to one.
>
>
> Isabel
>
Re: Major Issues for 0.6
Posted by Isabel Drost <is...@apache.org>.
On 06.12.2011 Ted Dunning wrote:
> Isabel?
>
> Do you have an HMM expert handy?
Can only defer to Sebastian who is working at a research lab that might have -
or at least has contacts to one.
Isabel
Re: Major Issues for 0.6
Posted by Ted Dunning <te...@gmail.com>.
Isabel?
Do you have an HMM expert handy?
On Mon, Dec 5, 2011 at 3:44 PM, Grant Ingersoll <gs...@apache.org> wrote:
> I haven't opened an issue yet, but I also have concerns about the new
> bayes classifier in terms of quality of results.
>
> I'm hoping to get to M-627 soon, but wouldn't mind some help, as I am not
> an HMM expert (yet).
>
> On Dec 5, 2011, at 5:29 PM, Jeff Eastman wrote:
>
> > Here's the list of Major issues marked for 0.6. I can commit to the 3
> which I own by end of this month.
> >
> > Issue Type Key Summary Assignee Reporter Priority
> Status Resolution Created Updated
> > Improvement MAHOUT-910 Improve sampling in
> SamplingCandidateItemStrategy, optimize intersection computations Sean
> Owen Sean Owen Major Patch Available Unresolved 12/2/11
> 16:46 12/5/11 22:00
> > New Feature MAHOUT-897 New implementation for LDA: Collapsed
> Variational Bayes (0th derivative approximation), with map-side model
> caching Jake Mannix Jake Mannix Major Patch Available
> Unresolved 11/27/11 7:34 12/3/11 5:16
> > Improvement MAHOUT-846 Improve Scalability of Gaussian Cluster
> For Wide Vectors Jeff Eastman Jeff Eastman Major Open
> Unresolved 10/19/11 17:06 10/20/11 3:35
> > New Feature MAHOUT-843 Top Down Clustering Jeff Eastman
> Paritosh Ranjan Major Patch Available Unresolved 10/15/11 19:27
> 12/5/11 19:56
> > Bug MAHOUT-825 Canopies grouping records outside t1 Jeff
> Eastman Paritosh Ranjan Major Patch Available Unresolved 10/3/11
> 8:10 12/3/11 18:29
> > Bug MAHOUT-794 Eigencuts produces unexpected results, part 2
> Shannon Quinn Sean Owen Major Open Unresolved 8/21/11
> 19:37 8/21/11 19:37
> > Improvement MAHOUT-772 Refactor Matrix/Vector implementation with
> linear operators Unassigned Jonathan Traupman Major Open
> Unresolved 7/25/11 6:13 8/18/11 17:41
> > Task MAHOUT-627 Baum-Welch Algorithm on Map-Reduce for Parallel
> Hidden Markov Model Training. Grant Ingersoll Dhruv Kumar Major
> Patch Available Unresolved 3/17/11 15:59 11/8/11 19:16
> > Bug MAHOUT-598 Downstream steps in the seq2sparse job flow
> looking in wrong location for output from previous steps when running in
> Elastic MapReduce (EMR) cluster Robin Anil Timothy Potter Major
> Open Unresolved 1/27/11 16:20 8/21/11 20:01
> > Bug MAHOUT-524 DisplaySpectralKMeans example fails Shannon
> Quinn Jeff Eastman Major Open Unresolved 10/12/10 3:31
> 11/22/11 18:04
> > Bug MAHOUT-399 LDA on Mahout 0.3 does not converge to correct
> solution for overlapping pyramids toy problem. Jake Mannix Michael
> Lazarus Major Patch Available Unresolved 5/24/10 17:17 12/2/11
> 22:32
>
> --------------------------------------------
> Grant Ingersoll
> http://www.lucidimagination.com
>
>
>
>
Re: Major Issues for 0.6
Posted by Grant Ingersoll <gs...@apache.org>.
I haven't opened an issue yet, but I also have concerns about the new bayes classifier in terms of quality of results.
I'm hoping to get to M-627 soon, but wouldn't mind some help, as I am not an HMM expert (yet).
On Dec 5, 2011, at 5:29 PM, Jeff Eastman wrote:
> Here's the list of Major issues marked for 0.6. I can commit to the 3 which I own by end of this month.
>
> Issue Type Key Summary Assignee Reporter Priority Status Resolution Created Updated
> Improvement MAHOUT-910 Improve sampling in SamplingCandidateItemStrategy, optimize intersection computations Sean Owen Sean Owen Major Patch Available Unresolved 12/2/11 16:46 12/5/11 22:00
> New Feature MAHOUT-897 New implementation for LDA: Collapsed Variational Bayes (0th derivative approximation), with map-side model caching Jake Mannix Jake Mannix Major Patch Available Unresolved 11/27/11 7:34 12/3/11 5:16
> Improvement MAHOUT-846 Improve Scalability of Gaussian Cluster For Wide Vectors Jeff Eastman Jeff Eastman Major Open Unresolved 10/19/11 17:06 10/20/11 3:35
> New Feature MAHOUT-843 Top Down Clustering Jeff Eastman Paritosh Ranjan Major Patch Available Unresolved 10/15/11 19:27 12/5/11 19:56
> Bug MAHOUT-825 Canopies grouping records outside t1 Jeff Eastman Paritosh Ranjan Major Patch Available Unresolved 10/3/11 8:10 12/3/11 18:29
> Bug MAHOUT-794 Eigencuts produces unexpected results, part 2 Shannon Quinn Sean Owen Major Open Unresolved 8/21/11 19:37 8/21/11 19:37
> Improvement MAHOUT-772 Refactor Matrix/Vector implementation with linear operators Unassigned Jonathan Traupman Major Open Unresolved 7/25/11 6:13 8/18/11 17:41
> Task MAHOUT-627 Baum-Welch Algorithm on Map-Reduce for Parallel Hidden Markov Model Training. Grant Ingersoll Dhruv Kumar Major Patch Available Unresolved 3/17/11 15:59 11/8/11 19:16
> Bug MAHOUT-598 Downstream steps in the seq2sparse job flow looking in wrong location for output from previous steps when running in Elastic MapReduce (EMR) cluster Robin Anil Timothy Potter Major Open Unresolved 1/27/11 16:20 8/21/11 20:01
> Bug MAHOUT-524 DisplaySpectralKMeans example fails Shannon Quinn Jeff Eastman Major Open Unresolved 10/12/10 3:31 11/22/11 18:04
> Bug MAHOUT-399 LDA on Mahout 0.3 does not converge to correct solution for overlapping pyramids toy problem. Jake Mannix Michael Lazarus Major Patch Available Unresolved 5/24/10 17:17 12/2/11 22:32
--------------------------------------------
Grant Ingersoll
http://www.lucidimagination.com