You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Grant Ingersoll (JIRA)" <ji...@apache.org> on 2009/06/24 06:52:07 UTC
[jira] Created: (MAHOUT-139) Make use of Vector Iterator
capabilities where appropriate
Make use of Vector Iterator capabilities where appropriate
----------------------------------------------------------
Key: MAHOUT-139
URL: https://issues.apache.org/jira/browse/MAHOUT-139
Project: Mahout
Issue Type: Improvement
Affects Versions: 0.2
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Fix For: 0.2
There are a bunch of places where we loop over the size of the vector when we should be taking advantage of the sparseness, or at least be agnostic about it and use an iterator.
This patch addresses these issues in the Vector implementations and in the DistanceMeasure implementations
Also adds iterateNonZero() and interateAll and drops the Iterable portion of Vector since it wasn't clear what it was iterating
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAHOUT-139) Make use of Vector Iterator
capabilities where appropriate
Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Grant Ingersoll updated MAHOUT-139:
-----------------------------------
Attachment: MAHOUT-139.patch
Draft of a patch that makes a whole lot of conversions to use an appropriate Iterator.
Drops Vector extends Iterator and instead provides two methods:
iterateAll()
iterateNonZero()
Iterators are now implemented by DenseVect and SparseVect instead of AbstractVector to try and take advantage of class specific data structures.
Also updates the DistanceMeasures where appropriate.
All tests passed in core.
The profiling view looks a lot healthier too, as the primary bottlenecks are now in code that actually does the work, versus the data structures and accessors.
> Make use of Vector Iterator capabilities where appropriate
> ----------------------------------------------------------
>
> Key: MAHOUT-139
> URL: https://issues.apache.org/jira/browse/MAHOUT-139
> Project: Mahout
> Issue Type: Improvement
> Affects Versions: 0.2
> Reporter: Grant Ingersoll
> Assignee: Grant Ingersoll
> Fix For: 0.2
>
> Attachments: MAHOUT-139.patch
>
>
> There are a bunch of places where we loop over the size of the vector when we should be taking advantage of the sparseness, or at least be agnostic about it and use an iterator.
> This patch addresses these issues in the Vector implementations and in the DistanceMeasure implementations
> Also adds iterateNonZero() and interateAll and drops the Iterable portion of Vector since it wasn't clear what it was iterating
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (MAHOUT-139) Make use of Vector Iterator
capabilities where appropriate
Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Grant Ingersoll resolved MAHOUT-139.
------------------------------------
Resolution: Fixed
> Make use of Vector Iterator capabilities where appropriate
> ----------------------------------------------------------
>
> Key: MAHOUT-139
> URL: https://issues.apache.org/jira/browse/MAHOUT-139
> Project: Mahout
> Issue Type: Improvement
> Affects Versions: 0.2
> Reporter: Grant Ingersoll
> Assignee: Grant Ingersoll
> Fix For: 0.2
>
> Attachments: MAHOUT-139.patch
>
>
> There are a bunch of places where we loop over the size of the vector when we should be taking advantage of the sparseness, or at least be agnostic about it and use an iterator.
> This patch addresses these issues in the Vector implementations and in the DistanceMeasure implementations
> Also adds iterateNonZero() and interateAll and drops the Iterable portion of Vector since it wasn't clear what it was iterating
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAHOUT-139) Make use of Vector Iterator
capabilities where appropriate
Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12723753#action_12723753 ]
Grant Ingersoll commented on MAHOUT-139:
----------------------------------------
Committed revision 788186.
> Make use of Vector Iterator capabilities where appropriate
> ----------------------------------------------------------
>
> Key: MAHOUT-139
> URL: https://issues.apache.org/jira/browse/MAHOUT-139
> Project: Mahout
> Issue Type: Improvement
> Affects Versions: 0.2
> Reporter: Grant Ingersoll
> Assignee: Grant Ingersoll
> Fix For: 0.2
>
> Attachments: MAHOUT-139.patch
>
>
> There are a bunch of places where we loop over the size of the vector when we should be taking advantage of the sparseness, or at least be agnostic about it and use an iterator.
> This patch addresses these issues in the Vector implementations and in the DistanceMeasure implementations
> Also adds iterateNonZero() and interateAll and drops the Iterable portion of Vector since it wasn't clear what it was iterating
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAHOUT-139) Make use of Vector Iterator
capabilities where appropriate
Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12723667#action_12723667 ]
Grant Ingersoll commented on MAHOUT-139:
----------------------------------------
I'd like to commit this soon. My preliminary tests are pretty positive in terms of the performance gains to be had by being smarter about iteration but it would be helpful to have some feedback.
> Make use of Vector Iterator capabilities where appropriate
> ----------------------------------------------------------
>
> Key: MAHOUT-139
> URL: https://issues.apache.org/jira/browse/MAHOUT-139
> Project: Mahout
> Issue Type: Improvement
> Affects Versions: 0.2
> Reporter: Grant Ingersoll
> Assignee: Grant Ingersoll
> Fix For: 0.2
>
> Attachments: MAHOUT-139.patch
>
>
> There are a bunch of places where we loop over the size of the vector when we should be taking advantage of the sparseness, or at least be agnostic about it and use an iterator.
> This patch addresses these issues in the Vector implementations and in the DistanceMeasure implementations
> Also adds iterateNonZero() and interateAll and drops the Iterable portion of Vector since it wasn't clear what it was iterating
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.