You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Dmitriy Lyubimov <dl...@gmail.com> on 2015/01/14 02:22:52 UTC

Inconsistent SparseMatrix.iterator

Hi,

I posted a question before about what it means to iterate over matrix, and
was given an answer it means going over matrix rows (0...m-1).

That makes it, for exapmle, possible to create code like the following

for ( (row, index) <- mxA.zipWithIndex ) row += something(index)

however actual code in SparseMatrix iterates over hash of existing rows.
That breaks this execution for SparseMatrix only in two senses :

(1) not all rows are visited; only those that have non-zeros. In
particular, if we created a matrix via like() then nothing is happening at
all;

(2) rows are not visited in order of indices.

I would like to fix (1) and (2) problems in this iterator but not sure if
this is important elsewhere.

Re: Inconsistent SparseMatrix.iterator

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
Right now even ColumnMatrix. toString() doesn't work because iterateAll
assumes row slicing but column matrix actually does column slices -- and of
course not only produces transposed information but also runs into
index-out-of-bound problems along the way.

Given these conceptual inconsistencies, i think it is actually worth
deprecating ColumnMatrix altogether. Modifying transpose view will flip row
matrix into column matrix if needed at minimum cost and with all internal
optimization needed, retained.

On Wed, Jan 14, 2015 at 12:06 PM, Dmitriy Lyubimov <dl...@gmail.com>
wrote:

> I take it no objections then.
>
> On Tue, Jan 13, 2015 at 5:44 PM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
>
>> I guess i still very much want to rename iterator() in sparse matrix to
>> something like iterateNonEmpty(). That would fix it everywhere. Thoughts?
>>
>> On Tue, Jan 13, 2015 at 5:37 PM, Dmitriy Lyubimov <dl...@gmail.com>
>> wrote:
>>
>>> ok so
>>>
>>> for (row <- mxA.iterateAll)
>>>
>>> works correctly even for SparseMatrix. Now i _only_ have to fix all the
>>> spots where it says without iterateAll. which is like every where. must be
>>> fun.
>>>
>>>
>>>
>>> On Tue, Jan 13, 2015 at 5:31 PM, Dmitriy Lyubimov <dl...@gmail.com>
>>> wrote:
>>>
>>>> Or that is meant to be behavior of iterateAll() only? which is true for
>>>> all iterators except SparseMatrix it seems.
>>>>
>>>> On Tue, Jan 13, 2015 at 5:22 PM, Dmitriy Lyubimov <dl...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I posted a question before about what it means to iterate over matrix,
>>>>> and was given an answer it means going over matrix rows (0...m-1).
>>>>>
>>>>> That makes it, for exapmle, possible to create code like the following
>>>>>
>>>>> for ( (row, index) <- mxA.zipWithIndex ) row += something(index)
>>>>>
>>>>> however actual code in SparseMatrix iterates over hash of existing
>>>>> rows. That breaks this execution for SparseMatrix only in two senses :
>>>>>
>>>>> (1) not all rows are visited; only those that have non-zeros. In
>>>>> particular, if we created a matrix via like() then nothing is happening at
>>>>> all;
>>>>>
>>>>> (2) rows are not visited in order of indices.
>>>>>
>>>>> I would like to fix (1) and (2) problems in this iterator but not sure
>>>>> if this is important elsewhere.
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Inconsistent SparseMatrix.iterator

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
I take it no objections then.

On Tue, Jan 13, 2015 at 5:44 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

> I guess i still very much want to rename iterator() in sparse matrix to
> something like iterateNonEmpty(). That would fix it everywhere. Thoughts?
>
> On Tue, Jan 13, 2015 at 5:37 PM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
>
>> ok so
>>
>> for (row <- mxA.iterateAll)
>>
>> works correctly even for SparseMatrix. Now i _only_ have to fix all the
>> spots where it says without iterateAll. which is like every where. must be
>> fun.
>>
>>
>>
>> On Tue, Jan 13, 2015 at 5:31 PM, Dmitriy Lyubimov <dl...@gmail.com>
>> wrote:
>>
>>> Or that is meant to be behavior of iterateAll() only? which is true for
>>> all iterators except SparseMatrix it seems.
>>>
>>> On Tue, Jan 13, 2015 at 5:22 PM, Dmitriy Lyubimov <dl...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I posted a question before about what it means to iterate over matrix,
>>>> and was given an answer it means going over matrix rows (0...m-1).
>>>>
>>>> That makes it, for exapmle, possible to create code like the following
>>>>
>>>> for ( (row, index) <- mxA.zipWithIndex ) row += something(index)
>>>>
>>>> however actual code in SparseMatrix iterates over hash of existing
>>>> rows. That breaks this execution for SparseMatrix only in two senses :
>>>>
>>>> (1) not all rows are visited; only those that have non-zeros. In
>>>> particular, if we created a matrix via like() then nothing is happening at
>>>> all;
>>>>
>>>> (2) rows are not visited in order of indices.
>>>>
>>>> I would like to fix (1) and (2) problems in this iterator but not sure
>>>> if this is important elsewhere.
>>>>
>>>
>>>
>>
>

Re: Inconsistent SparseMatrix.iterator

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
I guess i still very much want to rename iterator() in sparse matrix to
something like iterateNonEmpty(). That would fix it everywhere. Thoughts?

On Tue, Jan 13, 2015 at 5:37 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

> ok so
>
> for (row <- mxA.iterateAll)
>
> works correctly even for SparseMatrix. Now i _only_ have to fix all the
> spots where it says without iterateAll. which is like every where. must be
> fun.
>
>
>
> On Tue, Jan 13, 2015 at 5:31 PM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
>
>> Or that is meant to be behavior of iterateAll() only? which is true for
>> all iterators except SparseMatrix it seems.
>>
>> On Tue, Jan 13, 2015 at 5:22 PM, Dmitriy Lyubimov <dl...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I posted a question before about what it means to iterate over matrix,
>>> and was given an answer it means going over matrix rows (0...m-1).
>>>
>>> That makes it, for exapmle, possible to create code like the following
>>>
>>> for ( (row, index) <- mxA.zipWithIndex ) row += something(index)
>>>
>>> however actual code in SparseMatrix iterates over hash of existing rows.
>>> That breaks this execution for SparseMatrix only in two senses :
>>>
>>> (1) not all rows are visited; only those that have non-zeros. In
>>> particular, if we created a matrix via like() then nothing is happening at
>>> all;
>>>
>>> (2) rows are not visited in order of indices.
>>>
>>> I would like to fix (1) and (2) problems in this iterator but not sure
>>> if this is important elsewhere.
>>>
>>
>>
>

Re: Inconsistent SparseMatrix.iterator

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
ok so

for (row <- mxA.iterateAll)

works correctly even for SparseMatrix. Now i _only_ have to fix all the
spots where it says without iterateAll. which is like every where. must be
fun.



On Tue, Jan 13, 2015 at 5:31 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

> Or that is meant to be behavior of iterateAll() only? which is true for
> all iterators except SparseMatrix it seems.
>
> On Tue, Jan 13, 2015 at 5:22 PM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I posted a question before about what it means to iterate over matrix,
>> and was given an answer it means going over matrix rows (0...m-1).
>>
>> That makes it, for exapmle, possible to create code like the following
>>
>> for ( (row, index) <- mxA.zipWithIndex ) row += something(index)
>>
>> however actual code in SparseMatrix iterates over hash of existing rows.
>> That breaks this execution for SparseMatrix only in two senses :
>>
>> (1) not all rows are visited; only those that have non-zeros. In
>> particular, if we created a matrix via like() then nothing is happening at
>> all;
>>
>> (2) rows are not visited in order of indices.
>>
>> I would like to fix (1) and (2) problems in this iterator but not sure if
>> this is important elsewhere.
>>
>
>

Re: Inconsistent SparseMatrix.iterator

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
Or that is meant to be behavior of iterateAll() only? which is true for all
iterators except SparseMatrix it seems.

On Tue, Jan 13, 2015 at 5:22 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

> Hi,
>
> I posted a question before about what it means to iterate over matrix, and
> was given an answer it means going over matrix rows (0...m-1).
>
> That makes it, for exapmle, possible to create code like the following
>
> for ( (row, index) <- mxA.zipWithIndex ) row += something(index)
>
> however actual code in SparseMatrix iterates over hash of existing rows.
> That breaks this execution for SparseMatrix only in two senses :
>
> (1) not all rows are visited; only those that have non-zeros. In
> particular, if we created a matrix via like() then nothing is happening at
> all;
>
> (2) rows are not visited in order of indices.
>
> I would like to fix (1) and (2) problems in this iterator but not sure if
> this is important elsewhere.
>