You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Dmitriy Lyubimov <dl...@gmail.com> on 2015/03/03 00:35:57 UTC

Broken non-zero iterator in VectorView?

it looks like assigning 0s in a view of SequentialAccessSparseVector
doesn't work, as it internally using setQuick() which tirms the length
of non-zero elements (?)
which causes invalidation of the iterator state.

in particular, this simple test fails:

val svec = new SequentialAccessSparseVector(30)

svec(1) = -0.5
svec(3) = 0.5
println(svec)

svec(1 until svec.length) ::= ( _ => 0)
println(svec)

svec.sum shouldBe 0

This produces  output

{1:-0.5,3:0.5}
{3:0.5}

The reason seems to be in the way vector view handles non-zero iterator:

public final class NonZeroIterator extends AbstractIterator<Element> {

  private final Iterator<Element> it;

  private NonZeroIterator() {
    it = vector.nonZeroes().iterator();
  }

  @Override
  protected Element computeNext() {
    while (it.hasNext()) {
      Element el = it.next();
      if (isInView(el.index()) && el.get() != 0) {
        Element decorated = vector.getElement(el.index());
        return new DecoratorElement(decorated);
      }
    }
    return endOfData();
  }

}


In particular, the problem lies in the line

        Element decorated = vector.getElement(el.index());

This creates yet another element which is AbstractVector.LocalElement
instead of iterator's element which would not cause iterator state
invalidation during assignment.

The question is why it tries to create yet another element instead of
decorating iterator's element itself in the Vector View???

I would just replace this line with simply

Element decorated = el

but i guess it might break something? what it is it might break?

Re: Broken non-zero iterator in VectorView?

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
tests happily pass for everything, except i did not try the legacy

really see no reason for special contracts on views. Again, we don't
guarantee non-reusability of elements in iterators outside the scope
of the closure they are passed in.

On Mon, Mar 2, 2015 at 4:33 PM, Pat Ferrel <pa...@occamsmachete.com> wrote:
> I vaguely remember the NonZeroIterator being optimized just as we were switching to Scala. Something Sebastian was working on but no idea if it was related to this.
>
>
> On Mar 2, 2015, at 3:51 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>
> actually the test error is something else but i think vector view
> iterator implementation is still wrong. I will scan if that produces
> any more errors elsewhere.
>
> On Mon, Mar 2, 2015 at 3:43 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>> this test is failing after i remove non-reusable elements in views,
>> but we should be able to remove that claimi think based on general
>> vector contract i don't think view contract should be any special,
>> this would defeat inheritance concept ...
>>
>> Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.306
>> sec <<< FAILURE! - in org.apache.mahout.math.MatricesTest
>> testViewDenseSparseReporting(org.apache.mahout.math.MatricesTest)
>> Time elapsed: 0.016 sec  <<< FAILURE!
>> java.lang.AssertionError: null
>>        at __randomizedtesting.SeedInfo.seed([BB9E332570D97135:8BBE9BBA065B4C3A]:0)
>>        at org.junit.Assert.fail(Assert.java:86)
>>        at org.junit.Assert.assertTrue(Assert.java:41)
>>        at org.junit.Assert.assertTrue(Assert.java:52)
>>        at org.apache.mahout.math.MatricesTest.testViewDenseSparseReporting(MatricesTest.java:69)
>>
>> Running org.apache.mahout.math.VectorBinaryAggregateCostTest
>> T
>>
>> On Mon, Mar 2, 2015 at 3:41 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>>> it looks like an attempt to eliminate reusable elements in vector
>>> view's iterators but why? Vector contract already implies element
>>> reusability inside iterators, so why special treatment inside vector
>>> views?
>>>
>>> umph.
>>>
>>> On Mon, Mar 2, 2015 at 3:35 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>>>> it looks like assigning 0s in a view of SequentialAccessSparseVector
>>>> doesn't work, as it internally using setQuick() which tirms the length
>>>> of non-zero elements (?)
>>>> which causes invalidation of the iterator state.
>>>>
>>>> in particular, this simple test fails:
>>>>
>>>> val svec = new SequentialAccessSparseVector(30)
>>>>
>>>> svec(1) = -0.5
>>>> svec(3) = 0.5
>>>> println(svec)
>>>>
>>>> svec(1 until svec.length) ::= ( _ => 0)
>>>> println(svec)
>>>>
>>>> svec.sum shouldBe 0
>>>>
>>>> This produces  output
>>>>
>>>> {1:-0.5,3:0.5}
>>>> {3:0.5}
>>>>
>>>> The reason seems to be in the way vector view handles non-zero iterator:
>>>>
>>>> public final class NonZeroIterator extends AbstractIterator<Element> {
>>>>
>>>>  private final Iterator<Element> it;
>>>>
>>>>  private NonZeroIterator() {
>>>>    it = vector.nonZeroes().iterator();
>>>>  }
>>>>
>>>>  @Override
>>>>  protected Element computeNext() {
>>>>    while (it.hasNext()) {
>>>>      Element el = it.next();
>>>>      if (isInView(el.index()) && el.get() != 0) {
>>>>        Element decorated = vector.getElement(el.index());
>>>>        return new DecoratorElement(decorated);
>>>>      }
>>>>    }
>>>>    return endOfData();
>>>>  }
>>>>
>>>> }
>>>>
>>>>
>>>> In particular, the problem lies in the line
>>>>
>>>>        Element decorated = vector.getElement(el.index());
>>>>
>>>> This creates yet another element which is AbstractVector.LocalElement
>>>> instead of iterator's element which would not cause iterator state
>>>> invalidation during assignment.
>>>>
>>>> The question is why it tries to create yet another element instead of
>>>> decorating iterator's element itself in the Vector View???
>>>>
>>>> I would just replace this line with simply
>>>>
>>>> Element decorated = el
>>>>
>>>> but i guess it might break something? what it is it might break?
>

Re: Broken non-zero iterator in VectorView?

Posted by Pat Ferrel <pa...@occamsmachete.com>.
I vaguely remember the NonZeroIterator being optimized just as we were switching to Scala. Something Sebastian was working on but no idea if it was related to this.


On Mar 2, 2015, at 3:51 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

actually the test error is something else but i think vector view
iterator implementation is still wrong. I will scan if that produces
any more errors elsewhere.

On Mon, Mar 2, 2015 at 3:43 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> this test is failing after i remove non-reusable elements in views,
> but we should be able to remove that claimi think based on general
> vector contract i don't think view contract should be any special,
> this would defeat inheritance concept ...
> 
> Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.306
> sec <<< FAILURE! - in org.apache.mahout.math.MatricesTest
> testViewDenseSparseReporting(org.apache.mahout.math.MatricesTest)
> Time elapsed: 0.016 sec  <<< FAILURE!
> java.lang.AssertionError: null
>        at __randomizedtesting.SeedInfo.seed([BB9E332570D97135:8BBE9BBA065B4C3A]:0)
>        at org.junit.Assert.fail(Assert.java:86)
>        at org.junit.Assert.assertTrue(Assert.java:41)
>        at org.junit.Assert.assertTrue(Assert.java:52)
>        at org.apache.mahout.math.MatricesTest.testViewDenseSparseReporting(MatricesTest.java:69)
> 
> Running org.apache.mahout.math.VectorBinaryAggregateCostTest
> T
> 
> On Mon, Mar 2, 2015 at 3:41 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>> it looks like an attempt to eliminate reusable elements in vector
>> view's iterators but why? Vector contract already implies element
>> reusability inside iterators, so why special treatment inside vector
>> views?
>> 
>> umph.
>> 
>> On Mon, Mar 2, 2015 at 3:35 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>>> it looks like assigning 0s in a view of SequentialAccessSparseVector
>>> doesn't work, as it internally using setQuick() which tirms the length
>>> of non-zero elements (?)
>>> which causes invalidation of the iterator state.
>>> 
>>> in particular, this simple test fails:
>>> 
>>> val svec = new SequentialAccessSparseVector(30)
>>> 
>>> svec(1) = -0.5
>>> svec(3) = 0.5
>>> println(svec)
>>> 
>>> svec(1 until svec.length) ::= ( _ => 0)
>>> println(svec)
>>> 
>>> svec.sum shouldBe 0
>>> 
>>> This produces  output
>>> 
>>> {1:-0.5,3:0.5}
>>> {3:0.5}
>>> 
>>> The reason seems to be in the way vector view handles non-zero iterator:
>>> 
>>> public final class NonZeroIterator extends AbstractIterator<Element> {
>>> 
>>>  private final Iterator<Element> it;
>>> 
>>>  private NonZeroIterator() {
>>>    it = vector.nonZeroes().iterator();
>>>  }
>>> 
>>>  @Override
>>>  protected Element computeNext() {
>>>    while (it.hasNext()) {
>>>      Element el = it.next();
>>>      if (isInView(el.index()) && el.get() != 0) {
>>>        Element decorated = vector.getElement(el.index());
>>>        return new DecoratorElement(decorated);
>>>      }
>>>    }
>>>    return endOfData();
>>>  }
>>> 
>>> }
>>> 
>>> 
>>> In particular, the problem lies in the line
>>> 
>>>        Element decorated = vector.getElement(el.index());
>>> 
>>> This creates yet another element which is AbstractVector.LocalElement
>>> instead of iterator's element which would not cause iterator state
>>> invalidation during assignment.
>>> 
>>> The question is why it tries to create yet another element instead of
>>> decorating iterator's element itself in the Vector View???
>>> 
>>> I would just replace this line with simply
>>> 
>>> Element decorated = el
>>> 
>>> but i guess it might break something? what it is it might break?


Re: Broken non-zero iterator in VectorView?

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
actually the test error is something else but i think vector view
iterator implementation is still wrong. I will scan if that produces
any more errors elsewhere.

On Mon, Mar 2, 2015 at 3:43 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> this test is failing after i remove non-reusable elements in views,
> but we should be able to remove that claimi think based on general
> vector contract i don't think view contract should be any special,
> this would defeat inheritance concept ...
>
> Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.306
> sec <<< FAILURE! - in org.apache.mahout.math.MatricesTest
> testViewDenseSparseReporting(org.apache.mahout.math.MatricesTest)
> Time elapsed: 0.016 sec  <<< FAILURE!
> java.lang.AssertionError: null
>         at __randomizedtesting.SeedInfo.seed([BB9E332570D97135:8BBE9BBA065B4C3A]:0)
>         at org.junit.Assert.fail(Assert.java:86)
>         at org.junit.Assert.assertTrue(Assert.java:41)
>         at org.junit.Assert.assertTrue(Assert.java:52)
>         at org.apache.mahout.math.MatricesTest.testViewDenseSparseReporting(MatricesTest.java:69)
>
> Running org.apache.mahout.math.VectorBinaryAggregateCostTest
> T
>
> On Mon, Mar 2, 2015 at 3:41 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>> it looks like an attempt to eliminate reusable elements in vector
>> view's iterators but why? Vector contract already implies element
>> reusability inside iterators, so why special treatment inside vector
>> views?
>>
>> umph.
>>
>> On Mon, Mar 2, 2015 at 3:35 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>>> it looks like assigning 0s in a view of SequentialAccessSparseVector
>>> doesn't work, as it internally using setQuick() which tirms the length
>>> of non-zero elements (?)
>>> which causes invalidation of the iterator state.
>>>
>>> in particular, this simple test fails:
>>>
>>> val svec = new SequentialAccessSparseVector(30)
>>>
>>> svec(1) = -0.5
>>> svec(3) = 0.5
>>> println(svec)
>>>
>>> svec(1 until svec.length) ::= ( _ => 0)
>>> println(svec)
>>>
>>> svec.sum shouldBe 0
>>>
>>> This produces  output
>>>
>>> {1:-0.5,3:0.5}
>>> {3:0.5}
>>>
>>> The reason seems to be in the way vector view handles non-zero iterator:
>>>
>>> public final class NonZeroIterator extends AbstractIterator<Element> {
>>>
>>>   private final Iterator<Element> it;
>>>
>>>   private NonZeroIterator() {
>>>     it = vector.nonZeroes().iterator();
>>>   }
>>>
>>>   @Override
>>>   protected Element computeNext() {
>>>     while (it.hasNext()) {
>>>       Element el = it.next();
>>>       if (isInView(el.index()) && el.get() != 0) {
>>>         Element decorated = vector.getElement(el.index());
>>>         return new DecoratorElement(decorated);
>>>       }
>>>     }
>>>     return endOfData();
>>>   }
>>>
>>> }
>>>
>>>
>>> In particular, the problem lies in the line
>>>
>>>         Element decorated = vector.getElement(el.index());
>>>
>>> This creates yet another element which is AbstractVector.LocalElement
>>> instead of iterator's element which would not cause iterator state
>>> invalidation during assignment.
>>>
>>> The question is why it tries to create yet another element instead of
>>> decorating iterator's element itself in the Vector View???
>>>
>>> I would just replace this line with simply
>>>
>>> Element decorated = el
>>>
>>> but i guess it might break something? what it is it might break?

Re: Broken non-zero iterator in VectorView?

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
this test is failing after i remove non-reusable elements in views,
but we should be able to remove that claimi think based on general
vector contract i don't think view contract should be any special,
this would defeat inheritance concept ...

Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.306
sec <<< FAILURE! - in org.apache.mahout.math.MatricesTest
testViewDenseSparseReporting(org.apache.mahout.math.MatricesTest)
Time elapsed: 0.016 sec  <<< FAILURE!
java.lang.AssertionError: null
        at __randomizedtesting.SeedInfo.seed([BB9E332570D97135:8BBE9BBA065B4C3A]:0)
        at org.junit.Assert.fail(Assert.java:86)
        at org.junit.Assert.assertTrue(Assert.java:41)
        at org.junit.Assert.assertTrue(Assert.java:52)
        at org.apache.mahout.math.MatricesTest.testViewDenseSparseReporting(MatricesTest.java:69)

Running org.apache.mahout.math.VectorBinaryAggregateCostTest
T

On Mon, Mar 2, 2015 at 3:41 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> it looks like an attempt to eliminate reusable elements in vector
> view's iterators but why? Vector contract already implies element
> reusability inside iterators, so why special treatment inside vector
> views?
>
> umph.
>
> On Mon, Mar 2, 2015 at 3:35 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>> it looks like assigning 0s in a view of SequentialAccessSparseVector
>> doesn't work, as it internally using setQuick() which tirms the length
>> of non-zero elements (?)
>> which causes invalidation of the iterator state.
>>
>> in particular, this simple test fails:
>>
>> val svec = new SequentialAccessSparseVector(30)
>>
>> svec(1) = -0.5
>> svec(3) = 0.5
>> println(svec)
>>
>> svec(1 until svec.length) ::= ( _ => 0)
>> println(svec)
>>
>> svec.sum shouldBe 0
>>
>> This produces  output
>>
>> {1:-0.5,3:0.5}
>> {3:0.5}
>>
>> The reason seems to be in the way vector view handles non-zero iterator:
>>
>> public final class NonZeroIterator extends AbstractIterator<Element> {
>>
>>   private final Iterator<Element> it;
>>
>>   private NonZeroIterator() {
>>     it = vector.nonZeroes().iterator();
>>   }
>>
>>   @Override
>>   protected Element computeNext() {
>>     while (it.hasNext()) {
>>       Element el = it.next();
>>       if (isInView(el.index()) && el.get() != 0) {
>>         Element decorated = vector.getElement(el.index());
>>         return new DecoratorElement(decorated);
>>       }
>>     }
>>     return endOfData();
>>   }
>>
>> }
>>
>>
>> In particular, the problem lies in the line
>>
>>         Element decorated = vector.getElement(el.index());
>>
>> This creates yet another element which is AbstractVector.LocalElement
>> instead of iterator's element which would not cause iterator state
>> invalidation during assignment.
>>
>> The question is why it tries to create yet another element instead of
>> decorating iterator's element itself in the Vector View???
>>
>> I would just replace this line with simply
>>
>> Element decorated = el
>>
>> but i guess it might break something? what it is it might break?

Re: Broken non-zero iterator in VectorView?

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
it looks like an attempt to eliminate reusable elements in vector
view's iterators but why? Vector contract already implies element
reusability inside iterators, so why special treatment inside vector
views?

umph.

On Mon, Mar 2, 2015 at 3:35 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> it looks like assigning 0s in a view of SequentialAccessSparseVector
> doesn't work, as it internally using setQuick() which tirms the length
> of non-zero elements (?)
> which causes invalidation of the iterator state.
>
> in particular, this simple test fails:
>
> val svec = new SequentialAccessSparseVector(30)
>
> svec(1) = -0.5
> svec(3) = 0.5
> println(svec)
>
> svec(1 until svec.length) ::= ( _ => 0)
> println(svec)
>
> svec.sum shouldBe 0
>
> This produces  output
>
> {1:-0.5,3:0.5}
> {3:0.5}
>
> The reason seems to be in the way vector view handles non-zero iterator:
>
> public final class NonZeroIterator extends AbstractIterator<Element> {
>
>   private final Iterator<Element> it;
>
>   private NonZeroIterator() {
>     it = vector.nonZeroes().iterator();
>   }
>
>   @Override
>   protected Element computeNext() {
>     while (it.hasNext()) {
>       Element el = it.next();
>       if (isInView(el.index()) && el.get() != 0) {
>         Element decorated = vector.getElement(el.index());
>         return new DecoratorElement(decorated);
>       }
>     }
>     return endOfData();
>   }
>
> }
>
>
> In particular, the problem lies in the line
>
>         Element decorated = vector.getElement(el.index());
>
> This creates yet another element which is AbstractVector.LocalElement
> instead of iterator's element which would not cause iterator state
> invalidation during assignment.
>
> The question is why it tries to create yet another element instead of
> decorating iterator's element itself in the Vector View???
>
> I would just replace this line with simply
>
> Element decorated = el
>
> but i guess it might break something? what it is it might break?