You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Lance Norskog <go...@gmail.com> on 2010/11/17 08:47:01 UTC
About arithmetic on Vectors
The Mahout Vector implementations of arithmetic have what I think is a bug.
class AbstractVector
addTo(Vector v) {
Iterator<Element> it = iterateNonZero();
while (it.hasNext()) {
Element e = it.next();
int index = e.index();
v.setQuick(index, v.getQuick(index) + e.get());
}
}
Because "this" walks only its non-zero elements, matching columns in the
other vector are ignored. That is:
[1, 2, 0, 4].addTo([1, 1, 1, 1]) = [2, 3, 0, 5]
public Vector plus(Vector) also does this at around line 371:
Vector result = like().assign(this);
Iterator<Element> iter = x.iterateNonZero();
while (iter.hasNext()) {
Element e = iter.next();
int index = e.index();
result.setQuick(index, this.getQuick(index) + e.get());
}
return result;
All of the Vector subclasses that store data (Dense, RandomAccess,
SequentialAccess) don't override these two methods.
The unit tests don't catch this mistake- they need a wider range of test
data. A lot of code uses these two methods, and are getting bogus
results. Because Maven is weird for me, I can't run the test suites on a
fixed version.
Lance Norskog
.
Re: About arithmetic on Vectors
Posted by Lance Norskog <go...@gmail.com>.
Argh! Yes, I get it now.
On Thu, Nov 18, 2010 at 7:16 AM, Ted Dunning <te...@gmail.com> wrote:
> Also, if you have suggested test values, we can easily add them to the
> vector test prototype.
>
> On Thu, Nov 18, 2010 at 6:21 AM, Grant Ingersoll <gs...@apache.org>wrote:
>
>> > All of the Vector subclasses that store data (Dense, RandomAccess,
>> SequentialAccess) don't override these two methods.
>> > The unit tests don't catch this mistake- they need a wider range of test
>> data. A lot of code uses these two methods, and are getting bogus results.
>> Because Maven is weird for me, I can't run the test suites on a fixed
>> version.
>>
>> What commands are you running?
>
--
Lance Norskog
goksron@gmail.com
Re: About arithmetic on Vectors
Posted by Ted Dunning <te...@gmail.com>.
Also, if you have suggested test values, we can easily add them to the
vector test prototype.
On Thu, Nov 18, 2010 at 6:21 AM, Grant Ingersoll <gs...@apache.org>wrote:
> > All of the Vector subclasses that store data (Dense, RandomAccess,
> SequentialAccess) don't override these two methods.
> > The unit tests don't catch this mistake- they need a wider range of test
> data. A lot of code uses these two methods, and are getting bogus results.
> Because Maven is weird for me, I can't run the test suites on a fixed
> version.
>
> What commands are you running?
Re: About arithmetic on Vectors
Posted by Grant Ingersoll <gs...@apache.org>.
On Nov 17, 2010, at 2:47 AM, Lance Norskog wrote:
> The Mahout Vector implementations of arithmetic have what I think is a bug.
>
> class AbstractVector
> addTo(Vector v) {
> Iterator<Element> it = iterateNonZero();
> while (it.hasNext()) {
> Element e = it.next();
> int index = e.index();
> v.setQuick(index, v.getQuick(index) + e.get());
> }
> }
>
> Because "this" walks only its non-zero elements, matching columns in the other vector are ignored. That is:
>
> [1, 2, 0, 4].addTo([1, 1, 1, 1]) = [2, 3, 0, 5]
I think the above code is correct, it is getting the non-zero values from the "this" vector and adding them to the vector passed in. In other words, it's adding this vector to the vector V, whereas I think you are implying it is the other way around.
The above should actually be:
> [1, 2, 0, 4].addTo([1, 1, 1, 1]) = [2, 3, 1, 5]
As the Javadoc says:
/** Add the elements to the other vector and results are stored in that vector. */
I added the following test:
@Test
public void testAddTo() throws Exception {
Vector v = new DenseVector(4);
Vector w = new DenseVector(4);
v.setQuick(0, 1);
v.setQuick(1, 2);
v.setQuick(2, 0);
v.setQuick(3, 4);
w.setQuick(0, 1);
w.setQuick(1, 1);
w.setQuick(2, 1);
w.setQuick(3, 1);
v.addTo(w);
Vector gold = new DenseVector(new double[]{2, 3, 1, 5});
assertTrue(w.equals(gold));
assertFalse(v.equals(gold));
}
I also clarified the docs.
>
>
> All of the Vector subclasses that store data (Dense, RandomAccess, SequentialAccess) don't override these two methods.
> The unit tests don't catch this mistake- they need a wider range of test data. A lot of code uses these two methods, and are getting bogus results. Because Maven is weird for me, I can't run the test suites on a fixed version.
What commands are you running?