You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Nico Higgs <el...@gmail.com> on 2009/07/29 22:56:20 UTC

Taste and MySQL with GroupLens dataset

Hi Sean and everybody!

I've download Mahout from SVN and followed the FAQ for trying Taste with the
1M ratings from GroupLens dataset test. First I tried With the
GroupLensRecommender(that uses a FileDataModel) and everythings went ok.

Then I decide to give a try with the data loaded on MySQL with the Slopone
Recommender. After inserting the ratings.dat in taste_preferences table
(1000029 rows), and the first run (3 hours and a half to generate the
4922072 rows for the slopeone diffs), I tried to get a recommend but a I'm
getting this error in
the getDiffs method of AbstractJDBCDiffStorage (with the userID=1 and
howMany=5)

Caused by: java.lang.ArrayIndexOutOfBoundsException: 53
    at
org.apache.mahout.cf.taste.impl.recommender.slopeone.jdbc.AbstractJDBCDiffStorage.getDiffs(AbstractJDBCDiffStorage.java:175)
    at
org.apache.mahout.cf.taste.impl.recommender.slopeone.SlopeOneRecommender.doEstimatePreference(SlopeOneRecommender.java:136)
    at
org.apache.mahout.cf.taste.impl.recommender.slopeone.SlopeOneRecommender.access$100(SlopeOneRecommender.java:50)
    at
org.apache.mahout.cf.taste.impl.recommender.slopeone.SlopeOneRecommender$Estimator.estimate(SlopeOneRecommender.java:219)
    at
org.apache.mahout.cf.taste.impl.recommender.slopeone.SlopeOneRecommender$Estimator.estimate(SlopeOneRecommender.java:209)


Looking at the code it seems, that the set containg the results of the
diffSQL contains distinct or more data than the preferences of the user. Is
this possible?

Thanks and regards!

PD: I'm currently starting with taste so I don't know If i could give any
more help. Nevetherless I will continue to investigate further the problem
and tell you If I found something.

Re: Taste and MySQL with GroupLens dataset

Posted by Sean Owen <sr...@gmail.com>.
It is not your error. You should be able to use integers. That line should
call getObject() and cast to Comparable instead of assuming a String. I will
make this fix and commit soon but you can do so locally too.

On Jul 30, 2009 12:54 AM, "Nico Higgs" <el...@gmail.com> wrote:

Ouch! I'm a stupid!

Yes, I'm using Integers in MySQL (Long types in Java) for the Ids cause I
thought it would be more performant. "Small" thing I forgot to mention
before!!!!

That decision for the ids force me a small change in AbstractJDBCDataModel
in the getUser method (removing String idString = id.toString() in line 226
and using the id in the buildUser at the bottom) and in
AbstractJDBCDiffStorage where I change

String nextResultItemID = rs.getString(3)
with
Comparable<?>nextResultItemID = rs.getString(3);

instead of
Comparable<?> nextResultItemID = (Comparable<?>) rs.getObject(3);
!!!!

When you mention the integers I found my error quickly. Now I fixed the cast
problem and everything is working fine again!

Sorry for the mistake and wasting your time Sean! :(

Kind regards

On Wed, Jul 29, 2009 at 7:47 PM, Sean Owen <sr...@gmail.com> wrote: >
Interesting, that is a comp...

Re: Taste and MySQL with GroupLens dataset

Posted by Nico Higgs <el...@gmail.com>.
Ouch! I'm a stupid!

Yes, I'm using Integers in MySQL (Long types in Java) for the Ids cause I
thought it would be more performant. "Small" thing I forgot to mention
before!!!!

That decision for the ids force me a small change in AbstractJDBCDataModel
in the getUser method (removing String idString = id.toString() in line 226
and using the id in the buildUser at the bottom) and in
AbstractJDBCDiffStorage where I change

String nextResultItemID = rs.getString(3)
with
Comparable<?>nextResultItemID = rs.getString(3);

instead of
Comparable<?> nextResultItemID = (Comparable<?>) rs.getObject(3);
!!!!

When you mention the integers I found my error quickly. Now I fixed the cast
problem and everything is working fine again!

Sorry for the mistake and wasting your time Sean! :(

Kind regards

On Wed, Jul 29, 2009 at 7:47 PM, Sean Owen <sr...@gmail.com> wrote:

> Interesting, that is a complicated bit of code.
>
> By any chance are you using ints as keys? that could suggest an
> explanation.
>
> I don't suppose the data might be changing underneath you here?
>
> On Wed, Jul 29, 2009 at 9:56 PM, Nico Higgs<el...@gmail.com> wrote:
> > Hi Sean and everybody!
> >
> > I've download Mahout from SVN and followed the FAQ for trying Taste with
> the
> > 1M ratings from GroupLens dataset test. First I tried With the
> > GroupLensRecommender(that uses a FileDataModel) and everythings went ok.
> >
> > Then I decide to give a try with the data loaded on MySQL with the
> Slopone
> > Recommender. After inserting the ratings.dat in taste_preferences table
> > (1000029 rows), and the first run (3 hours and a half to generate the
> > 4922072 rows for the slopeone diffs), I tried to get a recommend but a
> I'm
> > getting this error in
> > the getDiffs method of AbstractJDBCDiffStorage (with the userID=1 and
> > howMany=5)
> >
> > Caused by: java.lang.ArrayIndexOutOfBoundsException: 53
> >    at
> >
> org.apache.mahout.cf.taste.impl.recommender.slopeone.jdbc.AbstractJDBCDiffStorage.getDiffs(AbstractJDBCDiffStorage.java:175)
> >    at
> >
> org.apache.mahout.cf.taste.impl.recommender.slopeone.SlopeOneRecommender.doEstimatePreference(SlopeOneRecommender.java:136)
> >    at
> >
> org.apache.mahout.cf.taste.impl.recommender.slopeone.SlopeOneRecommender.access$100(SlopeOneRecommender.java:50)
> >    at
> >
> org.apache.mahout.cf.taste.impl.recommender.slopeone.SlopeOneRecommender$Estimator.estimate(SlopeOneRecommender.java:219)
> >    at
> >
> org.apache.mahout.cf.taste.impl.recommender.slopeone.SlopeOneRecommender$Estimator.estimate(SlopeOneRecommender.java:209)
> >
> >
> > Looking at the code it seems, that the set containg the results of the
> > diffSQL contains distinct or more data than the preferences of the user.
> Is
> > this possible?
> >
> > Thanks and regards!
> >
> > PD: I'm currently starting with taste so I don't know If i could give any
> > more help. Nevetherless I will continue to investigate further the
> problem
> > and tell you If I found something.
> >
>

Re: Taste and MySQL with GroupLens dataset

Posted by Sean Owen <sr...@gmail.com>.
Interesting, that is a complicated bit of code.

By any chance are you using ints as keys? that could suggest an explanation.

I don't suppose the data might be changing underneath you here?

On Wed, Jul 29, 2009 at 9:56 PM, Nico Higgs<el...@gmail.com> wrote:
> Hi Sean and everybody!
>
> I've download Mahout from SVN and followed the FAQ for trying Taste with the
> 1M ratings from GroupLens dataset test. First I tried With the
> GroupLensRecommender(that uses a FileDataModel) and everythings went ok.
>
> Then I decide to give a try with the data loaded on MySQL with the Slopone
> Recommender. After inserting the ratings.dat in taste_preferences table
> (1000029 rows), and the first run (3 hours and a half to generate the
> 4922072 rows for the slopeone diffs), I tried to get a recommend but a I'm
> getting this error in
> the getDiffs method of AbstractJDBCDiffStorage (with the userID=1 and
> howMany=5)
>
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 53
>    at
> org.apache.mahout.cf.taste.impl.recommender.slopeone.jdbc.AbstractJDBCDiffStorage.getDiffs(AbstractJDBCDiffStorage.java:175)
>    at
> org.apache.mahout.cf.taste.impl.recommender.slopeone.SlopeOneRecommender.doEstimatePreference(SlopeOneRecommender.java:136)
>    at
> org.apache.mahout.cf.taste.impl.recommender.slopeone.SlopeOneRecommender.access$100(SlopeOneRecommender.java:50)
>    at
> org.apache.mahout.cf.taste.impl.recommender.slopeone.SlopeOneRecommender$Estimator.estimate(SlopeOneRecommender.java:219)
>    at
> org.apache.mahout.cf.taste.impl.recommender.slopeone.SlopeOneRecommender$Estimator.estimate(SlopeOneRecommender.java:209)
>
>
> Looking at the code it seems, that the set containg the results of the
> diffSQL contains distinct or more data than the preferences of the user. Is
> this possible?
>
> Thanks and regards!
>
> PD: I'm currently starting with taste so I don't know If i could give any
> more help. Nevetherless I will continue to investigate further the problem
> and tell you If I found something.
>