You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Laya Patwa <la...@iitr.ernet.in> on 2009/07/15 11:23:58 UTC

problems with taste.

Hi!
I am a student and right now I am working on a project named CoEUD. My task
is to build a recommender system and I am using the taste recommender
library that comes with mahout.

I downloaded the subversion and installed it using maven and cygwin on
windows and also on MacOS. I tried the grouplens demo that is given in the
taste documentation and it also worked. Then I tried the example for user
based recommender, but it is giving some problems as follows:
1) I have 3 different data files with me and they are in CSV format having
values of userID, itemID and preference. The recommendations are generated
for 1 data file and not for the others. There are no errors. The program
runs and generates an empty recommendation list for other data files
2)Another peculiar thing that is happening is that when I make a copy of the
data file for which the user based recommender example is working and use
this data file, the recommendation list generated is empty.

I am stuck with the above 2 problems and can't figure out why the
recommendations are not generated for all the data files. Can you please
help me out.

I am using eclipse.

Cheers,
Laya

Re: problems with taste.

Posted by Grant Ingersoll <gs...@apache.org>.
The mailing list often strips attachments.  It would be better to post  
them somewhere for download.  I'm surprised that the one attachment  
even went through.

On Jul 15, 2009, at 8:48 AM, Laya Patwa wrote:

> There is a problem in the mailing system I guess. I am sending 1  
> file in 1 mail. This has the code file.
>
> On Wed, Jul 15, 2009 at 2:15 PM, Sean Owen <sr...@gmail.com> wrote:
> There is still only one text file attached. But anyway I believe  
> Thomas has
> identified the problem.
>
> On Jul 15, 2009 1:03 PM, "Laya Patwa" <la...@iitr.ernet.in> wrote:
>
> You did not get the code? But I sent it. Anyways please find 3 files
> attached with this mail containing the code and the 2 data files.
> My apologies for the mistake.
>
> On Wed, Jul 15, 2009 at 1:57 PM, Sean Owen <sr...@gmail.com> wrote:  
> > >
> Thomas is right, you have...
>
> <recommender.txt>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


Re: problems with taste.

Posted by Sean Owen <sr...@gmail.com>.
I get no recommendations from your code and from either data file you
sent. (Neither ends in .csv though you reference that in the emails
and code, so I wonder if you are using different data.)

The reasons there are no recommendations are as we predicted earlier.
I confirmed this by debugging.

Why don't you take a look at the assessment I posted last, and adjust
your data accordingly, as a next step? That is your problem.

On Wed, Jul 15, 2009 at 3:30 PM, Laya Patwa<la...@iitr.ernet.in> wrote:
> OK... but i am getting recommendations for this data file!
> And i also tried the other data file after removing some of the items but
> the recommendation list is still empty. Can you please have a look at it
> that you have the code and the data now.

Re: problems with taste.

Posted by Laya Patwa <la...@iitr.ernet.in>.
The data that i sent you is just for testing purpose. The actual dataset is
much larger and it is similar to this small data.

On Thu, Jul 16, 2009 at 12:56 PM, Sean Owen <sr...@gmail.com> wrote:

> Yes, in one of your data sets, I noticed that all the preference
> values were "1". This indicates to me that you really don't have a
> notion of the strength of the preference between users and items.
> There is an association, or there is none. I call this, somewhat
> wrongly, a "boolean" preference.
>
> In this case, you can use faster and lighter versions of the
> components you are currently using, which are specialized for this
> situation.
>
> To try this, first use a copy of your data file which omits the final
> ",1" on every line. You don't need it.
> Instead of using PearsonCorrelationSimilarity, try
> BooleanLogLikelihoodSimilarity.
> Remove the PreferenceInferrer (these don't work so well anyway in my
> experience)
> Then use BooleanUserGenericUserBasedRecommender as your recommender
> implementation.
>
> For such a small data set, it is already extremely fast. But if you
> had a great deal more data, you would see a big difference.
>
> You may even find this approach, which ignores preference data, gives
> better results.
>
>
> On Thu, Jul 16, 2009 at 11:16 AM, Laya Patwa<la...@iitr.ernet.in>
> wrote:
> > Thank you so much guys for discussing the problem of mine. I am getting
> the
> > recommendations now!
> > You mentioned something about improving the performance in one of the
> mails.
>

Re: problems with taste.

Posted by Sean Owen <sr...@gmail.com>.
Yes, in one of your data sets, I noticed that all the preference
values were "1". This indicates to me that you really don't have a
notion of the strength of the preference between users and items.
There is an association, or there is none. I call this, somewhat
wrongly, a "boolean" preference.

In this case, you can use faster and lighter versions of the
components you are currently using, which are specialized for this
situation.

To try this, first use a copy of your data file which omits the final
",1" on every line. You don't need it.
Instead of using PearsonCorrelationSimilarity, try
BooleanLogLikelihoodSimilarity.
Remove the PreferenceInferrer (these don't work so well anyway in my experience)
Then use BooleanUserGenericUserBasedRecommender as your recommender
implementation.

For such a small data set, it is already extremely fast. But if you
had a great deal more data, you would see a big difference.

You may even find this approach, which ignores preference data, gives
better results.


On Thu, Jul 16, 2009 at 11:16 AM, Laya Patwa<la...@iitr.ernet.in> wrote:
> Thank you so much guys for discussing the problem of mine. I am getting the
> recommendations now!
> You mentioned something about improving the performance in one of the mails.

Re: problems with taste.

Posted by Laya Patwa <la...@iitr.ernet.in>.
Thank you so much guys for discussing the problem of mine. I am getting the
recommendations now!
You mentioned something about improving the performance in one of the mails.

On Wed, Jul 15, 2009 at 5:36 PM, Thomas Rewig <tr...@mufin.com> wrote:

> In principle your code works (I tested it), I just think your Testdata
> isn't good or so realistic because it is not much and worse distributed - if
> you delete all 0.1 Items - ... but that is only my assumption.
>
> If you need some working testdata you can use that data I used as I started
> with taste and want to know how it works:
>
> User1,Item1,5
> User1,Item2,4
> User1,Item4,2
> User2,Item2,3
> User2,Item3,2
> User3,Item2,4
> User3,Item3,3
> User3,Item4,2
> User4,Item1,1
> User4,Item2,1
> User4,Item3,1
> User4,Item4,1
>
> You can compute it (For User 1-3 there is a recommendation, for User 4
> naturaly not because all Items are set)
> and than you can calculate the stuff manually or debug it and understand
> like taste is working.
>
> Perhaps this helps you a little bit.
>
> regards
> Thomas
>
> Laya Patwa schrieb:
>
>  OK... but i am getting recommendations for this data file!
>> And i also tried the other data file after removing some of the items but
>> the recommendation list is still empty. Can you please have a look at it
>> that you have the code and the data now.
>>
>> On Wed, Jul 15, 2009 at 2:52 PM, Sean Owen <sr...@gmail.com> wrote:
>>
>>
>>
>>> I have them now. Yes, again Thomas's analysis is correct about why
>>> there are no recommendations from the .txt file. See the previous
>>> messages about what to do.
>>>
>>> The other data file has a different issue. All the ratings are the
>>> same. Correlation-based similarity metrics will not work as they
>>> cannot define a similarity in such a case (correlation is undefined).
>>> It will be unable to give recommendations as a result. You need to try
>>> a different metric like TanimotoCoefficientSimilarity.
>>>
>>> If that works well, there is a way to make this a lot faster. We can
>>> discuss it next.
>>>
>>> On Wed, Jul 15, 2009 at 1:48 PM, Laya Patwa<la...@iitr.ernet.in>
>>> wrote:
>>>
>>>
>>>> There is a problem in the mailing system I guess. I am sending 1 file in
>>>>
>>>>
>>> 1
>>>
>>>
>>>> mail. This has the code file.
>>>>
>>>> On Wed, Jul 15, 2009 at 2:15 PM, Sean Owen <sr...@gmail.com> wrote:
>>>>
>>>>
>>>>> There is still only one text file attached. But anyway I believe Thomas
>>>>> has
>>>>> identified the problem.
>>>>>
>>>>> On Jul 15, 2009 1:03 PM, "Laya Patwa" <la...@iitr.ernet.in> wrote:
>>>>>
>>>>> You did not get the code? But I sent it. Anyways please find 3 files
>>>>> attached with this mail containing the code and the 2 data files.
>>>>> My apologies for the mistake.
>>>>>
>>>>> On Wed, Jul 15, 2009 at 1:57 PM, Sean Owen <sr...@gmail.com> wrote: >
>>>>>        Thomas is right, you have...
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>

Re: problems with taste.

Posted by Thomas Rewig <tr...@mufin.com>.
In principle your code works (I tested it), I just think your Testdata 
isn't good or so realistic because it is not much and worse distributed 
- if you delete all 0.1 Items - ... but that is only my assumption.

If you need some working testdata you can use that data I used as I 
started with taste and want to know how it works:

User1,Item1,5
User1,Item2,4
User1,Item4,2
User2,Item2,3
User2,Item3,2
User3,Item2,4
User3,Item3,3
User3,Item4,2
User4,Item1,1
User4,Item2,1
User4,Item3,1
User4,Item4,1

You can compute it (For User 1-3 there is a recommendation, for User 4 
naturaly not because all Items are set)
and than you can calculate the stuff manually or debug it and understand 
like taste is working.

Perhaps this helps you a little bit.

regards
Thomas

Laya Patwa schrieb:
> OK... but i am getting recommendations for this data file!
> And i also tried the other data file after removing some of the items but
> the recommendation list is still empty. Can you please have a look at it
> that you have the code and the data now.
>
> On Wed, Jul 15, 2009 at 2:52 PM, Sean Owen <sr...@gmail.com> wrote:
>
>   
>> I have them now. Yes, again Thomas's analysis is correct about why
>> there are no recommendations from the .txt file. See the previous
>> messages about what to do.
>>
>> The other data file has a different issue. All the ratings are the
>> same. Correlation-based similarity metrics will not work as they
>> cannot define a similarity in such a case (correlation is undefined).
>> It will be unable to give recommendations as a result. You need to try
>> a different metric like TanimotoCoefficientSimilarity.
>>
>> If that works well, there is a way to make this a lot faster. We can
>> discuss it next.
>>
>> On Wed, Jul 15, 2009 at 1:48 PM, Laya Patwa<la...@iitr.ernet.in> wrote:
>>     
>>> There is a problem in the mailing system I guess. I am sending 1 file in
>>>       
>> 1
>>     
>>> mail. This has the code file.
>>>
>>> On Wed, Jul 15, 2009 at 2:15 PM, Sean Owen <sr...@gmail.com> wrote:
>>>       
>>>> There is still only one text file attached. But anyway I believe Thomas
>>>> has
>>>> identified the problem.
>>>>
>>>> On Jul 15, 2009 1:03 PM, "Laya Patwa" <la...@iitr.ernet.in> wrote:
>>>>
>>>> You did not get the code? But I sent it. Anyways please find 3 files
>>>> attached with this mail containing the code and the 2 data files.
>>>> My apologies for the mistake.
>>>>
>>>> On Wed, Jul 15, 2009 at 1:57 PM, Sean Owen <sr...@gmail.com> wrote: >
>>>>         
>>>> Thomas is right, you have...
>>>>         
>>>       
>
>   

Re: problems with taste.

Posted by Laya Patwa <la...@iitr.ernet.in>.
OK... but i am getting recommendations for this data file!
And i also tried the other data file after removing some of the items but
the recommendation list is still empty. Can you please have a look at it
that you have the code and the data now.

On Wed, Jul 15, 2009 at 2:52 PM, Sean Owen <sr...@gmail.com> wrote:

> I have them now. Yes, again Thomas's analysis is correct about why
> there are no recommendations from the .txt file. See the previous
> messages about what to do.
>
> The other data file has a different issue. All the ratings are the
> same. Correlation-based similarity metrics will not work as they
> cannot define a similarity in such a case (correlation is undefined).
> It will be unable to give recommendations as a result. You need to try
> a different metric like TanimotoCoefficientSimilarity.
>
> If that works well, there is a way to make this a lot faster. We can
> discuss it next.
>
> On Wed, Jul 15, 2009 at 1:48 PM, Laya Patwa<la...@iitr.ernet.in> wrote:
> > There is a problem in the mailing system I guess. I am sending 1 file in
> 1
> > mail. This has the code file.
> >
> > On Wed, Jul 15, 2009 at 2:15 PM, Sean Owen <sr...@gmail.com> wrote:
> >>
> >> There is still only one text file attached. But anyway I believe Thomas
> >> has
> >> identified the problem.
> >>
> >> On Jul 15, 2009 1:03 PM, "Laya Patwa" <la...@iitr.ernet.in> wrote:
> >>
> >> You did not get the code? But I sent it. Anyways please find 3 files
> >> attached with this mail containing the code and the 2 data files.
> >> My apologies for the mistake.
> >>
> >> On Wed, Jul 15, 2009 at 1:57 PM, Sean Owen <sr...@gmail.com> wrote: >
> >
> >> Thomas is right, you have...
> >
> >
>

Re: problems with taste.

Posted by Sean Owen <sr...@gmail.com>.
I have them now. Yes, again Thomas's analysis is correct about why
there are no recommendations from the .txt file. See the previous
messages about what to do.

The other data file has a different issue. All the ratings are the
same. Correlation-based similarity metrics will not work as they
cannot define a similarity in such a case (correlation is undefined).
It will be unable to give recommendations as a result. You need to try
a different metric like TanimotoCoefficientSimilarity.

If that works well, there is a way to make this a lot faster. We can
discuss it next.

On Wed, Jul 15, 2009 at 1:48 PM, Laya Patwa<la...@iitr.ernet.in> wrote:
> There is a problem in the mailing system I guess. I am sending 1 file in 1
> mail. This has the code file.
>
> On Wed, Jul 15, 2009 at 2:15 PM, Sean Owen <sr...@gmail.com> wrote:
>>
>> There is still only one text file attached. But anyway I believe Thomas
>> has
>> identified the problem.
>>
>> On Jul 15, 2009 1:03 PM, "Laya Patwa" <la...@iitr.ernet.in> wrote:
>>
>> You did not get the code? But I sent it. Anyways please find 3 files
>> attached with this mail containing the code and the 2 data files.
>> My apologies for the mistake.
>>
>> On Wed, Jul 15, 2009 at 1:57 PM, Sean Owen <sr...@gmail.com> wrote: > >
>> Thomas is right, you have...
>
>

Re: problems with taste.

Posted by Laya Patwa <la...@iitr.ernet.in>.
The data file.

On Wed, Jul 15, 2009 at 2:48 PM, Laya Patwa <la...@iitr.ernet.in> wrote:

> There is a problem in the mailing system I guess. I am sending 1 file in 1
> mail. This has the code file.
>
>
> On Wed, Jul 15, 2009 at 2:15 PM, Sean Owen <sr...@gmail.com> wrote:
>
>> There is still only one text file attached. But anyway I believe Thomas
>> has
>> identified the problem.
>>
>> On Jul 15, 2009 1:03 PM, "Laya Patwa" <la...@iitr.ernet.in> wrote:
>>
>> You did not get the code? But I sent it. Anyways please find 3 files
>> attached with this mail containing the code and the 2 data files.
>> My apologies for the mistake.
>>
>> On Wed, Jul 15, 2009 at 1:57 PM, Sean Owen <sr...@gmail.com> wrote: > >
>> Thomas is right, you have...
>>
>
>

Re: problems with taste.

Posted by Laya Patwa <la...@iitr.ernet.in>.
There is a problem in the mailing system I guess. I am sending 1 file in 1
mail. This has the code file.

On Wed, Jul 15, 2009 at 2:15 PM, Sean Owen <sr...@gmail.com> wrote:

> There is still only one text file attached. But anyway I believe Thomas has
> identified the problem.
>
> On Jul 15, 2009 1:03 PM, "Laya Patwa" <la...@iitr.ernet.in> wrote:
>
> You did not get the code? But I sent it. Anyways please find 3 files
> attached with this mail containing the code and the 2 data files.
> My apologies for the mistake.
>
> On Wed, Jul 15, 2009 at 1:57 PM, Sean Owen <sr...@gmail.com> wrote: > >
> Thomas is right, you have...
>

Re: problems with taste.

Posted by Sean Owen <sr...@gmail.com>.
There is still only one text file attached. But anyway I believe Thomas has
identified the problem.

On Jul 15, 2009 1:03 PM, "Laya Patwa" <la...@iitr.ernet.in> wrote:

You did not get the code? But I sent it. Anyways please find 3 files
attached with this mail containing the code and the 2 data files.
My apologies for the mistake.

On Wed, Jul 15, 2009 at 1:57 PM, Sean Owen <sr...@gmail.com> wrote: > >
Thomas is right, you have...

Re: problems with taste.

Posted by Laya Patwa <la...@iitr.ernet.in>.
You did not get the code? But I sent it. Anyways please find 3 files
attached with this mail containing the code and the 2 data files.
My apologies for the mistake.

On Wed, Jul 15, 2009 at 1:57 PM, Sean Owen <sr...@gmail.com> wrote:

> Thomas is right, you have all users expressing a preference for all items.
> 0
> does not mean 'no preference' (but you have 0.1 in the file anyhow). So
> nothing new can be recommended to anyone. What behavior are you
> anticipating
> here?
>
> I don't have code or the other file you mention, perhapss that clarifies
> things.
>
> On Jul 15, 2009 12:52 PM, "Laya Patwa" <la...@iitr.ernet.in> wrote:
>
> All users do not own all colors! Because the preferences are different in
> case of different user. It is either 1 or 0.
>
> On Wed, Jul 15, 2009 at 1:34 PM, Thomas Rewig <tr...@mufin.com> wrote: >
> I
> just take a short look...
>

Re: problems with taste.

Posted by Sean Owen <sr...@gmail.com>.
Thomas is right, you have all users expressing a preference for all items. 0
does not mean 'no preference' (but you have 0.1 in the file anyhow). So
nothing new can be recommended to anyone. What behavior are you anticipating
here?

I don't have code or the other file you mention, perhapss that clarifies
things.

On Jul 15, 2009 12:52 PM, "Laya Patwa" <la...@iitr.ernet.in> wrote:

All users do not own all colors! Because the preferences are different in
case of different user. It is either 1 or 0.

On Wed, Jul 15, 2009 at 1:34 PM, Thomas Rewig <tr...@mufin.com> wrote: > I
just take a short look...

Re: problems with taste.

Posted by Laya Patwa <la...@iitr.ernet.in>.
All users do not own all colors! Because the preferences are different in
case of different user. It is either 1 or 0.

On Wed, Jul 15, 2009 at 1:34 PM, Thomas Rewig <tr...@mufin.com> wrote:

> I just take a short look at the file. Maybe thats because all Users own all
> Collors - so there is nothing to recommend. Delete some Item for the User
> the recommendation is made for and you should get some recommendations.
>
> Laya Patwa schrieb:
>
>  Did you get any recommendations?
>>
>> The preferences are different for each user in tdata.txt. In testdata.csv
>> the preferences are all 1, but the items are different for each user. I
>> think the pearson correlation is working, because the nearest n user list
>> is
>> working and the code generates the nearest n user list.
>>
>>
>>
>> On Wed, Jul 15, 2009 at 12:16 PM, Sean Owen <sr...@gmail.com> wrote:
>>
>>
>>
>>> OK, hmm well it was kind of a long shot anyway. It should have worked
>>> even
>>> so.
>>>
>>> I see one data file attached. There are some blank lines at the top,
>>> though I don't think that will matter.
>>>
>>> You are putting quotes around the names. That means the item IDs have
>>> quotes in their names, which is not what I think you intend. For
>>> example, you do not have an item named 'red', you have an item named
>>> '"red"' in your model. If you are looking for recommendations that
>>> include the item 'red' you will not find any. But somehow I suspect
>>> this is not the problem you are talking about.
>>>
>>> What algorithm are you using -- one involving a correlation-based
>>> similarity metric like Pearson? I ask because most of your ratings
>>> have exactly the same rating, which will cause correlations to be
>>> undefined in some cases. You also have relatively little data. It may
>>> be that there are simply few or no defined similarities between users
>>> in the model and so no recommendations can be made.
>>>
>>> Add more, or more realistic, data and you should see better results
>>> perhaps.
>>>
>>>
>>> But I still then can't explain why two copies of the same file give
>>> different results. I might have to see the code.
>>> Yes do not send anything confidential.
>>>
>>>
>>> On Wed, Jul 15, 2009 at 11:10 AM, Laya Patwa<la...@iitr.ernet.in>
>>> wrote:
>>>
>>>
>>>> Hey!
>>>> I followed your instructions. It doesn't work even when I put each file
>>>>
>>>>
>>> in
>>>
>>>
>>>> separate directory.
>>>> Maybe you should have a look at the code and the data files. I am
>>>>
>>>>
>>> attaching
>>>
>>>
>>>> 2 of the data files. I need to get permission for the 3rd one( it is
>>>> also
>>>>
>>>>
>>> a
>>>
>>>
>>>> bit larger ). The code is almost the same as given in the documentation.
>>>>
>>>>
>>> It
>>>
>>>
>>>> is giving recommendations for the file testdata.csv.
>>>> Cheers,
>>>> Laya
>>>>
>>>> On Wed, Jul 15, 2009 at 11:43 AM, Sean Owen <sr...@gmail.com> wrote:
>>>>
>>>>
>>>>> Hmm, I might have guessed there is some file encoding issues, related
>>>>> to line breaks, since you say copying the file "breaks" it. But that
>>>>> would explain, I think, why a copy would *work* rather than fail.
>>>>>
>>>>> One thing to be careful of is that FileDataModel tries to be clever
>>>>> and allow you to post incremental updates to the data file by placing
>>>>> similarly-named files in the same directory. How have you named your
>>>>> files? To rule this out, put the files in separate directories, just
>>>>> to make sure.
>>>>>
>>>>> Otherwise perhaps you can send me a sample of the data file or a
>>>>> sample of your code to see what is going on.
>>>>>
>>>>> On Wed, Jul 15, 2009 at 10:23 AM, Laya Patwa<la...@iitr.ernet.in>
>>>>> wrote:
>>>>>
>>>>>
>>>>>> Hi!
>>>>>> I am a student and right now I am working on a project named CoEUD. My
>>>>>> task
>>>>>> is to build a recommender system and I am using the taste recommender
>>>>>> library that comes with mahout.
>>>>>>
>>>>>> I downloaded the subversion and installed it using maven and cygwin on
>>>>>> windows and also on MacOS. I tried the grouplens demo that is given in
>>>>>> the
>>>>>> taste documentation and it also worked. Then I tried the example for
>>>>>> user
>>>>>> based recommender, but it is giving some problems as follows:
>>>>>> 1) I have 3 different data files with me and they are in CSV format
>>>>>> having
>>>>>> values of userID, itemID and preference. The recommendations are
>>>>>> generated
>>>>>> for 1 data file and not for the others. There are no errors. The
>>>>>>
>>>>>>
>>>>> program
>>>
>>>
>>>> runs and generates an empty recommendation list for other data files
>>>>>> 2)Another peculiar thing that is happening is that when I make a copy
>>>>>>
>>>>>>
>>>>> of
>>>
>>>
>>>> the
>>>>>> data file for which the user based recommender example is working and
>>>>>> use
>>>>>> this data file, the recommendation list generated is empty.
>>>>>>
>>>>>> I am stuck with the above 2 problems and can't figure out why the
>>>>>> recommendations are not generated for all the data files. Can you
>>>>>>
>>>>>>
>>>>> please
>>>
>>>
>>>> help me out.
>>>>>>
>>>>>> I am using eclipse.
>>>>>>
>>>>>> Cheers,
>>>>>> Laya
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>
>

Re: problems with taste.

Posted by Thomas Rewig <tr...@mufin.com>.
I just take a short look at the file. Maybe thats because all Users own 
all Collors - so there is nothing to recommend. Delete some Item for the 
User the recommendation is made for and you should get some 
recommendations.

Laya Patwa schrieb:
> Did you get any recommendations?
>
> The preferences are different for each user in tdata.txt. In testdata.csv
> the preferences are all 1, but the items are different for each user. I
> think the pearson correlation is working, because the nearest n user list is
> working and the code generates the nearest n user list.
>
>
>
> On Wed, Jul 15, 2009 at 12:16 PM, Sean Owen <sr...@gmail.com> wrote:
>
>   
>> OK, hmm well it was kind of a long shot anyway. It should have worked even
>> so.
>>
>> I see one data file attached. There are some blank lines at the top,
>> though I don't think that will matter.
>>
>> You are putting quotes around the names. That means the item IDs have
>> quotes in their names, which is not what I think you intend. For
>> example, you do not have an item named 'red', you have an item named
>> '"red"' in your model. If you are looking for recommendations that
>> include the item 'red' you will not find any. But somehow I suspect
>> this is not the problem you are talking about.
>>
>> What algorithm are you using -- one involving a correlation-based
>> similarity metric like Pearson? I ask because most of your ratings
>> have exactly the same rating, which will cause correlations to be
>> undefined in some cases. You also have relatively little data. It may
>> be that there are simply few or no defined similarities between users
>> in the model and so no recommendations can be made.
>>
>> Add more, or more realistic, data and you should see better results
>> perhaps.
>>
>>
>> But I still then can't explain why two copies of the same file give
>> different results. I might have to see the code.
>> Yes do not send anything confidential.
>>
>>
>> On Wed, Jul 15, 2009 at 11:10 AM, Laya Patwa<la...@iitr.ernet.in>
>> wrote:
>>     
>>> Hey!
>>> I followed your instructions. It doesn't work even when I put each file
>>>       
>> in
>>     
>>> separate directory.
>>> Maybe you should have a look at the code and the data files. I am
>>>       
>> attaching
>>     
>>> 2 of the data files. I need to get permission for the 3rd one( it is also
>>>       
>> a
>>     
>>> bit larger ). The code is almost the same as given in the documentation.
>>>       
>> It
>>     
>>> is giving recommendations for the file testdata.csv.
>>> Cheers,
>>> Laya
>>>
>>> On Wed, Jul 15, 2009 at 11:43 AM, Sean Owen <sr...@gmail.com> wrote:
>>>       
>>>> Hmm, I might have guessed there is some file encoding issues, related
>>>> to line breaks, since you say copying the file "breaks" it. But that
>>>> would explain, I think, why a copy would *work* rather than fail.
>>>>
>>>> One thing to be careful of is that FileDataModel tries to be clever
>>>> and allow you to post incremental updates to the data file by placing
>>>> similarly-named files in the same directory. How have you named your
>>>> files? To rule this out, put the files in separate directories, just
>>>> to make sure.
>>>>
>>>> Otherwise perhaps you can send me a sample of the data file or a
>>>> sample of your code to see what is going on.
>>>>
>>>> On Wed, Jul 15, 2009 at 10:23 AM, Laya Patwa<la...@iitr.ernet.in>
>>>> wrote:
>>>>         
>>>>> Hi!
>>>>> I am a student and right now I am working on a project named CoEUD. My
>>>>> task
>>>>> is to build a recommender system and I am using the taste recommender
>>>>> library that comes with mahout.
>>>>>
>>>>> I downloaded the subversion and installed it using maven and cygwin on
>>>>> windows and also on MacOS. I tried the grouplens demo that is given in
>>>>> the
>>>>> taste documentation and it also worked. Then I tried the example for
>>>>> user
>>>>> based recommender, but it is giving some problems as follows:
>>>>> 1) I have 3 different data files with me and they are in CSV format
>>>>> having
>>>>> values of userID, itemID and preference. The recommendations are
>>>>> generated
>>>>> for 1 data file and not for the others. There are no errors. The
>>>>>           
>> program
>>     
>>>>> runs and generates an empty recommendation list for other data files
>>>>> 2)Another peculiar thing that is happening is that when I make a copy
>>>>>           
>> of
>>     
>>>>> the
>>>>> data file for which the user based recommender example is working and
>>>>> use
>>>>> this data file, the recommendation list generated is empty.
>>>>>
>>>>> I am stuck with the above 2 problems and can't figure out why the
>>>>> recommendations are not generated for all the data files. Can you
>>>>>           
>> please
>>     
>>>>> help me out.
>>>>>
>>>>> I am using eclipse.
>>>>>
>>>>> Cheers,
>>>>> Laya
>>>>>
>>>>>           
>>>       
>
>   

Re: problems with taste.

Posted by Laya Patwa <la...@iitr.ernet.in>.
Did you get any recommendations?

The preferences are different for each user in tdata.txt. In testdata.csv
the preferences are all 1, but the items are different for each user. I
think the pearson correlation is working, because the nearest n user list is
working and the code generates the nearest n user list.



On Wed, Jul 15, 2009 at 12:16 PM, Sean Owen <sr...@gmail.com> wrote:

> OK, hmm well it was kind of a long shot anyway. It should have worked even
> so.
>
> I see one data file attached. There are some blank lines at the top,
> though I don't think that will matter.
>
> You are putting quotes around the names. That means the item IDs have
> quotes in their names, which is not what I think you intend. For
> example, you do not have an item named 'red', you have an item named
> '"red"' in your model. If you are looking for recommendations that
> include the item 'red' you will not find any. But somehow I suspect
> this is not the problem you are talking about.
>
> What algorithm are you using -- one involving a correlation-based
> similarity metric like Pearson? I ask because most of your ratings
> have exactly the same rating, which will cause correlations to be
> undefined in some cases. You also have relatively little data. It may
> be that there are simply few or no defined similarities between users
> in the model and so no recommendations can be made.
>
> Add more, or more realistic, data and you should see better results
> perhaps.
>
>
> But I still then can't explain why two copies of the same file give
> different results. I might have to see the code.
> Yes do not send anything confidential.
>
>
> On Wed, Jul 15, 2009 at 11:10 AM, Laya Patwa<la...@iitr.ernet.in>
> wrote:
> > Hey!
> > I followed your instructions. It doesn't work even when I put each file
> in
> > separate directory.
> > Maybe you should have a look at the code and the data files. I am
> attaching
> > 2 of the data files. I need to get permission for the 3rd one( it is also
> a
> > bit larger ). The code is almost the same as given in the documentation.
> It
> > is giving recommendations for the file testdata.csv.
> > Cheers,
> > Laya
> >
> > On Wed, Jul 15, 2009 at 11:43 AM, Sean Owen <sr...@gmail.com> wrote:
> >>
> >> Hmm, I might have guessed there is some file encoding issues, related
> >> to line breaks, since you say copying the file "breaks" it. But that
> >> would explain, I think, why a copy would *work* rather than fail.
> >>
> >> One thing to be careful of is that FileDataModel tries to be clever
> >> and allow you to post incremental updates to the data file by placing
> >> similarly-named files in the same directory. How have you named your
> >> files? To rule this out, put the files in separate directories, just
> >> to make sure.
> >>
> >> Otherwise perhaps you can send me a sample of the data file or a
> >> sample of your code to see what is going on.
> >>
> >> On Wed, Jul 15, 2009 at 10:23 AM, Laya Patwa<la...@iitr.ernet.in>
> >> wrote:
> >> > Hi!
> >> > I am a student and right now I am working on a project named CoEUD. My
> >> > task
> >> > is to build a recommender system and I am using the taste recommender
> >> > library that comes with mahout.
> >> >
> >> > I downloaded the subversion and installed it using maven and cygwin on
> >> > windows and also on MacOS. I tried the grouplens demo that is given in
> >> > the
> >> > taste documentation and it also worked. Then I tried the example for
> >> > user
> >> > based recommender, but it is giving some problems as follows:
> >> > 1) I have 3 different data files with me and they are in CSV format
> >> > having
> >> > values of userID, itemID and preference. The recommendations are
> >> > generated
> >> > for 1 data file and not for the others. There are no errors. The
> program
> >> > runs and generates an empty recommendation list for other data files
> >> > 2)Another peculiar thing that is happening is that when I make a copy
> of
> >> > the
> >> > data file for which the user based recommender example is working and
> >> > use
> >> > this data file, the recommendation list generated is empty.
> >> >
> >> > I am stuck with the above 2 problems and can't figure out why the
> >> > recommendations are not generated for all the data files. Can you
> please
> >> > help me out.
> >> >
> >> > I am using eclipse.
> >> >
> >> > Cheers,
> >> > Laya
> >> >
> >
> >
>

Re: problems with taste.

Posted by Sean Owen <sr...@gmail.com>.
OK, hmm well it was kind of a long shot anyway. It should have worked even so.

I see one data file attached. There are some blank lines at the top,
though I don't think that will matter.

You are putting quotes around the names. That means the item IDs have
quotes in their names, which is not what I think you intend. For
example, you do not have an item named 'red', you have an item named
'"red"' in your model. If you are looking for recommendations that
include the item 'red' you will not find any. But somehow I suspect
this is not the problem you are talking about.

What algorithm are you using -- one involving a correlation-based
similarity metric like Pearson? I ask because most of your ratings
have exactly the same rating, which will cause correlations to be
undefined in some cases. You also have relatively little data. It may
be that there are simply few or no defined similarities between users
in the model and so no recommendations can be made.

Add more, or more realistic, data and you should see better results perhaps.


But I still then can't explain why two copies of the same file give
different results. I might have to see the code.
Yes do not send anything confidential.


On Wed, Jul 15, 2009 at 11:10 AM, Laya Patwa<la...@iitr.ernet.in> wrote:
> Hey!
> I followed your instructions. It doesn't work even when I put each file in
> separate directory.
> Maybe you should have a look at the code and the data files. I am attaching
> 2 of the data files. I need to get permission for the 3rd one( it is also a
> bit larger ). The code is almost the same as given in the documentation. It
> is giving recommendations for the file testdata.csv.
> Cheers,
> Laya
>
> On Wed, Jul 15, 2009 at 11:43 AM, Sean Owen <sr...@gmail.com> wrote:
>>
>> Hmm, I might have guessed there is some file encoding issues, related
>> to line breaks, since you say copying the file "breaks" it. But that
>> would explain, I think, why a copy would *work* rather than fail.
>>
>> One thing to be careful of is that FileDataModel tries to be clever
>> and allow you to post incremental updates to the data file by placing
>> similarly-named files in the same directory. How have you named your
>> files? To rule this out, put the files in separate directories, just
>> to make sure.
>>
>> Otherwise perhaps you can send me a sample of the data file or a
>> sample of your code to see what is going on.
>>
>> On Wed, Jul 15, 2009 at 10:23 AM, Laya Patwa<la...@iitr.ernet.in>
>> wrote:
>> > Hi!
>> > I am a student and right now I am working on a project named CoEUD. My
>> > task
>> > is to build a recommender system and I am using the taste recommender
>> > library that comes with mahout.
>> >
>> > I downloaded the subversion and installed it using maven and cygwin on
>> > windows and also on MacOS. I tried the grouplens demo that is given in
>> > the
>> > taste documentation and it also worked. Then I tried the example for
>> > user
>> > based recommender, but it is giving some problems as follows:
>> > 1) I have 3 different data files with me and they are in CSV format
>> > having
>> > values of userID, itemID and preference. The recommendations are
>> > generated
>> > for 1 data file and not for the others. There are no errors. The program
>> > runs and generates an empty recommendation list for other data files
>> > 2)Another peculiar thing that is happening is that when I make a copy of
>> > the
>> > data file for which the user based recommender example is working and
>> > use
>> > this data file, the recommendation list generated is empty.
>> >
>> > I am stuck with the above 2 problems and can't figure out why the
>> > recommendations are not generated for all the data files. Can you please
>> > help me out.
>> >
>> > I am using eclipse.
>> >
>> > Cheers,
>> > Laya
>> >
>
>

Re: problems with taste.

Posted by Laya Patwa <la...@iitr.ernet.in>.
Hey!
I followed your instructions. It doesn't work even when I put each file in
separate directory.
Maybe you should have a look at the code and the data files. I am attaching
2 of the data files. I need to get permission for the 3rd one( it is also a
bit larger ). The code is almost the same as given in the documentation. It
is giving recommendations for the file testdata.csv.

Cheers,
Laya

On Wed, Jul 15, 2009 at 11:43 AM, Sean Owen <sr...@gmail.com> wrote:

> Hmm, I might have guessed there is some file encoding issues, related
> to line breaks, since you say copying the file "breaks" it. But that
> would explain, I think, why a copy would *work* rather than fail.
>
> One thing to be careful of is that FileDataModel tries to be clever
> and allow you to post incremental updates to the data file by placing
> similarly-named files in the same directory. How have you named your
> files? To rule this out, put the files in separate directories, just
> to make sure.
>
> Otherwise perhaps you can send me a sample of the data file or a
> sample of your code to see what is going on.
>
> On Wed, Jul 15, 2009 at 10:23 AM, Laya Patwa<la...@iitr.ernet.in>
> wrote:
> > Hi!
> > I am a student and right now I am working on a project named CoEUD. My
> task
> > is to build a recommender system and I am using the taste recommender
> > library that comes with mahout.
> >
> > I downloaded the subversion and installed it using maven and cygwin on
> > windows and also on MacOS. I tried the grouplens demo that is given in
> the
> > taste documentation and it also worked. Then I tried the example for user
> > based recommender, but it is giving some problems as follows:
> > 1) I have 3 different data files with me and they are in CSV format
> having
> > values of userID, itemID and preference. The recommendations are
> generated
> > for 1 data file and not for the others. There are no errors. The program
> > runs and generates an empty recommendation list for other data files
> > 2)Another peculiar thing that is happening is that when I make a copy of
> the
> > data file for which the user based recommender example is working and use
> > this data file, the recommendation list generated is empty.
> >
> > I am stuck with the above 2 problems and can't figure out why the
> > recommendations are not generated for all the data files. Can you please
> > help me out.
> >
> > I am using eclipse.
> >
> > Cheers,
> > Laya
> >
>

Re: problems with taste.

Posted by Sean Owen <sr...@gmail.com>.
Hmm, I might have guessed there is some file encoding issues, related
to line breaks, since you say copying the file "breaks" it. But that
would explain, I think, why a copy would *work* rather than fail.

One thing to be careful of is that FileDataModel tries to be clever
and allow you to post incremental updates to the data file by placing
similarly-named files in the same directory. How have you named your
files? To rule this out, put the files in separate directories, just
to make sure.

Otherwise perhaps you can send me a sample of the data file or a
sample of your code to see what is going on.

On Wed, Jul 15, 2009 at 10:23 AM, Laya Patwa<la...@iitr.ernet.in> wrote:
> Hi!
> I am a student and right now I am working on a project named CoEUD. My task
> is to build a recommender system and I am using the taste recommender
> library that comes with mahout.
>
> I downloaded the subversion and installed it using maven and cygwin on
> windows and also on MacOS. I tried the grouplens demo that is given in the
> taste documentation and it also worked. Then I tried the example for user
> based recommender, but it is giving some problems as follows:
> 1) I have 3 different data files with me and they are in CSV format having
> values of userID, itemID and preference. The recommendations are generated
> for 1 data file and not for the others. There are no errors. The program
> runs and generates an empty recommendation list for other data files
> 2)Another peculiar thing that is happening is that when I make a copy of the
> data file for which the user based recommender example is working and use
> this data file, the recommendation list generated is empty.
>
> I am stuck with the above 2 problems and can't figure out why the
> recommendations are not generated for all the data files. Can you please
> help me out.
>
> I am using eclipse.
>
> Cheers,
> Laya
>