You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by 刘逸哲 <zh...@alibaba-inc.com> on 2009/12/23 12:15:25 UTC

Does the FileDataModel still require the UserID and ItemID numeric?

I saw this "Further, this ID value must be
numeric; it is a Java long type through the APIs" in the taste documation.
But in the javadoc of FileDataModel,there is "The user and item IDs are ready literally as Strings and treated as such in the API"
Does it mean that userID and ItemID could be string now, for example, my data is look like this:
"Auser,apple,1
User2,banana,2
User3,car,3"
Could I use FileDataModel directly?
Thanks



This email (including any attachments) is confidential and may be legally privileged. If you received this email in error, please delete it immediately and do not copy it or use it for any purpose or disclose its contents to any other person. Thank you.

本电邮(包括任何附件)可能含有机密资料并受法律保护。如您不是正确的收件人,请您立即删除本邮件。请不要将本电邮进行复制并用作任何其他用途、或透露本邮件之内容。谢谢。

答复: 答复: Does the FileDataModel still require the UserID and ItemID numeric?

Posted by 刘逸哲 <zh...@alibaba-inc.com>.
Thanks^_^

-----邮件原件-----
发件人: Sean Owen [mailto:srowen@gmail.com]
发送时间: 2009年12月23日 19:42
收件人: mahout-user@lucene.apache.org
主题: Re: 答复: Does the FileDataModel still require the UserID and ItemID numeric?

org.apache.mahout.cf.taste.model.IDMigrator -- you are using Mahout
0.2? It is documented.

I just fixed the FileDataModel javadoc in Subversion.

On Wed, Dec 23, 2009 at 11:39 AM, 刘逸哲 <zh...@alibaba-inc.com> wrote:
> Thanks for your reply.
> Where is the IDMigrator class, is there any example shows how to use it?
> I don't find it in the javadoc of mahout.

This email (including any attachments) is confidential and may be legally privileged. If you received this email in error, please delete it immediately and do not copy it or use it for any purpose or disclose its contents to any other person. Thank you.

本电邮(包括任何附件)可能含有机密资料并受法律保护。如您不是正确的收件人,请您立即删除本邮件。请不要将本电邮进行复制并用作任何其他用途、或透露本邮件之内容。谢谢。

Re: 答复: Does the FileDataModel still require the UserID and ItemID numeric?

Posted by Sean Owen <sr...@gmail.com>.
org.apache.mahout.cf.taste.model.IDMigrator -- you are using Mahout
0.2? It is documented.

I just fixed the FileDataModel javadoc in Subversion.

On Wed, Dec 23, 2009 at 11:39 AM, 刘逸哲 <zh...@alibaba-inc.com> wrote:
> Thanks for your reply.
> Where is the IDMigrator class, is there any example shows how to use it?
> I don't find it in the javadoc of mahout.

答复: Does the FileDataModel still require the UserID and ItemID numeric?

Posted by 刘逸哲 <zh...@alibaba-inc.com>.
Thanks for your reply.
Where is the IDMigrator class, is there any example shows how to use it?
I don't find it in the javadoc of mahout.

-----邮件原件-----
发件人: Sean Owen [mailto:srowen@gmail.com] 
发送时间: 2009年12月23日 19:32
收件人: mahout-user@lucene.apache.org
主题: Re: Does the FileDataModel still require the UserID and ItemID numeric?

That is probably a typo, if it hasn't been fixed already. Yes, the
user and item IDs must be numeric. You can't use "apple", no.

However look at the IDMigrator class for solutions for translating
between strings and numeric IDs if you must. This will be a lot slower
though. It's much better to be able to use numeric IDs.

On Wed, Dec 23, 2009 at 11:15 AM, 刘逸哲 <zh...@alibaba-inc.com> wrote:
> I saw this "Further, this ID value must be
> numeric; it is a Java long type through the APIs" in the taste documation.
> But in the javadoc of FileDataModel,there is "The user and item IDs are ready literally as Strings and treated as such in the API"
> Does it mean that userID and ItemID could be string now, for example, my data is look like this:
> "Auser,apple,1
> User2,banana,2
> User3,car,3"
> Could I use FileDataModel directly?
> Thanks
>
>
>
> This email (including any attachments) is confidential and may be legally privileged. If you received this email in error, please delete it immediately and do not copy it or use it for any purpose or disclose its contents to any other person. Thank you.

>
> 本电邮(包括任何附件)可能含有机密资料并受法律保护。如您不是正确的收件人,请您立即删除本邮件。请不要将本电邮进行复制并用作任何其他用途、或透露本邮件之内容。谢谢。
>

Re: Does the FileDataModel still require the UserID and ItemID numeric?

Posted by Sean Owen <sr...@gmail.com>.
That is probably a typo, if it hasn't been fixed already. Yes, the
user and item IDs must be numeric. You can't use "apple", no.

However look at the IDMigrator class for solutions for translating
between strings and numeric IDs if you must. This will be a lot slower
though. It's much better to be able to use numeric IDs.

On Wed, Dec 23, 2009 at 11:15 AM, 刘逸哲 <zh...@alibaba-inc.com> wrote:
> I saw this "Further, this ID value must be
> numeric; it is a Java long type through the APIs" in the taste documation.
> But in the javadoc of FileDataModel,there is "The user and item IDs are ready literally as Strings and treated as such in the API"
> Does it mean that userID and ItemID could be string now, for example, my data is look like this:
> "Auser,apple,1
> User2,banana,2
> User3,car,3"
> Could I use FileDataModel directly?
> Thanks
>
>
>
> This email (including any attachments) is confidential and may be legally privileged. If you received this email in error, please delete it immediately and do not copy it or use it for any purpose or disclose its contents to any other person. Thank you.
>
> 本电邮(包括任何附件)可能含有机密资料并受法律保护。如您不是正确的收件人,请您立即删除本邮件。请不要将本电邮进行复制并用作任何其他用途、或透露本邮件之内容。谢谢。
>

Re: 答复: Does the FileDataModel still require the UserID and ItemID numeric?

Posted by Jeff Zhang <zj...@gmail.com>.
Hi Liu,

For small data set, I suggest you use MemoryIDMigrator for experiment.


Jeff Zhang


On Wed, Dec 23, 2009 at 3:48 AM, 刘逸哲 <zh...@alibaba-inc.com> wrote:

> Hi,zhang
>  Could you give some example show how to use it?
>
> -----邮件原件-----
> 发件人: Jeff Zhang [mailto:zjffdu@gmail.com]
> 发送时间: 2009年12月23日 19:41
> 收件人: mahout-user@lucene.apache.org
> 主题: Re: Does the FileDataModel still require the UserID and ItemID numeric?
>
> Hi 刘,
>
> In the mahout 0.20, the user and item id must be long type. It is for
> performance.
> If your data model's id is string type , you should implement the interface
> IDMigrator which help you convert string to long. There's also some
> implementation such as MySQLJDBCIDMigrator in mahout which you can refer
> to.
>
>
> Jeff Zhang
>
>
>
> On Wed, Dec 23, 2009 at 3:15 AM, 刘逸哲 <zh...@alibaba-inc.com> wrote:
>
> > I saw this "Further, this ID value must be
> > numeric; it is a Java long type through the APIs" in the taste
> documation.
> > But in the javadoc of FileDataModel,there is "The user and item IDs are
> > ready literally as Strings and treated as such in the API"
> > Does it mean that userID and ItemID could be string now, for example, my
> > data is look like this:
> > "Auser,apple,1
> > User2,banana,2
> > User3,car,3"
> > Could I use FileDataModel directly?
> > Thanks
> >
> >
> >
> > This email (including any attachments) is confidential and may be legally
> > privileged. If you received this email in error, please delete it
> > immediately and do not copy it or use it for any purpose or disclose its
> > contents to any other person. Thank you.
> >
> >
> >
> 本电邮(包括任何附件)可能含有机密资料并受法律保护。如您不是正确的收件人,请您立即删除本邮件。请不要将本电邮进行复制并用作任何其他用途、或透露本邮件之内容。谢谢。
> >
>

答复: 答复: 答复: Does the FileDataModel still require the UserID and ItemID numeric?

Posted by 刘逸哲 <zh...@alibaba-inc.com>.
Thanks
I think I should extend the fileDataModel. And if there are lot of applications have stringID data like mine, it is better to have a defult implementation to deal with this kind data^_^ 

-----邮件原件-----
发件人: Jeff Zhang [mailto:zjffdu@gmail.com] 
发送时间: 2009年12月24日 15:06
收件人: mahout-user@lucene.apache.org
主题: Re: 答复: 答复: Does the FileDataModel still require the UserID and ItemID numeric?

I mean you can use the MemoryIDMigrator to do experiment, the
MemeoryIDMigrator store the mapping between string and long in memory which
is fit for small data experiemnt. When you want to put it into production,
maybe you can use the MySQLJDBCIDMigrator which store the mapping into a
table.
And I think you need a little change to FileDataModel or extends
FileDataModel, what you need to change is to modify the load process,
convert the string to long using IDMigrator each time you load a record.


Jeff Zhang



On Wed, Dec 23, 2009 at 10:46 PM, 刘逸哲 <zh...@alibaba-inc.com> wrote:

> I mean how does it interact with the mahout framework.
> Do you mean that I should use one of IDMigrator implementations convert
> strings to longs and save to a data file and open it with FileDataModel?
> Or I need implement a new fileDataModel that use the IDMigrator?
>
> -----邮件原件-----
> 发件人: Sean Owen [mailto:srowen@gmail.com]
> 发送时间: 2009年12月23日 20:03
> 收件人: mahout-user@lucene.apache.org
> 主题: Re: 答复: Does the FileDataModel still require the UserID and ItemID
> numeric?
>
> Should be pretty self-explanatory, do you see the javadoc? it maps
> strings to longs, and longs to strings for you. Strings from your app
> get converted to longs for use in the recommender, and then back to
> strings. You do the conversion yourself, but this class helps.
>
> On Wed, Dec 23, 2009 at 11:48 AM, 刘逸哲 <zh...@alibaba-inc.com> wrote:
> > Hi,zhang
> >  Could you give some example show how to use it?
> >
> > -----邮件原件-----
> > 发件人: Jeff Zhang [mailto:zjffdu@gmail.com]
> > 发送时间: 2009年12月23日 19:41
> > 收件人: mahout-user@lucene.apache.org
> > 主题: Re: Does the FileDataModel still require the UserID and ItemID
> numeric?
> >
> > Hi 刘,
> >
> > In the mahout 0.20, the user and item id must be long type. It is for
> > performance.
> > If your data model's id is string type , you should implement the
> interface
> > IDMigrator which help you convert string to long. There's also some
> > implementation such as MySQLJDBCIDMigrator in mahout which you can refer
> to.
> >
> >
> > Jeff Zhang
> >
> >
> >
> > On Wed, Dec 23, 2009 at 3:15 AM, 刘逸哲 <zh...@alibaba-inc.com> wrote:
> >
> >> I saw this "Further, this ID value must be
> >> numeric; it is a Java long type through the APIs" in the taste
> documation.
> >> But in the javadoc of FileDataModel,there is "The user and item IDs are
> >> ready literally as Strings and treated as such in the API"
> >> Does it mean that userID and ItemID could be string now, for example, my
> >> data is look like this:
> >> "Auser,apple,1
> >> User2,banana,2
> >> User3,car,3"
> >> Could I use FileDataModel directly?
> >> Thanks
> >>
> >>
> >>
> >> This email (including any attachments) is confidential and may be
> legally
> >> privileged. If you received this email in error, please delete it
> >> immediately and do not copy it or use it for any purpose or disclose its
> >> contents to any other person. Thank you.
> >>
> >>
> >>
> 本电邮(包括任何附件)可能含有机密资料并受法律保护。如您不是正确的收件人,请您立即删除本邮件。请不要将本电邮进行复制并用作任何其他用途、或透露本邮件之内容。谢谢。
> >>
> >
>

Re: 答复: 答复: Does the FileDataModel still require the UserID and ItemID numeric?

Posted by Jeff Zhang <zj...@gmail.com>.
I mean you can use the MemoryIDMigrator to do experiment, the
MemeoryIDMigrator store the mapping between string and long in memory which
is fit for small data experiemnt. When you want to put it into production,
maybe you can use the MySQLJDBCIDMigrator which store the mapping into a
table.
And I think you need a little change to FileDataModel or extends
FileDataModel, what you need to change is to modify the load process,
convert the string to long using IDMigrator each time you load a record.


Jeff Zhang



On Wed, Dec 23, 2009 at 10:46 PM, 刘逸哲 <zh...@alibaba-inc.com> wrote:

> I mean how does it interact with the mahout framework.
> Do you mean that I should use one of IDMigrator implementations convert
> strings to longs and save to a data file and open it with FileDataModel?
> Or I need implement a new fileDataModel that use the IDMigrator?
>
> -----邮件原件-----
> 发件人: Sean Owen [mailto:srowen@gmail.com]
> 发送时间: 2009年12月23日 20:03
> 收件人: mahout-user@lucene.apache.org
> 主题: Re: 答复: Does the FileDataModel still require the UserID and ItemID
> numeric?
>
> Should be pretty self-explanatory, do you see the javadoc? it maps
> strings to longs, and longs to strings for you. Strings from your app
> get converted to longs for use in the recommender, and then back to
> strings. You do the conversion yourself, but this class helps.
>
> On Wed, Dec 23, 2009 at 11:48 AM, 刘逸哲 <zh...@alibaba-inc.com> wrote:
> > Hi,zhang
> >  Could you give some example show how to use it?
> >
> > -----邮件原件-----
> > 发件人: Jeff Zhang [mailto:zjffdu@gmail.com]
> > 发送时间: 2009年12月23日 19:41
> > 收件人: mahout-user@lucene.apache.org
> > 主题: Re: Does the FileDataModel still require the UserID and ItemID
> numeric?
> >
> > Hi 刘,
> >
> > In the mahout 0.20, the user and item id must be long type. It is for
> > performance.
> > If your data model's id is string type , you should implement the
> interface
> > IDMigrator which help you convert string to long. There's also some
> > implementation such as MySQLJDBCIDMigrator in mahout which you can refer
> to.
> >
> >
> > Jeff Zhang
> >
> >
> >
> > On Wed, Dec 23, 2009 at 3:15 AM, 刘逸哲 <zh...@alibaba-inc.com> wrote:
> >
> >> I saw this "Further, this ID value must be
> >> numeric; it is a Java long type through the APIs" in the taste
> documation.
> >> But in the javadoc of FileDataModel,there is "The user and item IDs are
> >> ready literally as Strings and treated as such in the API"
> >> Does it mean that userID and ItemID could be string now, for example, my
> >> data is look like this:
> >> "Auser,apple,1
> >> User2,banana,2
> >> User3,car,3"
> >> Could I use FileDataModel directly?
> >> Thanks
> >>
> >>
> >>
> >> This email (including any attachments) is confidential and may be
> legally
> >> privileged. If you received this email in error, please delete it
> >> immediately and do not copy it or use it for any purpose or disclose its
> >> contents to any other person. Thank you.
> >>
> >>
> >>
> 本电邮(包括任何附件)可能含有机密资料并受法律保护。如您不是正确的收件人,请您立即删除本邮件。请不要将本电邮进行复制并用作任何其他用途、或透露本邮件之内容。谢谢。
> >>
> >
>

答复: 答复: Does the FileDataModel still require the UserID and ItemID numeric?

Posted by 刘逸哲 <zh...@alibaba-inc.com>.
I mean how does it interact with the mahout framework.
Do you mean that I should use one of IDMigrator implementations convert strings to longs and save to a data file and open it with FileDataModel?
Or I need implement a new fileDataModel that use the IDMigrator?  

-----邮件原件-----
发件人: Sean Owen [mailto:srowen@gmail.com] 
发送时间: 2009年12月23日 20:03
收件人: mahout-user@lucene.apache.org
主题: Re: 答复: Does the FileDataModel still require the UserID and ItemID numeric?

Should be pretty self-explanatory, do you see the javadoc? it maps
strings to longs, and longs to strings for you. Strings from your app
get converted to longs for use in the recommender, and then back to
strings. You do the conversion yourself, but this class helps.

On Wed, Dec 23, 2009 at 11:48 AM, 刘逸哲 <zh...@alibaba-inc.com> wrote:
> Hi,zhang
>  Could you give some example show how to use it?
>
> -----邮件原件-----
> 发件人: Jeff Zhang [mailto:zjffdu@gmail.com]
> 发送时间: 2009年12月23日 19:41
> 收件人: mahout-user@lucene.apache.org
> 主题: Re: Does the FileDataModel still require the UserID and ItemID numeric?
>
> Hi 刘,
>
> In the mahout 0.20, the user and item id must be long type. It is for
> performance.
> If your data model's id is string type , you should implement the interface
> IDMigrator which help you convert string to long. There's also some
> implementation such as MySQLJDBCIDMigrator in mahout which you can refer to.
>
>
> Jeff Zhang
>
>
>
> On Wed, Dec 23, 2009 at 3:15 AM, 刘逸哲 <zh...@alibaba-inc.com> wrote:
>
>> I saw this "Further, this ID value must be
>> numeric; it is a Java long type through the APIs" in the taste documation.
>> But in the javadoc of FileDataModel,there is "The user and item IDs are
>> ready literally as Strings and treated as such in the API"
>> Does it mean that userID and ItemID could be string now, for example, my
>> data is look like this:
>> "Auser,apple,1
>> User2,banana,2
>> User3,car,3"
>> Could I use FileDataModel directly?
>> Thanks
>>
>>
>>
>> This email (including any attachments) is confidential and may be legally
>> privileged. If you received this email in error, please delete it
>> immediately and do not copy it or use it for any purpose or disclose its
>> contents to any other person. Thank you.
>>
>>
>> 本电邮(包括任何附件)可能含有机密资料并受法律保护。如您不是正确的收件人,请您立即删除本邮件。请不要将本电邮进行复制并用作任何其他用途、或透露本邮件之内容。谢谢。
>>
>

Re: 答复: Does the FileDataModel still require the UserID and ItemID numeric?

Posted by Sean Owen <sr...@gmail.com>.
Should be pretty self-explanatory, do you see the javadoc? it maps
strings to longs, and longs to strings for you. Strings from your app
get converted to longs for use in the recommender, and then back to
strings. You do the conversion yourself, but this class helps.

On Wed, Dec 23, 2009 at 11:48 AM, 刘逸哲 <zh...@alibaba-inc.com> wrote:
> Hi,zhang
>  Could you give some example show how to use it?
>
> -----邮件原件-----
> 发件人: Jeff Zhang [mailto:zjffdu@gmail.com]
> 发送时间: 2009年12月23日 19:41
> 收件人: mahout-user@lucene.apache.org
> 主题: Re: Does the FileDataModel still require the UserID and ItemID numeric?
>
> Hi 刘,
>
> In the mahout 0.20, the user and item id must be long type. It is for
> performance.
> If your data model's id is string type , you should implement the interface
> IDMigrator which help you convert string to long. There's also some
> implementation such as MySQLJDBCIDMigrator in mahout which you can refer to.
>
>
> Jeff Zhang
>
>
>
> On Wed, Dec 23, 2009 at 3:15 AM, 刘逸哲 <zh...@alibaba-inc.com> wrote:
>
>> I saw this "Further, this ID value must be
>> numeric; it is a Java long type through the APIs" in the taste documation.
>> But in the javadoc of FileDataModel,there is "The user and item IDs are
>> ready literally as Strings and treated as such in the API"
>> Does it mean that userID and ItemID could be string now, for example, my
>> data is look like this:
>> "Auser,apple,1
>> User2,banana,2
>> User3,car,3"
>> Could I use FileDataModel directly?
>> Thanks
>>
>>
>>
>> This email (including any attachments) is confidential and may be legally
>> privileged. If you received this email in error, please delete it
>> immediately and do not copy it or use it for any purpose or disclose its
>> contents to any other person. Thank you.
>>
>>
>> 本电邮(包括任何附件)可能含有机密资料并受法律保护。如您不是正确的收件人,请您立即删除本邮件。请不要将本电邮进行复制并用作任何其他用途、或透露本邮件之内容。谢谢。
>>
>

答复: Does the FileDataModel still require the UserID and ItemID numeric?

Posted by 刘逸哲 <zh...@alibaba-inc.com>.
Hi,zhang
 Could you give some example show how to use it?

-----邮件原件-----
发件人: Jeff Zhang [mailto:zjffdu@gmail.com] 
发送时间: 2009年12月23日 19:41
收件人: mahout-user@lucene.apache.org
主题: Re: Does the FileDataModel still require the UserID and ItemID numeric?

Hi 刘,

In the mahout 0.20, the user and item id must be long type. It is for
performance.
If your data model's id is string type , you should implement the interface
IDMigrator which help you convert string to long. There's also some
implementation such as MySQLJDBCIDMigrator in mahout which you can refer to.


Jeff Zhang



On Wed, Dec 23, 2009 at 3:15 AM, 刘逸哲 <zh...@alibaba-inc.com> wrote:

> I saw this "Further, this ID value must be
> numeric; it is a Java long type through the APIs" in the taste documation.
> But in the javadoc of FileDataModel,there is "The user and item IDs are
> ready literally as Strings and treated as such in the API"
> Does it mean that userID and ItemID could be string now, for example, my
> data is look like this:
> "Auser,apple,1
> User2,banana,2
> User3,car,3"
> Could I use FileDataModel directly?
> Thanks
>
>
>
> This email (including any attachments) is confidential and may be legally
> privileged. If you received this email in error, please delete it
> immediately and do not copy it or use it for any purpose or disclose its
> contents to any other person. Thank you.
>
>
> 本电邮(包括任何附件)可能含有机密资料并受法律保护。如您不是正确的收件人,请您立即删除本邮件。请不要将本电邮进行复制并用作任何其他用途、或透露本邮件之内容。谢谢。
>

Re: Does the FileDataModel still require the UserID and ItemID numeric?

Posted by Jeff Zhang <zj...@gmail.com>.
Hi 刘,

In the mahout 0.20, the user and item id must be long type. It is for
performance.
If your data model's id is string type , you should implement the interface
IDMigrator which help you convert string to long. There's also some
implementation such as MySQLJDBCIDMigrator in mahout which you can refer to.


Jeff Zhang



On Wed, Dec 23, 2009 at 3:15 AM, 刘逸哲 <zh...@alibaba-inc.com> wrote:

> I saw this "Further, this ID value must be
> numeric; it is a Java long type through the APIs" in the taste documation.
> But in the javadoc of FileDataModel,there is "The user and item IDs are
> ready literally as Strings and treated as such in the API"
> Does it mean that userID and ItemID could be string now, for example, my
> data is look like this:
> "Auser,apple,1
> User2,banana,2
> User3,car,3"
> Could I use FileDataModel directly?
> Thanks
>
>
>
> This email (including any attachments) is confidential and may be legally
> privileged. If you received this email in error, please delete it
> immediately and do not copy it or use it for any purpose or disclose its
> contents to any other person. Thank you.
>
>
> 本电邮(包括任何附件)可能含有机密资料并受法律保护。如您不是正确的收件人,请您立即删除本邮件。请不要将本电邮进行复制并用作任何其他用途、或透露本邮件之内容。谢谢。
>