You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Lakshminarayana Motamarri <na...@gmail.com> on 2012/12/04 22:10:08 UTC

Regarding Mahout Item Recommendation engine - numberformatexception on noninteger column (eg: ISBN (alphanumeric value))

Hi

Just wondering if some one has come across this problem:
I know that mahout (Item recommender engine) needs input to be in a
specific format: (integer, integer, floating-point).

but what if we have non-integer values, like alphanumeric values as IDs
(ex: userID, movieID, or in my case bookID which is ISBN, a alphanumeric )
in the data set.

any workarounds to work with mahout in such cases?

Thanks,
Narayan Motamarri.

Re: Regarding Mahout Item Recommendation engine - numberformatexception on noninteger column (eg: ISBN (alphanumeric value))

Posted by Sean Owen <sr...@gmail.com>.
You would need to translate these values into longs, and then back
into Strings. There is some help in that regard in the project -- look
for IDMigrator and its subclasses. This will help translate
automatically, though it's a little bit of a hack.

Sean

On Tue, Dec 4, 2012 at 9:10 PM, Lakshminarayana Motamarri
<na...@gmail.com> wrote:
> Hi
>
> Just wondering if some one has come across this problem:
> I know that mahout (Item recommender engine) needs input to be in a
> specific format: (integer, integer, floating-point).
>
> but what if we have non-integer values, like alphanumeric values as IDs
> (ex: userID, movieID, or in my case bookID which is ISBN, a alphanumeric )
> in the data set.
>
> any workarounds to work with mahout in such cases?
>
> Thanks,
> Narayan Motamarri.

Re: Regarding Mahout Item Recommendation engine - numberformatexception on noninteger column (eg: ISBN (alphanumeric value))

Posted by Sean Owen <sr...@gmail.com>.
(This is more or less exactly what FileIDMigrator does.)

On Wed, Dec 5, 2012 at 5:56 AM, Utkarsh Gupta <Ut...@infosys.com> wrote:
> Before giving your data to mahout you can create a mapping for original IDs and create a new int/long ID
> I also faced the same problem and i used this code to solve it
> You can copy the content of map to a file and load it back to map output of item recommendation for each used with original user id
>           Int userId=1;
>                     reader = new BufferedReader(new FileReader(fileName));
>            writer=new BufferedWriter(new FileWriter(file2));
>                     userIdMap = new HashMap<String, Integer>();
>                     while ((line = reader.readLine()) != null) {
>                         columns = line.split(",");
>                         val = userIdMap.get(columns[0]) + "";
>                         if (val.equals("null") || val.equals("")) {
>                             userIdMap.put(columns[0], userId);
>                    val=userId;
>                             userId++;
>                         }
>                  Writer.write(val+","+columns[1]+","+columns[2]);
>                     }
>
>

RE: Regarding Mahout Item Recommendation engine - numberformatexception on noninteger column (eg: ISBN (alphanumeric value))

Posted by Utkarsh Gupta <Ut...@infosys.com>.
Before giving your data to mahout you can create a mapping for original IDs and create a new int/long ID
I also faced the same problem and i used this code to solve it
You can copy the content of map to a file and load it back to map output of item recommendation for each used with original user id
	  Int userId=1;
                    reader = new BufferedReader(new FileReader(fileName));
	   writer=new BufferedWriter(new FileWriter(file2));
                    userIdMap = new HashMap<String, Integer>();
                    while ((line = reader.readLine()) != null) {
                        columns = line.split(",");
                        val = userIdMap.get(columns[0]) + "";
                        if (val.equals("null") || val.equals("")) {
                            userIdMap.put(columns[0], userId);
	           val=userId;
                            userId++;
                        }
	         Writer.write(val+","+columns[1]+","+columns[2]);
                    }
                    

-----Original Message-----
From: Lakshminarayana Motamarri [mailto:narayana.gupta123@gmail.com] 
Sent: Wednesday, December 05, 2012 2:40 AM
To: user@mahout.apache.org
Subject: Regarding Mahout Item Recommendation engine - numberformatexception on noninteger column (eg: ISBN (alphanumeric value))

Hi

Just wondering if some one has come across this problem:
I know that mahout (Item recommender engine) needs input to be in a specific format: (integer, integer, floating-point).

but what if we have non-integer values, like alphanumeric values as IDs
(ex: userID, movieID, or in my case bookID which is ISBN, a alphanumeric ) in the data set.

any workarounds to work with mahout in such cases?

Thanks,
Narayan Motamarri.

**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely 
for the use of the addressee(s). If you are not the intended recipient, please 
notify the sender by e-mail and delete the original message. Further, you are not 
to copy, disclose, or distribute this e-mail or its contents to any other person and 
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken 
every reasonable precaution to minimize this risk, but is not liable for any damage 
you may sustain as a result of any virus in this e-mail. You should carry out your 
own virus checks before opening the e-mail or attachment. Infosys reserves the 
right to monitor and review the content of all messages sent to or from this e-mail 
address. Messages sent to or from this e-mail address may be stored on the 
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***

Re: Regarding Mahout Item Recommendation engine - numberformatexception on noninteger column (eg: ISBN (alphanumeric value))

Posted by Lakshminarayana Motamarri <na...@gmail.com>.
Thanks Utkarsh and Sean for your inputs.