You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by "F.Ozgur Catak" <f....@gmail.com> on 2009/12/09 16:35:25 UTC

Simple problem

Hi,

I started to use mahout. I have a problem about FileDataModel. When i use a
csv file that use non-numeric values, I get error like

09.Ara.2009 16:54:27 org.slf4j.impl.JCLLoggerAdapter info
INFO: Creating FileDataModel for file c:\input.csv
09.Ara.2009 16:54:27 org.slf4j.impl.JCLLoggerAdapter info
INFO: Reading file info...
Exception in thread "main" java.lang.NumberFormatException: For input
string: "SPTBY-1711"
    at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
    at java.lang.Long.parseLong(Long.java:403)
    at java.lang.Long.parseLong(Long.java:461)
    at
org.apache.mahout.cf.taste.impl.model.file.FileDataModel.processLineWithoutID(FileDataModel.java:315)
    at
org.apache.mahout.cf.taste.impl.model.file.FileDataModel.processFileWithoutID(FileDataModel.java:293)
    at
org.apache.mahout.cf.taste.impl.model.file.FileDataModel.buildModel(FileDataModel.java:158)
    at
org.apache.mahout.cf.taste.impl.model.file.FileDataModel.reload(FileDataModel.java:129)
    at
org.apache.mahout.cf.taste.impl.model.file.FileDataModel.checkLoaded(FileDataModel.java:332)
    at
org.apache.mahout.cf.taste.impl.model.file.FileDataModel.getNumItems(FileDataModel.java:372)
    at TestMahout.main(TestMahout.java:21)



my input file like

5572,SPTBY-1711
5572,SPTBL-1711
5572,SPTKP-1711
5581,TDTBY-861
5581,TDTBL-861


and few source is

FileDataModel dataModel = new FileDataModel( new File("c:\\input.csv"));
System.out.println(dataModel.getNumItems());

what's the error.

thanks

Re: Simple problem

Posted by Sean Owen <sr...@gmail.com>.
Yes, you cannot use non-numeric IDs (anymore). See the docs for FileDataModel.

If you need to translate between Strings and numeric IDs, see the
IDMigrator interface and implementations which can help you.

On Wed, Dec 9, 2009 at 3:35 PM, F.Ozgur Catak <f....@gmail.com> wrote:
> Hi,
>
> I started to use mahout. I have a problem about FileDataModel. When i use a
> csv file that use non-numeric values, I get error like
>
> 09.Ara.2009 16:54:27 org.slf4j.impl.JCLLoggerAdapter info
> INFO: Creating FileDataModel for file c:\input.csv
> 09.Ara.2009 16:54:27 org.slf4j.impl.JCLLoggerAdapter info
> INFO: Reading file info...
> Exception in thread "main" java.lang.NumberFormatException: For input
> string: "SPTBY-1711"
>    at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
>    at java.lang.Long.parseLong(Long.java:403)
>    at java.lang.Long.parseLong(Long.java:461)
>    at
> org.apache.mahout.cf.taste.impl.model.file.FileDataModel.processLineWithoutID(FileDataModel.java:315)
>    at
> org.apache.mahout.cf.taste.impl.model.file.FileDataModel.processFileWithoutID(FileDataModel.java:293)
>    at
> org.apache.mahout.cf.taste.impl.model.file.FileDataModel.buildModel(FileDataModel.java:158)
>    at
> org.apache.mahout.cf.taste.impl.model.file.FileDataModel.reload(FileDataModel.java:129)
>    at
> org.apache.mahout.cf.taste.impl.model.file.FileDataModel.checkLoaded(FileDataModel.java:332)
>    at
> org.apache.mahout.cf.taste.impl.model.file.FileDataModel.getNumItems(FileDataModel.java:372)
>    at TestMahout.main(TestMahout.java:21)
>
>
>
> my input file like
>
> 5572,SPTBY-1711
> 5572,SPTBL-1711
> 5572,SPTKP-1711
> 5581,TDTBY-861
> 5581,TDTBL-861
>
>
> and few source is
>
> FileDataModel dataModel = new FileDataModel( new File("c:\\input.csv"));
> System.out.println(dataModel.getNumItems());
>
> what's the error.
>
> thanks
>