You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Jerry Ye <je...@yahoo-inc.com> on 2010/01/19 22:16:33 UTC

Converting Arff to MVC

Hi,
I've been trying to convert a simple arff file and I'm getting the following error:

-bash-3.1$ java -cp mahout-core-0.3-SNAPSHOT.jar:mahout-utils-0.3-SNAPSHOT.jar:$(echo dependency/*.jar . | sed 's/ /:/g') org.apache.mahout.utils.vectors.arff.Driver -d vehicle.arff -o iris -t iris/dict.txt
Jan 19, 2010 8:58:36 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Output Dir: iris
Jan 19, 2010 8:58:36 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Converting File: vehicle.arff
outfile: iris/vehicle.arff.mvc
Exception in thread "main" java.lang.NullPointerException
    at org.apache.hadoop.io.serializer.SerializationFactory.getSerializer(SerializationFactory.java:73)
    at org.apache.hadoop.io.SequenceFile$Writer.init(SequenceFile.java:910)
    at org.apache.hadoop.io.SequenceFile$RecordCompressWriter.<init>(SequenceFile.java:1074)
    at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:397)
    at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
    at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:265)
    at org.apache.mahout.utils.vectors.arff.Driver.getSeqFileWriter(Driver.java:180)
    at org.apache.mahout.utils.vectors.arff.Driver.writeFile(Driver.java:167)
    at org.apache.mahout.utils.vectors.arff.Driver.main(Driver.java:132)

My data is:

@relation iris

@attribute f1 numeric
@attribute f2 numeric
@attribute f3 numeric
@attribute f4 numeric

@data
5.1,3.5,1.4,0.2
4.9,3.0,1.4,0.2
4.7,3.2,1.3,0.2
4.6,3.1,1.5,0.2
5.0,3.6,1.4,0.2
5.4,3.9,1.7,0.4
4.6,3.4,1.4,0.3
5.0,3.4,1.5,0.2
4.4,2.9,1.4,0.2
4.9,3.1,1.5,0.1


Any guidance?  Thanks.

- jerry

Re: Converting Arff to MVC

Posted by Grant Ingersoll <gs...@apache.org>.
Thanks, Jerry!

On Jan 20, 2010, at 11:59 AM, Jerry Ye wrote:

> Ticket created as https://issues.apache.org/jira/browse/MAHOUT-265.  I noticed that I don't get this error with revision 897299, however it shows that it wrote 0 vectors when converting vectors from both the Solr index and the arff converter.
> 
> - jerry
> 
> 
> On 1/20/10 9:22 AM, "Grant Ingersoll" <gs...@apache.org> wrote:
> 
> Looks like a bug, can you open a JIRA ticket?   Note, the ARFF support may not be complete as of yet, although I wonder if this is an issue with the upgrade to a newer version of Hadoop.  I'll try to take a look in the next few days.
> 
> -Grant
> 
> On Jan 19, 2010, at 4:16 PM, Jerry Ye wrote:
> 
>> Hi,
>> I've been trying to convert a simple arff file and I'm getting the following error:
>> 
>> -bash-3.1$ java -cp mahout-core-0.3-SNAPSHOT.jar:mahout-utils-0.3-SNAPSHOT.jar:$(echo dependency/*.jar . | sed 's/ /:/g') org.apache.mahout.utils.vectors.arff.Driver -d vehicle.arff -o iris -t iris/dict.txt
>> Jan 19, 2010 8:58:36 PM org.slf4j.impl.JCLLoggerAdapter info
>> INFO: Output Dir: iris
>> Jan 19, 2010 8:58:36 PM org.slf4j.impl.JCLLoggerAdapter info
>> INFO: Converting File: vehicle.arff
>> outfile: iris/vehicle.arff.mvc
>> Exception in thread "main" java.lang.NullPointerException
>>   at org.apache.hadoop.io.serializer.SerializationFactory.getSerializer(SerializationFactory.java:73)
>>   at org.apache.hadoop.io.SequenceFile$Writer.init(SequenceFile.java:910)
>>   at org.apache.hadoop.io.SequenceFile$RecordCompressWriter.<init>(SequenceFile.java:1074)
>>   at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:397)
>>   at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>>   at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:265)
>>   at org.apache.mahout.utils.vectors.arff.Driver.getSeqFileWriter(Driver.java:180)
>>   at org.apache.mahout.utils.vectors.arff.Driver.writeFile(Driver.java:167)
>>   at org.apache.mahout.utils.vectors.arff.Driver.main(Driver.java:132)
>> 
>> My data is:
>> 
>> @relation iris
>> 
>> @attribute f1 numeric
>> @attribute f2 numeric
>> @attribute f3 numeric
>> @attribute f4 numeric
>> 
>> @data
>> 5.1,3.5,1.4,0.2
>> 4.9,3.0,1.4,0.2
>> 4.7,3.2,1.3,0.2
>> 4.6,3.1,1.5,0.2
>> 5.0,3.6,1.4,0.2
>> 5.4,3.9,1.7,0.4
>> 4.6,3.4,1.4,0.3
>> 5.0,3.4,1.5,0.2
>> 4.4,2.9,1.4,0.2
>> 4.9,3.1,1.5,0.1
>> 
>> 
>> Any guidance?  Thanks.
>> 
>> - jerry
> 
> 


Re: Converting Arff to MVC

Posted by Jerry Ye <je...@yahoo-inc.com>.
Ticket created as https://issues.apache.org/jira/browse/MAHOUT-265.  I noticed that I don't get this error with revision 897299, however it shows that it wrote 0 vectors when converting vectors from both the Solr index and the arff converter.

- jerry


On 1/20/10 9:22 AM, "Grant Ingersoll" <gs...@apache.org> wrote:

Looks like a bug, can you open a JIRA ticket?   Note, the ARFF support may not be complete as of yet, although I wonder if this is an issue with the upgrade to a newer version of Hadoop.  I'll try to take a look in the next few days.

-Grant

On Jan 19, 2010, at 4:16 PM, Jerry Ye wrote:

> Hi,
> I've been trying to convert a simple arff file and I'm getting the following error:
>
> -bash-3.1$ java -cp mahout-core-0.3-SNAPSHOT.jar:mahout-utils-0.3-SNAPSHOT.jar:$(echo dependency/*.jar . | sed 's/ /:/g') org.apache.mahout.utils.vectors.arff.Driver -d vehicle.arff -o iris -t iris/dict.txt
> Jan 19, 2010 8:58:36 PM org.slf4j.impl.JCLLoggerAdapter info
> INFO: Output Dir: iris
> Jan 19, 2010 8:58:36 PM org.slf4j.impl.JCLLoggerAdapter info
> INFO: Converting File: vehicle.arff
> outfile: iris/vehicle.arff.mvc
> Exception in thread "main" java.lang.NullPointerException
>    at org.apache.hadoop.io.serializer.SerializationFactory.getSerializer(SerializationFactory.java:73)
>    at org.apache.hadoop.io.SequenceFile$Writer.init(SequenceFile.java:910)
>    at org.apache.hadoop.io.SequenceFile$RecordCompressWriter.<init>(SequenceFile.java:1074)
>    at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:397)
>    at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>    at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:265)
>    at org.apache.mahout.utils.vectors.arff.Driver.getSeqFileWriter(Driver.java:180)
>    at org.apache.mahout.utils.vectors.arff.Driver.writeFile(Driver.java:167)
>    at org.apache.mahout.utils.vectors.arff.Driver.main(Driver.java:132)
>
> My data is:
>
> @relation iris
>
> @attribute f1 numeric
> @attribute f2 numeric
> @attribute f3 numeric
> @attribute f4 numeric
>
> @data
> 5.1,3.5,1.4,0.2
> 4.9,3.0,1.4,0.2
> 4.7,3.2,1.3,0.2
> 4.6,3.1,1.5,0.2
> 5.0,3.6,1.4,0.2
> 5.4,3.9,1.7,0.4
> 4.6,3.4,1.4,0.3
> 5.0,3.4,1.5,0.2
> 4.4,2.9,1.4,0.2
> 4.9,3.1,1.5,0.1
>
>
> Any guidance?  Thanks.
>
> - jerry



Re: Converting Arff to MVC

Posted by Grant Ingersoll <gs...@apache.org>.
Looks like a bug, can you open a JIRA ticket?   Note, the ARFF support may not be complete as of yet, although I wonder if this is an issue with the upgrade to a newer version of Hadoop.  I'll try to take a look in the next few days.

-Grant

On Jan 19, 2010, at 4:16 PM, Jerry Ye wrote:

> Hi,
> I've been trying to convert a simple arff file and I'm getting the following error:
> 
> -bash-3.1$ java -cp mahout-core-0.3-SNAPSHOT.jar:mahout-utils-0.3-SNAPSHOT.jar:$(echo dependency/*.jar . | sed 's/ /:/g') org.apache.mahout.utils.vectors.arff.Driver -d vehicle.arff -o iris -t iris/dict.txt
> Jan 19, 2010 8:58:36 PM org.slf4j.impl.JCLLoggerAdapter info
> INFO: Output Dir: iris
> Jan 19, 2010 8:58:36 PM org.slf4j.impl.JCLLoggerAdapter info
> INFO: Converting File: vehicle.arff
> outfile: iris/vehicle.arff.mvc
> Exception in thread "main" java.lang.NullPointerException
>    at org.apache.hadoop.io.serializer.SerializationFactory.getSerializer(SerializationFactory.java:73)
>    at org.apache.hadoop.io.SequenceFile$Writer.init(SequenceFile.java:910)
>    at org.apache.hadoop.io.SequenceFile$RecordCompressWriter.<init>(SequenceFile.java:1074)
>    at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:397)
>    at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
>    at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:265)
>    at org.apache.mahout.utils.vectors.arff.Driver.getSeqFileWriter(Driver.java:180)
>    at org.apache.mahout.utils.vectors.arff.Driver.writeFile(Driver.java:167)
>    at org.apache.mahout.utils.vectors.arff.Driver.main(Driver.java:132)
> 
> My data is:
> 
> @relation iris
> 
> @attribute f1 numeric
> @attribute f2 numeric
> @attribute f3 numeric
> @attribute f4 numeric
> 
> @data
> 5.1,3.5,1.4,0.2
> 4.9,3.0,1.4,0.2
> 4.7,3.2,1.3,0.2
> 4.6,3.1,1.5,0.2
> 5.0,3.6,1.4,0.2
> 5.4,3.9,1.7,0.4
> 4.6,3.4,1.4,0.3
> 5.0,3.4,1.5,0.2
> 4.4,2.9,1.4,0.2
> 4.9,3.1,1.5,0.1
> 
> 
> Any guidance?  Thanks.
> 
> - jerry