You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hama.apache.org by Martin Illecker <mi...@apache.org> on 2013/08/20 21:38:29 UTC

Problems with VectorWritable as OutputValueClass

Hi,

sorry for asking such a simple question but I have troubles when using *
ml.writable.VectorWritable* as OutputValueClass.

The output file is always empty!

$ hama seqdumper -seqFile
> output/hama/rootbeer/examples/matrixmultiplication/CPU-1377026544751/MatrixC.seq
> 13/08/20 21:22:50 INFO util.SequenceFileDumper: DEBUG: Hama
> SequenceFileDumper started!
> 13/08/20 21:22:50 DEBUG util.SequenceFileDumper: DEBUG: Arguments:
> [Ljava.lang.String;@48d6c16c
> Input Path:
> output/hama/rootbeer/examples/matrixmultiplication/CPU-1377026544751/MatrixC.seq
> Key class: class org.apache.hadoop.io.IntWritable Value Class: class
> org.apache.hama.ml.writable.VectorWritable
> Count: 0
>

I had no problems with the mahout VectorWritable before.

Of course I have set the OutputFormat [1] to

job.setOutputFormat(SequenceFileOutputFormat.class);
> job.setOutputKeyClass(IntWritable.class);
> job.setOutputValueClass(VectorWritable.class);
> job.setOutputPath(outPath);
>

The source [2] within the bsp task looks like that

DenseDoubleVector rowValues = new DenseDoubleVector(new
double[]{1,2,3,4,5});


> if (isDebuggingEnabled) {
>   logger.writeChars("bsp,write,key=" + rowIndex + ",value="
>     + rowValues.toString() + "\n");
> }
> peer.write(new IntWritable(rowIndex), new VectorWritable(rowValues));
>

I hope, someone could help me with this strange behavior!

Thank you very much!

Martin

[1]
https://github.com/millecker/applications/blob/master/hama/rootbeer/matrixmultiplication/src/at/illecker/hama/rootbeer/examples/matrixmultiplication/cpu/MatrixMultiplicationBSPCpu.java#L219-222
[2]
https://github.com/millecker/applications/blob/master/hama/rootbeer/matrixmultiplication/src/at/illecker/hama/rootbeer/examples/matrixmultiplication/cpu/MatrixMultiplicationBSPCpu.java#L166-178

Re: Problems with VectorWritable as OutputValueClass

Posted by Martin Illecker <mi...@apache.org>.
The problem does not belong to the VectorWritable class!

I was running my example with bspTaskNum 8 and only task0 [1] is able to
write out to the hdfs.
If another task (except task0) is writing out the results the output file
is created but empty.

I don't understand this be behavior!?

Martin

[1]
https://github.com/millecker/applications/blob/master/hama/rootbeer/matrixmultiplication/src/at/illecker/hama/rootbeer/examples/matrixmultiplication/cpu/MatrixMultiplicationBSPCpu.java#L93

2013/8/20 Martin Illecker <ma...@illecker.at>

> Hi,
>
> sorry for asking such a simple question but I have troubles when using *
> ml.writable.VectorWritable* as OutputValueClass.
>
> The output file is always empty!
>
> $ hama seqdumper -seqFile
>> output/hama/rootbeer/examples/matrixmultiplication/CPU-1377026544751/MatrixC.seq
>> 13/08/20 21:22:50 INFO util.SequenceFileDumper: DEBUG: Hama
>> SequenceFileDumper started!
>> 13/08/20 21:22:50 DEBUG util.SequenceFileDumper: DEBUG: Arguments:
>> [Ljava.lang.String;@48d6c16c
>> Input Path:
>> output/hama/rootbeer/examples/matrixmultiplication/CPU-1377026544751/MatrixC.seq
>> Key class: class org.apache.hadoop.io.IntWritable Value Class: class
>> org.apache.hama.ml.writable.VectorWritable
>> Count: 0
>>
>
> I had no problems with the mahout VectorWritable before!
>
>
> Of course I have set the OutputFormat [1] to
>
> job.setOutputFormat(SequenceFileOutputFormat.class);
>> job.setOutputKeyClass(IntWritable.class);
>> job.setOutputValueClass(VectorWritable.class);
>> job.setOutputPath(outPath);
>>
>
> The source [2] within the bsp task looks like that
>
>
>> DenseDoubleVector rowValues = new DenseDoubleVector(new
>> double[]{1,2,3,4,5});
>
> if (isDebuggingEnabled) {
>>   logger.writeChars("bsp,write,key=" + rowIndex + ",value="
>>     + rowValues.toString() + "\n");
>> }
>> peer.write(new IntWritable(rowIndex), new VectorWritable(rowValues));
>>
>
> My logger tells the key / value pairs have been written out:
>
> bsp,write,key=0,value=[1.0, 2.0, 3.0, 4.0, 5.0]
>> bsp,write,key=1,value=[1.0, 2.0, 3.0, 4.0, 5.0]
>> bsp,write,key=2,value=[1.0, 2.0, 3.0, 4.0, 5.0]
>> bsp,write,key=3,value=[1.0, 2.0, 3.0, 4.0, 5.0]
>> bsp,write,key=4,value=[1.0, 2.0, 3.0, 4.0, 5.0]
>> bsp,write,key=5,value=[1.0, 2.0, 3.0, 4.0, 5.0]
>
>
> But the output file does not contain any key / value pairs!
>
> I hope, someone could help me with this strange behavior!
>
> Thank you very much!
>
> Martin
>
> [1]
> https://github.com/millecker/applications/blob/master/hama/rootbeer/matrixmultiplication/src/at/illecker/hama/rootbeer/examples/matrixmultiplication/cpu/MatrixMultiplicationBSPCpu.java#L219-222
> [2]
> https://github.com/millecker/applications/blob/master/hama/rootbeer/matrixmultiplication/src/at/illecker/hama/rootbeer/examples/matrixmultiplication/cpu/MatrixMultiplicationBSPCpu.java#L166-178
>
>

Problems with VectorWritable as OutputValueClass

Posted by Martin Illecker <ma...@illecker.at>.
Hi,

sorry for asking such a simple question but I have troubles when using *
ml.writable.VectorWritable* as OutputValueClass.

The output file is always empty!

$ hama seqdumper -seqFile
> output/hama/rootbeer/examples/matrixmultiplication/CPU-1377026544751/MatrixC.seq
> 13/08/20 21:22:50 INFO util.SequenceFileDumper: DEBUG: Hama
> SequenceFileDumper started!
> 13/08/20 21:22:50 DEBUG util.SequenceFileDumper: DEBUG: Arguments:
> [Ljava.lang.String;@48d6c16c
> Input Path:
> output/hama/rootbeer/examples/matrixmultiplication/CPU-1377026544751/MatrixC.seq
> Key class: class org.apache.hadoop.io.IntWritable Value Class: class
> org.apache.hama.ml.writable.VectorWritable
> Count: 0
>

I had no problems with the mahout VectorWritable before!

Of course I have set the OutputFormat [1] to

job.setOutputFormat(SequenceFileOutputFormat.class);
> job.setOutputKeyClass(IntWritable.class);
> job.setOutputValueClass(VectorWritable.class);
> job.setOutputPath(outPath);
>

The source [2] within the bsp task looks like that


> DenseDoubleVector rowValues = new DenseDoubleVector(new
> double[]{1,2,3,4,5});

if (isDebuggingEnabled) {
>   logger.writeChars("bsp,write,key=" + rowIndex + ",value="
>     + rowValues.toString() + "\n");
> }
> peer.write(new IntWritable(rowIndex), new VectorWritable(rowValues));
>

My logger tells the key / value pairs have been written out:

bsp,write,key=0,value=[1.0, 2.0, 3.0, 4.0, 5.0]
> bsp,write,key=1,value=[1.0, 2.0, 3.0, 4.0, 5.0]
> bsp,write,key=2,value=[1.0, 2.0, 3.0, 4.0, 5.0]
> bsp,write,key=3,value=[1.0, 2.0, 3.0, 4.0, 5.0]
> bsp,write,key=4,value=[1.0, 2.0, 3.0, 4.0, 5.0]
> bsp,write,key=5,value=[1.0, 2.0, 3.0, 4.0, 5.0]


But the output file does not contain any key / value pairs!

I hope, someone could help me with this strange behavior!

Thank you very much!

Martin

[1]
https://github.com/millecker/applications/blob/master/hama/rootbeer/matrixmultiplication/src/at/illecker/hama/rootbeer/examples/matrixmultiplication/cpu/MatrixMultiplicationBSPCpu.java#L219-222
[2]
https://github.com/millecker/applications/blob/master/hama/rootbeer/matrixmultiplication/src/at/illecker/hama/rootbeer/examples/matrixmultiplication/cpu/MatrixMultiplicationBSPCpu.java#L166-178