You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Ahmed Elgohary <aa...@gmail.com> on 2012/09/02 06:42:41 UTC

Inconsistent split cardinality in DistributedRowMatrix.times()

Hi,

I was multiplying two matrices using
DistributedRowMatrix.times(DistributedrowMatrix other). The two matrices
were of size 15x30 (small experiment on my local machine). I got the
following exception:

12/09/01 23:33:46 ERROR security.UserGroupInformation:
PriviledgedActionException as:ahmed cause:java.io.IOException: Inconsistent
split cardinality from child 1 (2/1)
Exception in thread "main" java.io.IOException: Inconsistent split
cardinality from child 1 (2/1)
    at org.apache.hadoop.mapred.join.Parser$CNode.getSplits(Parser.java:369)
    at
org.apache.hadoop.mapred.join.CompositeInputFormat.getSplits(CompositeInputFormat.java:117)
    at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:989)
    at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:981)
    at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:897)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
    at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
    at
org.apache.mahout.math.hadoop.DistributedRowMatrix.times(DistributedRowMatrix.java:189)
    at
baselines.ReconstructionError.clacReconstructionErr(ReconstructionError.java:45)
    at baselines.Main.runRandomSelection(Main.java:108)
    at baselines.Main.main(Main.java:154)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

Any idea what might be the reason of that error?

--ahmed

Re: Inconsistent split cardinality in DistributedRowMatrix.times()

Posted by Ahmed Elgohary <aa...@gmail.com>.
well, I figured out the reason of that problem. Mahout's
MatrixMultiplicationJob uses a CompositeInputFormat to join the input rows
of the two matrices being multiplied. CompositeInputFormat requires that
the two matrices should be partitioned the same way. My first matrix was in
a single sequence file while the second matrix was the output of a previous
mapreduce job with multiple reducers (so it was partitioned into multiple
sequence files). Am I missing something here? or it's a limitation with the
multiplication job. Is there an efficient way to avoid this limitation?

thanks,
--ahmed

On Sun, Sep 2, 2012 at 12:42 AM, Ahmed Elgohary <aa...@gmail.com> wrote:

> Hi,
>
> I was multiplying two matrices using
> DistributedRowMatrix.times(DistributedrowMatrix other). The two matrices
> were of size 15x30 (small experiment on my local machine). I got the
> following exception:
>
> 12/09/01 23:33:46 ERROR security.UserGroupInformation:
> PriviledgedActionException as:ahmed cause:java.io.IOException: Inconsistent
> split cardinality from child 1 (2/1)
> Exception in thread "main" java.io.IOException: Inconsistent split
> cardinality from child 1 (2/1)
>     at
> org.apache.hadoop.mapred.join.Parser$CNode.getSplits(Parser.java:369)
>     at
> org.apache.hadoop.mapred.join.CompositeInputFormat.getSplits(CompositeInputFormat.java:117)
>     at
> org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:989)
>     at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:981)
>     at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)
>     at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:897)
>     at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:396)
>     at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>     at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
>     at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
>     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
>     at
> org.apache.mahout.math.hadoop.DistributedRowMatrix.times(DistributedRowMatrix.java:189)
>     at
> baselines.ReconstructionError.clacReconstructionErr(ReconstructionError.java:45)
>     at baselines.Main.runRandomSelection(Main.java:108)
>     at baselines.Main.main(Main.java:154)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>
> Any idea what might be the reason of that error?
>
> --ahmed
>