You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@crunch.apache.org by Lucy Chen <lu...@gmail.com> on 2015/04/14 01:12:47 UTC

org.apache.avro.UnresolvedUnionException

Hi,

     I have an exception of org.apache.avro.UnresolvedUnionException thrown
out by the following codes:


PType<ABCData> ABCDataType = Avros.records(ABCData.class);

PTable<String, ABCData> ABC = input.mapValues(new
ConvertToABCData(feat_index_mapping, addIntercept), ABCDataType);

*******************************************************************************************************


PTable<String, String> lgr = ABC.groupByKey().

                       mapValues(new MapFn<Iterable<ABCData>, String> {

                         @Override

public String map(Iterable<LingPipeData> input)

{

Iterator<LingPipeData> ite1 = input.iterator();

int counter=0;

while(ite1.hasNext())

{

counter++;

}

 return Integer.toString(counter);


                       }

                          }, Avros.strings());

lgr.write(At.textFile(output_path), WriteMode.OVERWRITE);

****************************************************************************************************************


public class ConvertToABCData extends MapFn<InputType, ABCData>{


private FeatIndexMapping feat_index_mapping;

private boolean addIntercept;

 public ConvertToABCData(FeatIndexMapping feat_index_mapping, boolean
addIntercept)

{

this.feat_index_mapping = feat_index_mapping;

this.addIntercept = addIntercept;

}

 @Override

public  ABCData map(InputType input)

{

return new ABCData(input, feat_index_mapping, addIntercept);

 }


}


public class ABCData implements java.io.Serializable, Cloneable{


private int label;

private Vector feature;

private int dim;

private final static Logger logger = Logger

       .getLogger(ABCData.class.getName());

        ......

}


Here Vector is defined from third party: com.aliasi.matrix.Vector; The
codes can run well until the line of star. But when the codes include
ABC.groupByKey().mapValues(), the following exception will be caught. Can
any one tell me how to solve the problem?


Thanks.


Lucy


 The logs look like:


org.apache.crunch.CrunchRuntimeException:
org.apache.avro.file.DataFileWriter$AppendWriteException:
org.apache.avro.UnresolvedUnionException: Not in union
["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466

at
org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:45)

at org.apache.crunch.MapFn.process(MapFn.java:34)

at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)

at
org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)

at org.apache.crunch.MapFn.process(MapFn.java:34)

at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)

at
org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)

at
com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:14)

at
com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:10)

at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)

at
org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)

at org.apache.crunch.lib.join.InnerJoinFn.join(InnerJoinFn.java:66)

at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:79)

at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:32)

at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)

at
org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)

at org.apache.crunch.MapFn.process(MapFn.java:34)

at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)

at org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:113)

at org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57)

at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)

at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)

at
org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:744)

Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException:
org.apache.avro.UnresolvedUnionException: Not in union
["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466

at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263)

at
org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:87)

at
org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:84)

at org.apache.crunch.io.CrunchOutputs.write(CrunchOutputs.java:133)

at
org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:41)

... 28 more

Caused by: org.apache.avro.UnresolvedUnionException: Not in union
["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466

at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:561)

at
org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:144)

at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)

at
org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)

at
org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)

at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)

at
org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)

at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)

at
org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)

at
org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)

at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)

at
org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)

at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)

at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:257)

... 32 more

2015-04-13 15:49:16,876 INFO  [Thread-500] mapred.LocalJobRunner
(LocalJobRunner.java:runTasks(456)) - reduce task executor complete.

2015-04-13 15:49:16,879 WARN  [Thread-500] mapred.LocalJobRunner
(LocalJobRunner.java:run(560)) - job_local918028004_0008

java.lang.Exception: org.apache.crunch.CrunchRuntimeException:
org.apache.avro.file.DataFileWriter$AppendWriteException:
org.apache.avro.UnresolvedUnionException: Not in union
["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466

at
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)

at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)

Caused by: org.apache.crunch.CrunchRuntimeException:
org.apache.avro.file.DataFileWriter$AppendWriteException:
org.apache.avro.UnresolvedUnionException: Not in union
["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466

at
org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:45)

at org.apache.crunch.MapFn.process(MapFn.java:34)

at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)

at
org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)

at org.apache.crunch.MapFn.process(MapFn.java:34)

at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)

at
org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)

at
com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:14)

at
com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:10)

at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)

at
org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)

at org.apache.crunch.lib.join.InnerJoinFn.join(InnerJoinFn.java:66)

at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:79)

at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:32)

at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)

at
org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)

at org.apache.crunch.MapFn.process(MapFn.java:34)

at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)

at org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:113)

at org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57)

at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)

at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)

at
org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:744)

Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException:
org.apache.avro.UnresolvedUnionException: Not in union
["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466

at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263)

at
org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:87)

at
org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:84)

at org.apache.crunch.io.CrunchOutputs.write(CrunchOutputs.java:133)

at
org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:41)

... 28 more

Caused by: org.apache.avro.UnresolvedUnionException: Not in union
["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466

at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:561)

at
org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:144)

at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)

at
org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)

at
org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)

at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)

at
org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)

at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)

at
org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)

at
org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)

at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)

at
org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)

at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)

at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:257)

... 32 more

2 job failure(s) occurred:

(5): Depending job with jobID 1 failed.

com.apple.rsp.CrossValidation.CrossValidationDriver:
[[Text(/Users/luren/Lu/Project/model_testing/training_set... ID=1 (5/6)(1):
Job failed!

Re: org.apache.avro.UnresolvedUnionException

Posted by Josh Wills <jw...@cloudera.com>.
Huh. I guess the only thing I can suspect is that Avro reflection can't
handle serializing the Vector class automatically for some reason, although
I'm not enough of an expert on LingPipe (or Avro reflection-based
serialization, for that matter) to know why. My best advice is to try to
store the contents of the Vector in some other intermediate format that is
known to work w/Avro reflection, like maybe a double[].

J

On Mon, Apr 13, 2015 at 10:07 PM, Lucy Chen <lu...@gmail.com>
wrote:

> Hi Josh,
>
>          Here are my revised codes as you suggested,
>
> PTable<String, String> lgr = PTables.asPTable(ABC.groupByKey().
>
>         parallelDo(new DoFn<Pair<String, Iterable<ABCData>>, Pair<String,
> String>>()
>
>         {
>
>               @Override
>
>               public void process(Pair<String, Iterable<ABCData>> input,
> Emitter<Pair<String, String>> emitter)
>
>               {
>
>              Iterable<ABCData> temp = input.second();
>
>               Iterator<ABCPipeData> ite1 = temp.iterator();
>
>           int counter=0;
>
>           while(ite1.hasNext())
>
>           {
>
>           counter++;
>
>           }
>
>
>
>           emitter.emit(Pair.of(input.first(), Integer.toString(counter)));
>
>
>
>               }
>
>
>
>         }, Avros.pairs(Avros.strings(), Avros.strings())));
>
>
>
>         lgr.write(At.textFile(output_path), WriteMode.OVERWRITE);
>
>
> and I got the similar exception as follows. I feel that the value in the
> ABC table made the groupByKey() failed, but did not figure out why. I did
> some experiment with a normal PTable<String, String> as the input and wrote
> some similar functions, then it worked. But the input ABC I included here
> failed the job.
>
>
>      Thanks!
>
>
> Lu
>
>
>
>
> org.apache.crunch.CrunchRuntimeException:
> org.apache.avro.file.DataFileWriter$AppendWriteException:
> org.apache.avro.UnresolvedUnionException: Not in union
> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>
> at
> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:45)
>
> at org.apache.crunch.MapFn.process(MapFn.java:34)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at org.apache.crunch.MapFn.process(MapFn.java:34)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at
> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:14)
>
> at
> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:10)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at org.apache.crunch.lib.join.InnerJoinFn.join(InnerJoinFn.java:66)
>
> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:79)
>
> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:32)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at org.apache.crunch.MapFn.process(MapFn.java:34)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:113)
>
> at
> org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57)
>
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
>
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
> at java.lang.Thread.run(Thread.java:744)
>
> Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException:
> org.apache.avro.UnresolvedUnionException: Not in union
> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>
> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263)
>
> at
> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:87)
>
> at
> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:84)
>
> at org.apache.crunch.io.CrunchOutputs.write(CrunchOutputs.java:133)
>
> at
> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:41)
>
> ... 28 more
>
> Caused by: org.apache.avro.UnresolvedUnionException: Not in union
> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>
> at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:561)
>
> at
> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:144)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
>
> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:257)
>
> ... 32 more
>
> 2015-04-13 18:58:34,157 INFO  [Thread-449] mapred.LocalJobRunner
> (LocalJobRunner.java:runTasks(456)) - reduce task executor complete.
>
> 2015-04-13 18:58:34,160 WARN  [Thread-449] mapred.LocalJobRunner
> (LocalJobRunner.java:run(560)) - job_local2029702519_0007
>
> java.lang.Exception: org.apache.crunch.CrunchRuntimeException:
> org.apache.avro.file.DataFileWriter$AppendWriteException:
> org.apache.avro.UnresolvedUnionException: Not in union
> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
>
> Caused by: org.apache.crunch.CrunchRuntimeException:
> org.apache.avro.file.DataFileWriter$AppendWriteException:
> org.apache.avro.UnresolvedUnionException: Not in union
> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>
> at
> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:45)
>
> at org.apache.crunch.MapFn.process(MapFn.java:34)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at org.apache.crunch.MapFn.process(MapFn.java:34)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at
> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:14)
>
> at
> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:10)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at org.apache.crunch.lib.join.InnerJoinFn.join(InnerJoinFn.java:66)
>
> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:79)
>
> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:32)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at org.apache.crunch.MapFn.process(MapFn.java:34)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:113)
>
> at
> org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57)
>
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
>
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
> at java.lang.Thread.run(Thread.java:744)
>
> Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException:
> org.apache.avro.UnresolvedUnionException: Not in union
> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>
> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263)
>
> at
> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:87)
>
> at
> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:84)
>
> at org.apache.crunch.io.CrunchOutputs.write(CrunchOutputs.java:133)
>
> at
> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:41)
>
> ... 28 more
>
> Caused by: org.apache.avro.UnresolvedUnionException: Not in union
> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>
> at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:561)
>
> at
> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:144)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
>
> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:257)
>
> ... 32 more
>
> 2 job failure(s) occurred:
>
> (6): Depending job with jobID 2 failed.
>
> com.apple.rsp.CrossValidation.CrossValidationDriver:
> [[Text(/Users/luren/Lu/Project/model_testing/training_set... ID=2 (4/6)(2):
> Job failed!
>
> On Mon, Apr 13, 2015 at 6:37 PM, Josh Wills <jw...@cloudera.com> wrote:
>
>> Oh, okay. Would you humor me and try to use parallelDo(...) instead of
>> mapValues(...) after the groupByKey() call and see if that works? I have
>> this weird feeling that mapValues is doing something it shouldn't be doing
>> to the Iterable.
>>
>> J
>>
>> On Mon, Apr 13, 2015 at 9:34 PM, Lucy Chen <lu...@gmail.com>
>> wrote:
>>
>>> Hi Josh,
>>>
>>>          Thanks for your quick response. The codes should be as follows.
>>> I just renamed the LingPipeData and copied the codes in the email, I forgot
>>> to change a couple of places. I just simply change LingPipeData to ABCData
>>> to make it easier for you to understand. Here I used LingPipe package
>>> inside my Crunch jobs. I doubt whether the Vector included in the ABCData
>>> caused some troubles when it was serialized by an Avro type. However, when
>>> the codes exclude the parts after "*******" and just write ABC as an
>>> output. It worked fine; but after adding ABC.groupByKey().mapValues(...),
>>> it throws the exception.
>>>
>>>           Sorry about the typos in my last email.
>>>
>>>           Thanks!
>>>
>>> Lucy
>>>
>>> PType<ABCData> ABCDataType = Avros.records(ABCData.class);
>>>
>>> PTable<String, ABCData> ABC = input.mapValues(new ConvertToABCData(feat_index_mapping,
>>> addIntercept), ABCDataType);
>>>
>>>
>>> *******************************************************************************************************
>>>
>>>
>>> PTable<String, String> lgr = ABC.groupByKey().
>>>
>>>                        mapValues(new MapFn<Iterable<ABCData>, String> {
>>>
>>>                          @Override
>>>
>>> public String map(Iterable<ABCData> input)
>>>
>>> {
>>>
>>> Iterator<ABCData> ite1 = input.iterator();
>>>
>>> int counter=0;
>>>
>>> while(ite1.hasNext())
>>>
>>> {
>>>
>>> counter++;
>>>
>>> }
>>>
>>>  return Integer.toString(counter);
>>>
>>>
>>>                        }
>>>
>>>                           }, Avros.strings());
>>>
>>> lgr.write(At.textFile(output_path), WriteMode.OVERWRITE);
>>>
>>>
>>> ****************************************************************************************************************
>>>
>>>
>>> public class ConvertToABCData extends MapFn<InputType, ABCData>{
>>>
>>>
>>> private FeatIndexMapping feat_index_mapping;
>>>
>>> private boolean addIntercept;
>>>
>>>  public ConvertToABCData(FeatIndexMapping feat_index_mapping, boolean
>>> addIntercept)
>>>
>>> {
>>>
>>> this.feat_index_mapping = feat_index_mapping;
>>>
>>> this.addIntercept = addIntercept;
>>>
>>> }
>>>
>>>  @Override
>>>
>>> public  ABCData map(InputType input)
>>>
>>> {
>>>
>>> return new ABCData(input, feat_index_mapping, addIntercept);
>>>
>>>  }
>>>
>>>
>>> }
>>>
>>>
>>> public class ABCData implements java.io.Serializable, Cloneable{
>>>
>>>
>>> private int label;
>>>
>>> private Vector feature;
>>>
>>> private int dim;
>>>
>>> private final static Logger logger = Logger
>>>
>>>       .getLogger(ABCData.class.getName());
>>>
>>>         ......
>>>
>>> }
>>>
>>>
>>> On Mon, Apr 13, 2015 at 4:24 PM, Josh Wills <jo...@gmail.com>
>>> wrote:
>>>
>>>> Hey Lucy,
>>>>
>>>> I don't grok the last MapFn before the lgr gets written out; it looks
>>>> like it's defined over an Iterable<ABCData>, but the map() function defined
>>>> inside the class is over Iterable<LingPipeData>. I assume that's the source
>>>> of the problem-- the value that is getting printed out is the string form
>>>> of a LingPipeData object, which isn't what the system expects to see.
>>>>
>>>> J
>>>>
>>>> On Mon, Apr 13, 2015 at 7:12 PM, Lucy Chen <lu...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>>      I have an exception of org.apache.avro.UnresolvedUnionException
>>>>> thrown out by the following codes:
>>>>>
>>>>>
>>>>> PType<ABCData> ABCDataType = Avros.records(ABCData.class);
>>>>>
>>>>> PTable<String, ABCData> ABC = input.mapValues(new
>>>>> ConvertToABCData(feat_index_mapping, addIntercept), ABCDataType);
>>>>>
>>>>>
>>>>> *******************************************************************************************************
>>>>>
>>>>>
>>>>> PTable<String, String> lgr = ABC.groupByKey().
>>>>>
>>>>>                        mapValues(new MapFn<Iterable<ABCData>, String>
>>>>> {
>>>>>
>>>>>                          @Override
>>>>>
>>>>> public String map(Iterable<LingPipeData> input)
>>>>>
>>>>> {
>>>>>
>>>>> Iterator<LingPipeData> ite1 = input.iterator();
>>>>>
>>>>> int counter=0;
>>>>>
>>>>> while(ite1.hasNext())
>>>>>
>>>>> {
>>>>>
>>>>> counter++;
>>>>>
>>>>> }
>>>>>
>>>>>  return Integer.toString(counter);
>>>>>
>>>>>
>>>>>                        }
>>>>>
>>>>>                           }, Avros.strings());
>>>>>
>>>>> lgr.write(At.textFile(output_path), WriteMode.OVERWRITE);
>>>>>
>>>>>
>>>>> ****************************************************************************************************************
>>>>>
>>>>>
>>>>> public class ConvertToABCData extends MapFn<InputType, ABCData>{
>>>>>
>>>>>
>>>>> private FeatIndexMapping feat_index_mapping;
>>>>>
>>>>> private boolean addIntercept;
>>>>>
>>>>>  public ConvertToABCData(FeatIndexMapping feat_index_mapping, boolean
>>>>> addIntercept)
>>>>>
>>>>> {
>>>>>
>>>>> this.feat_index_mapping = feat_index_mapping;
>>>>>
>>>>> this.addIntercept = addIntercept;
>>>>>
>>>>> }
>>>>>
>>>>>  @Override
>>>>>
>>>>> public  ABCData map(InputType input)
>>>>>
>>>>> {
>>>>>
>>>>> return new ABCData(input, feat_index_mapping, addIntercept);
>>>>>
>>>>>  }
>>>>>
>>>>>
>>>>> }
>>>>>
>>>>>
>>>>> public class ABCData implements java.io.Serializable, Cloneable{
>>>>>
>>>>>
>>>>> private int label;
>>>>>
>>>>> private Vector feature;
>>>>>
>>>>> private int dim;
>>>>>
>>>>> private final static Logger logger = Logger
>>>>>
>>>>>        .getLogger(ABCData.class.getName());
>>>>>
>>>>>         ......
>>>>>
>>>>> }
>>>>>
>>>>>
>>>>> Here Vector is defined from third party: com.aliasi.matrix.Vector; The
>>>>> codes can run well until the line of star. But when the codes include
>>>>> ABC.groupByKey().mapValues(), the following exception will be caught. Can
>>>>> any one tell me how to solve the problem?
>>>>>
>>>>>
>>>>> Thanks.
>>>>>
>>>>>
>>>>> Lucy
>>>>>
>>>>>
>>>>>  The logs look like:
>>>>>
>>>>>
>>>>> org.apache.crunch.CrunchRuntimeException:
>>>>> org.apache.avro.file.DataFileWriter$AppendWriteException:
>>>>> org.apache.avro.UnresolvedUnionException: Not in union
>>>>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>>>>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>>>>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>>>>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>>>>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>>>>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>>>>
>>>>> at
>>>>> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:45)
>>>>>
>>>>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>>>>
>>>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>>>
>>>>> at
>>>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>>>
>>>>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>>>>
>>>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>>>
>>>>> at
>>>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>>>
>>>>> at
>>>>> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:14)
>>>>>
>>>>> at
>>>>> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:10)
>>>>>
>>>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>>>
>>>>> at
>>>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>>>
>>>>> at org.apache.crunch.lib.join.InnerJoinFn.join(InnerJoinFn.java:66)
>>>>>
>>>>> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:79)
>>>>>
>>>>> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:32)
>>>>>
>>>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>>>
>>>>> at
>>>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>>>
>>>>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>>>>
>>>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>>>
>>>>> at
>>>>> org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:113)
>>>>>
>>>>> at
>>>>> org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57)
>>>>>
>>>>> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
>>>>>
>>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>>>>>
>>>>> at
>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>>>
>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>>>
>>>>> at
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>
>>>>> at
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>
>>>>> at java.lang.Thread.run(Thread.java:744)
>>>>>
>>>>> Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException:
>>>>> org.apache.avro.UnresolvedUnionException: Not in union
>>>>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>>>>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>>>>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>>>>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>>>>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>>>>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>>>>
>>>>> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263)
>>>>>
>>>>> at
>>>>> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:87)
>>>>>
>>>>> at
>>>>> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:84)
>>>>>
>>>>> at org.apache.crunch.io.CrunchOutputs.write(CrunchOutputs.java:133)
>>>>>
>>>>> at
>>>>> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:41)
>>>>>
>>>>> ... 28 more
>>>>>
>>>>> Caused by: org.apache.avro.UnresolvedUnionException: Not in union
>>>>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>>>>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>>>>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>>>>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>>>>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>>>>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>>>>
>>>>> at
>>>>> org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:561)
>>>>>
>>>>> at
>>>>> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:144)
>>>>>
>>>>> at
>>>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)
>>>>>
>>>>> at
>>>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>>>
>>>>> at
>>>>> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>>>>>
>>>>> at
>>>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>>>>>
>>>>> at
>>>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>>>
>>>>> at
>>>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
>>>>>
>>>>> at
>>>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>>>
>>>>> at
>>>>> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>>>>>
>>>>> at
>>>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>>>>>
>>>>> at
>>>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>>>
>>>>> at
>>>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
>>>>>
>>>>> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:257)
>>>>>
>>>>> ... 32 more
>>>>>
>>>>> 2015-04-13 15:49:16,876 INFO  [Thread-500] mapred.LocalJobRunner
>>>>> (LocalJobRunner.java:runTasks(456)) - reduce task executor complete.
>>>>>
>>>>> 2015-04-13 15:49:16,879 WARN  [Thread-500] mapred.LocalJobRunner
>>>>> (LocalJobRunner.java:run(560)) - job_local918028004_0008
>>>>>
>>>>> java.lang.Exception: org.apache.crunch.CrunchRuntimeException:
>>>>> org.apache.avro.file.DataFileWriter$AppendWriteException:
>>>>> org.apache.avro.UnresolvedUnionException: Not in union
>>>>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>>>>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>>>>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>>>>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>>>>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>>>>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>>>>
>>>>> at
>>>>> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
>>>>>
>>>>> Caused by: org.apache.crunch.CrunchRuntimeException:
>>>>> org.apache.avro.file.DataFileWriter$AppendWriteException:
>>>>> org.apache.avro.UnresolvedUnionException: Not in union
>>>>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>>>>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>>>>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>>>>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>>>>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>>>>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>>>>
>>>>> at
>>>>> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:45)
>>>>>
>>>>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>>>>
>>>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>>>
>>>>> at
>>>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>>>
>>>>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>>>>
>>>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>>>
>>>>> at
>>>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>>>
>>>>> at
>>>>> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:14)
>>>>>
>>>>> at
>>>>> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:10)
>>>>>
>>>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>>>
>>>>> at
>>>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>>>
>>>>> at org.apache.crunch.lib.join.InnerJoinFn.join(InnerJoinFn.java:66)
>>>>>
>>>>> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:79)
>>>>>
>>>>> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:32)
>>>>>
>>>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>>>
>>>>> at
>>>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>>>
>>>>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>>>>
>>>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>>>
>>>>> at
>>>>> org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:113)
>>>>>
>>>>> at
>>>>> org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57)
>>>>>
>>>>> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
>>>>>
>>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>>>>>
>>>>> at
>>>>> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>>>>>
>>>>> at
>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>>>
>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>>>
>>>>> at
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>>
>>>>> at
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>>
>>>>> at java.lang.Thread.run(Thread.java:744)
>>>>>
>>>>> Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException:
>>>>> org.apache.avro.UnresolvedUnionException: Not in union
>>>>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>>>>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>>>>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>>>>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>>>>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>>>>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>>>>
>>>>> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263)
>>>>>
>>>>> at
>>>>> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:87)
>>>>>
>>>>> at
>>>>> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:84)
>>>>>
>>>>> at org.apache.crunch.io.CrunchOutputs.write(CrunchOutputs.java:133)
>>>>>
>>>>> at
>>>>> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:41)
>>>>>
>>>>> ... 28 more
>>>>>
>>>>> Caused by: org.apache.avro.UnresolvedUnionException: Not in union
>>>>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>>>>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>>>>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>>>>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>>>>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>>>>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>>>>
>>>>> at
>>>>> org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:561)
>>>>>
>>>>> at
>>>>> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:144)
>>>>>
>>>>> at
>>>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)
>>>>>
>>>>> at
>>>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>>>
>>>>> at
>>>>> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>>>>>
>>>>> at
>>>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>>>>>
>>>>> at
>>>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>>>
>>>>> at
>>>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
>>>>>
>>>>> at
>>>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>>>
>>>>> at
>>>>> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>>>>>
>>>>> at
>>>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>>>>>
>>>>> at
>>>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>>>
>>>>> at
>>>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
>>>>>
>>>>> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:257)
>>>>>
>>>>> ... 32 more
>>>>>
>>>>> 2 job failure(s) occurred:
>>>>>
>>>>> (5): Depending job with jobID 1 failed.
>>>>>
>>>>> com.apple.rsp.CrossValidation.CrossValidationDriver:
>>>>> [[Text(/Users/luren/Lu/Project/model_testing/training_set... ID=1 (5/6)(1):
>>>>> Job failed!
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> Director of Data Science
>> Cloudera <http://www.cloudera.com>
>> Twitter: @josh_wills <http://twitter.com/josh_wills>
>>
>
>


-- 
Director of Data Science
Cloudera <http://www.cloudera.com>
Twitter: @josh_wills <http://twitter.com/josh_wills>

Re: org.apache.avro.UnresolvedUnionException

Posted by Lucy Chen <lu...@gmail.com>.
Hi Josh,

         Here are my revised codes as you suggested,

PTable<String, String> lgr = PTables.asPTable(ABC.groupByKey().

        parallelDo(new DoFn<Pair<String, Iterable<ABCData>>, Pair<String,
String>>()

        {

              @Override

              public void process(Pair<String, Iterable<ABCData>> input,
Emitter<Pair<String, String>> emitter)

              {

             Iterable<ABCData> temp = input.second();

              Iterator<ABCPipeData> ite1 = temp.iterator();

          int counter=0;

          while(ite1.hasNext())

          {

          counter++;

          }



          emitter.emit(Pair.of(input.first(), Integer.toString(counter)));



              }



        }, Avros.pairs(Avros.strings(), Avros.strings())));



        lgr.write(At.textFile(output_path), WriteMode.OVERWRITE);


and I got the similar exception as follows. I feel that the value in the
ABC table made the groupByKey() failed, but did not figure out why. I did
some experiment with a normal PTable<String, String> as the input and wrote
some similar functions, then it worked. But the input ABC I included here
failed the job.


     Thanks!


Lu




org.apache.crunch.CrunchRuntimeException:
org.apache.avro.file.DataFileWriter$AppendWriteException:
org.apache.avro.UnresolvedUnionException: Not in union
["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466

at
org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:45)

at org.apache.crunch.MapFn.process(MapFn.java:34)

at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)

at
org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)

at org.apache.crunch.MapFn.process(MapFn.java:34)

at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)

at
org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)

at
com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:14)

at
com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:10)

at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)

at
org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)

at org.apache.crunch.lib.join.InnerJoinFn.join(InnerJoinFn.java:66)

at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:79)

at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:32)

at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)

at
org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)

at org.apache.crunch.MapFn.process(MapFn.java:34)

at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)

at org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:113)

at org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57)

at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)

at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)

at
org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:744)

Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException:
org.apache.avro.UnresolvedUnionException: Not in union
["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466

at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263)

at
org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:87)

at
org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:84)

at org.apache.crunch.io.CrunchOutputs.write(CrunchOutputs.java:133)

at
org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:41)

... 28 more

Caused by: org.apache.avro.UnresolvedUnionException: Not in union
["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466

at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:561)

at
org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:144)

at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)

at
org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)

at
org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)

at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)

at
org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)

at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)

at
org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)

at
org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)

at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)

at
org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)

at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)

at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:257)

... 32 more

2015-04-13 18:58:34,157 INFO  [Thread-449] mapred.LocalJobRunner
(LocalJobRunner.java:runTasks(456)) - reduce task executor complete.

2015-04-13 18:58:34,160 WARN  [Thread-449] mapred.LocalJobRunner
(LocalJobRunner.java:run(560)) - job_local2029702519_0007

java.lang.Exception: org.apache.crunch.CrunchRuntimeException:
org.apache.avro.file.DataFileWriter$AppendWriteException:
org.apache.avro.UnresolvedUnionException: Not in union
["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466

at
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)

at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)

Caused by: org.apache.crunch.CrunchRuntimeException:
org.apache.avro.file.DataFileWriter$AppendWriteException:
org.apache.avro.UnresolvedUnionException: Not in union
["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466

at
org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:45)

at org.apache.crunch.MapFn.process(MapFn.java:34)

at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)

at
org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)

at org.apache.crunch.MapFn.process(MapFn.java:34)

at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)

at
org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)

at
com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:14)

at
com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:10)

at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)

at
org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)

at org.apache.crunch.lib.join.InnerJoinFn.join(InnerJoinFn.java:66)

at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:79)

at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:32)

at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)

at
org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)

at org.apache.crunch.MapFn.process(MapFn.java:34)

at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)

at org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:113)

at org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57)

at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)

at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)

at
org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:744)

Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException:
org.apache.avro.UnresolvedUnionException: Not in union
["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466

at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263)

at
org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:87)

at
org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:84)

at org.apache.crunch.io.CrunchOutputs.write(CrunchOutputs.java:133)

at
org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:41)

... 28 more

Caused by: org.apache.avro.UnresolvedUnionException: Not in union
["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466

at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:561)

at
org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:144)

at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)

at
org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)

at
org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)

at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)

at
org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)

at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)

at
org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)

at
org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)

at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)

at
org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)

at
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)

at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:257)

... 32 more

2 job failure(s) occurred:

(6): Depending job with jobID 2 failed.

com.apple.rsp.CrossValidation.CrossValidationDriver:
[[Text(/Users/luren/Lu/Project/model_testing/training_set... ID=2 (4/6)(2):
Job failed!

On Mon, Apr 13, 2015 at 6:37 PM, Josh Wills <jw...@cloudera.com> wrote:

> Oh, okay. Would you humor me and try to use parallelDo(...) instead of
> mapValues(...) after the groupByKey() call and see if that works? I have
> this weird feeling that mapValues is doing something it shouldn't be doing
> to the Iterable.
>
> J
>
> On Mon, Apr 13, 2015 at 9:34 PM, Lucy Chen <lu...@gmail.com>
> wrote:
>
>> Hi Josh,
>>
>>          Thanks for your quick response. The codes should be as follows.
>> I just renamed the LingPipeData and copied the codes in the email, I forgot
>> to change a couple of places. I just simply change LingPipeData to ABCData
>> to make it easier for you to understand. Here I used LingPipe package
>> inside my Crunch jobs. I doubt whether the Vector included in the ABCData
>> caused some troubles when it was serialized by an Avro type. However, when
>> the codes exclude the parts after "*******" and just write ABC as an
>> output. It worked fine; but after adding ABC.groupByKey().mapValues(...),
>> it throws the exception.
>>
>>           Sorry about the typos in my last email.
>>
>>           Thanks!
>>
>> Lucy
>>
>> PType<ABCData> ABCDataType = Avros.records(ABCData.class);
>>
>> PTable<String, ABCData> ABC = input.mapValues(new ConvertToABCData(feat_index_mapping,
>> addIntercept), ABCDataType);
>>
>>
>> *******************************************************************************************************
>>
>>
>> PTable<String, String> lgr = ABC.groupByKey().
>>
>>                        mapValues(new MapFn<Iterable<ABCData>, String> {
>>
>>                          @Override
>>
>> public String map(Iterable<ABCData> input)
>>
>> {
>>
>> Iterator<ABCData> ite1 = input.iterator();
>>
>> int counter=0;
>>
>> while(ite1.hasNext())
>>
>> {
>>
>> counter++;
>>
>> }
>>
>>  return Integer.toString(counter);
>>
>>
>>                        }
>>
>>                           }, Avros.strings());
>>
>> lgr.write(At.textFile(output_path), WriteMode.OVERWRITE);
>>
>>
>> ****************************************************************************************************************
>>
>>
>> public class ConvertToABCData extends MapFn<InputType, ABCData>{
>>
>>
>> private FeatIndexMapping feat_index_mapping;
>>
>> private boolean addIntercept;
>>
>>  public ConvertToABCData(FeatIndexMapping feat_index_mapping, boolean
>> addIntercept)
>>
>> {
>>
>> this.feat_index_mapping = feat_index_mapping;
>>
>> this.addIntercept = addIntercept;
>>
>> }
>>
>>  @Override
>>
>> public  ABCData map(InputType input)
>>
>> {
>>
>> return new ABCData(input, feat_index_mapping, addIntercept);
>>
>>  }
>>
>>
>> }
>>
>>
>> public class ABCData implements java.io.Serializable, Cloneable{
>>
>>
>> private int label;
>>
>> private Vector feature;
>>
>> private int dim;
>>
>> private final static Logger logger = Logger
>>
>>       .getLogger(ABCData.class.getName());
>>
>>         ......
>>
>> }
>>
>>
>> On Mon, Apr 13, 2015 at 4:24 PM, Josh Wills <jo...@gmail.com> wrote:
>>
>>> Hey Lucy,
>>>
>>> I don't grok the last MapFn before the lgr gets written out; it looks
>>> like it's defined over an Iterable<ABCData>, but the map() function defined
>>> inside the class is over Iterable<LingPipeData>. I assume that's the source
>>> of the problem-- the value that is getting printed out is the string form
>>> of a LingPipeData object, which isn't what the system expects to see.
>>>
>>> J
>>>
>>> On Mon, Apr 13, 2015 at 7:12 PM, Lucy Chen <lu...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>>      I have an exception of org.apache.avro.UnresolvedUnionException
>>>> thrown out by the following codes:
>>>>
>>>>
>>>> PType<ABCData> ABCDataType = Avros.records(ABCData.class);
>>>>
>>>> PTable<String, ABCData> ABC = input.mapValues(new
>>>> ConvertToABCData(feat_index_mapping, addIntercept), ABCDataType);
>>>>
>>>>
>>>> *******************************************************************************************************
>>>>
>>>>
>>>> PTable<String, String> lgr = ABC.groupByKey().
>>>>
>>>>                        mapValues(new MapFn<Iterable<ABCData>, String> {
>>>>
>>>>                          @Override
>>>>
>>>> public String map(Iterable<LingPipeData> input)
>>>>
>>>> {
>>>>
>>>> Iterator<LingPipeData> ite1 = input.iterator();
>>>>
>>>> int counter=0;
>>>>
>>>> while(ite1.hasNext())
>>>>
>>>> {
>>>>
>>>> counter++;
>>>>
>>>> }
>>>>
>>>>  return Integer.toString(counter);
>>>>
>>>>
>>>>                        }
>>>>
>>>>                           }, Avros.strings());
>>>>
>>>> lgr.write(At.textFile(output_path), WriteMode.OVERWRITE);
>>>>
>>>>
>>>> ****************************************************************************************************************
>>>>
>>>>
>>>> public class ConvertToABCData extends MapFn<InputType, ABCData>{
>>>>
>>>>
>>>> private FeatIndexMapping feat_index_mapping;
>>>>
>>>> private boolean addIntercept;
>>>>
>>>>  public ConvertToABCData(FeatIndexMapping feat_index_mapping, boolean
>>>> addIntercept)
>>>>
>>>> {
>>>>
>>>> this.feat_index_mapping = feat_index_mapping;
>>>>
>>>> this.addIntercept = addIntercept;
>>>>
>>>> }
>>>>
>>>>  @Override
>>>>
>>>> public  ABCData map(InputType input)
>>>>
>>>> {
>>>>
>>>> return new ABCData(input, feat_index_mapping, addIntercept);
>>>>
>>>>  }
>>>>
>>>>
>>>> }
>>>>
>>>>
>>>> public class ABCData implements java.io.Serializable, Cloneable{
>>>>
>>>>
>>>> private int label;
>>>>
>>>> private Vector feature;
>>>>
>>>> private int dim;
>>>>
>>>> private final static Logger logger = Logger
>>>>
>>>>        .getLogger(ABCData.class.getName());
>>>>
>>>>         ......
>>>>
>>>> }
>>>>
>>>>
>>>> Here Vector is defined from third party: com.aliasi.matrix.Vector; The
>>>> codes can run well until the line of star. But when the codes include
>>>> ABC.groupByKey().mapValues(), the following exception will be caught. Can
>>>> any one tell me how to solve the problem?
>>>>
>>>>
>>>> Thanks.
>>>>
>>>>
>>>> Lucy
>>>>
>>>>
>>>>  The logs look like:
>>>>
>>>>
>>>> org.apache.crunch.CrunchRuntimeException:
>>>> org.apache.avro.file.DataFileWriter$AppendWriteException:
>>>> org.apache.avro.UnresolvedUnionException: Not in union
>>>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>>>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>>>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>>>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>>>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>>>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>>>
>>>> at
>>>> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:45)
>>>>
>>>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>>>
>>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>>
>>>> at
>>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>>
>>>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>>>
>>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>>
>>>> at
>>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>>
>>>> at
>>>> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:14)
>>>>
>>>> at
>>>> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:10)
>>>>
>>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>>
>>>> at
>>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>>
>>>> at org.apache.crunch.lib.join.InnerJoinFn.join(InnerJoinFn.java:66)
>>>>
>>>> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:79)
>>>>
>>>> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:32)
>>>>
>>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>>
>>>> at
>>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>>
>>>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>>>
>>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>>
>>>> at org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:113)
>>>>
>>>> at
>>>> org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57)
>>>>
>>>> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
>>>>
>>>> at
>>>> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
>>>>
>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>>>>
>>>> at
>>>> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>>>>
>>>> at
>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>>
>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>>
>>>> at
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>
>>>> at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>
>>>> at java.lang.Thread.run(Thread.java:744)
>>>>
>>>> Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException:
>>>> org.apache.avro.UnresolvedUnionException: Not in union
>>>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>>>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>>>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>>>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>>>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>>>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>>>
>>>> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263)
>>>>
>>>> at
>>>> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:87)
>>>>
>>>> at
>>>> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:84)
>>>>
>>>> at org.apache.crunch.io.CrunchOutputs.write(CrunchOutputs.java:133)
>>>>
>>>> at
>>>> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:41)
>>>>
>>>> ... 28 more
>>>>
>>>> Caused by: org.apache.avro.UnresolvedUnionException: Not in union
>>>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>>>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>>>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>>>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>>>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>>>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>>>
>>>> at
>>>> org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:561)
>>>>
>>>> at
>>>> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:144)
>>>>
>>>> at
>>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)
>>>>
>>>> at
>>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>>
>>>> at
>>>> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>>>>
>>>> at
>>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>>>>
>>>> at
>>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>>
>>>> at
>>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
>>>>
>>>> at
>>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>>
>>>> at
>>>> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>>>>
>>>> at
>>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>>>>
>>>> at
>>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>>
>>>> at
>>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
>>>>
>>>> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:257)
>>>>
>>>> ... 32 more
>>>>
>>>> 2015-04-13 15:49:16,876 INFO  [Thread-500] mapred.LocalJobRunner
>>>> (LocalJobRunner.java:runTasks(456)) - reduce task executor complete.
>>>>
>>>> 2015-04-13 15:49:16,879 WARN  [Thread-500] mapred.LocalJobRunner
>>>> (LocalJobRunner.java:run(560)) - job_local918028004_0008
>>>>
>>>> java.lang.Exception: org.apache.crunch.CrunchRuntimeException:
>>>> org.apache.avro.file.DataFileWriter$AppendWriteException:
>>>> org.apache.avro.UnresolvedUnionException: Not in union
>>>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>>>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>>>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>>>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>>>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>>>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>>>
>>>> at
>>>> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>>>>
>>>> at
>>>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
>>>>
>>>> Caused by: org.apache.crunch.CrunchRuntimeException:
>>>> org.apache.avro.file.DataFileWriter$AppendWriteException:
>>>> org.apache.avro.UnresolvedUnionException: Not in union
>>>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>>>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>>>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>>>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>>>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>>>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>>>
>>>> at
>>>> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:45)
>>>>
>>>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>>>
>>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>>
>>>> at
>>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>>
>>>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>>>
>>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>>
>>>> at
>>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>>
>>>> at
>>>> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:14)
>>>>
>>>> at
>>>> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:10)
>>>>
>>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>>
>>>> at
>>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>>
>>>> at org.apache.crunch.lib.join.InnerJoinFn.join(InnerJoinFn.java:66)
>>>>
>>>> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:79)
>>>>
>>>> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:32)
>>>>
>>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>>
>>>> at
>>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>>
>>>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>>>
>>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>>
>>>> at org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:113)
>>>>
>>>> at
>>>> org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57)
>>>>
>>>> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
>>>>
>>>> at
>>>> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
>>>>
>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>>>>
>>>> at
>>>> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>>>>
>>>> at
>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>>
>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>>
>>>> at
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>>
>>>> at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>>
>>>> at java.lang.Thread.run(Thread.java:744)
>>>>
>>>> Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException:
>>>> org.apache.avro.UnresolvedUnionException: Not in union
>>>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>>>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>>>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>>>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>>>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>>>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>>>
>>>> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263)
>>>>
>>>> at
>>>> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:87)
>>>>
>>>> at
>>>> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:84)
>>>>
>>>> at org.apache.crunch.io.CrunchOutputs.write(CrunchOutputs.java:133)
>>>>
>>>> at
>>>> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:41)
>>>>
>>>> ... 28 more
>>>>
>>>> Caused by: org.apache.avro.UnresolvedUnionException: Not in union
>>>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>>>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>>>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>>>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>>>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>>>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>>>
>>>> at
>>>> org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:561)
>>>>
>>>> at
>>>> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:144)
>>>>
>>>> at
>>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)
>>>>
>>>> at
>>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>>
>>>> at
>>>> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>>>>
>>>> at
>>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>>>>
>>>> at
>>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>>
>>>> at
>>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
>>>>
>>>> at
>>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>>
>>>> at
>>>> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>>>>
>>>> at
>>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>>>>
>>>> at
>>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>>
>>>> at
>>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
>>>>
>>>> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:257)
>>>>
>>>> ... 32 more
>>>>
>>>> 2 job failure(s) occurred:
>>>>
>>>> (5): Depending job with jobID 1 failed.
>>>>
>>>> com.apple.rsp.CrossValidation.CrossValidationDriver:
>>>> [[Text(/Users/luren/Lu/Project/model_testing/training_set... ID=1 (5/6)(1):
>>>> Job failed!
>>>>
>>>>
>>>
>>
>
>
> --
> Director of Data Science
> Cloudera <http://www.cloudera.com>
> Twitter: @josh_wills <http://twitter.com/josh_wills>
>

Re: org.apache.avro.UnresolvedUnionException

Posted by Josh Wills <jw...@cloudera.com>.
Oh, okay. Would you humor me and try to use parallelDo(...) instead of
mapValues(...) after the groupByKey() call and see if that works? I have
this weird feeling that mapValues is doing something it shouldn't be doing
to the Iterable.

J

On Mon, Apr 13, 2015 at 9:34 PM, Lucy Chen <lu...@gmail.com>
wrote:

> Hi Josh,
>
>          Thanks for your quick response. The codes should be as follows. I
> just renamed the LingPipeData and copied the codes in the email, I forgot
> to change a couple of places. I just simply change LingPipeData to ABCData
> to make it easier for you to understand. Here I used LingPipe package
> inside my Crunch jobs. I doubt whether the Vector included in the ABCData
> caused some troubles when it was serialized by an Avro type. However, when
> the codes exclude the parts after "*******" and just write ABC as an
> output. It worked fine; but after adding ABC.groupByKey().mapValues(...),
> it throws the exception.
>
>           Sorry about the typos in my last email.
>
>           Thanks!
>
> Lucy
>
> PType<ABCData> ABCDataType = Avros.records(ABCData.class);
>
> PTable<String, ABCData> ABC = input.mapValues(new ConvertToABCData(feat_index_mapping,
> addIntercept), ABCDataType);
>
>
> *******************************************************************************************************
>
>
> PTable<String, String> lgr = ABC.groupByKey().
>
>                        mapValues(new MapFn<Iterable<ABCData>, String> {
>
>                          @Override
>
> public String map(Iterable<ABCData> input)
>
> {
>
> Iterator<ABCData> ite1 = input.iterator();
>
> int counter=0;
>
> while(ite1.hasNext())
>
> {
>
> counter++;
>
> }
>
>  return Integer.toString(counter);
>
>
>                        }
>
>                           }, Avros.strings());
>
> lgr.write(At.textFile(output_path), WriteMode.OVERWRITE);
>
>
> ****************************************************************************************************************
>
>
> public class ConvertToABCData extends MapFn<InputType, ABCData>{
>
>
> private FeatIndexMapping feat_index_mapping;
>
> private boolean addIntercept;
>
>  public ConvertToABCData(FeatIndexMapping feat_index_mapping, boolean
> addIntercept)
>
> {
>
> this.feat_index_mapping = feat_index_mapping;
>
> this.addIntercept = addIntercept;
>
> }
>
>  @Override
>
> public  ABCData map(InputType input)
>
> {
>
> return new ABCData(input, feat_index_mapping, addIntercept);
>
>  }
>
>
> }
>
>
> public class ABCData implements java.io.Serializable, Cloneable{
>
>
> private int label;
>
> private Vector feature;
>
> private int dim;
>
> private final static Logger logger = Logger
>
>       .getLogger(ABCData.class.getName());
>
>         ......
>
> }
>
>
> On Mon, Apr 13, 2015 at 4:24 PM, Josh Wills <jo...@gmail.com> wrote:
>
>> Hey Lucy,
>>
>> I don't grok the last MapFn before the lgr gets written out; it looks
>> like it's defined over an Iterable<ABCData>, but the map() function defined
>> inside the class is over Iterable<LingPipeData>. I assume that's the source
>> of the problem-- the value that is getting printed out is the string form
>> of a LingPipeData object, which isn't what the system expects to see.
>>
>> J
>>
>> On Mon, Apr 13, 2015 at 7:12 PM, Lucy Chen <lu...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>>      I have an exception of org.apache.avro.UnresolvedUnionException
>>> thrown out by the following codes:
>>>
>>>
>>> PType<ABCData> ABCDataType = Avros.records(ABCData.class);
>>>
>>> PTable<String, ABCData> ABC = input.mapValues(new
>>> ConvertToABCData(feat_index_mapping, addIntercept), ABCDataType);
>>>
>>>
>>> *******************************************************************************************************
>>>
>>>
>>> PTable<String, String> lgr = ABC.groupByKey().
>>>
>>>                        mapValues(new MapFn<Iterable<ABCData>, String> {
>>>
>>>                          @Override
>>>
>>> public String map(Iterable<LingPipeData> input)
>>>
>>> {
>>>
>>> Iterator<LingPipeData> ite1 = input.iterator();
>>>
>>> int counter=0;
>>>
>>> while(ite1.hasNext())
>>>
>>> {
>>>
>>> counter++;
>>>
>>> }
>>>
>>>  return Integer.toString(counter);
>>>
>>>
>>>                        }
>>>
>>>                           }, Avros.strings());
>>>
>>> lgr.write(At.textFile(output_path), WriteMode.OVERWRITE);
>>>
>>>
>>> ****************************************************************************************************************
>>>
>>>
>>> public class ConvertToABCData extends MapFn<InputType, ABCData>{
>>>
>>>
>>> private FeatIndexMapping feat_index_mapping;
>>>
>>> private boolean addIntercept;
>>>
>>>  public ConvertToABCData(FeatIndexMapping feat_index_mapping, boolean
>>> addIntercept)
>>>
>>> {
>>>
>>> this.feat_index_mapping = feat_index_mapping;
>>>
>>> this.addIntercept = addIntercept;
>>>
>>> }
>>>
>>>  @Override
>>>
>>> public  ABCData map(InputType input)
>>>
>>> {
>>>
>>> return new ABCData(input, feat_index_mapping, addIntercept);
>>>
>>>  }
>>>
>>>
>>> }
>>>
>>>
>>> public class ABCData implements java.io.Serializable, Cloneable{
>>>
>>>
>>> private int label;
>>>
>>> private Vector feature;
>>>
>>> private int dim;
>>>
>>> private final static Logger logger = Logger
>>>
>>>        .getLogger(ABCData.class.getName());
>>>
>>>         ......
>>>
>>> }
>>>
>>>
>>> Here Vector is defined from third party: com.aliasi.matrix.Vector; The
>>> codes can run well until the line of star. But when the codes include
>>> ABC.groupByKey().mapValues(), the following exception will be caught. Can
>>> any one tell me how to solve the problem?
>>>
>>>
>>> Thanks.
>>>
>>>
>>> Lucy
>>>
>>>
>>>  The logs look like:
>>>
>>>
>>> org.apache.crunch.CrunchRuntimeException:
>>> org.apache.avro.file.DataFileWriter$AppendWriteException:
>>> org.apache.avro.UnresolvedUnionException: Not in union
>>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>>
>>> at
>>> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:45)
>>>
>>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>>
>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>
>>> at
>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>
>>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>>
>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>
>>> at
>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>
>>> at
>>> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:14)
>>>
>>> at
>>> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:10)
>>>
>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>
>>> at
>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>
>>> at org.apache.crunch.lib.join.InnerJoinFn.join(InnerJoinFn.java:66)
>>>
>>> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:79)
>>>
>>> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:32)
>>>
>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>
>>> at
>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>
>>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>>
>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>
>>> at org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:113)
>>>
>>> at
>>> org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57)
>>>
>>> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
>>>
>>> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
>>>
>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>>>
>>> at
>>> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>>>
>>> at
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>
>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>
>>> at java.lang.Thread.run(Thread.java:744)
>>>
>>> Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException:
>>> org.apache.avro.UnresolvedUnionException: Not in union
>>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>>
>>> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263)
>>>
>>> at
>>> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:87)
>>>
>>> at
>>> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:84)
>>>
>>> at org.apache.crunch.io.CrunchOutputs.write(CrunchOutputs.java:133)
>>>
>>> at
>>> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:41)
>>>
>>> ... 28 more
>>>
>>> Caused by: org.apache.avro.UnresolvedUnionException: Not in union
>>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>>
>>> at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:561)
>>>
>>> at
>>> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:144)
>>>
>>> at
>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)
>>>
>>> at
>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>
>>> at
>>> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>>>
>>> at
>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>>>
>>> at
>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>
>>> at
>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
>>>
>>> at
>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>
>>> at
>>> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>>>
>>> at
>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>>>
>>> at
>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>
>>> at
>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
>>>
>>> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:257)
>>>
>>> ... 32 more
>>>
>>> 2015-04-13 15:49:16,876 INFO  [Thread-500] mapred.LocalJobRunner
>>> (LocalJobRunner.java:runTasks(456)) - reduce task executor complete.
>>>
>>> 2015-04-13 15:49:16,879 WARN  [Thread-500] mapred.LocalJobRunner
>>> (LocalJobRunner.java:run(560)) - job_local918028004_0008
>>>
>>> java.lang.Exception: org.apache.crunch.CrunchRuntimeException:
>>> org.apache.avro.file.DataFileWriter$AppendWriteException:
>>> org.apache.avro.UnresolvedUnionException: Not in union
>>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>>
>>> at
>>> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>>>
>>> at
>>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
>>>
>>> Caused by: org.apache.crunch.CrunchRuntimeException:
>>> org.apache.avro.file.DataFileWriter$AppendWriteException:
>>> org.apache.avro.UnresolvedUnionException: Not in union
>>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>>
>>> at
>>> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:45)
>>>
>>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>>
>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>
>>> at
>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>
>>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>>
>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>
>>> at
>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>
>>> at
>>> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:14)
>>>
>>> at
>>> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:10)
>>>
>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>
>>> at
>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>
>>> at org.apache.crunch.lib.join.InnerJoinFn.join(InnerJoinFn.java:66)
>>>
>>> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:79)
>>>
>>> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:32)
>>>
>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>
>>> at
>>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>>
>>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>>
>>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>>
>>> at org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:113)
>>>
>>> at
>>> org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57)
>>>
>>> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
>>>
>>> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
>>>
>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>>>
>>> at
>>> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>>>
>>> at
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>
>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>
>>> at java.lang.Thread.run(Thread.java:744)
>>>
>>> Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException:
>>> org.apache.avro.UnresolvedUnionException: Not in union
>>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>>
>>> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263)
>>>
>>> at
>>> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:87)
>>>
>>> at
>>> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:84)
>>>
>>> at org.apache.crunch.io.CrunchOutputs.write(CrunchOutputs.java:133)
>>>
>>> at
>>> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:41)
>>>
>>> ... 28 more
>>>
>>> Caused by: org.apache.avro.UnresolvedUnionException: Not in union
>>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>>
>>> at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:561)
>>>
>>> at
>>> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:144)
>>>
>>> at
>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)
>>>
>>> at
>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>
>>> at
>>> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>>>
>>> at
>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>>>
>>> at
>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>
>>> at
>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
>>>
>>> at
>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>
>>> at
>>> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>>>
>>> at
>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>>>
>>> at
>>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>>
>>> at
>>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
>>>
>>> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:257)
>>>
>>> ... 32 more
>>>
>>> 2 job failure(s) occurred:
>>>
>>> (5): Depending job with jobID 1 failed.
>>>
>>> com.apple.rsp.CrossValidation.CrossValidationDriver:
>>> [[Text(/Users/luren/Lu/Project/model_testing/training_set... ID=1 (5/6)(1):
>>> Job failed!
>>>
>>>
>>
>


-- 
Director of Data Science
Cloudera <http://www.cloudera.com>
Twitter: @josh_wills <http://twitter.com/josh_wills>

Re: org.apache.avro.UnresolvedUnionException

Posted by Lucy Chen <lu...@gmail.com>.
Hi Josh,

         Thanks for your quick response. The codes should be as follows. I
just renamed the LingPipeData and copied the codes in the email, I forgot
to change a couple of places. I just simply change LingPipeData to ABCData
to make it easier for you to understand. Here I used LingPipe package
inside my Crunch jobs. I doubt whether the Vector included in the ABCData
caused some troubles when it was serialized by an Avro type. However, when
the codes exclude the parts after "*******" and just write ABC as an
output. It worked fine; but after adding ABC.groupByKey().mapValues(...),
it throws the exception.

          Sorry about the typos in my last email.

          Thanks!

Lucy

PType<ABCData> ABCDataType = Avros.records(ABCData.class);

PTable<String, ABCData> ABC = input.mapValues(new
ConvertToABCData(feat_index_mapping,
addIntercept), ABCDataType);

*******************************************************************************************************


PTable<String, String> lgr = ABC.groupByKey().

                       mapValues(new MapFn<Iterable<ABCData>, String> {

                         @Override

public String map(Iterable<ABCData> input)

{

Iterator<ABCData> ite1 = input.iterator();

int counter=0;

while(ite1.hasNext())

{

counter++;

}

 return Integer.toString(counter);


                       }

                          }, Avros.strings());

lgr.write(At.textFile(output_path), WriteMode.OVERWRITE);

****************************************************************************************************************


public class ConvertToABCData extends MapFn<InputType, ABCData>{


private FeatIndexMapping feat_index_mapping;

private boolean addIntercept;

 public ConvertToABCData(FeatIndexMapping feat_index_mapping, boolean
addIntercept)

{

this.feat_index_mapping = feat_index_mapping;

this.addIntercept = addIntercept;

}

 @Override

public  ABCData map(InputType input)

{

return new ABCData(input, feat_index_mapping, addIntercept);

 }


}


public class ABCData implements java.io.Serializable, Cloneable{


private int label;

private Vector feature;

private int dim;

private final static Logger logger = Logger

      .getLogger(ABCData.class.getName());

        ......

}


On Mon, Apr 13, 2015 at 4:24 PM, Josh Wills <jo...@gmail.com> wrote:

> Hey Lucy,
>
> I don't grok the last MapFn before the lgr gets written out; it looks like
> it's defined over an Iterable<ABCData>, but the map() function defined
> inside the class is over Iterable<LingPipeData>. I assume that's the source
> of the problem-- the value that is getting printed out is the string form
> of a LingPipeData object, which isn't what the system expects to see.
>
> J
>
> On Mon, Apr 13, 2015 at 7:12 PM, Lucy Chen <lu...@gmail.com>
> wrote:
>
>> Hi,
>>
>>      I have an exception of org.apache.avro.UnresolvedUnionException
>> thrown out by the following codes:
>>
>>
>> PType<ABCData> ABCDataType = Avros.records(ABCData.class);
>>
>> PTable<String, ABCData> ABC = input.mapValues(new
>> ConvertToABCData(feat_index_mapping, addIntercept), ABCDataType);
>>
>>
>> *******************************************************************************************************
>>
>>
>> PTable<String, String> lgr = ABC.groupByKey().
>>
>>                        mapValues(new MapFn<Iterable<ABCData>, String> {
>>
>>                          @Override
>>
>> public String map(Iterable<LingPipeData> input)
>>
>> {
>>
>> Iterator<LingPipeData> ite1 = input.iterator();
>>
>> int counter=0;
>>
>> while(ite1.hasNext())
>>
>> {
>>
>> counter++;
>>
>> }
>>
>>  return Integer.toString(counter);
>>
>>
>>                        }
>>
>>                           }, Avros.strings());
>>
>> lgr.write(At.textFile(output_path), WriteMode.OVERWRITE);
>>
>>
>> ****************************************************************************************************************
>>
>>
>> public class ConvertToABCData extends MapFn<InputType, ABCData>{
>>
>>
>> private FeatIndexMapping feat_index_mapping;
>>
>> private boolean addIntercept;
>>
>>  public ConvertToABCData(FeatIndexMapping feat_index_mapping, boolean
>> addIntercept)
>>
>> {
>>
>> this.feat_index_mapping = feat_index_mapping;
>>
>> this.addIntercept = addIntercept;
>>
>> }
>>
>>  @Override
>>
>> public  ABCData map(InputType input)
>>
>> {
>>
>> return new ABCData(input, feat_index_mapping, addIntercept);
>>
>>  }
>>
>>
>> }
>>
>>
>> public class ABCData implements java.io.Serializable, Cloneable{
>>
>>
>> private int label;
>>
>> private Vector feature;
>>
>> private int dim;
>>
>> private final static Logger logger = Logger
>>
>>        .getLogger(ABCData.class.getName());
>>
>>         ......
>>
>> }
>>
>>
>> Here Vector is defined from third party: com.aliasi.matrix.Vector; The
>> codes can run well until the line of star. But when the codes include
>> ABC.groupByKey().mapValues(), the following exception will be caught. Can
>> any one tell me how to solve the problem?
>>
>>
>> Thanks.
>>
>>
>> Lucy
>>
>>
>>  The logs look like:
>>
>>
>> org.apache.crunch.CrunchRuntimeException:
>> org.apache.avro.file.DataFileWriter$AppendWriteException:
>> org.apache.avro.UnresolvedUnionException: Not in union
>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>
>> at
>> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:45)
>>
>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>
>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>
>> at
>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>
>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>
>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>
>> at
>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>
>> at
>> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:14)
>>
>> at
>> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:10)
>>
>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>
>> at
>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>
>> at org.apache.crunch.lib.join.InnerJoinFn.join(InnerJoinFn.java:66)
>>
>> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:79)
>>
>> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:32)
>>
>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>
>> at
>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>
>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>
>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>
>> at org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:113)
>>
>> at
>> org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57)
>>
>> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
>>
>> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
>>
>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>>
>> at
>> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>>
>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>
>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>
>> at java.lang.Thread.run(Thread.java:744)
>>
>> Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException:
>> org.apache.avro.UnresolvedUnionException: Not in union
>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>
>> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263)
>>
>> at
>> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:87)
>>
>> at
>> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:84)
>>
>> at org.apache.crunch.io.CrunchOutputs.write(CrunchOutputs.java:133)
>>
>> at
>> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:41)
>>
>> ... 28 more
>>
>> Caused by: org.apache.avro.UnresolvedUnionException: Not in union
>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>
>> at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:561)
>>
>> at
>> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:144)
>>
>> at
>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)
>>
>> at
>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>
>> at
>> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>>
>> at
>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>>
>> at
>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>
>> at
>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
>>
>> at
>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>
>> at
>> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>>
>> at
>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>>
>> at
>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>
>> at
>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
>>
>> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:257)
>>
>> ... 32 more
>>
>> 2015-04-13 15:49:16,876 INFO  [Thread-500] mapred.LocalJobRunner
>> (LocalJobRunner.java:runTasks(456)) - reduce task executor complete.
>>
>> 2015-04-13 15:49:16,879 WARN  [Thread-500] mapred.LocalJobRunner
>> (LocalJobRunner.java:run(560)) - job_local918028004_0008
>>
>> java.lang.Exception: org.apache.crunch.CrunchRuntimeException:
>> org.apache.avro.file.DataFileWriter$AppendWriteException:
>> org.apache.avro.UnresolvedUnionException: Not in union
>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>
>> at
>> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>>
>> at
>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
>>
>> Caused by: org.apache.crunch.CrunchRuntimeException:
>> org.apache.avro.file.DataFileWriter$AppendWriteException:
>> org.apache.avro.UnresolvedUnionException: Not in union
>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>
>> at
>> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:45)
>>
>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>
>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>
>> at
>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>
>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>
>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>
>> at
>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>
>> at
>> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:14)
>>
>> at
>> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:10)
>>
>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>
>> at
>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>
>> at org.apache.crunch.lib.join.InnerJoinFn.join(InnerJoinFn.java:66)
>>
>> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:79)
>>
>> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:32)
>>
>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>
>> at
>> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>>
>> at org.apache.crunch.MapFn.process(MapFn.java:34)
>>
>> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>>
>> at org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:113)
>>
>> at
>> org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57)
>>
>> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
>>
>> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
>>
>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>>
>> at
>> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>>
>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>
>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>
>> at java.lang.Thread.run(Thread.java:744)
>>
>> Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException:
>> org.apache.avro.UnresolvedUnionException: Not in union
>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>
>> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263)
>>
>> at
>> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:87)
>>
>> at
>> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:84)
>>
>> at org.apache.crunch.io.CrunchOutputs.write(CrunchOutputs.java:133)
>>
>> at
>> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:41)
>>
>> ... 28 more
>>
>> Caused by: org.apache.avro.UnresolvedUnionException: Not in union
>> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
>> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
>> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
>> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
>> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
>> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>>
>> at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:561)
>>
>> at
>> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:144)
>>
>> at
>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)
>>
>> at
>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>
>> at
>> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>>
>> at
>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>>
>> at
>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>
>> at
>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
>>
>> at
>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>
>> at
>> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>>
>> at
>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>>
>> at
>> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>>
>> at
>> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
>>
>> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:257)
>>
>> ... 32 more
>>
>> 2 job failure(s) occurred:
>>
>> (5): Depending job with jobID 1 failed.
>>
>> com.apple.rsp.CrossValidation.CrossValidationDriver:
>> [[Text(/Users/luren/Lu/Project/model_testing/training_set... ID=1 (5/6)(1):
>> Job failed!
>>
>>
>

Re: org.apache.avro.UnresolvedUnionException

Posted by Josh Wills <jo...@gmail.com>.
Hey Lucy,

I don't grok the last MapFn before the lgr gets written out; it looks like
it's defined over an Iterable<ABCData>, but the map() function defined
inside the class is over Iterable<LingPipeData>. I assume that's the source
of the problem-- the value that is getting printed out is the string form
of a LingPipeData object, which isn't what the system expects to see.

J

On Mon, Apr 13, 2015 at 7:12 PM, Lucy Chen <lu...@gmail.com>
wrote:

> Hi,
>
>      I have an exception of org.apache.avro.UnresolvedUnionException
> thrown out by the following codes:
>
>
> PType<ABCData> ABCDataType = Avros.records(ABCData.class);
>
> PTable<String, ABCData> ABC = input.mapValues(new
> ConvertToABCData(feat_index_mapping, addIntercept), ABCDataType);
>
>
> *******************************************************************************************************
>
>
> PTable<String, String> lgr = ABC.groupByKey().
>
>                        mapValues(new MapFn<Iterable<ABCData>, String> {
>
>                          @Override
>
> public String map(Iterable<LingPipeData> input)
>
> {
>
> Iterator<LingPipeData> ite1 = input.iterator();
>
> int counter=0;
>
> while(ite1.hasNext())
>
> {
>
> counter++;
>
> }
>
>  return Integer.toString(counter);
>
>
>                        }
>
>                           }, Avros.strings());
>
> lgr.write(At.textFile(output_path), WriteMode.OVERWRITE);
>
>
> ****************************************************************************************************************
>
>
> public class ConvertToABCData extends MapFn<InputType, ABCData>{
>
>
> private FeatIndexMapping feat_index_mapping;
>
> private boolean addIntercept;
>
>  public ConvertToABCData(FeatIndexMapping feat_index_mapping, boolean
> addIntercept)
>
> {
>
> this.feat_index_mapping = feat_index_mapping;
>
> this.addIntercept = addIntercept;
>
> }
>
>  @Override
>
> public  ABCData map(InputType input)
>
> {
>
> return new ABCData(input, feat_index_mapping, addIntercept);
>
>  }
>
>
> }
>
>
> public class ABCData implements java.io.Serializable, Cloneable{
>
>
> private int label;
>
> private Vector feature;
>
> private int dim;
>
> private final static Logger logger = Logger
>
>        .getLogger(ABCData.class.getName());
>
>         ......
>
> }
>
>
> Here Vector is defined from third party: com.aliasi.matrix.Vector; The
> codes can run well until the line of star. But when the codes include
> ABC.groupByKey().mapValues(), the following exception will be caught. Can
> any one tell me how to solve the problem?
>
>
> Thanks.
>
>
> Lucy
>
>
>  The logs look like:
>
>
> org.apache.crunch.CrunchRuntimeException:
> org.apache.avro.file.DataFileWriter$AppendWriteException:
> org.apache.avro.UnresolvedUnionException: Not in union
> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>
> at
> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:45)
>
> at org.apache.crunch.MapFn.process(MapFn.java:34)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at org.apache.crunch.MapFn.process(MapFn.java:34)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at
> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:14)
>
> at
> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:10)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at org.apache.crunch.lib.join.InnerJoinFn.join(InnerJoinFn.java:66)
>
> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:79)
>
> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:32)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at org.apache.crunch.MapFn.process(MapFn.java:34)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:113)
>
> at
> org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57)
>
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
>
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
> at java.lang.Thread.run(Thread.java:744)
>
> Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException:
> org.apache.avro.UnresolvedUnionException: Not in union
> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>
> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263)
>
> at
> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:87)
>
> at
> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:84)
>
> at org.apache.crunch.io.CrunchOutputs.write(CrunchOutputs.java:133)
>
> at
> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:41)
>
> ... 28 more
>
> Caused by: org.apache.avro.UnresolvedUnionException: Not in union
> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>
> at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:561)
>
> at
> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:144)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
>
> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:257)
>
> ... 32 more
>
> 2015-04-13 15:49:16,876 INFO  [Thread-500] mapred.LocalJobRunner
> (LocalJobRunner.java:runTasks(456)) - reduce task executor complete.
>
> 2015-04-13 15:49:16,879 WARN  [Thread-500] mapred.LocalJobRunner
> (LocalJobRunner.java:run(560)) - job_local918028004_0008
>
> java.lang.Exception: org.apache.crunch.CrunchRuntimeException:
> org.apache.avro.file.DataFileWriter$AppendWriteException:
> org.apache.avro.UnresolvedUnionException: Not in union
> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
>
> Caused by: org.apache.crunch.CrunchRuntimeException:
> org.apache.avro.file.DataFileWriter$AppendWriteException:
> org.apache.avro.UnresolvedUnionException: Not in union
> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>
> at
> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:45)
>
> at org.apache.crunch.MapFn.process(MapFn.java:34)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at org.apache.crunch.MapFn.process(MapFn.java:34)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at
> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:14)
>
> at
> com.apple.rsp.Utils.RetrieveDataFromJoin.process(RetrieveDataFromJoin.java:10)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at org.apache.crunch.lib.join.InnerJoinFn.join(InnerJoinFn.java:66)
>
> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:79)
>
> at org.apache.crunch.lib.join.JoinFn.process(JoinFn.java:32)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at
> org.apache.crunch.impl.mr.emit.IntermediateEmitter.emit(IntermediateEmitter.java:56)
>
> at org.apache.crunch.MapFn.process(MapFn.java:34)
>
> at org.apache.crunch.impl.mr.run.RTNode.process(RTNode.java:98)
>
> at org.apache.crunch.impl.mr.run.RTNode.processIterable(RTNode.java:113)
>
> at
> org.apache.crunch.impl.mr.run.CrunchReducer.reduce(CrunchReducer.java:57)
>
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
>
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
> at java.lang.Thread.run(Thread.java:744)
>
> Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException:
> org.apache.avro.UnresolvedUnionException: Not in union
> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>
> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:263)
>
> at
> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:87)
>
> at
> org.apache.crunch.types.avro.AvroOutputFormat$1.write(AvroOutputFormat.java:84)
>
> at org.apache.crunch.io.CrunchOutputs.write(CrunchOutputs.java:133)
>
> at
> org.apache.crunch.impl.mr.emit.MultipleOutputEmitter.emit(MultipleOutputEmitter.java:41)
>
> ... 28 more
>
> Caused by: org.apache.avro.UnresolvedUnionException: Not in union
> ["null",{"type":"record","name":"Vector","namespace":"com.aliasi.matrix","fields":[]}]:
> 0=1.0 8=0.0917 9=0.0734 14=0.0336 22=0.0485 36=0.0795 40=0.0611 59=0.079
> 101=0.1065 127=0.1101 131=0.0969 135=0.1016 151=0.079 154=0.1847 177=0.0858
> 199=0.1131 200=0.0485 269=0.1096 271=0.1275 335=0.1299 588=0.165 799=0.2264
> 1200=0.1321 1286=0.2796 1482=0.1299 1702=0.4409 2170=0.2236 3644=0.2319
> 4824=0.2624 5040=0.3815 5584=0.2258 5937=0.2466
>
> at org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:561)
>
> at
> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:144)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:71)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:106)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:66)
>
> at
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:104)
>
> at
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:58)
>
> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:257)
>
> ... 32 more
>
> 2 job failure(s) occurred:
>
> (5): Depending job with jobID 1 failed.
>
> com.apple.rsp.CrossValidation.CrossValidationDriver:
> [[Text(/Users/luren/Lu/Project/model_testing/training_set... ID=1 (5/6)(1):
> Job failed!
>
>