You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Matt Christiansen <ad...@nikore.net> on 2013/09/04 03:55:50 UTC

SchemaTuple doesn't seem to work on YARN

Hello; we are running a YARN cluster and I went to go try out the
pig.schematuple option on our cluster (yes I know its experimental)
and it doesn't seem to work the jobs all error out with:


2013-09-03 15:20:49,732 INFO [main]
org.apache.pig.data.SchemaTupleBackend: Copying files in key
[pig.schematuple.classes] from distributed cache:
SchemaTuple_14$1.class,SchemaTuple_63.class,SchemaTuple_30.class,SchemaTuple_9$1.class,SchemaTuple_36.class,SchemaTuple_5$1.class,SchemaTuple_49.class,SchemaTuple_47.class,SchemaTuple_8.class,SchemaTuple_23.class,SchemaTuple_19$1.class,SchemaTuple_32.class,SchemaTuple_11.class,SchemaTuple_71.class,SchemaTuple_23$1.class,SchemaTuple_73.class,SchemaTuple_1$1.class,SchemaTuple_58.class,SchemaTuple_65.class,SchemaTuple_75$1.class,SchemaTuple_26.class,SchemaTuple_6.class,SchemaTuple_10$1.class,SchemaTuple_1.class,SchemaTuple_58$1.class,SchemaTuple_51$1.class,SchemaTuple_3$1.class,SchemaTuple_67$1.class,SchemaTuple_35.class,SchemaTuple_38.class,SchemaTuple_11$1.class,SchemaTuple_74$1.class,SchemaTuple_45$1.class,SchemaTuple_12$1.class,SchemaTuple_0$1.class,SchemaTuple_10.class,SchemaTuple_14.class,SchemaTuple_76$1.class,SchemaTuple_49$1.class,SchemaTuple_30$1.class,SchemaTuple_56.class,SchemaTuple_48$1.class,SchemaTuple_43$1.class,SchemaTuple_16$1.class,SchemaTuple_61$1.class,SchemaTuple_15$1.class,SchemaTuple_21$1.class,SchemaTuple_59$1.class,SchemaTuple_76.class,SchemaTuple_50$1.class,SchemaTuple_64$1.class,SchemaTuple_44$1.class,SchemaTuple_72.class,SchemaTuple_51.class,SchemaTuple_25$1.class,SchemaTuple_43.class,SchemaTuple_57.class,SchemaTuple_62$1.class,SchemaTuple_16.class,SchemaTuple_66.class,SchemaTuple_57$1.class,SchemaTuple_20$1.class,SchemaTuple_33.class,SchemaTuple_21.class,SchemaTuple_68.class,SchemaTuple_74.class,SchemaTuple_48.class,SchemaTuple_54$1.class,SchemaTuple_19.class,SchemaTuple_71$1.class,SchemaTuple_38$1.class,SchemaTuple_42.class,SchemaTuple_18.class,SchemaTuple_37$1.class,SchemaTuple_39$1.class,SchemaTuple_64.class,SchemaTuple_41$1.class,SchemaTuple_52$1.class,SchemaTuple_7.class,SchemaTuple_28$1.class,SchemaTuple_13.class,SchemaTuple_69.class,SchemaTuple_72$1.class,SchemaTuple_41.class,SchemaTuple_56$1.class,SchemaTuple_0.class,SchemaTuple_53.class,SchemaTuple_60.class,SchemaTuple_40.class,SchemaTuple_66$1.class,SchemaTuple_24$1.class,SchemaTuple_60$1.class,SchemaTuple_2$1.class,SchemaTuple_47$1.class,SchemaTuple_28.class,SchemaTuple_68$1.class,SchemaTuple_39.class,SchemaTuple_6$1.class,SchemaTuple_69$1.class,SchemaTuple_50.class,SchemaTuple_40$1.class,SchemaTuple_62.class,SchemaTuple_31$1.class,SchemaTuple_46$1.class,SchemaTuple_20.class,SchemaTuple_13$1.class,SchemaTuple_37.class,SchemaTuple_24.class,SchemaTuple_8$1.class,SchemaTuple_9.class,SchemaTuple_22$1.class,SchemaTuple_46.class,SchemaTuple_65$1.class,SchemaTuple_29.class,SchemaTuple_22.class,SchemaTuple_29$1.class,SchemaTuple_67.class,SchemaTuple_45.class,SchemaTuple_44.class,SchemaTuple_4$1.class,SchemaTuple_18$1.class,SchemaTuple_12.class,SchemaTuple_17$1.class,SchemaTuple_34.class,SchemaTuple_53$1.class,SchemaTuple_35$1.class,SchemaTuple_75.class,SchemaTuple_31.class,SchemaTuple_70.class,SchemaTuple_7$1.class,SchemaTuple_32$1.class,SchemaTuple_17.class,SchemaTuple_5.class,SchemaTuple_61.class,SchemaTuple_25.class,SchemaTuple_63$1.class,SchemaTuple_70$1.class,SchemaTuple_55$1.class,SchemaTuple_55.class,SchemaTuple_4.class,SchemaTuple_34$1.class,SchemaTuple_59.class,SchemaTuple_36$1.class,SchemaTuple_2.class,SchemaTuple_3.class,SchemaTuple_54.class,SchemaTuple_73$1.class,SchemaTuple_27$1.class,SchemaTuple_27.class,SchemaTuple_15.class,SchemaTuple_26$1.class,SchemaTuple_33$1.class,SchemaTuple_52.class,SchemaTuple_42$1.class
2013-09-03 15:20:49,732 INFO [main]
org.apache.pig.data.SchemaTupleBackend: Attempting to read file:
SchemaTuple_14$1.class
2013-09-03 15:20:49,732 ERROR [main]
org.apache.hadoop.security.UserGroupInformation:
PriviledgedActionException as:candiru (auth:SIMPLE)
cause:java.io.FileNotFoundException: SchemaTuple_14$1.class (No such
file or directory)
2013-09-03 15:20:49,733 WARN [main]
org.apache.hadoop.mapred.YarnChild: Exception running child :
java.io.FileNotFoundException: SchemaTuple_14$1.class (No such file or
directory)
        at java.io.FileInputStream.open(Native Method)
        at java.io.FileInputStream.<init>(FileInputStream.java:138)
        at org.apache.pig.data.SchemaTupleBackend.copyAllFromDistributedCache(SchemaTupleBackend.java:187)
        at org.apache.pig.data.SchemaTupleBackend.copyAndResolve(SchemaTupleBackend.java:160)
        at org.apache.pig.data.SchemaTupleBackend.initialize(SchemaTupleBackend.java:278)
        at org.apache.pig.data.SchemaTupleBackend.initialize(SchemaTupleBackend.java:268)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:175)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:338)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)

2013-09-03 15:20:49,740 INFO [main] org.apache.hadoop.mapred.Task:
Runnning cleanup for the task


I was wondering if any one has had any luck with this feature on a
YARN cluster or had any ideas as to what I could do to get it to work.

Re: SchemaTuple doesn't seem to work on YARN

Posted by Jonathan Coveney <jc...@gmail.com>.
Hello!

I implemented the SchemaTuple stuff. Glad to hear you're trying it out! I
did not test it with YARN at all. It looks like the way that the filesystem
and distributed cache work have changed. I myself am not super up on that,
but perhaps there is known documentation on how it differs? The way that it
works is that when the pig script is processed, a bunch of code is
generated and added to the distributed cache. Then, each mapper and reducer
copies that code locally, so that a UrlClassLoader can load it. Any piece
of that could have changed...


2013/9/3 Matt Christiansen <ad...@nikore.net>

> Hello; we are running a YARN cluster and I went to go try out the
> pig.schematuple option on our cluster (yes I know its experimental)
> and it doesn't seem to work the jobs all error out with:
>
>
> 2013-09-03 15:20:49,732 INFO [main]
> org.apache.pig.data.SchemaTupleBackend: Copying files in key
> [pig.schematuple.classes] from distributed cache:
>
> SchemaTuple_14$1.class,SchemaTuple_63.class,SchemaTuple_30.class,SchemaTuple_9$1.class,SchemaTuple_36.class,SchemaTuple_5$1.class,SchemaTuple_49.class,SchemaTuple_47.class,SchemaTuple_8.class,SchemaTuple_23.class,SchemaTuple_19$1.class,SchemaTuple_32.class,SchemaTuple_11.class,SchemaTuple_71.class,SchemaTuple_23$1.class,SchemaTuple_73.class,SchemaTuple_1$1.class,SchemaTuple_58.class,SchemaTuple_65.class,SchemaTuple_75$1.class,SchemaTuple_26.class,SchemaTuple_6.class,SchemaTuple_10$1.class,SchemaTuple_1.class,SchemaTuple_58$1.class,SchemaTuple_51$1.class,SchemaTuple_3$1.class,SchemaTuple_67$1.class,SchemaTuple_35.class,SchemaTuple_38.class,SchemaTuple_11$1.class,SchemaTuple_74$1.class,SchemaTuple_45$1.class,SchemaTuple_12$1.class,SchemaTuple_0$1.class,SchemaTuple_10.class,SchemaTuple_14.class,SchemaTuple_76$1.class,SchemaTuple_49$1.class,SchemaTuple_30$1.class,SchemaTuple_56.class,SchemaTuple_48$1.class,SchemaTuple_43$1.class,SchemaTuple_16$1.class,SchemaTuple_61$1.class,SchemaTuple_15$1.class,SchemaTuple_21$1.class,SchemaTuple_59$1.class,SchemaTuple_76.class,SchemaTuple_50$1.class,SchemaTuple_64$1.class,SchemaTuple_44$1.class,SchemaTuple_72.class,SchemaTuple_51.class,SchemaTuple_25$1.class,SchemaTuple_43.class,SchemaTuple_57.class,SchemaTuple_62$1.class,SchemaTuple_16.class,SchemaTuple_66.class,SchemaTuple_57$1.class,SchemaTuple_20$1.class,SchemaTuple_33.class,SchemaTuple_21.class,SchemaTuple_68.class,SchemaTuple_74.class,SchemaTuple_48.class,SchemaTuple_54$1.class,SchemaTuple_19.class,SchemaTuple_71$1.class,SchemaTuple_38$1.class,SchemaTuple_42.class,SchemaTuple_18.class,SchemaTuple_37$1.class,SchemaTuple_39$1.class,SchemaTuple_64.class,SchemaTuple_41$1.class,SchemaTuple_52$1.class,SchemaTuple_7.class,SchemaTuple_28$1.class,SchemaTuple_13.class,SchemaTuple_69.class,SchemaTuple_72$1.class,SchemaTuple_41.class,SchemaTuple_56$1.class,SchemaTuple_0.class,SchemaTuple_53.class,SchemaTuple_60.class,SchemaTuple_40.class,SchemaTuple_66$1.class,SchemaTuple_24$1.class,SchemaTuple_60$1.class,SchemaTuple_2$1.class,SchemaTuple_47$1.class,SchemaTuple_28.class,SchemaTuple_68$1.class,SchemaTuple_39.class,SchemaTuple_6$1.class,SchemaTuple_69$1.class,SchemaTuple_50.class,SchemaTuple_40$1.class,SchemaTuple_62.class,SchemaTuple_31$1.class,SchemaTuple_46$1.class,SchemaTuple_20.class,SchemaTuple_13$1.class,SchemaTuple_37.class,SchemaTuple_24.class,SchemaTuple_8$1.class,SchemaTuple_9.class,SchemaTuple_22$1.class,SchemaTuple_46.class,SchemaTuple_65$1.class,SchemaTuple_29.class,SchemaTuple_22.class,SchemaTuple_29$1.class,SchemaTuple_67.class,SchemaTuple_45.class,SchemaTuple_44.class,SchemaTuple_4$1.class,SchemaTuple_18$1.class,SchemaTuple_12.class,SchemaTuple_17$1.class,SchemaTuple_34.class,SchemaTuple_53$1.class,SchemaTuple_35$1.class,SchemaTuple_75.class,SchemaTuple_31.class,SchemaTuple_70.class,SchemaTuple_7$1.class,SchemaTuple_32$1.class,SchemaTuple_17.class,SchemaTuple_5.class,SchemaTuple_61.class,SchemaTuple_25.class,SchemaTuple_63$1.class,SchemaTuple_70$1.class,SchemaTuple_55$1.class,SchemaTuple_55.class,SchemaTuple_4.class,SchemaTuple_34$1.class,SchemaTuple_59.class,SchemaTuple_36$1.class,SchemaTuple_2.class,SchemaTuple_3.class,SchemaTuple_54.class,SchemaTuple_73$1.class,SchemaTuple_27$1.class,SchemaTuple_27.class,SchemaTuple_15.class,SchemaTuple_26$1.class,SchemaTuple_33$1.class,SchemaTuple_52.class,SchemaTuple_42$1.class
> 2013-09-03 15:20:49,732 INFO [main]
> org.apache.pig.data.SchemaTupleBackend: Attempting to read file:
> SchemaTuple_14$1.class
> 2013-09-03 15:20:49,732 ERROR [main]
> org.apache.hadoop.security.UserGroupInformation:
> PriviledgedActionException as:candiru (auth:SIMPLE)
> cause:java.io.FileNotFoundException: SchemaTuple_14$1.class (No such
> file or directory)
> 2013-09-03 15:20:49,733 WARN [main]
> org.apache.hadoop.mapred.YarnChild: Exception running child :
> java.io.FileNotFoundException: SchemaTuple_14$1.class (No such file or
> directory)
>         at java.io.FileInputStream.open(Native Method)
>         at java.io.FileInputStream.<init>(FileInputStream.java:138)
>         at
> org.apache.pig.data.SchemaTupleBackend.copyAllFromDistributedCache(SchemaTupleBackend.java:187)
>         at
> org.apache.pig.data.SchemaTupleBackend.copyAndResolve(SchemaTupleBackend.java:160)
>         at
> org.apache.pig.data.SchemaTupleBackend.initialize(SchemaTupleBackend.java:278)
>         at
> org.apache.pig.data.SchemaTupleBackend.initialize(SchemaTupleBackend.java:268)
>         at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.setup(PigGenericMapBase.java:175)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:338)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)
>
> 2013-09-03 15:20:49,740 INFO [main] org.apache.hadoop.mapred.Task:
> Runnning cleanup for the task
>
>
> I was wondering if any one has had any luck with this feature on a
> YARN cluster or had any ideas as to what I could do to get it to work.
>