You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Poorvi Lashkary (JIRA)" <ji...@apache.org> on 2015/08/04 12:42:04 UTC

[jira] [Comment Edited] (SPARK-9594) Failed to get broadcast_33_piece0 while using Accumulators in UDF

    [ https://issues.apache.org/jira/browse/SPARK-9594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653440#comment-14653440 ] 

Poorvi Lashkary edited comment on SPARK-9594 at 8/4/15 10:41 AM:
-----------------------------------------------------------------

Use case: I need to create auto increment sequence column for a data frame. I have created a UDF which updates value of accumulator and returns that value. Used accumulators so that executors can share the same.
PFB sample code snippet:
static Accumulator<Integer> start  = sc.accumulator(0);
SQLContext.udf().register("seq",new UDF1<Integer,Integer>(){
					public Integer call(Integer l) throws Exception{
						l = startCtr.value() + 1;
						startCtr.setValue(l);
						return l;
						}
				},DataTypes.IntegerType);
		
		Query---- "Select seq("+start.value()+") as ID from df";


was (Author: poorvi_767):
Use case: I need to create auto increment sequence column for a data frame. I have created a UDF which updates value of accumulator and returns that value. Used accumulators so that executors can share the same.
PFB sample code snippet:
static Accumulator<Integer> start  = sc.accumulator(0);
SQLContext.udf().register("seq",new UDF1<Integer,Integer>(){
					public Integer call(Integer l) throws Exception{
						l = startCtr.value() + 1;
						startCtr.setValue(l);
						return l;
						}
				},DataTypes.IntegerType);
		System.out.println("recors:::::"+seqCount);
		Query---- "Select seq("+start.value()+") as ID from df";

>  Failed to get broadcast_33_piece0 while using Accumulators in UDF
> ------------------------------------------------------------------
>
>                 Key: SPARK-9594
>                 URL: https://issues.apache.org/jira/browse/SPARK-9594
>             Project: Spark
>          Issue Type: Test
>         Environment: Amazon Linux AMI release 2014.09
>            Reporter: Poorvi Lashkary
>
> Getting Below Exception while using accumulator in a UDF.
>  java.io.IOException: org.apache.spark.SparkException: Failed to get broadcast_33_piece0 of broadcast_33
>         at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1156)
>         at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
>         at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
>         at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
>         at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
>         at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
>         at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:58)
>         at org.apache.spark.scheduler.Task.run(Task.scala:64)
>         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.spark.SparkException: Failed to get broadcast_33_piece0 of broadcast_33
>         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:137)
>         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:137)
>         at scala.Option.getOrElse(Option.scala:120)
>         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply$mcVI$sp(TorrentBroadcast.scala:136)
>         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:119)
>         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1.apply(TorrentBroadcast.scala:119)
>         at scala.collection.immutable.List.foreach(List.scala:318)
>         at org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$readBlocks(TorrentBroadcast.scala:119)
>         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:174)
>         at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1153)
>         ... 11 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org