You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Rahul Bhojwani <ra...@gmail.com> on 2014/07/08 21:17:24 UTC

Error: Could not delete temporary files.

HI,

I am getting this error. Can anyone help out to explain why is this error
coming.

########

Exception in thread "delete Spark temp dir
C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560"
 java.io.IOException: Failed to delete:
C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560\tmp
cmenlp
        at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:483)
        at
org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:479)
        at
org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:478)
        at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at
scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
        at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:478)
        at org.apache.spark.util.Utils$$anon$4.run(Utils.scala:212)
PS>
############




Thanks in advance
-- 
Rahul K Bhojwani
3rd Year B.Tech
Computer Science and Engineering
National Institute of Technology, Karnataka

Re: Error: Could not delete temporary files.

Posted by Marcelo Vanzin <va...@cloudera.com>.

Have you tried the obvious (increase the heap size of your JVM)?

On Tue, Jul 8, 2014 at 2:02 PM, Rahul Bhojwani
<ra...@gmail.com> wrote:
> Thanks Marcelo.
> I was having another problem. My code was running properly and then it
> suddenly stopped with the error:
>
> java.lang.OutOfMemoryError: Java heap space
>         at java.io.BufferedOutputStream.<init>(Unknown Source)
>         at
> org.apache.spark.api.python.PythonRDD$$anon$2.run(PythonRDD.scala:62)
>
> Can you help in that?
>
>
> On Wed, Jul 9, 2014 at 2:07 AM, Marcelo Vanzin <va...@cloudera.com> wrote:
>>
>> Sorry, that would be sc.stop() (not close).
>>
>> On Tue, Jul 8, 2014 at 1:31 PM, Marcelo Vanzin <va...@cloudera.com>
>> wrote:
>> > Hi Rahul,
>> >
>> > Can you try calling "sc.close()" at the end of your program, so Spark
>> > can clean up after itself?
>> >
>> > On Tue, Jul 8, 2014 at 12:40 PM, Rahul Bhojwani
>> > <ra...@gmail.com> wrote:
>> >> Here I am adding my code. If you can have a look to help me out.
>> >> Thanks
>> >> #######################
>> >>
>> >> import tokenizer
>> >> import gettingWordLists as gl
>> >> from pyspark.mllib.classification import NaiveBayes
>> >> from numpy import array
>> >> from pyspark import SparkContext, SparkConf
>> >>
>> >> conf = (SparkConf().setMaster("local[6]").setAppName("My
>> >> app").set("spark.executor.memory", "1g"))
>> >>
>> >> sc=SparkContext(conf = conf)
>> >> # Getting the positive dict:
>> >>
>> >> pos_list = []
>> >> pos_list = gl.getPositiveList()
>> >> neg_list = gl.getNegativeList()
>> >>
>> >> #print neg_list
>> >> tok = tokenizer.Tokenizer(preserve_case=False)
>> >> train_data  = []
>> >>
>> >> with open("training_file_coach.csv","r") as train_file:
>> >>     for line in train_file:
>> >>         tokens = line.split("######")
>> >>         msg = tokens[0]
>> >>         sentiment = tokens[1]
>> >>         pos_count = 0
>> >>         neg_count = 0
>> >> #        print sentiment + "\n\n"
>> >> #        print msg
>> >>         tokens = set(tok.tokenize(msg))
>> >>         for i in tokens:
>> >>             if i.encode('utf-8') in pos_list:
>> >>                 pos_count+=1
>> >>             if i.encode('utf-8') in neg_list:
>> >>                 neg_count+=1
>> >>         if sentiment.__contains__('NEG'):
>> >>             label = 0.0
>> >>         else:
>> >>             label = 1.0
>> >>
>> >>         feature = []
>> >>         feature.append(label)
>> >>         feature.append(float(pos_count))
>> >>         feature.append(float(neg_count))
>> >>         train_data.append(feature)
>> >>     train_file.close()
>> >>
>> >> model = NaiveBayes.train(sc.parallelize(array(train_data)))
>> >>
>> >>
>> >> file_predicted = open("predicted_file_coach.csv","w")
>> >>
>> >> with open("prediction_file_coach.csv","r") as predict_file:
>> >>     for line in predict_file:
>> >>         msg = line[0:-1]
>> >>         pos_count = 0
>> >>         neg_count = 0
>> >> #        print sentiment + "\n\n"
>> >> #        print msg
>> >>         tokens = set(tok.tokenize(msg))
>> >>         for i in tokens:
>> >>             if i.encode('utf-8') in pos_list:
>> >>                 pos_count+=1
>> >>             if i.encode('utf-8') in neg_list:
>> >>                 neg_count+=1
>> >>         prediction =
>> >> model.predict(array([float(pos_count),float(neg_count)]))
>> >>         if prediction == 0:
>> >>             sentiment = "NEG"
>> >>         elif prediction == 1:
>> >>             sentiment = "POS"
>> >>         else:
>> >>             print "ERROR\n\n\n\n\n\n\nERROR"
>> >>
>> >>         feature = []
>> >>         feature.append(float(prediction))
>> >>         feature.append(float(pos_count))
>> >>         feature.append(float(neg_count))
>> >>         print feature
>> >>         train_data.append(feature)
>> >>         model = NaiveBayes.train(sc.parallelize(array(train_data)))
>> >>         file_predicted.write(msg + "######" + sentiment + "\n")
>> >>
>> >> file_predicted.close()
>> >> ###################
>> >>
>> >> If you can have a look at the code and help me out, It would be great
>> >>
>> >> Thanks
>> >>
>> >>
>> >> On Wed, Jul 9, 2014 at 12:54 AM, Rahul Bhojwani
>> >> <ra...@gmail.com> wrote:
>> >>>
>> >>> Hi Marcelo.
>> >>> Thanks for the quick reply. Can you suggest me how to increase the
>> >>> memory
>> >>> limits or how to tackle this problem. I am a novice. If you want I can
>> >>> post
>> >>> my code here.
>> >>>
>> >>>
>> >>> Thanks
>> >>>
>> >>>
>> >>> On Wed, Jul 9, 2014 at 12:50 AM, Marcelo Vanzin <va...@cloudera.com>
>> >>> wrote:
>> >>>>
>> >>>> This is generally a side effect of your executor being killed. For
>> >>>> example, Yarn will do that if you're going over the requested memory
>> >>>> limits.
>> >>>>
>> >>>> On Tue, Jul 8, 2014 at 12:17 PM, Rahul Bhojwani
>> >>>> <ra...@gmail.com> wrote:
>> >>>> > HI,
>> >>>> >
>> >>>> > I am getting this error. Can anyone help out to explain why is this
>> >>>> > error
>> >>>> > coming.
>> >>>> >
>> >>>> > ########
>> >>>> >
>> >>>> > Exception in thread "delete Spark temp dir
>> >>>> >
>> >>>> >
>> >>>> > C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560"
>> >>>> >  java.io.IOException: Failed to delete:
>> >>>> >
>> >>>> >
>> >>>> > C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560\tmp
>> >>>> > cmenlp
>> >>>> >         at
>> >>>> > org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:483)
>> >>>> >         at
>> >>>> >
>> >>>> >
>> >>>> > org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:479)
>> >>>> >         at
>> >>>> >
>> >>>> >
>> >>>> > org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:478)
>> >>>> >         at
>> >>>> >
>> >>>> >
>> >>>> > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>> >>>> >         at
>> >>>> >
>> >>>> > scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
>> >>>> >         at
>> >>>> > org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:478)
>> >>>> >         at org.apache.spark.util.Utils$$anon$4.run(Utils.scala:212)
>> >>>> > PS>
>> >>>> > ############
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> > Thanks in advance
>> >>>> > --
>> >>>> > Rahul K Bhojwani
>> >>>> > 3rd Year B.Tech
>> >>>> > Computer Science and Engineering
>> >>>> > National Institute of Technology, Karnataka
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Marcelo
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Rahul K Bhojwani
>> >>> 3rd Year B.Tech
>> >>> Computer Science and Engineering
>> >>> National Institute of Technology, Karnataka
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >> Rahul K Bhojwani
>> >> 3rd Year B.Tech
>> >> Computer Science and Engineering
>> >> National Institute of Technology, Karnataka
>> >
>> >
>> >
>> > --
>> > Marcelo
>>
>>
>>
>> --
>> Marcelo
>
>
>
>
> --
> Rahul K Bhojwani
> 3rd Year B.Tech
> Computer Science and Engineering
> National Institute of Technology, Karnataka



-- 
Marcelo

Re: Error: Could not delete temporary files.

Posted by Rahul Bhojwani <ra...@gmail.com>.

Thanks Marcelo.
I was having another problem. My code was running properly and then it
suddenly stopped with the error:

java.lang.OutOfMemoryError: Java heap space
        at java.io.BufferedOutputStream.<init>(Unknown Source)
        at
org.apache.spark.api.python.PythonRDD$$anon$2.run(PythonRDD.scala:62)

Can you help in that?


On Wed, Jul 9, 2014 at 2:07 AM, Marcelo Vanzin <va...@cloudera.com> wrote:

> Sorry, that would be sc.stop() (not close).
>
> On Tue, Jul 8, 2014 at 1:31 PM, Marcelo Vanzin <va...@cloudera.com>
> wrote:
> > Hi Rahul,
> >
> > Can you try calling "sc.close()" at the end of your program, so Spark
> > can clean up after itself?
> >
> > On Tue, Jul 8, 2014 at 12:40 PM, Rahul Bhojwani
> > <ra...@gmail.com> wrote:
> >> Here I am adding my code. If you can have a look to help me out.
> >> Thanks
> >> #######################
> >>
> >> import tokenizer
> >> import gettingWordLists as gl
> >> from pyspark.mllib.classification import NaiveBayes
> >> from numpy import array
> >> from pyspark import SparkContext, SparkConf
> >>
> >> conf = (SparkConf().setMaster("local[6]").setAppName("My
> >> app").set("spark.executor.memory", "1g"))
> >>
> >> sc=SparkContext(conf = conf)
> >> # Getting the positive dict:
> >>
> >> pos_list = []
> >> pos_list = gl.getPositiveList()
> >> neg_list = gl.getNegativeList()
> >>
> >> #print neg_list
> >> tok = tokenizer.Tokenizer(preserve_case=False)
> >> train_data  = []
> >>
> >> with open("training_file_coach.csv","r") as train_file:
> >>     for line in train_file:
> >>         tokens = line.split("######")
> >>         msg = tokens[0]
> >>         sentiment = tokens[1]
> >>         pos_count = 0
> >>         neg_count = 0
> >> #        print sentiment + "\n\n"
> >> #        print msg
> >>         tokens = set(tok.tokenize(msg))
> >>         for i in tokens:
> >>             if i.encode('utf-8') in pos_list:
> >>                 pos_count+=1
> >>             if i.encode('utf-8') in neg_list:
> >>                 neg_count+=1
> >>         if sentiment.__contains__('NEG'):
> >>             label = 0.0
> >>         else:
> >>             label = 1.0
> >>
> >>         feature = []
> >>         feature.append(label)
> >>         feature.append(float(pos_count))
> >>         feature.append(float(neg_count))
> >>         train_data.append(feature)
> >>     train_file.close()
> >>
> >> model = NaiveBayes.train(sc.parallelize(array(train_data)))
> >>
> >>
> >> file_predicted = open("predicted_file_coach.csv","w")
> >>
> >> with open("prediction_file_coach.csv","r") as predict_file:
> >>     for line in predict_file:
> >>         msg = line[0:-1]
> >>         pos_count = 0
> >>         neg_count = 0
> >> #        print sentiment + "\n\n"
> >> #        print msg
> >>         tokens = set(tok.tokenize(msg))
> >>         for i in tokens:
> >>             if i.encode('utf-8') in pos_list:
> >>                 pos_count+=1
> >>             if i.encode('utf-8') in neg_list:
> >>                 neg_count+=1
> >>         prediction =
> >> model.predict(array([float(pos_count),float(neg_count)]))
> >>         if prediction == 0:
> >>             sentiment = "NEG"
> >>         elif prediction == 1:
> >>             sentiment = "POS"
> >>         else:
> >>             print "ERROR\n\n\n\n\n\n\nERROR"
> >>
> >>         feature = []
> >>         feature.append(float(prediction))
> >>         feature.append(float(pos_count))
> >>         feature.append(float(neg_count))
> >>         print feature
> >>         train_data.append(feature)
> >>         model = NaiveBayes.train(sc.parallelize(array(train_data)))
> >>         file_predicted.write(msg + "######" + sentiment + "\n")
> >>
> >> file_predicted.close()
> >> ###################
> >>
> >> If you can have a look at the code and help me out, It would be great
> >>
> >> Thanks
> >>
> >>
> >> On Wed, Jul 9, 2014 at 12:54 AM, Rahul Bhojwani
> >> <ra...@gmail.com> wrote:
> >>>
> >>> Hi Marcelo.
> >>> Thanks for the quick reply. Can you suggest me how to increase the
> memory
> >>> limits or how to tackle this problem. I am a novice. If you want I can
> post
> >>> my code here.
> >>>
> >>>
> >>> Thanks
> >>>
> >>>
> >>> On Wed, Jul 9, 2014 at 12:50 AM, Marcelo Vanzin <va...@cloudera.com>
> >>> wrote:
> >>>>
> >>>> This is generally a side effect of your executor being killed. For
> >>>> example, Yarn will do that if you're going over the requested memory
> >>>> limits.
> >>>>
> >>>> On Tue, Jul 8, 2014 at 12:17 PM, Rahul Bhojwani
> >>>> <ra...@gmail.com> wrote:
> >>>> > HI,
> >>>> >
> >>>> > I am getting this error. Can anyone help out to explain why is this
> >>>> > error
> >>>> > coming.
> >>>> >
> >>>> > ########
> >>>> >
> >>>> > Exception in thread "delete Spark temp dir
> >>>> >
> >>>> >
> C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560"
> >>>> >  java.io.IOException: Failed to delete:
> >>>> >
> >>>> >
> C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560\tmp
> >>>> > cmenlp
> >>>> >         at
> >>>> > org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:483)
> >>>> >         at
> >>>> >
> >>>> >
> org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:479)
> >>>> >         at
> >>>> >
> >>>> >
> org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:478)
> >>>> >         at
> >>>> >
> >>>> >
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
> >>>> >         at
> >>>> > scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
> >>>> >         at
> >>>> > org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:478)
> >>>> >         at org.apache.spark.util.Utils$$anon$4.run(Utils.scala:212)
> >>>> > PS>
> >>>> > ############
> >>>> >
> >>>> >
> >>>> >
> >>>> >
> >>>> > Thanks in advance
> >>>> > --
> >>>> > Rahul K Bhojwani
> >>>> > 3rd Year B.Tech
> >>>> > Computer Science and Engineering
> >>>> > National Institute of Technology, Karnataka
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Marcelo
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> Rahul K Bhojwani
> >>> 3rd Year B.Tech
> >>> Computer Science and Engineering
> >>> National Institute of Technology, Karnataka
> >>
> >>
> >>
> >>
> >> --
> >> Rahul K Bhojwani
> >> 3rd Year B.Tech
> >> Computer Science and Engineering
> >> National Institute of Technology, Karnataka
> >
> >
> >
> > --
> > Marcelo
>
>
>
> --
> Marcelo
>



-- 
Rahul K Bhojwani
3rd Year B.Tech
Computer Science and Engineering
National Institute of Technology, Karnataka

Re: Error: Could not delete temporary files.

Posted by Marcelo Vanzin <va...@cloudera.com>.

Sorry, that would be sc.stop() (not close).

On Tue, Jul 8, 2014 at 1:31 PM, Marcelo Vanzin <va...@cloudera.com> wrote:
> Hi Rahul,
>
> Can you try calling "sc.close()" at the end of your program, so Spark
> can clean up after itself?
>
> On Tue, Jul 8, 2014 at 12:40 PM, Rahul Bhojwani
> <ra...@gmail.com> wrote:
>> Here I am adding my code. If you can have a look to help me out.
>> Thanks
>> #######################
>>
>> import tokenizer
>> import gettingWordLists as gl
>> from pyspark.mllib.classification import NaiveBayes
>> from numpy import array
>> from pyspark import SparkContext, SparkConf
>>
>> conf = (SparkConf().setMaster("local[6]").setAppName("My
>> app").set("spark.executor.memory", "1g"))
>>
>> sc=SparkContext(conf = conf)
>> # Getting the positive dict:
>>
>> pos_list = []
>> pos_list = gl.getPositiveList()
>> neg_list = gl.getNegativeList()
>>
>> #print neg_list
>> tok = tokenizer.Tokenizer(preserve_case=False)
>> train_data  = []
>>
>> with open("training_file_coach.csv","r") as train_file:
>>     for line in train_file:
>>         tokens = line.split("######")
>>         msg = tokens[0]
>>         sentiment = tokens[1]
>>         pos_count = 0
>>         neg_count = 0
>> #        print sentiment + "\n\n"
>> #        print msg
>>         tokens = set(tok.tokenize(msg))
>>         for i in tokens:
>>             if i.encode('utf-8') in pos_list:
>>                 pos_count+=1
>>             if i.encode('utf-8') in neg_list:
>>                 neg_count+=1
>>         if sentiment.__contains__('NEG'):
>>             label = 0.0
>>         else:
>>             label = 1.0
>>
>>         feature = []
>>         feature.append(label)
>>         feature.append(float(pos_count))
>>         feature.append(float(neg_count))
>>         train_data.append(feature)
>>     train_file.close()
>>
>> model = NaiveBayes.train(sc.parallelize(array(train_data)))
>>
>>
>> file_predicted = open("predicted_file_coach.csv","w")
>>
>> with open("prediction_file_coach.csv","r") as predict_file:
>>     for line in predict_file:
>>         msg = line[0:-1]
>>         pos_count = 0
>>         neg_count = 0
>> #        print sentiment + "\n\n"
>> #        print msg
>>         tokens = set(tok.tokenize(msg))
>>         for i in tokens:
>>             if i.encode('utf-8') in pos_list:
>>                 pos_count+=1
>>             if i.encode('utf-8') in neg_list:
>>                 neg_count+=1
>>         prediction =
>> model.predict(array([float(pos_count),float(neg_count)]))
>>         if prediction == 0:
>>             sentiment = "NEG"
>>         elif prediction == 1:
>>             sentiment = "POS"
>>         else:
>>             print "ERROR\n\n\n\n\n\n\nERROR"
>>
>>         feature = []
>>         feature.append(float(prediction))
>>         feature.append(float(pos_count))
>>         feature.append(float(neg_count))
>>         print feature
>>         train_data.append(feature)
>>         model = NaiveBayes.train(sc.parallelize(array(train_data)))
>>         file_predicted.write(msg + "######" + sentiment + "\n")
>>
>> file_predicted.close()
>> ###################
>>
>> If you can have a look at the code and help me out, It would be great
>>
>> Thanks
>>
>>
>> On Wed, Jul 9, 2014 at 12:54 AM, Rahul Bhojwani
>> <ra...@gmail.com> wrote:
>>>
>>> Hi Marcelo.
>>> Thanks for the quick reply. Can you suggest me how to increase the memory
>>> limits or how to tackle this problem. I am a novice. If you want I can post
>>> my code here.
>>>
>>>
>>> Thanks
>>>
>>>
>>> On Wed, Jul 9, 2014 at 12:50 AM, Marcelo Vanzin <va...@cloudera.com>
>>> wrote:
>>>>
>>>> This is generally a side effect of your executor being killed. For
>>>> example, Yarn will do that if you're going over the requested memory
>>>> limits.
>>>>
>>>> On Tue, Jul 8, 2014 at 12:17 PM, Rahul Bhojwani
>>>> <ra...@gmail.com> wrote:
>>>> > HI,
>>>> >
>>>> > I am getting this error. Can anyone help out to explain why is this
>>>> > error
>>>> > coming.
>>>> >
>>>> > ########
>>>> >
>>>> > Exception in thread "delete Spark temp dir
>>>> >
>>>> > C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560"
>>>> >  java.io.IOException: Failed to delete:
>>>> >
>>>> > C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560\tmp
>>>> > cmenlp
>>>> >         at
>>>> > org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:483)
>>>> >         at
>>>> >
>>>> > org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:479)
>>>> >         at
>>>> >
>>>> > org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:478)
>>>> >         at
>>>> >
>>>> > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>>>> >         at
>>>> > scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
>>>> >         at
>>>> > org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:478)
>>>> >         at org.apache.spark.util.Utils$$anon$4.run(Utils.scala:212)
>>>> > PS>
>>>> > ############
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > Thanks in advance
>>>> > --
>>>> > Rahul K Bhojwani
>>>> > 3rd Year B.Tech
>>>> > Computer Science and Engineering
>>>> > National Institute of Technology, Karnataka
>>>>
>>>>
>>>>
>>>> --
>>>> Marcelo
>>>
>>>
>>>
>>>
>>> --
>>> Rahul K Bhojwani
>>> 3rd Year B.Tech
>>> Computer Science and Engineering
>>> National Institute of Technology, Karnataka
>>
>>
>>
>>
>> --
>> Rahul K Bhojwani
>> 3rd Year B.Tech
>> Computer Science and Engineering
>> National Institute of Technology, Karnataka
>
>
>
> --
> Marcelo



-- 
Marcelo

Re: Error: Could not delete temporary files.

Posted by Marcelo Vanzin <va...@cloudera.com>.

Hi Rahul,

Can you try calling "sc.close()" at the end of your program, so Spark
can clean up after itself?

On Tue, Jul 8, 2014 at 12:40 PM, Rahul Bhojwani
<ra...@gmail.com> wrote:
> Here I am adding my code. If you can have a look to help me out.
> Thanks
> #######################
>
> import tokenizer
> import gettingWordLists as gl
> from pyspark.mllib.classification import NaiveBayes
> from numpy import array
> from pyspark import SparkContext, SparkConf
>
> conf = (SparkConf().setMaster("local[6]").setAppName("My
> app").set("spark.executor.memory", "1g"))
>
> sc=SparkContext(conf = conf)
> # Getting the positive dict:
>
> pos_list = []
> pos_list = gl.getPositiveList()
> neg_list = gl.getNegativeList()
>
> #print neg_list
> tok = tokenizer.Tokenizer(preserve_case=False)
> train_data  = []
>
> with open("training_file_coach.csv","r") as train_file:
>     for line in train_file:
>         tokens = line.split("######")
>         msg = tokens[0]
>         sentiment = tokens[1]
>         pos_count = 0
>         neg_count = 0
> #        print sentiment + "\n\n"
> #        print msg
>         tokens = set(tok.tokenize(msg))
>         for i in tokens:
>             if i.encode('utf-8') in pos_list:
>                 pos_count+=1
>             if i.encode('utf-8') in neg_list:
>                 neg_count+=1
>         if sentiment.__contains__('NEG'):
>             label = 0.0
>         else:
>             label = 1.0
>
>         feature = []
>         feature.append(label)
>         feature.append(float(pos_count))
>         feature.append(float(neg_count))
>         train_data.append(feature)
>     train_file.close()
>
> model = NaiveBayes.train(sc.parallelize(array(train_data)))
>
>
> file_predicted = open("predicted_file_coach.csv","w")
>
> with open("prediction_file_coach.csv","r") as predict_file:
>     for line in predict_file:
>         msg = line[0:-1]
>         pos_count = 0
>         neg_count = 0
> #        print sentiment + "\n\n"
> #        print msg
>         tokens = set(tok.tokenize(msg))
>         for i in tokens:
>             if i.encode('utf-8') in pos_list:
>                 pos_count+=1
>             if i.encode('utf-8') in neg_list:
>                 neg_count+=1
>         prediction =
> model.predict(array([float(pos_count),float(neg_count)]))
>         if prediction == 0:
>             sentiment = "NEG"
>         elif prediction == 1:
>             sentiment = "POS"
>         else:
>             print "ERROR\n\n\n\n\n\n\nERROR"
>
>         feature = []
>         feature.append(float(prediction))
>         feature.append(float(pos_count))
>         feature.append(float(neg_count))
>         print feature
>         train_data.append(feature)
>         model = NaiveBayes.train(sc.parallelize(array(train_data)))
>         file_predicted.write(msg + "######" + sentiment + "\n")
>
> file_predicted.close()
> ###################
>
> If you can have a look at the code and help me out, It would be great
>
> Thanks
>
>
> On Wed, Jul 9, 2014 at 12:54 AM, Rahul Bhojwani
> <ra...@gmail.com> wrote:
>>
>> Hi Marcelo.
>> Thanks for the quick reply. Can you suggest me how to increase the memory
>> limits or how to tackle this problem. I am a novice. If you want I can post
>> my code here.
>>
>>
>> Thanks
>>
>>
>> On Wed, Jul 9, 2014 at 12:50 AM, Marcelo Vanzin <va...@cloudera.com>
>> wrote:
>>>
>>> This is generally a side effect of your executor being killed. For
>>> example, Yarn will do that if you're going over the requested memory
>>> limits.
>>>
>>> On Tue, Jul 8, 2014 at 12:17 PM, Rahul Bhojwani
>>> <ra...@gmail.com> wrote:
>>> > HI,
>>> >
>>> > I am getting this error. Can anyone help out to explain why is this
>>> > error
>>> > coming.
>>> >
>>> > ########
>>> >
>>> > Exception in thread "delete Spark temp dir
>>> >
>>> > C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560"
>>> >  java.io.IOException: Failed to delete:
>>> >
>>> > C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560\tmp
>>> > cmenlp
>>> >         at
>>> > org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:483)
>>> >         at
>>> >
>>> > org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:479)
>>> >         at
>>> >
>>> > org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:478)
>>> >         at
>>> >
>>> > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>>> >         at
>>> > scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
>>> >         at
>>> > org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:478)
>>> >         at org.apache.spark.util.Utils$$anon$4.run(Utils.scala:212)
>>> > PS>
>>> > ############
>>> >
>>> >
>>> >
>>> >
>>> > Thanks in advance
>>> > --
>>> > Rahul K Bhojwani
>>> > 3rd Year B.Tech
>>> > Computer Science and Engineering
>>> > National Institute of Technology, Karnataka
>>>
>>>
>>>
>>> --
>>> Marcelo
>>
>>
>>
>>
>> --
>> Rahul K Bhojwani
>> 3rd Year B.Tech
>> Computer Science and Engineering
>> National Institute of Technology, Karnataka
>
>
>
>
> --
> Rahul K Bhojwani
> 3rd Year B.Tech
> Computer Science and Engineering
> National Institute of Technology, Karnataka



-- 
Marcelo

Re: Error: Could not delete temporary files.

Posted by Rahul Bhojwani <ra...@gmail.com>.

I have pasted the logs below:


PS F:\spark-0.9.1\codes\sentiment analysis> pyspark
.\naive_bayes_analyser.py
Running python with PYTHONPATH=F:\spark-0.9.1\spark-0.9.1\bin\..\python;
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/F:/spark-0.9.1/spark-0.9.1/assembly/target/scala-2.10/spark-assembly-0.9.1-hadoop1.0.
4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/F:/spark-0.9.1/spark-0.9.1/tools/target/scala-2.10/spark-tools-assembly-0.9.1.jar!/or
g/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
log4j:WARN No appenders could be found for logger
(akka.event.slf4j.Slf4jLogger).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
more info.
14/07/09 00:57:25 INFO SparkEnv: Using Spark's default log4j profile:
org/apache/spark/log4j-defaults.properties
14/07/09 00:57:25 INFO SparkEnv: Registering BlockManagerMaster
14/07/09 00:57:25 INFO DiskBlockManager: Created local directory at
C:\Users\shawn\AppData\Local\Temp\spark-local-201407
09005725-fe99
14/07/09 00:57:25 INFO MemoryStore: MemoryStore started with capacity 297.0
MB.
14/07/09 00:57:25 INFO ConnectionManager: Bound socket to port 51231 with
id = ConnectionManagerId(shawn-PC,51231)
14/07/09 00:57:25 INFO BlockManagerMaster: Trying to register BlockManager
14/07/09 00:57:25 INFO BlockManagerMasterActor$BlockManagerInfo:
Registering block manager shawn-PC:51231 with 297.0 MB
RAM
14/07/09 00:57:25 INFO BlockManagerMaster: Registered BlockManager
14/07/09 00:57:26 INFO HttpServer: Starting HTTP Server
14/07/09 00:57:26 INFO HttpBroadcast: Broadcast server started at
http://192.168.1.100:51232
14/07/09 00:57:26 INFO SparkEnv: Registering MapOutputTracker
14/07/09 00:57:26 INFO HttpFileServer: HTTP File server directory is
C:\Users\shawn\AppData\Local\Temp\spark-339491dd-68
f4-4027-b661-00f2c5f95494
14/07/09 00:57:26 INFO HttpServer: Starting HTTP Server
14/07/09 00:57:26 INFO SparkUI: Started Spark Web UI at http://shawn-PC:4040
14/07/09 00:57:39 INFO SparkContext: Starting job: aggregate at
NaiveBayes.scala:81
14/07/09 00:57:39 INFO DAGScheduler: Got job 0 (aggregate at
NaiveBayes.scala:81) with 6 output partitions (allowLocal=f
alse)
14/07/09 00:57:39 INFO DAGScheduler: Final stage: Stage 0 (aggregate at
NaiveBayes.scala:81)
14/07/09 00:57:39 INFO DAGScheduler: Parents of final stage: List()
14/07/09 00:57:39 INFO DAGScheduler: Missing parents: List()
14/07/09 00:57:39 INFO DAGScheduler: Submitting Stage 0 (MappedRDD[2] at
map at PythonMLLibAPI.scala:190), which has no
missing parents
14/07/09 00:57:39 INFO DAGScheduler: Submitting 6 missing tasks from Stage
0 (MappedRDD[2] at map at PythonMLLibAPI.scal
a:190)
14/07/09 00:57:39 INFO TaskSchedulerImpl: Adding task set 0.0 with 6 tasks
14/07/09 00:57:39 INFO TaskSetManager: Starting task 0.0:0 as TID 0 on
executor localhost: localhost (PROCESS_LOCAL)
14/07/09 00:57:39 INFO TaskSetManager: Serialized task 0.0:0 as 52792 bytes
in 4 ms
14/07/09 00:57:39 INFO TaskSetManager: Starting task 0.0:1 as TID 1 on
executor localhost: localhost (PROCESS_LOCAL)
14/07/09 00:57:39 INFO TaskSetManager: Serialized task 0.0:1 as 52792 bytes
in 0 ms
14/07/09 00:57:39 INFO TaskSetManager: Starting task 0.0:2 as TID 2 on
executor localhost: localhost (PROCESS_LOCAL)
14/07/09 00:57:39 INFO TaskSetManager: Serialized task 0.0:2 as 52792 bytes
in 1 ms
14/07/09 00:57:39 INFO TaskSetManager: Starting task 0.0:3 as TID 3 on
executor localhost: localhost (PROCESS_LOCAL)
14/07/09 00:57:39 INFO TaskSetManager: Serialized task 0.0:3 as 52792 bytes
in 0 ms
14/07/09 00:57:39 INFO TaskSetManager: Starting task 0.0:4 as TID 4 on
executor localhost: localhost (PROCESS_LOCAL)
14/07/09 00:57:39 INFO TaskSetManager: Serialized task 0.0:4 as 52792 bytes
in 0 ms
14/07/09 00:57:39 INFO TaskSetManager: Starting task 0.0:5 as TID 5 on
executor localhost: localhost (PROCESS_LOCAL)
14/07/09 00:57:39 INFO TaskSetManager: Serialized task 0.0:5 as 53011 bytes
in 0 ms
14/07/09 00:57:39 INFO Executor: Running task ID 3
14/07/09 00:57:39 INFO Executor: Running task ID 1
14/07/09 00:57:39 INFO Executor: Running task ID 2
14/07/09 00:57:39 INFO Executor: Running task ID 5
14/07/09 00:57:39 INFO Executor: Running task ID 4
14/07/09 00:57:39 INFO Executor: Running task ID 0
14/07/09 00:57:39 INFO CacheManager: Partition rdd_1_4 not found, computing
it
14/07/09 00:57:39 INFO CacheManager: Partition rdd_1_0 not found, computing
it
14/07/09 00:57:39 INFO CacheManager: Partition rdd_1_5 not found, computing
it
14/07/09 00:57:39 INFO CacheManager: Partition rdd_1_2 not found, computing
it
14/07/09 00:57:39 INFO CacheManager: Partition rdd_1_1 not found, computing
it
14/07/09 00:57:39 INFO CacheManager: Partition rdd_1_3 not found, computing
it
14/07/09 00:57:39 INFO PythonRDD: Times: total = 290, boot = 176, init =
102, finish = 12
14/07/09 00:57:39 WARN SizeEstimator: Failed to check whether
UseCompressedOops is set; assuming yes
14/07/09 00:57:39 INFO MemoryStore: ensureFreeSpace(57465) called with
curMem=0, maxMem=311387750
14/07/09 00:57:39 INFO MemoryStore: Block rdd_1_4 stored as values to
memory (estimated size 56.1 KB, free 296.9 MB)
14/07/09 00:57:39 INFO BlockManagerMasterActor$BlockManagerInfo: Added
rdd_1_4 in memory on shawn-PC:51231 (size: 56.1 K
B, free: 296.9 MB)
14/07/09 00:57:39 INFO BlockManagerMaster: Updated info of block rdd_1_4
14/07/09 00:57:39 INFO Executor: Serialized size of result for 4 is 967
14/07/09 00:57:39 INFO Executor: Sending result for 4 directly to driver
14/07/09 00:57:39 INFO Executor: Finished task ID 4
14/07/09 00:57:39 INFO TaskSetManager: Finished TID 4 in 438 ms on
localhost (progress: 1/6)
14/07/09 00:57:39 INFO DAGScheduler: Completed ResultTask(0, 4)
14/07/09 00:57:39 INFO PythonRDD: Times: total = 457, boot = 334, init =
111, finish = 12
14/07/09 00:57:39 INFO MemoryStore: ensureFreeSpace(57465) called with
curMem=57465, maxMem=311387750
14/07/09 00:57:39 INFO MemoryStore: Block rdd_1_3 stored as values to
memory (estimated size 56.1 KB, free 296.9 MB)
14/07/09 00:57:39 INFO BlockManagerMasterActor$BlockManagerInfo: Added
rdd_1_3 in memory on shawn-PC:51231 (size: 56.1 K
B, free: 296.9 MB)
14/07/09 00:57:39 INFO BlockManagerMaster: Updated info of block rdd_1_3
14/07/09 00:57:39 INFO Executor: Serialized size of result for 3 is 967
14/07/09 00:57:39 INFO Executor: Sending result for 3 directly to driver
14/07/09 00:57:39 INFO Executor: Finished task ID 3
14/07/09 00:57:39 INFO DAGScheduler: Completed ResultTask(0, 3)
14/07/09 00:57:39 INFO TaskSetManager: Finished TID 3 in 522 ms on
localhost (progress: 2/6)
14/07/09 00:57:39 INFO PythonRDD: Times: total = 622, boot = 513, init =
98, finish = 11
14/07/09 00:57:39 INFO MemoryStore: ensureFreeSpace(57465) called with
curMem=114930, maxMem=311387750
14/07/09 00:57:39 INFO MemoryStore: Block rdd_1_1 stored as values to
memory (estimated size 56.1 KB, free 296.8 MB)
14/07/09 00:57:39 INFO BlockManagerMasterActor$BlockManagerInfo: Added
rdd_1_1 in memory on shawn-PC:51231 (size: 56.1 K
B, free: 296.8 MB)
14/07/09 00:57:39 INFO BlockManagerMaster: Updated info of block rdd_1_1
14/07/09 00:57:39 INFO Executor: Serialized size of result for 1 is 967
14/07/09 00:57:39 INFO Executor: Sending result for 1 directly to driver
14/07/09 00:57:39 INFO Executor: Finished task ID 1
14/07/09 00:57:39 INFO DAGScheduler: Completed ResultTask(0, 1)
14/07/09 00:57:39 INFO TaskSetManager: Finished TID 1 in 677 ms on
localhost (progress: 3/6)
14/07/09 00:57:39 INFO PythonRDD: Times: total = 787, boot = 678, init =
98, finish = 11
14/07/09 00:57:39 INFO MemoryStore: ensureFreeSpace(57465) called with
curMem=172395, maxMem=311387750
14/07/09 00:57:39 INFO MemoryStore: Block rdd_1_2 stored as values to
memory (estimated size 56.1 KB, free 296.7 MB)
14/07/09 00:57:39 INFO BlockManagerMasterActor$BlockManagerInfo: Added
rdd_1_2 in memory on shawn-PC:51231 (size: 56.1 K
B, free: 296.7 MB)
14/07/09 00:57:39 INFO BlockManagerMaster: Updated info of block rdd_1_2
14/07/09 00:57:39 INFO Executor: Serialized size of result for 2 is 967
14/07/09 00:57:39 INFO Executor: Sending result for 2 directly to driver
14/07/09 00:57:39 INFO Executor: Finished task ID 2
14/07/09 00:57:39 INFO DAGScheduler: Completed ResultTask(0, 2)
14/07/09 00:57:39 INFO TaskSetManager: Finished TID 2 in 838 ms on
localhost (progress: 4/6)
14/07/09 00:57:40 INFO PythonRDD: Times: total = 950, boot = 842, init =
96, finish = 12
14/07/09 00:57:40 INFO MemoryStore: ensureFreeSpace(57465) called with
curMem=229860, maxMem=311387750
14/07/09 00:57:40 INFO MemoryStore: Block rdd_1_5 stored as values to
memory (estimated size 56.1 KB, free 296.7 MB)
14/07/09 00:57:40 INFO BlockManagerMasterActor$BlockManagerInfo: Added
rdd_1_5 in memory on shawn-PC:51231 (size: 56.1 K
B, free: 296.7 MB)
14/07/09 00:57:40 INFO BlockManagerMaster: Updated info of block rdd_1_5
14/07/09 00:57:40 INFO Executor: Serialized size of result for 5 is 967
14/07/09 00:57:40 INFO Executor: Sending result for 5 directly to driver
14/07/09 00:57:40 INFO Executor: Finished task ID 5
14/07/09 00:57:40 INFO TaskSetManager: Finished TID 5 in 995 ms on
localhost (progress: 5/6)
14/07/09 00:57:40 INFO DAGScheduler: Completed ResultTask(0, 5)
14/07/09 00:57:40 INFO PythonRDD: Times: total = 1114, boot = 1004, init =
98, finish = 12
14/07/09 00:57:40 INFO MemoryStore: ensureFreeSpace(57465) called with
curMem=287325, maxMem=311387750
14/07/09 00:57:40 INFO MemoryStore: Block rdd_1_0 stored as values to
memory (estimated size 56.1 KB, free 296.6 MB)
14/07/09 00:57:40 INFO BlockManagerMasterActor$BlockManagerInfo: Added
rdd_1_0 in memory on shawn-PC:51231 (size: 56.1 K
B, free: 296.6 MB)
14/07/09 00:57:40 INFO BlockManagerMaster: Updated info of block rdd_1_0
14/07/09 00:57:40 INFO Executor: Serialized size of result for 0 is 967
14/07/09 00:57:40 INFO Executor: Sending result for 0 directly to driver
14/07/09 00:57:40 INFO Executor: Finished task ID 0
14/07/09 00:57:40 INFO DAGScheduler: Completed ResultTask(0, 0)
14/07/09 00:57:40 INFO TaskSetManager: Finished TID 0 in 1173 ms on
localhost (progress: 6/6)
14/07/09 00:57:40 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks
have all completed, from pool
14/07/09 00:57:40 INFO DAGScheduler: Stage 0 (aggregate at
NaiveBayes.scala:81) finished in 1.180 s
14/07/09 00:57:40 INFO SparkContext: Job finished: aggregate at
NaiveBayes.scala:81, took 1.24974564 s
[0.0, 329.0, 231.0]
14/07/09 00:57:41 INFO SparkContext: Starting job: aggregate at
NaiveBayes.scala:81
14/07/09 00:57:41 INFO DAGScheduler: Got job 1 (aggregate at
NaiveBayes.scala:81) with 6 output partitions (allowLocal=f
alse)
14/07/09 00:57:41 INFO DAGScheduler: Final stage: Stage 1 (aggregate at
NaiveBayes.scala:81)
14/07/09 00:57:41 INFO DAGScheduler: Parents of final stage: List()
14/07/09 00:57:41 INFO DAGScheduler: Missing parents: List()
14/07/09 00:57:41 INFO DAGScheduler: Submitting Stage 1 (MappedRDD[5] at
map at PythonMLLibAPI.scala:190), which has no
missing parents
14/07/09 00:57:41 INFO DAGScheduler: Submitting 6 missing tasks from Stage
1 (MappedRDD[5] at map at PythonMLLibAPI.scal
a:190)
14/07/09 00:57:41 INFO TaskSchedulerImpl: Adding task set 1.0 with 6 tasks
14/07/09 00:57:41 INFO TaskSetManager: Starting task 1.0:0 as TID 6 on
executor localhost: localhost (PROCESS_LOCAL)
14/07/09 00:57:41 INFO TaskSetManager: Serialized task 1.0:0 as 52790 bytes
in 1 ms
14/07/09 00:57:41 INFO TaskSetManager: Starting task 1.0:1 as TID 7 on
executor localhost: localhost (PROCESS_LOCAL)
14/07/09 00:57:41 INFO TaskSetManager: Serialized task 1.0:1 as 52790 bytes
in 1 ms
14/07/09 00:57:41 INFO TaskSetManager: Starting task 1.0:2 as TID 8 on
executor localhost: localhost (PROCESS_LOCAL)
14/07/09 00:57:41 INFO TaskSetManager: Serialized task 1.0:2 as 52790 bytes
in 0 ms
14/07/09 00:57:41 INFO TaskSetManager: Starting task 1.0:3 as TID 9 on
executor localhost: localhost (PROCESS_LOCAL)
14/07/09 00:57:41 INFO TaskSetManager: Serialized task 1.0:3 as 52790 bytes
in 0 ms
14/07/09 00:57:41 INFO TaskSetManager: Starting task 1.0:4 as TID 10 on
executor localhost: localhost (PROCESS_LOCAL)
14/07/09 00:57:41 INFO TaskSetManager: Serialized task 1.0:4 as 52790 bytes
in 1 ms
14/07/09 00:57:41 INFO TaskSetManager: Starting task 1.0:5 as TID 11 on
executor localhost: localhost (PROCESS_LOCAL)
14/07/09 00:57:41 INFO TaskSetManager: Serialized task 1.0:5 as 53060 bytes
in 1 ms
14/07/09 00:57:41 INFO Executor: Running task ID 10
14/07/09 00:57:41 INFO Executor: Running task ID 6
14/07/09 00:57:41 INFO Executor: Running task ID 9
14/07/09 00:57:41 INFO Executor: Running task ID 8
14/07/09 00:57:41 INFO Executor: Running task ID 11
14/07/09 00:57:41 INFO CacheManager: Partition rdd_4_2 not found, computing
it
14/07/09 00:57:41 INFO CacheManager: Partition rdd_4_4 not found, computing
it
14/07/09 00:57:41 INFO Executor: Running task ID 7
14/07/09 00:57:41 INFO CacheManager: Partition rdd_4_5 not found, computing
it
14/07/09 00:57:41 INFO CacheManager: Partition rdd_4_0 not found, computing
it
14/07/09 00:57:41 INFO CacheManager: Partition rdd_4_3 not found, computing
it
14/07/09 00:57:41 INFO CacheManager: Partition rdd_4_1 not found, computing
it
14/07/09 00:57:42 INFO PythonRDD: Times: total = 268, boot = 157, init =
100, finish = 11
14/07/09 00:57:42 INFO MemoryStore: ensureFreeSpace(57465) called with
curMem=344790, maxMem=311387750
14/07/09 00:57:42 INFO MemoryStore: Block rdd_4_2 stored as values to
memory (estimated size 56.1 KB, free 296.6 MB)
14/07/09 00:57:42 INFO BlockManagerMasterActor$BlockManagerInfo: Added
rdd_4_2 in memory on shawn-PC:51231 (size: 56.1 K
B, free: 296.6 MB)
14/07/09 00:57:42 INFO BlockManagerMaster: Updated info of block rdd_4_2
14/07/09 00:57:42 INFO Executor: Serialized size of result for 8 is 967
14/07/09 00:57:42 INFO Executor: Sending result for 8 directly to driver
14/07/09 00:57:42 INFO Executor: Finished task ID 8
14/07/09 00:57:42 INFO TaskSetManager: Finished TID 8 in 289 ms on
localhost (progress: 1/6)
14/07/09 00:57:42 INFO DAGScheduler: Completed ResultTask(1, 2)
14/07/09 00:57:42 INFO PythonRDD: Times: total = 425, boot = 316, init =
97, finish = 12
14/07/09 00:57:42 INFO MemoryStore: ensureFreeSpace(57465) called with
curMem=402255, maxMem=311387750
14/07/09 00:57:42 INFO MemoryStore: Block rdd_4_1 stored as values to
memory (estimated size 56.1 KB, free 296.5 MB)
14/07/09 00:57:42 INFO BlockManagerMasterActor$BlockManagerInfo: Added
rdd_4_1 in memory on shawn-PC:51231 (size: 56.1 K
B, free: 296.5 MB)
14/07/09 00:57:42 INFO BlockManagerMaster: Updated info of block rdd_4_1
14/07/09 00:57:42 INFO Executor: Serialized size of result for 7 is 967
14/07/09 00:57:42 INFO Executor: Sending result for 7 directly to driver
14/07/09 00:57:42 INFO Executor: Finished task ID 7
14/07/09 00:57:42 INFO TaskSetManager: Finished TID 7 in 452 ms on
localhost (progress: 2/6)
14/07/09 00:57:42 INFO DAGScheduler: Completed ResultTask(1, 1)
14/07/09 00:57:42 INFO PythonRDD: Times: total = 589, boot = 479, init =
98, finish = 12
14/07/09 00:57:42 INFO MemoryStore: ensureFreeSpace(57465) called with
curMem=459720, maxMem=311387750
14/07/09 00:57:42 INFO MemoryStore: Block rdd_4_3 stored as values to
memory (estimated size 56.1 KB, free 296.5 MB)
14/07/09 00:57:42 INFO BlockManagerMasterActor$BlockManagerInfo: Added
rdd_4_3 in memory on shawn-PC:51231 (size: 56.1 K
B, free: 296.5 MB)
14/07/09 00:57:42 INFO BlockManagerMaster: Updated info of block rdd_4_3
14/07/09 00:57:42 INFO Executor: Serialized size of result for 9 is 967
14/07/09 00:57:42 INFO Executor: Sending result for 9 directly to driver
14/07/09 00:57:42 INFO Executor: Finished task ID 9
14/07/09 00:57:42 INFO TaskSetManager: Finished TID 9 in 612 ms on
localhost (progress: 3/6)
14/07/09 00:57:42 INFO DAGScheduler: Completed ResultTask(1, 3)
14/07/09 00:57:42 INFO PythonRDD: Times: total = 752, boot = 642, init =
98, finish = 12
14/07/09 00:57:42 INFO MemoryStore: ensureFreeSpace(57465) called with
curMem=517185, maxMem=311387750
14/07/09 00:57:42 INFO MemoryStore: Block rdd_4_0 stored as values to
memory (estimated size 56.1 KB, free 296.4 MB)
14/07/09 00:57:42 INFO BlockManagerMasterActor$BlockManagerInfo: Added
rdd_4_0 in memory on shawn-PC:51231 (size: 56.1 K
B, free: 296.4 MB)
14/07/09 00:57:42 INFO BlockManagerMaster: Updated info of block rdd_4_0
14/07/09 00:57:42 INFO Executor: Serialized size of result for 6 is 967
14/07/09 00:57:42 INFO Executor: Sending result for 6 directly to driver
14/07/09 00:57:42 INFO Executor: Finished task ID 6
14/07/09 00:57:42 INFO TaskSetManager: Finished TID 6 in 777 ms on
localhost (progress: 4/6)
14/07/09 00:57:42 INFO DAGScheduler: Completed ResultTask(1, 0)
14/07/09 00:57:42 INFO PythonRDD: Times: total = 919, boot = 806, init =
101, finish = 12
14/07/09 00:57:42 INFO MemoryStore: ensureFreeSpace(57465) called with
curMem=574650, maxMem=311387750
14/07/09 00:57:42 INFO MemoryStore: Block rdd_4_5 stored as values to
memory (estimated size 56.1 KB, free 296.4 MB)
14/07/09 00:57:42 INFO BlockManagerMasterActor$BlockManagerInfo: Added
rdd_4_5 in memory on shawn-PC:51231 (size: 56.1 K
B, free: 296.4 MB)
14/07/09 00:57:42 INFO BlockManagerMaster: Updated info of block rdd_4_5
14/07/09 00:57:42 INFO Executor: Serialized size of result for 11 is 967
14/07/09 00:57:42 INFO Executor: Sending result for 11 directly to driver
14/07/09 00:57:42 INFO Executor: Finished task ID 11
14/07/09 00:57:42 INFO DAGScheduler: Completed ResultTask(1, 5)
14/07/09 00:57:42 INFO TaskSetManager: Finished TID 11 in 938 ms on
localhost (progress: 5/6)
14/07/09 00:57:42 INFO PythonRDD: Times: total = 1079, boot = 972, init =
95, finish = 12
14/07/09 00:57:42 INFO MemoryStore: ensureFreeSpace(57465) called with
curMem=632115, maxMem=311387750
14/07/09 00:57:42 INFO MemoryStore: Block rdd_4_4 stored as values to
memory (estimated size 56.1 KB, free 296.3 MB)
14/07/09 00:57:42 INFO BlockManagerMasterActor$BlockManagerInfo: Added
rdd_4_4 in memory on shawn-PC:51231 (size: 56.1 K
B, free: 296.3 MB)
14/07/09 00:57:42 INFO BlockManagerMaster: Updated info of block rdd_4_4
14/07/09 00:57:42 INFO Executor: Serialized size of result for 10 is 967
14/07/09 00:57:42 INFO Executor: Sending result for 10 directly to driver
14/07/09 00:57:42 INFO Executor: Finished task ID 10
14/07/09 00:57:42 INFO TaskSetManager: Finished TID 10 in 1098 ms on
localhost (progress: 6/6)
14/07/09 00:57:42 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks
have all completed, from pool
14/07/09 00:57:42 INFO DAGScheduler: Completed ResultTask(1, 4)
14/07/09 00:57:42 INFO DAGScheduler: Stage 1 (aggregate at
NaiveBayes.scala:81) finished in 1.106 s
14/07/09 00:57:42 INFO SparkContext: Job finished: aggregate at
NaiveBayes.scala:81, took 1.114280889 s
Exception in thread "delete Spark temp dir
C:\Users\shawn\AppData\Local\Temp\spark-45a2e1f4-8229-4614-a428-593886dac0c6"
 java.io.IOException: Failed to delete:
C:\Users\shawn\AppData\Local\Temp\spark-45a2e1f4-8229-4614-a428-593886dac0c6\tmp
d7r1h0
        at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:483)
        at
org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:479)
        at
org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:478)
        at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
        at
scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
        at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:478)
        at org.apache.spark.util.Utils$$anon$4.run(Utils.scala:212)

################################
These are the logs. Can you suggest something after looking at it.



On Wed, Jul 9, 2014 at 1:10 AM, Rahul Bhojwani <ra...@gmail.com>
wrote:

> Here I am adding my code. If you can have a look to help me out.
> Thanks
> #######################
>
> import tokenizer
> import gettingWordLists as gl
> from pyspark.mllib.classification import NaiveBayes
> from numpy import array
> from pyspark import SparkContext, SparkConf
>
> conf = (SparkConf().setMaster("local[6]").setAppName("My
> app").set("spark.executor.memory", "1g"))
>
> sc=SparkContext(conf = conf)
> # Getting the positive dict:
>
> pos_list = []
> pos_list = gl.getPositiveList()
> neg_list = gl.getNegativeList()
>
> #print neg_list
> tok = tokenizer.Tokenizer(preserve_case=False)
> train_data  = []
>
> with open("training_file_coach.csv","r") as train_file:
>     for line in train_file:
>         tokens = line.split("######")
>         msg = tokens[0]
>         sentiment = tokens[1]
>         pos_count = 0
>         neg_count = 0
> #        print sentiment + "\n\n"
> #        print msg
>         tokens = set(tok.tokenize(msg))
>         for i in tokens:
>             if i.encode('utf-8') in pos_list:
>                 pos_count+=1
>             if i.encode('utf-8') in neg_list:
>                 neg_count+=1
>         if sentiment.__contains__('NEG'):
>             label = 0.0
>         else:
>             label = 1.0
>
>         feature = []
>         feature.append(label)
>         feature.append(float(pos_count))
>         feature.append(float(neg_count))
>         train_data.append(feature)
>     train_file.close()
>
> model = NaiveBayes.train(sc.parallelize(array(train_data)))
>
>
> file_predicted = open("predicted_file_coach.csv","w")
>
> with open("prediction_file_coach.csv","r") as predict_file:
>     for line in predict_file:
>         msg = line[0:-1]
>         pos_count = 0
>         neg_count = 0
> #        print sentiment + "\n\n"
> #        print msg
>         tokens = set(tok.tokenize(msg))
>         for i in tokens:
>             if i.encode('utf-8') in pos_list:
>                 pos_count+=1
>             if i.encode('utf-8') in neg_list:
>                 neg_count+=1
>         prediction =
> model.predict(array([float(pos_count),float(neg_count)]))
>         if prediction == 0:
>             sentiment = "NEG"
>         elif prediction == 1:
>             sentiment = "POS"
>         else:
>             print "ERROR\n\n\n\n\n\n\nERROR"
>
>         feature = []
>         feature.append(float(prediction))
>         feature.append(float(pos_count))
>         feature.append(float(neg_count))
>         print feature
>         train_data.append(feature)
>         model = NaiveBayes.train(sc.parallelize(array(train_data)))
>         file_predicted.write(msg + "######" + sentiment + "\n")
>
> file_predicted.close()
> ###################
>
> If you can have a look at the code and help me out, It would be great
>
> Thanks
>
>
> On Wed, Jul 9, 2014 at 12:54 AM, Rahul Bhojwani <
> rahulbhojwani2003@gmail.com> wrote:
>
>> Hi Marcelo.
>> Thanks for the quick reply. Can you suggest me how to increase the memory
>> limits or how to tackle this problem. I am a novice. If you want I can post
>> my code here.
>>
>>
>> Thanks
>>
>>
>> On Wed, Jul 9, 2014 at 12:50 AM, Marcelo Vanzin <va...@cloudera.com>
>> wrote:
>>
>>> This is generally a side effect of your executor being killed. For
>>> example, Yarn will do that if you're going over the requested memory
>>> limits.
>>>
>>> On Tue, Jul 8, 2014 at 12:17 PM, Rahul Bhojwani
>>> <ra...@gmail.com> wrote:
>>> > HI,
>>> >
>>> > I am getting this error. Can anyone help out to explain why is this
>>> error
>>> > coming.
>>> >
>>> > ########
>>> >
>>> > Exception in thread "delete Spark temp dir
>>> >
>>> C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560"
>>> >  java.io.IOException: Failed to delete:
>>> >
>>> C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560\tmp
>>> > cmenlp
>>> >         at
>>> org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:483)
>>> >         at
>>> >
>>> org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:479)
>>> >         at
>>> >
>>> org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:478)
>>> >         at
>>> >
>>> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>>> >         at
>>> > scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
>>> >         at
>>> org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:478)
>>> >         at org.apache.spark.util.Utils$$anon$4.run(Utils.scala:212)
>>> > PS>
>>> > ############
>>> >
>>> >
>>> >
>>> >
>>> > Thanks in advance
>>> > --
>>> > Rahul K Bhojwani
>>> > 3rd Year B.Tech
>>> > Computer Science and Engineering
>>> > National Institute of Technology, Karnataka
>>>
>>>
>>>
>>> --
>>> Marcelo
>>>
>>
>>
>>
>> --
>> Rahul K Bhojwani
>> 3rd Year B.Tech
>> Computer Science and Engineering
>> National Institute of Technology, Karnataka
>>
>
>
>
> --
> Rahul K Bhojwani
> 3rd Year B.Tech
> Computer Science and Engineering
> National Institute of Technology, Karnataka
>



-- 
Rahul K Bhojwani
3rd Year B.Tech
Computer Science and Engineering
National Institute of Technology, Karnataka

Re: Error: Could not delete temporary files.

Posted by Rahul Bhojwani <ra...@gmail.com>.

Here I am adding my code. If you can have a look to help me out.
Thanks
#######################

import tokenizer
import gettingWordLists as gl
from pyspark.mllib.classification import NaiveBayes
from numpy import array
from pyspark import SparkContext, SparkConf

conf = (SparkConf().setMaster("local[6]").setAppName("My
app").set("spark.executor.memory", "1g"))

sc=SparkContext(conf = conf)
# Getting the positive dict:

pos_list = []
pos_list = gl.getPositiveList()
neg_list = gl.getNegativeList()

#print neg_list
tok = tokenizer.Tokenizer(preserve_case=False)
train_data  = []

with open("training_file_coach.csv","r") as train_file:
    for line in train_file:
        tokens = line.split("######")
        msg = tokens[0]
        sentiment = tokens[1]
        pos_count = 0
        neg_count = 0
#        print sentiment + "\n\n"
#        print msg
        tokens = set(tok.tokenize(msg))
        for i in tokens:
            if i.encode('utf-8') in pos_list:
                pos_count+=1
            if i.encode('utf-8') in neg_list:
                neg_count+=1
        if sentiment.__contains__('NEG'):
            label = 0.0
        else:
            label = 1.0

        feature = []
        feature.append(label)
        feature.append(float(pos_count))
        feature.append(float(neg_count))
        train_data.append(feature)
    train_file.close()

model = NaiveBayes.train(sc.parallelize(array(train_data)))


file_predicted = open("predicted_file_coach.csv","w")

with open("prediction_file_coach.csv","r") as predict_file:
    for line in predict_file:
        msg = line[0:-1]
        pos_count = 0
        neg_count = 0
#        print sentiment + "\n\n"
#        print msg
        tokens = set(tok.tokenize(msg))
        for i in tokens:
            if i.encode('utf-8') in pos_list:
                pos_count+=1
            if i.encode('utf-8') in neg_list:
                neg_count+=1
        prediction =
model.predict(array([float(pos_count),float(neg_count)]))
        if prediction == 0:
            sentiment = "NEG"
        elif prediction == 1:
            sentiment = "POS"
        else:
            print "ERROR\n\n\n\n\n\n\nERROR"

        feature = []
        feature.append(float(prediction))
        feature.append(float(pos_count))
        feature.append(float(neg_count))
        print feature
        train_data.append(feature)
        model = NaiveBayes.train(sc.parallelize(array(train_data)))
        file_predicted.write(msg + "######" + sentiment + "\n")

file_predicted.close()
###################

If you can have a look at the code and help me out, It would be great

Thanks


On Wed, Jul 9, 2014 at 12:54 AM, Rahul Bhojwani <rahulbhojwani2003@gmail.com
> wrote:

> Hi Marcelo.
> Thanks for the quick reply. Can you suggest me how to increase the memory
> limits or how to tackle this problem. I am a novice. If you want I can post
> my code here.
>
>
> Thanks
>
>
> On Wed, Jul 9, 2014 at 12:50 AM, Marcelo Vanzin <va...@cloudera.com>
> wrote:
>
>> This is generally a side effect of your executor being killed. For
>> example, Yarn will do that if you're going over the requested memory
>> limits.
>>
>> On Tue, Jul 8, 2014 at 12:17 PM, Rahul Bhojwani
>> <ra...@gmail.com> wrote:
>> > HI,
>> >
>> > I am getting this error. Can anyone help out to explain why is this
>> error
>> > coming.
>> >
>> > ########
>> >
>> > Exception in thread "delete Spark temp dir
>> >
>> C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560"
>> >  java.io.IOException: Failed to delete:
>> >
>> C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560\tmp
>> > cmenlp
>> >         at
>> org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:483)
>> >         at
>> >
>> org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:479)
>> >         at
>> >
>> org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:478)
>> >         at
>> >
>> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>> >         at
>> > scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
>> >         at
>> org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:478)
>> >         at org.apache.spark.util.Utils$$anon$4.run(Utils.scala:212)
>> > PS>
>> > ############
>> >
>> >
>> >
>> >
>> > Thanks in advance
>> > --
>> > Rahul K Bhojwani
>> > 3rd Year B.Tech
>> > Computer Science and Engineering
>> > National Institute of Technology, Karnataka
>>
>>
>>
>> --
>> Marcelo
>>
>
>
>
> --
> Rahul K Bhojwani
> 3rd Year B.Tech
> Computer Science and Engineering
> National Institute of Technology, Karnataka
>



-- 
Rahul K Bhojwani
3rd Year B.Tech
Computer Science and Engineering
National Institute of Technology, Karnataka

Re: Error: Could not delete temporary files.

Posted by Marcelo Vanzin <va...@cloudera.com>.

Note I didn't say that was your problem - it would be if (i) you're
running your job on Yarn and (ii) you look at the Yarn NodeManager
logs and see that it's actually killing your process.

I just said that the exception shows up in those kinds of situations.
You haven't provided enough information (=logs) for us to know what
exactly is happening with your job.

On Tue, Jul 8, 2014 at 12:24 PM, Rahul Bhojwani
<ra...@gmail.com> wrote:
> Hi Marcelo.
> Thanks for the quick reply. Can you suggest me how to increase the memory
> limits or how to tackle this problem. I am a novice. If you want I can post
> my code here.
>
>
> Thanks
>
>
> On Wed, Jul 9, 2014 at 12:50 AM, Marcelo Vanzin <va...@cloudera.com> wrote:
>>
>> This is generally a side effect of your executor being killed. For
>> example, Yarn will do that if you're going over the requested memory
>> limits.
>>
>> On Tue, Jul 8, 2014 at 12:17 PM, Rahul Bhojwani
>> <ra...@gmail.com> wrote:
>> > HI,
>> >
>> > I am getting this error. Can anyone help out to explain why is this
>> > error
>> > coming.
>> >
>> > ########
>> >
>> > Exception in thread "delete Spark temp dir
>> >
>> > C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560"
>> >  java.io.IOException: Failed to delete:
>> >
>> > C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560\tmp
>> > cmenlp
>> >         at
>> > org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:483)
>> >         at
>> >
>> > org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:479)
>> >         at
>> >
>> > org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:478)
>> >         at
>> >
>> > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>> >         at
>> > scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
>> >         at
>> > org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:478)
>> >         at org.apache.spark.util.Utils$$anon$4.run(Utils.scala:212)
>> > PS>
>> > ############
>> >
>> >
>> >
>> >
>> > Thanks in advance
>> > --
>> > Rahul K Bhojwani
>> > 3rd Year B.Tech
>> > Computer Science and Engineering
>> > National Institute of Technology, Karnataka
>>
>>
>>
>> --
>> Marcelo
>
>
>
>
> --
> Rahul K Bhojwani
> 3rd Year B.Tech
> Computer Science and Engineering
> National Institute of Technology, Karnataka



-- 
Marcelo

Re: Error: Could not delete temporary files.

Posted by Rahul Bhojwani <ra...@gmail.com>.

Hi Marcelo.
Thanks for the quick reply. Can you suggest me how to increase the memory
limits or how to tackle this problem. I am a novice. If you want I can post
my code here.


Thanks


On Wed, Jul 9, 2014 at 12:50 AM, Marcelo Vanzin <va...@cloudera.com> wrote:

> This is generally a side effect of your executor being killed. For
> example, Yarn will do that if you're going over the requested memory
> limits.
>
> On Tue, Jul 8, 2014 at 12:17 PM, Rahul Bhojwani
> <ra...@gmail.com> wrote:
> > HI,
> >
> > I am getting this error. Can anyone help out to explain why is this error
> > coming.
> >
> > ########
> >
> > Exception in thread "delete Spark temp dir
> >
> C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560"
> >  java.io.IOException: Failed to delete:
> >
> C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560\tmp
> > cmenlp
> >         at
> org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:483)
> >         at
> >
> org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:479)
> >         at
> >
> org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:478)
> >         at
> >
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
> >         at
> > scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
> >         at
> org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:478)
> >         at org.apache.spark.util.Utils$$anon$4.run(Utils.scala:212)
> > PS>
> > ############
> >
> >
> >
> >
> > Thanks in advance
> > --
> > Rahul K Bhojwani
> > 3rd Year B.Tech
> > Computer Science and Engineering
> > National Institute of Technology, Karnataka
>
>
>
> --
> Marcelo
>



-- 
Rahul K Bhojwani
3rd Year B.Tech
Computer Science and Engineering
National Institute of Technology, Karnataka

Re: Error: Could not delete temporary files.

Posted by Marcelo Vanzin <va...@cloudera.com>.

This is generally a side effect of your executor being killed. For
example, Yarn will do that if you're going over the requested memory
limits.

On Tue, Jul 8, 2014 at 12:17 PM, Rahul Bhojwani
<ra...@gmail.com> wrote:
> HI,
>
> I am getting this error. Can anyone help out to explain why is this error
> coming.
>
> ########
>
> Exception in thread "delete Spark temp dir
> C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560"
>  java.io.IOException: Failed to delete:
> C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560\tmp
> cmenlp
>         at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:483)
>         at
> org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:479)
>         at
> org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:478)
>         at
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>         at
> scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
>         at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:478)
>         at org.apache.spark.util.Utils$$anon$4.run(Utils.scala:212)
> PS>
> ############
>
>
>
>
> Thanks in advance
> --
> Rahul K Bhojwani
> 3rd Year B.Tech
> Computer Science and Engineering
> National Institute of Technology, Karnataka



-- 
Marcelo