You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Sourav Chandra <so...@livestream.com> on 2014/03/10 07:37:02 UTC

Re: [External] Re: no stdout output from worker

Hi Ranjan,

Whatever code is being passed as closure to spark operations like map,
flatmap, filter etc are part of task

All others are in driver.

Thanks,
Sourav


On Mon, Mar 10, 2014 at 12:03 PM, Sen, Ranjan [USA] <Se...@bah.com>wrote:

> Hi Patrick
>
> How do I know which part of the code is in the driver and which in task?
> The structure of my code is as below-
>
> Š
>
> Static boolean done=false;
> Š
>
> Public static void main(..
>
> ..
>
> JavaRDD<String> lines = ..
>
> ..
>
> While (!done) {
>
> ..
> While (..) {
>
> JavaPairRDD<Integer, List<Integer>> labs1 = labs.map (new PairFunction<Š );
>
> !! Here I have System.out.println (A)
>
> } // inner while
>
> !! Here I have System.out.println (B)
>
>
> If (Š) {
>         Done = true;
>
>         !! Also here some System.out.println  (C)
>
>         Break;
> }
>
> Else {
>
>         If (Š) {
>
>                 !! More System.out.println  (D)
>
>
>                 labs = labs.map(Š) ;
>
>                 }
>         }
>
> } // outer while
>
> !! Even more System.out.println  (E)
>
> } // main
>
> } //class
>
> I get the console outputs on the master for (B) and (E). I do not see any
> stdout in the worker node. I find the stdout and stderr in the
> <spark>/work/<appid>/0/. I see output
> in stderr but not in stdout.
>
> I do get all the outputs on the console when I run it in local mode.
>
> Sorry I am new and may be asking some naïve question but it is really
> confusing to me. Thanks for your help.
>
> Ranjan
>
> On 3/9/14, 10:50 PM, "Patrick Wendell" <pw...@gmail.com> wrote:
>
> >Hey Sen,
> >
> >Is your code in the driver code or inside one of the tasks?
> >
> >If it's in the tasks, the place you would expect these to be is in
> >stdout file under <spark>/<appid>/work/[stdout/stderr]. Are you seeing
> >at least stderr logs in that folder? If not then the tasks might not
> >be running on the workers machines. If you see stderr but not stdout
> >that's a bit of a puzzler since they both go through the same
> >mechanism.
> >
> >- Patrick
> >
> >On Sun, Mar 9, 2014 at 2:32 PM, Sen, Ranjan [USA] <Se...@bah.com>
> >wrote:
> >> Hi
> >> I have some System.out.println in my Java code that is working ok in a
> >>local
> >> environment. But when I run the same code on a standalone  mode in a EC2
> >> cluster I do not see them at the worker stdout (in the worker node under
> >> <spark location>/work ) or at the driver console. Could you help me
> >> understand how do I troubleshoot?
> >>
> >> Thanks
> >> Ranjan
>
>


-- 

Sourav Chandra

Senior Software Engineer

· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·

sourav.chandra@livestream.com

o: +91 80 4121 8723

m: +91 988 699 3746

skype: sourav.chandra

Livestream

"Ajmera Summit", First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, 3rd
Block, Koramangala Industrial Area,

Bangalore 560034

www.livestream.com

Re: [External] Re: no stdout output from worker

Posted by Patrick Wendell <pw...@gmail.com>.
Hey Sen,

Suarav is right, and I think all of your print statements are inside of the
driver program rather than inside of a closure. How are you running your
program (i.e. what do you run that starts this job)? Where you run the
driver you should expect to see the output.

- Patrick


On Mon, Mar 10, 2014 at 8:56 AM, Sen, Ranjan [USA] <Se...@bah.com>wrote:

>  Hi Sourav
> That makes so much sense. Thanks much.
> Ranjan
>
>   From: Sourav Chandra <so...@livestream.com>
> Reply-To: "user@spark.apache.org" <us...@spark.apache.org>
> Date: Sunday, March 9, 2014 at 10:37 PM
> To: "user@spark.apache.org" <us...@spark.apache.org>
> Subject: Re: [External] Re: no stdout output from worker
>
>   Hi Ranjan,
>
>  Whatever code is being passed as closure to spark operations like map,
> flatmap, filter etc are part of task
>
>  All others are in driver.
>
>  Thanks,
> Sourav
>
>
> On Mon, Mar 10, 2014 at 12:03 PM, Sen, Ranjan [USA] <Se...@bah.com>wrote:
>
>> Hi Patrick
>>
>> How do I know which part of the code is in the driver and which in task?
>> The structure of my code is as below-
>>
>> Š
>>
>> Static boolean done=false;
>> Š
>>
>> Public static void main(..
>>
>> ..
>>
>> JavaRDD<String> lines = ..
>>
>> ..
>>
>> While (!done) {
>>
>> ..
>> While (..) {
>>
>> JavaPairRDD<Integer, List<Integer>> labs1 = labs.map (new PairFunction<Š
>> );
>>
>> !! Here I have System.out.println (A)
>>
>> } // inner while
>>
>> !! Here I have System.out.println (B)
>>
>>
>> If (Š) {
>>         Done = true;
>>
>>         !! Also here some System.out.println  (C)
>>
>>         Break;
>> }
>>
>> Else {
>>
>>         If (Š) {
>>
>>                 !! More System.out.println  (D)
>>
>>
>>                 labs = labs.map(Š) ;
>>
>>                 }
>>         }
>>
>> } // outer while
>>
>> !! Even more System.out.println  (E)
>>
>> } // main
>>
>> } //class
>>
>> I get the console outputs on the master for (B) and (E). I do not see any
>> stdout in the worker node. I find the stdout and stderr in the
>> <spark>/work/<appid>/0/. I see output
>> in stderr but not in stdout.
>>
>> I do get all the outputs on the console when I run it in local mode.
>>
>> Sorry I am new and may be asking some naïve question but it is really
>> confusing to me. Thanks for your help.
>>
>> Ranjan
>>
>> On 3/9/14, 10:50 PM, "Patrick Wendell" <pw...@gmail.com> wrote:
>>
>> >Hey Sen,
>> >
>> >Is your code in the driver code or inside one of the tasks?
>> >
>> >If it's in the tasks, the place you would expect these to be is in
>> >stdout file under <spark>/<appid>/work/[stdout/stderr]. Are you seeing
>> >at least stderr logs in that folder? If not then the tasks might not
>> >be running on the workers machines. If you see stderr but not stdout
>> >that's a bit of a puzzler since they both go through the same
>> >mechanism.
>> >
>> >- Patrick
>> >
>> >On Sun, Mar 9, 2014 at 2:32 PM, Sen, Ranjan [USA] <Se...@bah.com>
>> >wrote:
>> >> Hi
>> >> I have some System.out.println in my Java code that is working ok in a
>> >>local
>> >> environment. But when I run the same code on a standalone  mode in a
>> EC2
>> >> cluster I do not see them at the worker stdout (in the worker node
>> under
>> >> <spark location>/work ) or at the driver console. Could you help me
>> >> understand how do I troubleshoot?
>> >>
>> >> Thanks
>> >> Ranjan
>>
>>
>
>
>  --
>
> Sourav Chandra
>
> Senior Software Engineer
>
> · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
>
> sourav.chandra@livestream.com
>
> o: +91 80 4121 8723
>
> m: +91 988 699 3746
>
> skype: sourav.chandra
>
> Livestream
>
> "Ajmera Summit", First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, 3rd
> Block, Koramangala Industrial Area,
>
> Bangalore 560034
>
> www.livestream.com
>

Re: [External] Re: no stdout output from worker

Posted by "Sen, Ranjan [USA]" <Se...@bah.com>.
Hi Sourav
That makes so much sense. Thanks much.
Ranjan

From: Sourav Chandra <so...@livestream.com>>
Reply-To: "user@spark.apache.org<ma...@spark.apache.org>" <us...@spark.apache.org>>
Date: Sunday, March 9, 2014 at 10:37 PM
To: "user@spark.apache.org<ma...@spark.apache.org>" <us...@spark.apache.org>>
Subject: Re: [External] Re: no stdout output from worker

Hi Ranjan,

Whatever code is being passed as closure to spark operations like map, flatmap, filter etc are part of task

All others are in driver.

Thanks,
Sourav


On Mon, Mar 10, 2014 at 12:03 PM, Sen, Ranjan [USA] <Se...@bah.com>> wrote:
Hi Patrick

How do I know which part of the code is in the driver and which in task?
The structure of my code is as below-

Š

Static boolean done=false;
Š

Public static void main(..

..

JavaRDD<String> lines = ..

..

While (!done) {

..
While (..) {

JavaPairRDD<Integer, List<Integer>> labs1 = labs.map (new PairFunction<Š );

!! Here I have System.out.println (A)

} // inner while

!! Here I have System.out.println (B)


If (Š) {
        Done = true;

        !! Also here some System.out.println  (C)

        Break;
}

Else {

        If (Š) {

                !! More System.out.println  (D)


                labs = labs.map(Š) ;

                }
        }

} // outer while

!! Even more System.out.println  (E)

} // main

} //class

I get the console outputs on the master for (B) and (E). I do not see any
stdout in the worker node. I find the stdout and stderr in the
<spark>/work/<appid>/0/. I see output
in stderr but not in stdout.

I do get all the outputs on the console when I run it in local mode.

Sorry I am new and may be asking some naïve question but it is really
confusing to me. Thanks for your help.

Ranjan

On 3/9/14, 10:50 PM, "Patrick Wendell" <pw...@gmail.com>> wrote:

>Hey Sen,
>
>Is your code in the driver code or inside one of the tasks?
>
>If it's in the tasks, the place you would expect these to be is in
>stdout file under <spark>/<appid>/work/[stdout/stderr]. Are you seeing
>at least stderr logs in that folder? If not then the tasks might not
>be running on the workers machines. If you see stderr but not stdout
>that's a bit of a puzzler since they both go through the same
>mechanism.
>
>- Patrick
>
>On Sun, Mar 9, 2014 at 2:32 PM, Sen, Ranjan [USA] <Se...@bah.com>>
>wrote:
>> Hi
>> I have some System.out.println in my Java code that is working ok in a
>>local
>> environment. But when I run the same code on a standalone  mode in a EC2
>> cluster I do not see them at the worker stdout (in the worker node under
>> <spark location>/work ) or at the driver console. Could you help me
>> understand how do I troubleshoot?
>>
>> Thanks
>> Ranjan




--

Sourav Chandra

Senior Software Engineer

· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·


sourav.chandra@livestream.com<ma...@livestream.com>

o: +91 80 4121 8723

m: +91 988 699 3746

skype: sourav.chandra

Livestream

"Ajmera Summit", First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, 3rd Block, Koramangala Industrial Area,

Bangalore 560034

www.livestream.com<http://www.livestream.com/>