You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Sourav Chandra <so...@livestream.com> on 2014/03/10 07:37:02 UTC
Re: [External] Re: no stdout output from worker
Hi Ranjan,
Whatever code is being passed as closure to spark operations like map,
flatmap, filter etc are part of task
All others are in driver.
Thanks,
Sourav
On Mon, Mar 10, 2014 at 12:03 PM, Sen, Ranjan [USA] <Se...@bah.com>wrote:
> Hi Patrick
>
> How do I know which part of the code is in the driver and which in task?
> The structure of my code is as below-
>
> Š
>
> Static boolean done=false;
> Š
>
> Public static void main(..
>
> ..
>
> JavaRDD<String> lines = ..
>
> ..
>
> While (!done) {
>
> ..
> While (..) {
>
> JavaPairRDD<Integer, List<Integer>> labs1 = labs.map (new PairFunction<Š );
>
> !! Here I have System.out.println (A)
>
> } // inner while
>
> !! Here I have System.out.println (B)
>
>
> If (Š) {
> Done = true;
>
> !! Also here some System.out.println (C)
>
> Break;
> }
>
> Else {
>
> If (Š) {
>
> !! More System.out.println (D)
>
>
> labs = labs.map(Š) ;
>
> }
> }
>
> } // outer while
>
> !! Even more System.out.println (E)
>
> } // main
>
> } //class
>
> I get the console outputs on the master for (B) and (E). I do not see any
> stdout in the worker node. I find the stdout and stderr in the
> <spark>/work/<appid>/0/. I see output
> in stderr but not in stdout.
>
> I do get all the outputs on the console when I run it in local mode.
>
> Sorry I am new and may be asking some naïve question but it is really
> confusing to me. Thanks for your help.
>
> Ranjan
>
> On 3/9/14, 10:50 PM, "Patrick Wendell" <pw...@gmail.com> wrote:
>
> >Hey Sen,
> >
> >Is your code in the driver code or inside one of the tasks?
> >
> >If it's in the tasks, the place you would expect these to be is in
> >stdout file under <spark>/<appid>/work/[stdout/stderr]. Are you seeing
> >at least stderr logs in that folder? If not then the tasks might not
> >be running on the workers machines. If you see stderr but not stdout
> >that's a bit of a puzzler since they both go through the same
> >mechanism.
> >
> >- Patrick
> >
> >On Sun, Mar 9, 2014 at 2:32 PM, Sen, Ranjan [USA] <Se...@bah.com>
> >wrote:
> >> Hi
> >> I have some System.out.println in my Java code that is working ok in a
> >>local
> >> environment. But when I run the same code on a standalone mode in a EC2
> >> cluster I do not see them at the worker stdout (in the worker node under
> >> <spark location>/work ) or at the driver console. Could you help me
> >> understand how do I troubleshoot?
> >>
> >> Thanks
> >> Ranjan
>
>
--
Sourav Chandra
Senior Software Engineer
· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
sourav.chandra@livestream.com
o: +91 80 4121 8723
m: +91 988 699 3746
skype: sourav.chandra
Livestream
"Ajmera Summit", First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, 3rd
Block, Koramangala Industrial Area,
Bangalore 560034
www.livestream.com
Re: [External] Re: no stdout output from worker
Posted by Patrick Wendell <pw...@gmail.com>.
Hey Sen,
Suarav is right, and I think all of your print statements are inside of the
driver program rather than inside of a closure. How are you running your
program (i.e. what do you run that starts this job)? Where you run the
driver you should expect to see the output.
- Patrick
On Mon, Mar 10, 2014 at 8:56 AM, Sen, Ranjan [USA] <Se...@bah.com>wrote:
> Hi Sourav
> That makes so much sense. Thanks much.
> Ranjan
>
> From: Sourav Chandra <so...@livestream.com>
> Reply-To: "user@spark.apache.org" <us...@spark.apache.org>
> Date: Sunday, March 9, 2014 at 10:37 PM
> To: "user@spark.apache.org" <us...@spark.apache.org>
> Subject: Re: [External] Re: no stdout output from worker
>
> Hi Ranjan,
>
> Whatever code is being passed as closure to spark operations like map,
> flatmap, filter etc are part of task
>
> All others are in driver.
>
> Thanks,
> Sourav
>
>
> On Mon, Mar 10, 2014 at 12:03 PM, Sen, Ranjan [USA] <Se...@bah.com>wrote:
>
>> Hi Patrick
>>
>> How do I know which part of the code is in the driver and which in task?
>> The structure of my code is as below-
>>
>> Š
>>
>> Static boolean done=false;
>> Š
>>
>> Public static void main(..
>>
>> ..
>>
>> JavaRDD<String> lines = ..
>>
>> ..
>>
>> While (!done) {
>>
>> ..
>> While (..) {
>>
>> JavaPairRDD<Integer, List<Integer>> labs1 = labs.map (new PairFunction<Š
>> );
>>
>> !! Here I have System.out.println (A)
>>
>> } // inner while
>>
>> !! Here I have System.out.println (B)
>>
>>
>> If (Š) {
>> Done = true;
>>
>> !! Also here some System.out.println (C)
>>
>> Break;
>> }
>>
>> Else {
>>
>> If (Š) {
>>
>> !! More System.out.println (D)
>>
>>
>> labs = labs.map(Š) ;
>>
>> }
>> }
>>
>> } // outer while
>>
>> !! Even more System.out.println (E)
>>
>> } // main
>>
>> } //class
>>
>> I get the console outputs on the master for (B) and (E). I do not see any
>> stdout in the worker node. I find the stdout and stderr in the
>> <spark>/work/<appid>/0/. I see output
>> in stderr but not in stdout.
>>
>> I do get all the outputs on the console when I run it in local mode.
>>
>> Sorry I am new and may be asking some naïve question but it is really
>> confusing to me. Thanks for your help.
>>
>> Ranjan
>>
>> On 3/9/14, 10:50 PM, "Patrick Wendell" <pw...@gmail.com> wrote:
>>
>> >Hey Sen,
>> >
>> >Is your code in the driver code or inside one of the tasks?
>> >
>> >If it's in the tasks, the place you would expect these to be is in
>> >stdout file under <spark>/<appid>/work/[stdout/stderr]. Are you seeing
>> >at least stderr logs in that folder? If not then the tasks might not
>> >be running on the workers machines. If you see stderr but not stdout
>> >that's a bit of a puzzler since they both go through the same
>> >mechanism.
>> >
>> >- Patrick
>> >
>> >On Sun, Mar 9, 2014 at 2:32 PM, Sen, Ranjan [USA] <Se...@bah.com>
>> >wrote:
>> >> Hi
>> >> I have some System.out.println in my Java code that is working ok in a
>> >>local
>> >> environment. But when I run the same code on a standalone mode in a
>> EC2
>> >> cluster I do not see them at the worker stdout (in the worker node
>> under
>> >> <spark location>/work ) or at the driver console. Could you help me
>> >> understand how do I troubleshoot?
>> >>
>> >> Thanks
>> >> Ranjan
>>
>>
>
>
> --
>
> Sourav Chandra
>
> Senior Software Engineer
>
> · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
>
> sourav.chandra@livestream.com
>
> o: +91 80 4121 8723
>
> m: +91 988 699 3746
>
> skype: sourav.chandra
>
> Livestream
>
> "Ajmera Summit", First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, 3rd
> Block, Koramangala Industrial Area,
>
> Bangalore 560034
>
> www.livestream.com
>
Re: [External] Re: no stdout output from worker
Posted by "Sen, Ranjan [USA]" <Se...@bah.com>.
Hi Sourav
That makes so much sense. Thanks much.
Ranjan
From: Sourav Chandra <so...@livestream.com>>
Reply-To: "user@spark.apache.org<ma...@spark.apache.org>" <us...@spark.apache.org>>
Date: Sunday, March 9, 2014 at 10:37 PM
To: "user@spark.apache.org<ma...@spark.apache.org>" <us...@spark.apache.org>>
Subject: Re: [External] Re: no stdout output from worker
Hi Ranjan,
Whatever code is being passed as closure to spark operations like map, flatmap, filter etc are part of task
All others are in driver.
Thanks,
Sourav
On Mon, Mar 10, 2014 at 12:03 PM, Sen, Ranjan [USA] <Se...@bah.com>> wrote:
Hi Patrick
How do I know which part of the code is in the driver and which in task?
The structure of my code is as below-
Š
Static boolean done=false;
Š
Public static void main(..
..
JavaRDD<String> lines = ..
..
While (!done) {
..
While (..) {
JavaPairRDD<Integer, List<Integer>> labs1 = labs.map (new PairFunction<Š );
!! Here I have System.out.println (A)
} // inner while
!! Here I have System.out.println (B)
If (Š) {
Done = true;
!! Also here some System.out.println (C)
Break;
}
Else {
If (Š) {
!! More System.out.println (D)
labs = labs.map(Š) ;
}
}
} // outer while
!! Even more System.out.println (E)
} // main
} //class
I get the console outputs on the master for (B) and (E). I do not see any
stdout in the worker node. I find the stdout and stderr in the
<spark>/work/<appid>/0/. I see output
in stderr but not in stdout.
I do get all the outputs on the console when I run it in local mode.
Sorry I am new and may be asking some naïve question but it is really
confusing to me. Thanks for your help.
Ranjan
On 3/9/14, 10:50 PM, "Patrick Wendell" <pw...@gmail.com>> wrote:
>Hey Sen,
>
>Is your code in the driver code or inside one of the tasks?
>
>If it's in the tasks, the place you would expect these to be is in
>stdout file under <spark>/<appid>/work/[stdout/stderr]. Are you seeing
>at least stderr logs in that folder? If not then the tasks might not
>be running on the workers machines. If you see stderr but not stdout
>that's a bit of a puzzler since they both go through the same
>mechanism.
>
>- Patrick
>
>On Sun, Mar 9, 2014 at 2:32 PM, Sen, Ranjan [USA] <Se...@bah.com>>
>wrote:
>> Hi
>> I have some System.out.println in my Java code that is working ok in a
>>local
>> environment. But when I run the same code on a standalone mode in a EC2
>> cluster I do not see them at the worker stdout (in the worker node under
>> <spark location>/work ) or at the driver console. Could you help me
>> understand how do I troubleshoot?
>>
>> Thanks
>> Ranjan
--
Sourav Chandra
Senior Software Engineer
· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
sourav.chandra@livestream.com<ma...@livestream.com>
o: +91 80 4121 8723
m: +91 988 699 3746
skype: sourav.chandra
Livestream
"Ajmera Summit", First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, 3rd Block, Koramangala Industrial Area,
Bangalore 560034
www.livestream.com<http://www.livestream.com/>