You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by Jeff Zhang <zj...@gmail.com> on 2018/09/27 07:08:31 UTC

[Discuss] 0.8.1 Release

Hi folks,

It has been a while for 0.8.0 release, and we got many feedback for that,
so I think it is time for us to make 0.8.1 release for fix the bugs of
0.8.0. Here's the umbrella tickets for 0.8.1 release
https://jira.apache.org/jira/browse/ZEPPELIN-3629

If you find any ticket that is necessary for 0.8.1 but not under this
umbrella ticket, feel free to link that. I will start the 0.8.1 release at
the beginning of Oct.

Re: [Discuss] 0.8.1 Release

Posted by Paul Brenner <pb...@placeiq.com>.
I added 3 tickets that I think are critical to basic functionality, but feel free to remove if you don’t agree:

https://jira.apache.org/jira/browse/ZEPPELIN-3253 - `Tab` key indentation is broken in paragraph editor - this is in place of 3692 which was closed as a duplicate

https://jira.apache.org/jira/browse/ZEPPELIN-3616 - Paragraph editor sections auto-collapse - This doesn’t happen all the time but when it does it makes zeppelin very hard to use

https://jira.apache.org/jira/browse/ZEPPELIN-2931 - JS memory/object leak makes browser to lag in rendering Zeppelin notebooks - this is another fundamental usability issue


Paul Brenner
SR. DATA SCIENTIST
(217) 390-3033

On Sep 27, 2018, 3:08 AM -0400, Jeff Zhang <zj...@gmail.com>, wrote:
> Hi folks,
>
> It has been a while for 0.8.0 release, and we got many feedback for that, so I think it is time for us to make 0.8.1 release for fix the bugs of 0.8.0. Here's the umbrella tickets for 0.8.1 release
> https://jira.apache.org/jira/browse/ZEPPELIN-3629
>
> If you find any ticket that is necessary for 0.8.1 but not under this umbrella ticket, feel free to link that. I will start the 0.8.1 release at the beginning of Oct.
>

Re: Interpreter behavior

Posted by Jeff Zhang <zj...@gmail.com>.
There's no standard way to calculate the memory requirement for driver. It
depends on your app, e.g if you want to fetch large data into driver, then
you'd better to set a large value for driver memory.

Regarding running paragraphs simultaneously, for scala/python/r code, the
execution in one paragraph is sequential, the reason is that there may be
dependencies between paragraphs, e.g. paragraph 2 may use the variable
defined in paragraph 1



Ajay Viswanathan <aj...@miqdigital.com.invalid>于2018年11月9日周五
下午3:49写道:

> I use version 0.7.3. I have been trying to investigate the reasons for the
> timeouts. I was trying to tune the number of cores available. Now that
> you've mentioned the driver memory issue, I'll try it out again and let you
> know if that solves the problem. What would be a back-of-the-envelope
> calculation for the resource requirements per notebook in a scoped + per
> note configuration?
>
> Also, is there a provision for executing multiple spark paragraphs in a
> notebook simultaneously? I wasn't able to achieve that and hence the
> workaround by running each paragraph in its own notebook.
>
> Thanks
> Ajay Viswanathan
> Sr. Software Engineer, MiQ
>
> On Fri, 9 Nov 2018 at 13:12, Jeff Zhang <zj...@gmail.com> wrote:
>
> > HI Ajay,
> >
> > Thanks for the reporting, which version do you use ? One know issue of
> > spark scoped mode is that each spark repl will occupy large memory and
> > won't be released, one workaround is to increase the driver memory.
> >
> > https://jira.apache.org/jira/browse/ZEPPELIN-3389
> >
> >
> > Ajay Viswanathan <aj...@miqdigital.com.invalid>于2018年11月9日周五
> > 下午3:31写道:
> >
> > > This is an issue even I am facing in my project currently. By running
> the
> > > spark interpreter in Scoped + Per Note mode, I do manage to execute
> > > paragraphs in parallel, but it becomes very resource intensive and
> times
> > > out if I run more than 3-4 jobs in parallel on a 4-core cloud instance.
> > > Typically a Thrift Transport exception is thrown after a prolonged
> period
> > > of inactivity,
> > >
> > > On Fri, 9 Nov 2018 at 06:54, Jeff Zhang <zj...@gmail.com> wrote:
> > >
> > > > Which version do you use ? This seems a bug. Each note should have
> its
> > > own
> > > > scheduler in scoped per note mode.
> > > >
> > > > <te...@yahoo.com.invalid>于2018年11月9日周五 上午1:56写道:
> > > >
> > > > > Hi
> > > > >
> > > > > We use zeppelin in multi-user environment, the interpreter scope
> mode
> > > > > seems to allow notebook execution in serial only. If multiple users
> > are
> > > > > running their notebooks concurrently, these notebooks are queued
> for
> > > > serial
> > > > > execution. If one notebook takes a long time to complete, it
> > basically
> > > > > blocks other notebooks from execution. To enable parallel notebook
> > > > > execution, it seems we need to use the isolated mode, which
> creates a
> > > new
> > > > > interpreter instance (run on separate JVM) per user. But this can
> > > become
> > > > > expensive (compute resource intensive). what is the suggested
> > > interpreter
> > > > > mode for multi-user environment?
> > > > >
> > > > > Thanks
> > > > > Denny
> > > > >
> > > >
> > >
> > >
> > > --
> > > Ajay Viswanathan
> > > *Sr. Software Engineer, CAPS - Processing*
> > > *A: *5th & 6th Floor | Skav 909 | 9/1 | Lavelle Road | Bangalore |
> 560001
> > > *E: *youremail@miqdigital.com
> > > *M: *+00 (0)0 0000 0000
> > > *W: *wearemiq.com
> > > [image: MiQ] <http://wearemiq.com/>
> > > *Disclaimer: *This email and its attachments are confidential and are
> > > intended solely for the use of the individual to whom it is addressed.
> If
> > > you are not the intended recipient of this email and its attachments,
> you
> > > must take no action based upon them, nor must you copy or show them to
> > > anyone. No contracts or official orders shall
> <https://maps.google.com/?q=.+No+contracts+or+official+orders+shall&entry=gmail&source=g>
> be concluded by means of
> > this
> > > email. Please contact the sender if you believe you have received this
> > > email in error.
> > >
> >
>
>
> --
> Ajay Viswanathan
> *Sr. Software Engineer, CAPS - Processing*
> *A: *5th & 6th Floor | Skav 909 | 9/1 | Lavelle Road | Bangalore | 560001
> *E: *ajayviswanathan@miqdigital.com
> *M: *+00 (0)0 0000 0000
> *W: *wearemiq.com
> [image: MiQ] <http://wearemiq.com/>
> *Disclaimer: *This email and its attachments are confidential and are
> intended solely for the use of the individual to whom it is addressed. If
> you are not the intended recipient of this email and its attachments, you
> must take no action based upon them, nor must you copy or show them to
> anyone. No contracts or official orders shall be concluded by means of this
> email. Please contact the sender if you believe you have received this
> email in error.
>

Re: Interpreter behavior

Posted by Ajay Viswanathan <aj...@miqdigital.com.INVALID>.
I use version 0.7.3. I have been trying to investigate the reasons for the
timeouts. I was trying to tune the number of cores available. Now that
you've mentioned the driver memory issue, I'll try it out again and let you
know if that solves the problem. What would be a back-of-the-envelope
calculation for the resource requirements per notebook in a scoped + per
note configuration?

Also, is there a provision for executing multiple spark paragraphs in a
notebook simultaneously? I wasn't able to achieve that and hence the
workaround by running each paragraph in its own notebook.

Thanks
Ajay Viswanathan
Sr. Software Engineer, MiQ

On Fri, 9 Nov 2018 at 13:12, Jeff Zhang <zj...@gmail.com> wrote:

> HI Ajay,
>
> Thanks for the reporting, which version do you use ? One know issue of
> spark scoped mode is that each spark repl will occupy large memory and
> won't be released, one workaround is to increase the driver memory.
>
> https://jira.apache.org/jira/browse/ZEPPELIN-3389
>
>
> Ajay Viswanathan <aj...@miqdigital.com.invalid>于2018年11月9日周五
> 下午3:31写道:
>
> > This is an issue even I am facing in my project currently. By running the
> > spark interpreter in Scoped + Per Note mode, I do manage to execute
> > paragraphs in parallel, but it becomes very resource intensive and times
> > out if I run more than 3-4 jobs in parallel on a 4-core cloud instance.
> > Typically a Thrift Transport exception is thrown after a prolonged period
> > of inactivity,
> >
> > On Fri, 9 Nov 2018 at 06:54, Jeff Zhang <zj...@gmail.com> wrote:
> >
> > > Which version do you use ? This seems a bug. Each note should have its
> > own
> > > scheduler in scoped per note mode.
> > >
> > > <te...@yahoo.com.invalid>于2018年11月9日周五 上午1:56写道:
> > >
> > > > Hi
> > > >
> > > > We use zeppelin in multi-user environment, the interpreter scope mode
> > > > seems to allow notebook execution in serial only. If multiple users
> are
> > > > running their notebooks concurrently, these notebooks are queued for
> > > serial
> > > > execution. If one notebook takes a long time to complete, it
> basically
> > > > blocks other notebooks from execution. To enable parallel notebook
> > > > execution, it seems we need to use the isolated mode, which creates a
> > new
> > > > interpreter instance (run on separate JVM) per user. But this can
> > become
> > > > expensive (compute resource intensive). what is the suggested
> > interpreter
> > > > mode for multi-user environment?
> > > >
> > > > Thanks
> > > > Denny
> > > >
> > >
> >
> >
> > --
> > Ajay Viswanathan
> > *Sr. Software Engineer, CAPS - Processing*
> > *A: *5th & 6th Floor | Skav 909 | 9/1 | Lavelle Road | Bangalore | 560001
> > *E: *youremail@miqdigital.com
> > *M: *+00 (0)0 0000 0000
> > *W: *wearemiq.com
> > [image: MiQ] <http://wearemiq.com/>
> > *Disclaimer: *This email and its attachments are confidential and are
> > intended solely for the use of the individual to whom it is addressed. If
> > you are not the intended recipient of this email and its attachments, you
> > must take no action based upon them, nor must you copy or show them to
> > anyone. No contracts or official orders shall be concluded by means of
> this
> > email. Please contact the sender if you believe you have received this
> > email in error.
> >
>


-- 
Ajay Viswanathan
*Sr. Software Engineer, CAPS - Processing*
*A: *5th & 6th Floor | Skav 909 | 9/1 | Lavelle Road | Bangalore | 560001
*E: *ajayviswanathan@miqdigital.com
*M: *+00 (0)0 0000 0000
*W: *wearemiq.com
[image: MiQ] <http://wearemiq.com/>
*Disclaimer: *This email and its attachments are confidential and are
intended solely for the use of the individual to whom it is addressed. If
you are not the intended recipient of this email and its attachments, you
must take no action based upon them, nor must you copy or show them to
anyone. No contracts or official orders shall be concluded by means of this
email. Please contact the sender if you believe you have received this
email in error.

Re: Interpreter behavior

Posted by Jeff Zhang <zj...@gmail.com>.
HI Ajay,

Thanks for the reporting, which version do you use ? One know issue of
spark scoped mode is that each spark repl will occupy large memory and
won't be released, one workaround is to increase the driver memory.

https://jira.apache.org/jira/browse/ZEPPELIN-3389


Ajay Viswanathan <aj...@miqdigital.com.invalid>于2018年11月9日周五
下午3:31写道:

> This is an issue even I am facing in my project currently. By running the
> spark interpreter in Scoped + Per Note mode, I do manage to execute
> paragraphs in parallel, but it becomes very resource intensive and times
> out if I run more than 3-4 jobs in parallel on a 4-core cloud instance.
> Typically a Thrift Transport exception is thrown after a prolonged period
> of inactivity,
>
> On Fri, 9 Nov 2018 at 06:54, Jeff Zhang <zj...@gmail.com> wrote:
>
> > Which version do you use ? This seems a bug. Each note should have its
> own
> > scheduler in scoped per note mode.
> >
> > <te...@yahoo.com.invalid>于2018年11月9日周五 上午1:56写道:
> >
> > > Hi
> > >
> > > We use zeppelin in multi-user environment, the interpreter scope mode
> > > seems to allow notebook execution in serial only. If multiple users are
> > > running their notebooks concurrently, these notebooks are queued for
> > serial
> > > execution. If one notebook takes a long time to complete, it basically
> > > blocks other notebooks from execution. To enable parallel notebook
> > > execution, it seems we need to use the isolated mode, which creates a
> new
> > > interpreter instance (run on separate JVM) per user. But this can
> become
> > > expensive (compute resource intensive). what is the suggested
> interpreter
> > > mode for multi-user environment?
> > >
> > > Thanks
> > > Denny
> > >
> >
>
>
> --
> Ajay Viswanathan
> *Sr. Software Engineer, CAPS - Processing*
> *A: *5th & 6th Floor | Skav 909 | 9/1 | Lavelle Road | Bangalore | 560001
> *E: *youremail@miqdigital.com
> *M: *+00 (0)0 0000 0000
> *W: *wearemiq.com
> [image: MiQ] <http://wearemiq.com/>
> *Disclaimer: *This email and its attachments are confidential and are
> intended solely for the use of the individual to whom it is addressed. If
> you are not the intended recipient of this email and its attachments, you
> must take no action based upon them, nor must you copy or show them to
> anyone. No contracts or official orders shall be concluded by means of this
> email. Please contact the sender if you believe you have received this
> email in error.
>

Re: Interpreter behavior

Posted by Ajay Viswanathan <aj...@miqdigital.com.INVALID>.
This is an issue even I am facing in my project currently. By running the
spark interpreter in Scoped + Per Note mode, I do manage to execute
paragraphs in parallel, but it becomes very resource intensive and times
out if I run more than 3-4 jobs in parallel on a 4-core cloud instance.
Typically a Thrift Transport exception is thrown after a prolonged period
of inactivity,

On Fri, 9 Nov 2018 at 06:54, Jeff Zhang <zj...@gmail.com> wrote:

> Which version do you use ? This seems a bug. Each note should have its own
> scheduler in scoped per note mode.
>
> <te...@yahoo.com.invalid>于2018年11月9日周五 上午1:56写道:
>
> > Hi
> >
> > We use zeppelin in multi-user environment, the interpreter scope mode
> > seems to allow notebook execution in serial only. If multiple users are
> > running their notebooks concurrently, these notebooks are queued for
> serial
> > execution. If one notebook takes a long time to complete, it basically
> > blocks other notebooks from execution. To enable parallel notebook
> > execution, it seems we need to use the isolated mode, which creates a new
> > interpreter instance (run on separate JVM) per user. But this can become
> > expensive (compute resource intensive). what is the suggested interpreter
> > mode for multi-user environment?
> >
> > Thanks
> > Denny
> >
>


-- 
Ajay Viswanathan
*Sr. Software Engineer, CAPS - Processing*
*A: *5th & 6th Floor | Skav 909 | 9/1 | Lavelle Road | Bangalore | 560001
*E: *youremail@miqdigital.com
*M: *+00 (0)0 0000 0000
*W: *wearemiq.com
[image: MiQ] <http://wearemiq.com/>
*Disclaimer: *This email and its attachments are confidential and are
intended solely for the use of the individual to whom it is addressed. If
you are not the intended recipient of this email and its attachments, you
must take no action based upon them, nor must you copy or show them to
anyone. No contracts or official orders shall be concluded by means of this
email. Please contact the sender if you believe you have received this
email in error.

Re: Interpreter behavior

Posted by Jeff Zhang <zj...@gmail.com>.
Which version do you use ? This seems a bug. Each note should have its own
scheduler in scoped per note mode.

<te...@yahoo.com.invalid>于2018年11月9日周五 上午1:56写道:

> Hi
>
> We use zeppelin in multi-user environment, the interpreter scope mode
> seems to allow notebook execution in serial only. If multiple users are
> running their notebooks concurrently, these notebooks are queued for serial
> execution. If one notebook takes a long time to complete, it basically
> blocks other notebooks from execution. To enable parallel notebook
> execution, it seems we need to use the isolated mode, which creates a new
> interpreter instance (run on separate JVM) per user. But this can become
> expensive (compute resource intensive). what is the suggested interpreter
> mode for multi-user environment?
>
> Thanks
> Denny
>

Re: Interpreter behavior

Posted by te...@yahoo.com.INVALID.
Hi 

We use zeppelin in multi-user environment, the interpreter scope mode seems to allow notebook execution in serial only. If multiple users are running their notebooks concurrently, these notebooks are queued for serial execution. If one notebook takes a long time to complete, it basically blocks other notebooks from execution. To enable parallel notebook execution, it seems we need to use the isolated mode, which creates a new interpreter instance (run on separate JVM) per user. But this can become expensive (compute resource intensive). what is the suggested interpreter mode for multi-user environment?

Thanks
Denny

Re: [Discuss] 0.8.1 Release

Posted by Jeff Zhang <zj...@gmail.com>.
I have merged ZEPPELIN-3528 into branch-0.8


andreas.weise@gmail.com <an...@gmail.com>于2018年10月6日周六 上午2:51写道:

> sorry for late response.
>
> Would you mind merging ZEPPELIN-3528 into branch-0.8 before 0.8.1 ? Small
> change but really helpful.
>
> On 2018/09/27 07:08:31, Jeff Zhang <zj...@gmail.com> wrote:
> > Hi folks,
> >
> > It has been a while for 0.8.0 release, and we got many feedback for that,
> > so I think it is time for us to make 0.8.1 release for fix the bugs of
> > 0.8.0. Here's the umbrella tickets for 0.8.1 release
> > https://jira.apache.org/jira/browse/ZEPPELIN-3629
> >
> > If you find any ticket that is necessary for 0.8.1 but not under this
> > umbrella ticket, feel free to link that. I will start the 0.8.1 release
> at
> > the beginning of Oct.
> >
>

Re: [Discuss] 0.8.1 Release

Posted by an...@gmail.com, an...@gmail.com.
sorry for late response.

Would you mind merging ZEPPELIN-3528 into branch-0.8 before 0.8.1 ? Small change but really helpful.

On 2018/09/27 07:08:31, Jeff Zhang <zj...@gmail.com> wrote: 
> Hi folks,
> 
> It has been a while for 0.8.0 release, and we got many feedback for that,
> so I think it is time for us to make 0.8.1 release for fix the bugs of
> 0.8.0. Here's the umbrella tickets for 0.8.1 release
> https://jira.apache.org/jira/browse/ZEPPELIN-3629
> 
> If you find any ticket that is necessary for 0.8.1 but not under this
> umbrella ticket, feel free to link that. I will start the 0.8.1 release at
> the beginning of Oct.
>