You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@zeppelin.apache.org by Sylvain Gibier <sy...@apefactory.com> on 2021/05/01 06:37:13 UTC

Re: Zeppelin 0.9 / Kubernetes / Spark interpreter

Hi,

Cf. ZEPPELIN-5337.

Switching to isolated mode is not really an option - as it means one spark
interpreter per note. per user -- which consumes a lot of resources, as
there is no mechanism to clean k8s pods created afterwards. The scope mode
allows us to share the spark interpreter along with our 100+ analysts.


On Fri, Apr 30, 2021 at 5:05 PM moon soo Lee <mo...@apache.org> wrote:

> Hi,
>
> Thanks for sharing the issue.
>
> I tried zeppelin 0.9+ on k8s with per note scoped, scala 2.12, spark 3.0+.
> And I could reproduce the problem. But isolated mode works without problem.
> Does isolated mode work for your use case?
>
> Best,
> moon
>
>
>
> On Tue, Apr 27, 2021 at 12:39 PM Sylvain Gibier <sy...@apefactory.com>
> wrote:
>
>> Any idea?
>>
>> Actually anyone using zeppelin 0.9+ on k8s, with spark interpreter scope
>> per note ?
>>
>>
>> On 2021/04/24 10:46:06, Sylvain Gibier <sy...@apefactory.com> wrote:
>> > Hi,
>> >
>> > we have an issue with our current deployment of zeppelin on k8s, and
>> more
>> > precisely with spark interpreter.
>> >
>> > For reference - the spark context is: scala 2.12.10 / spark 2.4.7
>> >
>> > We have a weird behaviour, running the spark interpreter in per note,
>> scoped
>> >
>> > To reproduce currently - we restart the spark interpreter in scoped per
>> > note, and create two notebooks (A & B) with the same following code:
>> >
>> > %spark
>> > > import spark.implicits._
>> > >
>> > > List(1, 2, 3).toDS.map(_ + 1).show
>> > >
>> >
>> > 1- we run notebook A successfully
>> > 2 - we run notebook B  - it fails with class cast exception
>> >
>> > org.apache.spark.SparkException: Job aborted due to stage failure: Task
>> 0
>> > > in stage 24.0 failed 4 times, most recent failure: Lost task 0.3 in
>> stage
>> > > 24.0 (TID 161, 10.11.18.133, executor 2):
>> java.lang.ClassCastException:
>> > > cannot assign instance of java.lang.invoke.SerializedLambda to field
>> > > org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in
>> instance
>> > > of org.apache.spark.rdd.MapPartitionsRDD at
>> > >
>> java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2287)
>> > > at
>> java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1417)
>> > > at
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2293)
>> > >
>> >
>> > Anyone having a working zeppelin deployment with k8s / spark 2.4 - scala
>> > 2.12 ?
>> >
>> > or let anyone interested to make some $$$ to help us fix the issue?
>> >
>> > cheers
>> >
>>
>

Re: Zeppelin 0.9 / Kubernetes / Spark interpreter

Posted by Jeff Zhang <zj...@gmail.com>.

It is fixed here https://github.com/apache/zeppelin/pull/4105



Sylvain Gibier <sy...@apefactory.com> 于2021年5月1日周六 下午2:37写道：

> Hi,
>
> Cf. ZEPPELIN-5337.
>
> Switching to isolated mode is not really an option - as it means one spark
> interpreter per note. per user -- which consumes a lot of resources, as
> there is no mechanism to clean k8s pods created afterwards. The scope mode
> allows us to share the spark interpreter along with our 100+ analysts.
>
>
> On Fri, Apr 30, 2021 at 5:05 PM moon soo Lee <mo...@apache.org> wrote:
>
>> Hi,
>>
>> Thanks for sharing the issue.
>>
>> I tried zeppelin 0.9+ on k8s with per note scoped, scala 2.12, spark 3.0+.
>> And I could reproduce the problem. But isolated mode works without
>> problem.
>> Does isolated mode work for your use case?
>>
>> Best,
>> moon
>>
>>
>>
>> On Tue, Apr 27, 2021 at 12:39 PM Sylvain Gibier <sy...@apefactory.com>
>> wrote:
>>
>>> Any idea?
>>>
>>> Actually anyone using zeppelin 0.9+ on k8s, with spark interpreter scope
>>> per note ?
>>>
>>>
>>> On 2021/04/24 10:46:06, Sylvain Gibier <sy...@apefactory.com> wrote:
>>> > Hi,
>>> >
>>> > we have an issue with our current deployment of zeppelin on k8s, and
>>> more
>>> > precisely with spark interpreter.
>>> >
>>> > For reference - the spark context is: scala 2.12.10 / spark 2.4.7
>>> >
>>> > We have a weird behaviour, running the spark interpreter in per note,
>>> scoped
>>> >
>>> > To reproduce currently - we restart the spark interpreter in scoped per
>>> > note, and create two notebooks (A & B) with the same following code:
>>> >
>>> > %spark
>>> > > import spark.implicits._
>>> > >
>>> > > List(1, 2, 3).toDS.map(_ + 1).show
>>> > >
>>> >
>>> > 1- we run notebook A successfully
>>> > 2 - we run notebook B  - it fails with class cast exception
>>> >
>>> > org.apache.spark.SparkException: Job aborted due to stage failure:
>>> Task 0
>>> > > in stage 24.0 failed 4 times, most recent failure: Lost task 0.3 in
>>> stage
>>> > > 24.0 (TID 161, 10.11.18.133, executor 2):
>>> java.lang.ClassCastException:
>>> > > cannot assign instance of java.lang.invoke.SerializedLambda to field
>>> > > org.apache.spark.rdd.MapPartitionsRDD.f of type scala.Function3 in
>>> instance
>>> > > of org.apache.spark.rdd.MapPartitionsRDD at
>>> > >
>>> java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2287)
>>> > > at
>>> java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1417)
>>> > > at
>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2293)
>>> > >
>>> >
>>> > Anyone having a working zeppelin deployment with k8s / spark 2.4 -
>>> scala
>>> > 2.12 ?
>>> >
>>> > or let anyone interested to make some $$$ to help us fix the issue?
>>> >
>>> > cheers
>>> >
>>>
>>

-- 
Best Regards

Jeff Zhang