You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Trevor Grant <tr...@gmail.com> on 2016/06/01 00:31:40 UTC

Re: Zeppelin Integration PR

For what it is worth, simply removing the dependencies from pom.xml breaks
the Mahout interpreter.

Upon a little further testing in cluster mode, so long as the dependencies
are included in pom.xml, the appropriate Mahout jars are shipped off to the
cluster and everything works swimmingly (in Zeppelin there is a local Spark
Interpretter internal to Zeppelin and then the 'real' cluster that
everything gets shipped off to. Sometimes you can make things work in local
mode that won't work in cluster mode)

The moral of this story is that the patch DOES in fact work in local and
cluster mode, so we just need to work out the dependencies and the
licensing (and a couple of fail safes to make sure the user is running
Spark version > 1.5.2) and we should be good to go.


Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things."  -Virgil*


On Tue, May 31, 2016 at 4:22 PM, Trevor Grant <tr...@gmail.com>
wrote:

> Hey folks,
>
> looks like we're making progress on the Mahout-Zeppelin integration.
>
> Any who are interested check out:
> https://github.com/apache/incubator-zeppelin/pull/928
>
> Regarding Moon's last comments:
> Does anyone know off hand if anything will break if we roll back the
> conflicting packages to the Spark 1.6 version?
>
> Also regarding the pom.xml and:
> "Packaging
> If mahout requires to be loaded in spark executor's classpath, then adding
> mahout dependency in pom.xml will not be enough to work with Spark cluster.
> Could you clarify if mahout need to be loaded in spark executor?"
>
> All we need to do is load the jars appropriate Mahout jars, I'm not
> familiar enough with the Spark Interpreter or Spark or Java to know exactly
> what would happen, any thoughts on this?
>
> Tonight I might just try removing mahout dependencies from pom.xml and
> seeing what happens? that would solve all of these problems I think.  As
> long as user has 'mvn install'ed Mahout, should be gtg?
>
> Trevor Grant
> Data Scientist
> https://github.com/rawkintrevo
> http://stackexchange.com/users/3002022/rawkintrevo
> http://trevorgrant.org
>
> *"Fortunate is he, who is able to know the causes of things."  -Virgil*
>
>

Re: Zeppelin Integration PR

Posted by Suneel Marthi <sm...@apache.org>.
As regards downgrading Kryo, it may break stuff on flink end of things. I
would let Andy speak to that, can't recall the top of my head the issues we
ran into with trying to balance Spark and Flink dependencies.

On Wed, Jun 1, 2016 at 8:45 AM, Suneel Marthi <sm...@apache.org> wrote:

> I guessed so after my last email. Well its ok to downgrade Jackson then.
> Hopefully the next Spark version that Zeppelin supports uses the latest
> jackson version.
>
> On Wed, Jun 1, 2016 at 8:43 AM, Trevor Grant <tr...@gmail.com>
> wrote:
>
>> Well, its a conflict with Spark... and keep in mind Zeppelin is supporting
>> Spark all the way back to 1.1 so maybe something in there?
>>
>> I'm going to make a serious effort to try the local jar loading from
>> MAHOUT_HOME as outlined last night, bc if that works- then all of these
>> problems magically go away, and are much less likely to haunt us in the
>> future.
>>
>> tg
>>
>>
>> Trevor Grant
>> Data Scientist
>> https://github.com/rawkintrevo
>> http://stackexchange.com/users/3002022/rawkintrevo
>> http://trevorgrant.org
>>
>> *"Fortunate is he, who is able to know the causes of things."  -Virgil*
>>
>>
>> On Wed, Jun 1, 2016 at 7:15 AM, Suneel Marthi <sm...@apache.org> wrote:
>>
>> > I think we shuld be fine with jackson 2.4.4 (its not fasterxml :))
>> > But what's the reason for the massive downgrade?  Elasticsearch 2.x and
>> > above need atleast Jackson 2.6.2 to function.
>> >
>> > Downgrading Kryo, I have no opinion and would let others weigh in.
>> >
>> > On Wed, Jun 1, 2016 at 8:07 AM, Trevor Grant <tr...@gmail.com>
>> > wrote:
>> >
>> > > Can we use kryo v 2.21 (instead of 2.24)
>> > >
>> > > and fasterxml 2.4.4 instead of 2.7.2 (this is what worries me)
>> > >
>> > > I am also working on fishing the mahout jars directly out of
>> > > MAHOUT_HOME=..../ ... instead of including in the pom, so let's not
>> get
>> > to
>> > > carried away with pruning dependencies.
>> > >
>> > > At this point its more of a, "can anyone one think of a specific
>> reason
>> > > this shouldn't work".
>> > >
>> > >
>> > > Mahout
>> > >
>> > > com.esotericsoftware.kryo:kryo:jar:2.24:compile
>> > > com.fasterxml.jackson.core:jackson-core:jar:2.7.2:compile
>> > >
>> > > Spark (1.6)
>> > >
>> > > com.esotericsoftware.kryo:kryo:jar:2.21:compile
>> > > com.fasterxml.jackson.core:jackson-core:jar:2.4.4:compile
>> > >
>> > >
>> > > Trevor Grant
>> > > Data Scientist
>> > > https://github.com/rawkintrevo
>> > > http://stackexchange.com/users/3002022/rawkintrevo
>> > > http://trevorgrant.org
>> > >
>> > > *"Fortunate is he, who is able to know the causes of things."
>> -Virgil*
>> > >
>> > >
>> > > On Tue, May 31, 2016 at 7:55 PM, Andrew Palumbo <ap...@outlook.com>
>> > > wrote:
>> > >
>> > > > which dependencies need to be removed?
>> > > >
>> > > > I saw the Kryo version on the PR was conflicting also, that may be a
>> > > flink
>> > > > thing.  I think the version at one point was being enforced in the
>> > flink
>> > > > module at least.
>> > > >
>> > > > ________________________________________
>> > > > From: Trevor Grant <tr...@gmail.com>
>> > > > Sent: Tuesday, May 31, 2016 8:41:15 PM
>> > > > To: dev@mahout.apache.org
>> > > > Subject: Re: Zeppelin Integration PR
>> > > >
>> > > > As a follow up to this- it would be nice to remove the dependencies
>> > from
>> > > > the pom.xml...
>> > > >
>> > > > All we REALLY need to do is make sure we can get to the required
>> jars
>> > and
>> > > > load them.  By including them in the pom we are ensuring they are
>> > > > available, but there is surely some other way to get ahold of them.
>> > > Since
>> > > > we have assumed that Mahout is installed on the system and
>> > > MAHOUT_HOME=...
>> > > > we can probably leverage that...
>> > > >
>> > > >
>> > > >
>> > > > Trevor Grant
>> > > > Data Scientist
>> > > > https://github.com/rawkintrevo
>> > > > http://stackexchange.com/users/3002022/rawkintrevo
>> > > > http://trevorgrant.org
>> > > >
>> > > > *"Fortunate is he, who is able to know the causes of things."
>> -Virgil*
>> > > >
>> > > >
>> > > > On Tue, May 31, 2016 at 7:31 PM, Trevor Grant <
>> > trevor.d.grant@gmail.com>
>> > > > wrote:
>> > > >
>> > > > > For what it is worth, simply removing the dependencies from
>> pom.xml
>> > > > breaks
>> > > > > the Mahout interpreter.
>> > > > >
>> > > > > Upon a little further testing in cluster mode, so long as the
>> > > > dependencies
>> > > > > are included in pom.xml, the appropriate Mahout jars are shipped
>> off
>> > to
>> > > > the
>> > > > > cluster and everything works swimmingly (in Zeppelin there is a
>> local
>> > > > Spark
>> > > > > Interpretter internal to Zeppelin and then the 'real' cluster that
>> > > > > everything gets shipped off to. Sometimes you can make things
>> work in
>> > > > local
>> > > > > mode that won't work in cluster mode)
>> > > > >
>> > > > > The moral of this story is that the patch DOES in fact work in
>> local
>> > > and
>> > > > > cluster mode, so we just need to work out the dependencies and the
>> > > > > licensing (and a couple of fail safes to make sure the user is
>> > running
>> > > > > Spark version > 1.5.2) and we should be good to go.
>> > > > >
>> > > > >
>> > > > > Trevor Grant
>> > > > > Data Scientist
>> > > > > https://github.com/rawkintrevo
>> > > > > http://stackexchange.com/users/3002022/rawkintrevo
>> > > > > http://trevorgrant.org
>> > > > >
>> > > > > *"Fortunate is he, who is able to know the causes of things."
>> > -Virgil*
>> > > > >
>> > > > >
>> > > > > On Tue, May 31, 2016 at 4:22 PM, Trevor Grant <
>> > > trevor.d.grant@gmail.com>
>> > > > > wrote:
>> > > > >
>> > > > >> Hey folks,
>> > > > >>
>> > > > >> looks like we're making progress on the Mahout-Zeppelin
>> integration.
>> > > > >>
>> > > > >> Any who are interested check out:
>> > > > >> https://github.com/apache/incubator-zeppelin/pull/928
>> > > > >>
>> > > > >> Regarding Moon's last comments:
>> > > > >> Does anyone know off hand if anything will break if we roll back
>> the
>> > > > >> conflicting packages to the Spark 1.6 version?
>> > > > >>
>> > > > >> Also regarding the pom.xml and:
>> > > > >> "Packaging
>> > > > >> If mahout requires to be loaded in spark executor's classpath,
>> then
>> > > > >> adding mahout dependency in pom.xml will not be enough to work
>> with
>> > > > Spark
>> > > > >> cluster. Could you clarify if mahout need to be loaded in spark
>> > > > executor?"
>> > > > >>
>> > > > >> All we need to do is load the jars appropriate Mahout jars, I'm
>> not
>> > > > >> familiar enough with the Spark Interpreter or Spark or Java to
>> know
>> > > > exactly
>> > > > >> what would happen, any thoughts on this?
>> > > > >>
>> > > > >> Tonight I might just try removing mahout dependencies from
>> pom.xml
>> > and
>> > > > >> seeing what happens? that would solve all of these problems I
>> think.
>> > > As
>> > > > >> long as user has 'mvn install'ed Mahout, should be gtg?
>> > > > >>
>> > > > >> Trevor Grant
>> > > > >> Data Scientist
>> > > > >> https://github.com/rawkintrevo
>> > > > >> http://stackexchange.com/users/3002022/rawkintrevo
>> > > > >> http://trevorgrant.org
>> > > > >>
>> > > > >> *"Fortunate is he, who is able to know the causes of things."
>> > > -Virgil*
>> > > > >>
>> > > > >>
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Re: Zeppelin Integration PR

Posted by Suneel Marthi <sm...@apache.org>.
I guessed so after my last email. Well its ok to downgrade Jackson then.
Hopefully the next Spark version that Zeppelin supports uses the latest
jackson version.

On Wed, Jun 1, 2016 at 8:43 AM, Trevor Grant <tr...@gmail.com>
wrote:

> Well, its a conflict with Spark... and keep in mind Zeppelin is supporting
> Spark all the way back to 1.1 so maybe something in there?
>
> I'm going to make a serious effort to try the local jar loading from
> MAHOUT_HOME as outlined last night, bc if that works- then all of these
> problems magically go away, and are much less likely to haunt us in the
> future.
>
> tg
>
>
> Trevor Grant
> Data Scientist
> https://github.com/rawkintrevo
> http://stackexchange.com/users/3002022/rawkintrevo
> http://trevorgrant.org
>
> *"Fortunate is he, who is able to know the causes of things."  -Virgil*
>
>
> On Wed, Jun 1, 2016 at 7:15 AM, Suneel Marthi <sm...@apache.org> wrote:
>
> > I think we shuld be fine with jackson 2.4.4 (its not fasterxml :))
> > But what's the reason for the massive downgrade?  Elasticsearch 2.x and
> > above need atleast Jackson 2.6.2 to function.
> >
> > Downgrading Kryo, I have no opinion and would let others weigh in.
> >
> > On Wed, Jun 1, 2016 at 8:07 AM, Trevor Grant <tr...@gmail.com>
> > wrote:
> >
> > > Can we use kryo v 2.21 (instead of 2.24)
> > >
> > > and fasterxml 2.4.4 instead of 2.7.2 (this is what worries me)
> > >
> > > I am also working on fishing the mahout jars directly out of
> > > MAHOUT_HOME=..../ ... instead of including in the pom, so let's not get
> > to
> > > carried away with pruning dependencies.
> > >
> > > At this point its more of a, "can anyone one think of a specific reason
> > > this shouldn't work".
> > >
> > >
> > > Mahout
> > >
> > > com.esotericsoftware.kryo:kryo:jar:2.24:compile
> > > com.fasterxml.jackson.core:jackson-core:jar:2.7.2:compile
> > >
> > > Spark (1.6)
> > >
> > > com.esotericsoftware.kryo:kryo:jar:2.21:compile
> > > com.fasterxml.jackson.core:jackson-core:jar:2.4.4:compile
> > >
> > >
> > > Trevor Grant
> > > Data Scientist
> > > https://github.com/rawkintrevo
> > > http://stackexchange.com/users/3002022/rawkintrevo
> > > http://trevorgrant.org
> > >
> > > *"Fortunate is he, who is able to know the causes of things."  -Virgil*
> > >
> > >
> > > On Tue, May 31, 2016 at 7:55 PM, Andrew Palumbo <ap...@outlook.com>
> > > wrote:
> > >
> > > > which dependencies need to be removed?
> > > >
> > > > I saw the Kryo version on the PR was conflicting also, that may be a
> > > flink
> > > > thing.  I think the version at one point was being enforced in the
> > flink
> > > > module at least.
> > > >
> > > > ________________________________________
> > > > From: Trevor Grant <tr...@gmail.com>
> > > > Sent: Tuesday, May 31, 2016 8:41:15 PM
> > > > To: dev@mahout.apache.org
> > > > Subject: Re: Zeppelin Integration PR
> > > >
> > > > As a follow up to this- it would be nice to remove the dependencies
> > from
> > > > the pom.xml...
> > > >
> > > > All we REALLY need to do is make sure we can get to the required jars
> > and
> > > > load them.  By including them in the pom we are ensuring they are
> > > > available, but there is surely some other way to get ahold of them.
> > > Since
> > > > we have assumed that Mahout is installed on the system and
> > > MAHOUT_HOME=...
> > > > we can probably leverage that...
> > > >
> > > >
> > > >
> > > > Trevor Grant
> > > > Data Scientist
> > > > https://github.com/rawkintrevo
> > > > http://stackexchange.com/users/3002022/rawkintrevo
> > > > http://trevorgrant.org
> > > >
> > > > *"Fortunate is he, who is able to know the causes of things."
> -Virgil*
> > > >
> > > >
> > > > On Tue, May 31, 2016 at 7:31 PM, Trevor Grant <
> > trevor.d.grant@gmail.com>
> > > > wrote:
> > > >
> > > > > For what it is worth, simply removing the dependencies from pom.xml
> > > > breaks
> > > > > the Mahout interpreter.
> > > > >
> > > > > Upon a little further testing in cluster mode, so long as the
> > > > dependencies
> > > > > are included in pom.xml, the appropriate Mahout jars are shipped
> off
> > to
> > > > the
> > > > > cluster and everything works swimmingly (in Zeppelin there is a
> local
> > > > Spark
> > > > > Interpretter internal to Zeppelin and then the 'real' cluster that
> > > > > everything gets shipped off to. Sometimes you can make things work
> in
> > > > local
> > > > > mode that won't work in cluster mode)
> > > > >
> > > > > The moral of this story is that the patch DOES in fact work in
> local
> > > and
> > > > > cluster mode, so we just need to work out the dependencies and the
> > > > > licensing (and a couple of fail safes to make sure the user is
> > running
> > > > > Spark version > 1.5.2) and we should be good to go.
> > > > >
> > > > >
> > > > > Trevor Grant
> > > > > Data Scientist
> > > > > https://github.com/rawkintrevo
> > > > > http://stackexchange.com/users/3002022/rawkintrevo
> > > > > http://trevorgrant.org
> > > > >
> > > > > *"Fortunate is he, who is able to know the causes of things."
> > -Virgil*
> > > > >
> > > > >
> > > > > On Tue, May 31, 2016 at 4:22 PM, Trevor Grant <
> > > trevor.d.grant@gmail.com>
> > > > > wrote:
> > > > >
> > > > >> Hey folks,
> > > > >>
> > > > >> looks like we're making progress on the Mahout-Zeppelin
> integration.
> > > > >>
> > > > >> Any who are interested check out:
> > > > >> https://github.com/apache/incubator-zeppelin/pull/928
> > > > >>
> > > > >> Regarding Moon's last comments:
> > > > >> Does anyone know off hand if anything will break if we roll back
> the
> > > > >> conflicting packages to the Spark 1.6 version?
> > > > >>
> > > > >> Also regarding the pom.xml and:
> > > > >> "Packaging
> > > > >> If mahout requires to be loaded in spark executor's classpath,
> then
> > > > >> adding mahout dependency in pom.xml will not be enough to work
> with
> > > > Spark
> > > > >> cluster. Could you clarify if mahout need to be loaded in spark
> > > > executor?"
> > > > >>
> > > > >> All we need to do is load the jars appropriate Mahout jars, I'm
> not
> > > > >> familiar enough with the Spark Interpreter or Spark or Java to
> know
> > > > exactly
> > > > >> what would happen, any thoughts on this?
> > > > >>
> > > > >> Tonight I might just try removing mahout dependencies from pom.xml
> > and
> > > > >> seeing what happens? that would solve all of these problems I
> think.
> > > As
> > > > >> long as user has 'mvn install'ed Mahout, should be gtg?
> > > > >>
> > > > >> Trevor Grant
> > > > >> Data Scientist
> > > > >> https://github.com/rawkintrevo
> > > > >> http://stackexchange.com/users/3002022/rawkintrevo
> > > > >> http://trevorgrant.org
> > > > >>
> > > > >> *"Fortunate is he, who is able to know the causes of things."
> > > -Virgil*
> > > > >>
> > > > >>
> > > > >
> > > >
> > >
> >
>

Re: Zeppelin Integration PR

Posted by Trevor Grant <tr...@gmail.com>.
Well, its a conflict with Spark... and keep in mind Zeppelin is supporting
Spark all the way back to 1.1 so maybe something in there?

I'm going to make a serious effort to try the local jar loading from
MAHOUT_HOME as outlined last night, bc if that works- then all of these
problems magically go away, and are much less likely to haunt us in the
future.

tg


Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things."  -Virgil*


On Wed, Jun 1, 2016 at 7:15 AM, Suneel Marthi <sm...@apache.org> wrote:

> I think we shuld be fine with jackson 2.4.4 (its not fasterxml :))
> But what's the reason for the massive downgrade?  Elasticsearch 2.x and
> above need atleast Jackson 2.6.2 to function.
>
> Downgrading Kryo, I have no opinion and would let others weigh in.
>
> On Wed, Jun 1, 2016 at 8:07 AM, Trevor Grant <tr...@gmail.com>
> wrote:
>
> > Can we use kryo v 2.21 (instead of 2.24)
> >
> > and fasterxml 2.4.4 instead of 2.7.2 (this is what worries me)
> >
> > I am also working on fishing the mahout jars directly out of
> > MAHOUT_HOME=..../ ... instead of including in the pom, so let's not get
> to
> > carried away with pruning dependencies.
> >
> > At this point its more of a, "can anyone one think of a specific reason
> > this shouldn't work".
> >
> >
> > Mahout
> >
> > com.esotericsoftware.kryo:kryo:jar:2.24:compile
> > com.fasterxml.jackson.core:jackson-core:jar:2.7.2:compile
> >
> > Spark (1.6)
> >
> > com.esotericsoftware.kryo:kryo:jar:2.21:compile
> > com.fasterxml.jackson.core:jackson-core:jar:2.4.4:compile
> >
> >
> > Trevor Grant
> > Data Scientist
> > https://github.com/rawkintrevo
> > http://stackexchange.com/users/3002022/rawkintrevo
> > http://trevorgrant.org
> >
> > *"Fortunate is he, who is able to know the causes of things."  -Virgil*
> >
> >
> > On Tue, May 31, 2016 at 7:55 PM, Andrew Palumbo <ap...@outlook.com>
> > wrote:
> >
> > > which dependencies need to be removed?
> > >
> > > I saw the Kryo version on the PR was conflicting also, that may be a
> > flink
> > > thing.  I think the version at one point was being enforced in the
> flink
> > > module at least.
> > >
> > > ________________________________________
> > > From: Trevor Grant <tr...@gmail.com>
> > > Sent: Tuesday, May 31, 2016 8:41:15 PM
> > > To: dev@mahout.apache.org
> > > Subject: Re: Zeppelin Integration PR
> > >
> > > As a follow up to this- it would be nice to remove the dependencies
> from
> > > the pom.xml...
> > >
> > > All we REALLY need to do is make sure we can get to the required jars
> and
> > > load them.  By including them in the pom we are ensuring they are
> > > available, but there is surely some other way to get ahold of them.
> > Since
> > > we have assumed that Mahout is installed on the system and
> > MAHOUT_HOME=...
> > > we can probably leverage that...
> > >
> > >
> > >
> > > Trevor Grant
> > > Data Scientist
> > > https://github.com/rawkintrevo
> > > http://stackexchange.com/users/3002022/rawkintrevo
> > > http://trevorgrant.org
> > >
> > > *"Fortunate is he, who is able to know the causes of things."  -Virgil*
> > >
> > >
> > > On Tue, May 31, 2016 at 7:31 PM, Trevor Grant <
> trevor.d.grant@gmail.com>
> > > wrote:
> > >
> > > > For what it is worth, simply removing the dependencies from pom.xml
> > > breaks
> > > > the Mahout interpreter.
> > > >
> > > > Upon a little further testing in cluster mode, so long as the
> > > dependencies
> > > > are included in pom.xml, the appropriate Mahout jars are shipped off
> to
> > > the
> > > > cluster and everything works swimmingly (in Zeppelin there is a local
> > > Spark
> > > > Interpretter internal to Zeppelin and then the 'real' cluster that
> > > > everything gets shipped off to. Sometimes you can make things work in
> > > local
> > > > mode that won't work in cluster mode)
> > > >
> > > > The moral of this story is that the patch DOES in fact work in local
> > and
> > > > cluster mode, so we just need to work out the dependencies and the
> > > > licensing (and a couple of fail safes to make sure the user is
> running
> > > > Spark version > 1.5.2) and we should be good to go.
> > > >
> > > >
> > > > Trevor Grant
> > > > Data Scientist
> > > > https://github.com/rawkintrevo
> > > > http://stackexchange.com/users/3002022/rawkintrevo
> > > > http://trevorgrant.org
> > > >
> > > > *"Fortunate is he, who is able to know the causes of things."
> -Virgil*
> > > >
> > > >
> > > > On Tue, May 31, 2016 at 4:22 PM, Trevor Grant <
> > trevor.d.grant@gmail.com>
> > > > wrote:
> > > >
> > > >> Hey folks,
> > > >>
> > > >> looks like we're making progress on the Mahout-Zeppelin integration.
> > > >>
> > > >> Any who are interested check out:
> > > >> https://github.com/apache/incubator-zeppelin/pull/928
> > > >>
> > > >> Regarding Moon's last comments:
> > > >> Does anyone know off hand if anything will break if we roll back the
> > > >> conflicting packages to the Spark 1.6 version?
> > > >>
> > > >> Also regarding the pom.xml and:
> > > >> "Packaging
> > > >> If mahout requires to be loaded in spark executor's classpath, then
> > > >> adding mahout dependency in pom.xml will not be enough to work with
> > > Spark
> > > >> cluster. Could you clarify if mahout need to be loaded in spark
> > > executor?"
> > > >>
> > > >> All we need to do is load the jars appropriate Mahout jars, I'm not
> > > >> familiar enough with the Spark Interpreter or Spark or Java to know
> > > exactly
> > > >> what would happen, any thoughts on this?
> > > >>
> > > >> Tonight I might just try removing mahout dependencies from pom.xml
> and
> > > >> seeing what happens? that would solve all of these problems I think.
> > As
> > > >> long as user has 'mvn install'ed Mahout, should be gtg?
> > > >>
> > > >> Trevor Grant
> > > >> Data Scientist
> > > >> https://github.com/rawkintrevo
> > > >> http://stackexchange.com/users/3002022/rawkintrevo
> > > >> http://trevorgrant.org
> > > >>
> > > >> *"Fortunate is he, who is able to know the causes of things."
> > -Virgil*
> > > >>
> > > >>
> > > >
> > >
> >
>

Re: Zeppelin Integration PR

Posted by Suneel Marthi <sm...@apache.org>.
I think we shuld be fine with jackson 2.4.4 (its not fasterxml :))
But what's the reason for the massive downgrade?  Elasticsearch 2.x and
above need atleast Jackson 2.6.2 to function.

Downgrading Kryo, I have no opinion and would let others weigh in.

On Wed, Jun 1, 2016 at 8:07 AM, Trevor Grant <tr...@gmail.com>
wrote:

> Can we use kryo v 2.21 (instead of 2.24)
>
> and fasterxml 2.4.4 instead of 2.7.2 (this is what worries me)
>
> I am also working on fishing the mahout jars directly out of
> MAHOUT_HOME=..../ ... instead of including in the pom, so let's not get to
> carried away with pruning dependencies.
>
> At this point its more of a, "can anyone one think of a specific reason
> this shouldn't work".
>
>
> Mahout
>
> com.esotericsoftware.kryo:kryo:jar:2.24:compile
> com.fasterxml.jackson.core:jackson-core:jar:2.7.2:compile
>
> Spark (1.6)
>
> com.esotericsoftware.kryo:kryo:jar:2.21:compile
> com.fasterxml.jackson.core:jackson-core:jar:2.4.4:compile
>
>
> Trevor Grant
> Data Scientist
> https://github.com/rawkintrevo
> http://stackexchange.com/users/3002022/rawkintrevo
> http://trevorgrant.org
>
> *"Fortunate is he, who is able to know the causes of things."  -Virgil*
>
>
> On Tue, May 31, 2016 at 7:55 PM, Andrew Palumbo <ap...@outlook.com>
> wrote:
>
> > which dependencies need to be removed?
> >
> > I saw the Kryo version on the PR was conflicting also, that may be a
> flink
> > thing.  I think the version at one point was being enforced in the flink
> > module at least.
> >
> > ________________________________________
> > From: Trevor Grant <tr...@gmail.com>
> > Sent: Tuesday, May 31, 2016 8:41:15 PM
> > To: dev@mahout.apache.org
> > Subject: Re: Zeppelin Integration PR
> >
> > As a follow up to this- it would be nice to remove the dependencies from
> > the pom.xml...
> >
> > All we REALLY need to do is make sure we can get to the required jars and
> > load them.  By including them in the pom we are ensuring they are
> > available, but there is surely some other way to get ahold of them.
> Since
> > we have assumed that Mahout is installed on the system and
> MAHOUT_HOME=...
> > we can probably leverage that...
> >
> >
> >
> > Trevor Grant
> > Data Scientist
> > https://github.com/rawkintrevo
> > http://stackexchange.com/users/3002022/rawkintrevo
> > http://trevorgrant.org
> >
> > *"Fortunate is he, who is able to know the causes of things."  -Virgil*
> >
> >
> > On Tue, May 31, 2016 at 7:31 PM, Trevor Grant <tr...@gmail.com>
> > wrote:
> >
> > > For what it is worth, simply removing the dependencies from pom.xml
> > breaks
> > > the Mahout interpreter.
> > >
> > > Upon a little further testing in cluster mode, so long as the
> > dependencies
> > > are included in pom.xml, the appropriate Mahout jars are shipped off to
> > the
> > > cluster and everything works swimmingly (in Zeppelin there is a local
> > Spark
> > > Interpretter internal to Zeppelin and then the 'real' cluster that
> > > everything gets shipped off to. Sometimes you can make things work in
> > local
> > > mode that won't work in cluster mode)
> > >
> > > The moral of this story is that the patch DOES in fact work in local
> and
> > > cluster mode, so we just need to work out the dependencies and the
> > > licensing (and a couple of fail safes to make sure the user is running
> > > Spark version > 1.5.2) and we should be good to go.
> > >
> > >
> > > Trevor Grant
> > > Data Scientist
> > > https://github.com/rawkintrevo
> > > http://stackexchange.com/users/3002022/rawkintrevo
> > > http://trevorgrant.org
> > >
> > > *"Fortunate is he, who is able to know the causes of things."  -Virgil*
> > >
> > >
> > > On Tue, May 31, 2016 at 4:22 PM, Trevor Grant <
> trevor.d.grant@gmail.com>
> > > wrote:
> > >
> > >> Hey folks,
> > >>
> > >> looks like we're making progress on the Mahout-Zeppelin integration.
> > >>
> > >> Any who are interested check out:
> > >> https://github.com/apache/incubator-zeppelin/pull/928
> > >>
> > >> Regarding Moon's last comments:
> > >> Does anyone know off hand if anything will break if we roll back the
> > >> conflicting packages to the Spark 1.6 version?
> > >>
> > >> Also regarding the pom.xml and:
> > >> "Packaging
> > >> If mahout requires to be loaded in spark executor's classpath, then
> > >> adding mahout dependency in pom.xml will not be enough to work with
> > Spark
> > >> cluster. Could you clarify if mahout need to be loaded in spark
> > executor?"
> > >>
> > >> All we need to do is load the jars appropriate Mahout jars, I'm not
> > >> familiar enough with the Spark Interpreter or Spark or Java to know
> > exactly
> > >> what would happen, any thoughts on this?
> > >>
> > >> Tonight I might just try removing mahout dependencies from pom.xml and
> > >> seeing what happens? that would solve all of these problems I think.
> As
> > >> long as user has 'mvn install'ed Mahout, should be gtg?
> > >>
> > >> Trevor Grant
> > >> Data Scientist
> > >> https://github.com/rawkintrevo
> > >> http://stackexchange.com/users/3002022/rawkintrevo
> > >> http://trevorgrant.org
> > >>
> > >> *"Fortunate is he, who is able to know the causes of things."
> -Virgil*
> > >>
> > >>
> > >
> >
>

Re: Zeppelin Integration PR

Posted by Trevor Grant <tr...@gmail.com>.
Can we use kryo v 2.21 (instead of 2.24)

and fasterxml 2.4.4 instead of 2.7.2 (this is what worries me)

I am also working on fishing the mahout jars directly out of
MAHOUT_HOME=..../ ... instead of including in the pom, so let's not get to
carried away with pruning dependencies.

At this point its more of a, "can anyone one think of a specific reason
this shouldn't work".


Mahout

com.esotericsoftware.kryo:kryo:jar:2.24:compile
com.fasterxml.jackson.core:jackson-core:jar:2.7.2:compile

Spark (1.6)

com.esotericsoftware.kryo:kryo:jar:2.21:compile
com.fasterxml.jackson.core:jackson-core:jar:2.4.4:compile


Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things."  -Virgil*


On Tue, May 31, 2016 at 7:55 PM, Andrew Palumbo <ap...@outlook.com> wrote:

> which dependencies need to be removed?
>
> I saw the Kryo version on the PR was conflicting also, that may be a flink
> thing.  I think the version at one point was being enforced in the flink
> module at least.
>
> ________________________________________
> From: Trevor Grant <tr...@gmail.com>
> Sent: Tuesday, May 31, 2016 8:41:15 PM
> To: dev@mahout.apache.org
> Subject: Re: Zeppelin Integration PR
>
> As a follow up to this- it would be nice to remove the dependencies from
> the pom.xml...
>
> All we REALLY need to do is make sure we can get to the required jars and
> load them.  By including them in the pom we are ensuring they are
> available, but there is surely some other way to get ahold of them.  Since
> we have assumed that Mahout is installed on the system and MAHOUT_HOME=...
> we can probably leverage that...
>
>
>
> Trevor Grant
> Data Scientist
> https://github.com/rawkintrevo
> http://stackexchange.com/users/3002022/rawkintrevo
> http://trevorgrant.org
>
> *"Fortunate is he, who is able to know the causes of things."  -Virgil*
>
>
> On Tue, May 31, 2016 at 7:31 PM, Trevor Grant <tr...@gmail.com>
> wrote:
>
> > For what it is worth, simply removing the dependencies from pom.xml
> breaks
> > the Mahout interpreter.
> >
> > Upon a little further testing in cluster mode, so long as the
> dependencies
> > are included in pom.xml, the appropriate Mahout jars are shipped off to
> the
> > cluster and everything works swimmingly (in Zeppelin there is a local
> Spark
> > Interpretter internal to Zeppelin and then the 'real' cluster that
> > everything gets shipped off to. Sometimes you can make things work in
> local
> > mode that won't work in cluster mode)
> >
> > The moral of this story is that the patch DOES in fact work in local and
> > cluster mode, so we just need to work out the dependencies and the
> > licensing (and a couple of fail safes to make sure the user is running
> > Spark version > 1.5.2) and we should be good to go.
> >
> >
> > Trevor Grant
> > Data Scientist
> > https://github.com/rawkintrevo
> > http://stackexchange.com/users/3002022/rawkintrevo
> > http://trevorgrant.org
> >
> > *"Fortunate is he, who is able to know the causes of things."  -Virgil*
> >
> >
> > On Tue, May 31, 2016 at 4:22 PM, Trevor Grant <tr...@gmail.com>
> > wrote:
> >
> >> Hey folks,
> >>
> >> looks like we're making progress on the Mahout-Zeppelin integration.
> >>
> >> Any who are interested check out:
> >> https://github.com/apache/incubator-zeppelin/pull/928
> >>
> >> Regarding Moon's last comments:
> >> Does anyone know off hand if anything will break if we roll back the
> >> conflicting packages to the Spark 1.6 version?
> >>
> >> Also regarding the pom.xml and:
> >> "Packaging
> >> If mahout requires to be loaded in spark executor's classpath, then
> >> adding mahout dependency in pom.xml will not be enough to work with
> Spark
> >> cluster. Could you clarify if mahout need to be loaded in spark
> executor?"
> >>
> >> All we need to do is load the jars appropriate Mahout jars, I'm not
> >> familiar enough with the Spark Interpreter or Spark or Java to know
> exactly
> >> what would happen, any thoughts on this?
> >>
> >> Tonight I might just try removing mahout dependencies from pom.xml and
> >> seeing what happens? that would solve all of these problems I think.  As
> >> long as user has 'mvn install'ed Mahout, should be gtg?
> >>
> >> Trevor Grant
> >> Data Scientist
> >> https://github.com/rawkintrevo
> >> http://stackexchange.com/users/3002022/rawkintrevo
> >> http://trevorgrant.org
> >>
> >> *"Fortunate is he, who is able to know the causes of things."  -Virgil*
> >>
> >>
> >
>

Re: Zeppelin Integration PR

Posted by Andrew Palumbo <ap...@outlook.com>.
which dependencies need to be removed?

I saw the Kryo version on the PR was conflicting also, that may be a flink thing.  I think the version at one point was being enforced in the flink module at least.

________________________________________
From: Trevor Grant <tr...@gmail.com>
Sent: Tuesday, May 31, 2016 8:41:15 PM
To: dev@mahout.apache.org
Subject: Re: Zeppelin Integration PR

As a follow up to this- it would be nice to remove the dependencies from
the pom.xml...

All we REALLY need to do is make sure we can get to the required jars and
load them.  By including them in the pom we are ensuring they are
available, but there is surely some other way to get ahold of them.  Since
we have assumed that Mahout is installed on the system and MAHOUT_HOME=...
we can probably leverage that...



Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things."  -Virgil*


On Tue, May 31, 2016 at 7:31 PM, Trevor Grant <tr...@gmail.com>
wrote:

> For what it is worth, simply removing the dependencies from pom.xml breaks
> the Mahout interpreter.
>
> Upon a little further testing in cluster mode, so long as the dependencies
> are included in pom.xml, the appropriate Mahout jars are shipped off to the
> cluster and everything works swimmingly (in Zeppelin there is a local Spark
> Interpretter internal to Zeppelin and then the 'real' cluster that
> everything gets shipped off to. Sometimes you can make things work in local
> mode that won't work in cluster mode)
>
> The moral of this story is that the patch DOES in fact work in local and
> cluster mode, so we just need to work out the dependencies and the
> licensing (and a couple of fail safes to make sure the user is running
> Spark version > 1.5.2) and we should be good to go.
>
>
> Trevor Grant
> Data Scientist
> https://github.com/rawkintrevo
> http://stackexchange.com/users/3002022/rawkintrevo
> http://trevorgrant.org
>
> *"Fortunate is he, who is able to know the causes of things."  -Virgil*
>
>
> On Tue, May 31, 2016 at 4:22 PM, Trevor Grant <tr...@gmail.com>
> wrote:
>
>> Hey folks,
>>
>> looks like we're making progress on the Mahout-Zeppelin integration.
>>
>> Any who are interested check out:
>> https://github.com/apache/incubator-zeppelin/pull/928
>>
>> Regarding Moon's last comments:
>> Does anyone know off hand if anything will break if we roll back the
>> conflicting packages to the Spark 1.6 version?
>>
>> Also regarding the pom.xml and:
>> "Packaging
>> If mahout requires to be loaded in spark executor's classpath, then
>> adding mahout dependency in pom.xml will not be enough to work with Spark
>> cluster. Could you clarify if mahout need to be loaded in spark executor?"
>>
>> All we need to do is load the jars appropriate Mahout jars, I'm not
>> familiar enough with the Spark Interpreter or Spark or Java to know exactly
>> what would happen, any thoughts on this?
>>
>> Tonight I might just try removing mahout dependencies from pom.xml and
>> seeing what happens? that would solve all of these problems I think.  As
>> long as user has 'mvn install'ed Mahout, should be gtg?
>>
>> Trevor Grant
>> Data Scientist
>> https://github.com/rawkintrevo
>> http://stackexchange.com/users/3002022/rawkintrevo
>> http://trevorgrant.org
>>
>> *"Fortunate is he, who is able to know the causes of things."  -Virgil*
>>
>>
>

Re: Zeppelin Integration PR

Posted by Trevor Grant <tr...@gmail.com>.
As a follow up to this- it would be nice to remove the dependencies from
the pom.xml...

All we REALLY need to do is make sure we can get to the required jars and
load them.  By including them in the pom we are ensuring they are
available, but there is surely some other way to get ahold of them.  Since
we have assumed that Mahout is installed on the system and MAHOUT_HOME=...
we can probably leverage that...



Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things."  -Virgil*


On Tue, May 31, 2016 at 7:31 PM, Trevor Grant <tr...@gmail.com>
wrote:

> For what it is worth, simply removing the dependencies from pom.xml breaks
> the Mahout interpreter.
>
> Upon a little further testing in cluster mode, so long as the dependencies
> are included in pom.xml, the appropriate Mahout jars are shipped off to the
> cluster and everything works swimmingly (in Zeppelin there is a local Spark
> Interpretter internal to Zeppelin and then the 'real' cluster that
> everything gets shipped off to. Sometimes you can make things work in local
> mode that won't work in cluster mode)
>
> The moral of this story is that the patch DOES in fact work in local and
> cluster mode, so we just need to work out the dependencies and the
> licensing (and a couple of fail safes to make sure the user is running
> Spark version > 1.5.2) and we should be good to go.
>
>
> Trevor Grant
> Data Scientist
> https://github.com/rawkintrevo
> http://stackexchange.com/users/3002022/rawkintrevo
> http://trevorgrant.org
>
> *"Fortunate is he, who is able to know the causes of things."  -Virgil*
>
>
> On Tue, May 31, 2016 at 4:22 PM, Trevor Grant <tr...@gmail.com>
> wrote:
>
>> Hey folks,
>>
>> looks like we're making progress on the Mahout-Zeppelin integration.
>>
>> Any who are interested check out:
>> https://github.com/apache/incubator-zeppelin/pull/928
>>
>> Regarding Moon's last comments:
>> Does anyone know off hand if anything will break if we roll back the
>> conflicting packages to the Spark 1.6 version?
>>
>> Also regarding the pom.xml and:
>> "Packaging
>> If mahout requires to be loaded in spark executor's classpath, then
>> adding mahout dependency in pom.xml will not be enough to work with Spark
>> cluster. Could you clarify if mahout need to be loaded in spark executor?"
>>
>> All we need to do is load the jars appropriate Mahout jars, I'm not
>> familiar enough with the Spark Interpreter or Spark or Java to know exactly
>> what would happen, any thoughts on this?
>>
>> Tonight I might just try removing mahout dependencies from pom.xml and
>> seeing what happens? that would solve all of these problems I think.  As
>> long as user has 'mvn install'ed Mahout, should be gtg?
>>
>> Trevor Grant
>> Data Scientist
>> https://github.com/rawkintrevo
>> http://stackexchange.com/users/3002022/rawkintrevo
>> http://trevorgrant.org
>>
>> *"Fortunate is he, who is able to know the causes of things."  -Virgil*
>>
>>
>