You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@ignite.apache.org by Iker Huerga <ik...@gmail.com> on 2015/04/27 23:20:08 UTC

New contributor?

Hi Ignite team,

My name is Iker Huerga, I'm a Software Engineer, Data Scientist and
entrepreneur with more than 8 years of experience in Java, I was a
Lucene/Solr contributor in the past, and have been using Hadoop in
production for more than 3 years now.

After being contacted by one the members of this community I got intriged
by the project you guys are working on. I took a look at the code and
documentation, and would like to say 'kudos' to all of you. It's clear that
there is a huge amount of work behind Ignite.

I would like to see whether I can be a contributor to Ignite, but there's
been a question in the back of my mind since I started reading about
Ignite, what is the main difference with Apache Spark?

Please note that I've already read the proposal [1], and I get the point
that Ignite is a more general in-memory engine. But Spark also provide
streaming processing, mapreduce computations, etc. Would you say the main
difference is ACID trx in memory?

Also, what is the route map for Ignite? Is it production ready?

Sorry for so many questions..... in exchange of an answer I can take care
of https://issues.apache.org/jira/browse/IGNITE-640  if you guys want to
assign it to me

Thanks in advance!
Iker


[1] https://wiki.apache.org/incubator/IgniteProposal

-- 
Iker Huerga
http://www.ikerhuerga.com/
ᐧ

Re: New contributor?

Posted by Iker Huerga <ik...@gmail.com>.

Thanks!

Iker

> On Apr 27, 2015, at 8:28 PM, Konstantin Boudnik <co...@apache.org> wrote:
> 
> I have added you to the Contributor's role in JIRA and assigned the ticket to
> you! Excited to see a patch soon! Thanks!
> 
> Cos
> 
>> On Mon, Apr 27, 2015 at 07:37PM, Iker Huerga wrote:
>> Thanks so much for the detailed response Cos, was really helpful!
>> 
>> As far as contributing is concerned, how about assigning
>> https://issues.apache.org/jira/browse/IGNITE-640 to me?
>> 
>> Best
>> Iker
>> ᐧ
>> 
>> 2015-04-27 19:05 GMT-04:00 Konstantin Boudnik <co...@apache.org>:
>> 
>>> Hi Iker and welcome!
>>> 
>>> It's nice to have more ppl being involved into the project and bringing in
>>> new
>>> ideas, feedback and code!
>>> 
>>> I'd like to touch on a couple of differences between Ignite and Spark, but
>>> I
>>> am sure other ppl will add their views as well.
>>> 
>>> - The main different is, of course, that Ignite is in-memory computing
>>>   system, e.g. the one that treats RAM as primary storage facility.
>>> Where's
>>>   others - Spark included - only use RAM for precessing.
>>> 
>>> - Ignite's mapreduce is fully compatibly with Hadoop MR APIs which let
>>>   everyone to simply reuse existing legacy MR code yet run it with >30x
>>>   performance improvement.
>>> 
>>> - Also, unlike Spark's the streaming in Ignite isn't quantified by the
>>> size
>>>   of RDD. In other words, you don't need to form an RDD first before
>>>   processing it; you can actually do the real streaming.
>>> 
>>> - Unlike Spark Ignite doesn't have the issue with data spil-overs to the
>>> disk
>>>   (which was attempted to be addressed with Tachyon)
>>> 
>>> - as one of the components, Ignite provides the first-class citizen
>>>   file-system caching layer. Note, there's a Tachyon project and I have
>>>   already addressed the differences between that and Ignite in [1], but
>>> looks
>>>   like my post got deleted for some reason. I wonder why? ;) [2]
>>> 
>>> - Ignite's uses off-heap memory to avoid GC pauses, etc. and does it
>>> highly
>>>   efficiently.
>>> 
>>> - Ignite guarantees strong consistency
>>> 
>>> - Ignite supports full SQL99 as one of the ways to process the data w/
>>> full
>>>   support for ACID transactions (as you have pointed out)
>>> 
>>> - with Ignite a Java programmer shouldn't learn new ropes of Scala. And I
>>>   will withhold my my professional opinion about the latter in order to
>>> keep
>>>   this threat polite and concise ;)
>>> 
>>> I can keep on rumbling for a long time, but you might consider reading [3]
>>> and
>>> [4], where Nikita Ivanov - one of the founders of this project - has a good
>>> reflection on key differences.
>>> 
>>> [1] http://bit.ly/1JvTAB6
>>> [2] https://twitter.com/c0sin/status/592825217606688768
>>> [3] http://www.infoq.com/articles/gridgain-apache-ignite
>>> [4] http://www.odbms.org/blog/2015/02/interview-nikita-ivanov/
>>> 
>>> Hope it helps to clarify the differences a bit.
>>>  Cos
>>> 
>>>> On Mon, Apr 27, 2015 at 05:20PM, Iker Huerga wrote:
>>>> Hi Ignite team,
>>>> 
>>>> My name is Iker Huerga, I'm a Software Engineer, Data Scientist and
>>>> entrepreneur with more than 8 years of experience in Java, I was a
>>>> Lucene/Solr contributor in the past, and have been using Hadoop in
>>>> production for more than 3 years now.
>>>> 
>>>> After being contacted by one the members of this community I got intriged
>>>> by the project you guys are working on. I took a look at the code and
>>>> documentation, and would like to say 'kudos' to all of you. It's clear
>>> that
>>>> there is a huge amount of work behind Ignite.
>>>> 
>>>> I would like to see whether I can be a contributor to Ignite, but there's
>>>> been a question in the back of my mind since I started reading about
>>>> Ignite, what is the main difference with Apache Spark?
>>>> 
>>>> Please note that I've already read the proposal [1], and I get the point
>>>> that Ignite is a more general in-memory engine. But Spark also provide
>>>> streaming processing, mapreduce computations, etc. Would you say the main
>>>> difference is ACID trx in memory?
>>>> 
>>>> Also, what is the route map for Ignite? Is it production ready?
>>>> 
>>>> Sorry for so many questions..... in exchange of an answer I can take care
>>>> of https://issues.apache.org/jira/browse/IGNITE-640  if you guys want to
>>>> assign it to me
>>>> 
>>>> Thanks in advance!
>>>> Iker
>>>> 
>>>> 
>>>> [1] https://wiki.apache.org/incubator/IgniteProposal
>>>> 
>>>> --
>>>> Iker Huerga
>>>> http://www.ikerhuerga.com/
>>>> ᐧ
>> 
>> 
>> 
>> -- 
>> Iker Huerga
>> http://www.ikerhuerga.com/

Re: New contributor?

Posted by Konstantin Boudnik <co...@apache.org>.

I have added you to the Contributor's role in JIRA and assigned the ticket to
you! Excited to see a patch soon! Thanks!

Cos

On Mon, Apr 27, 2015 at 07:37PM, Iker Huerga wrote:
> Thanks so much for the detailed response Cos, was really helpful!
> 
> As far as contributing is concerned, how about assigning
> https://issues.apache.org/jira/browse/IGNITE-640 to me?
> 
> Best
> Iker
> ᐧ
> 
> 2015-04-27 19:05 GMT-04:00 Konstantin Boudnik <co...@apache.org>:
> 
> > Hi Iker and welcome!
> >
> > It's nice to have more ppl being involved into the project and bringing in
> > new
> > ideas, feedback and code!
> >
> > I'd like to touch on a couple of differences between Ignite and Spark, but
> > I
> > am sure other ppl will add their views as well.
> >
> >  - The main different is, of course, that Ignite is in-memory computing
> >    system, e.g. the one that treats RAM as primary storage facility.
> > Where's
> >    others - Spark included - only use RAM for precessing.
> >
> >  - Ignite's mapreduce is fully compatibly with Hadoop MR APIs which let
> >    everyone to simply reuse existing legacy MR code yet run it with >30x
> >    performance improvement.
> >
> >  - Also, unlike Spark's the streaming in Ignite isn't quantified by the
> > size
> >    of RDD. In other words, you don't need to form an RDD first before
> >    processing it; you can actually do the real streaming.
> >
> >  - Unlike Spark Ignite doesn't have the issue with data spil-overs to the
> > disk
> >    (which was attempted to be addressed with Tachyon)
> >
> >  - as one of the components, Ignite provides the first-class citizen
> >    file-system caching layer. Note, there's a Tachyon project and I have
> >    already addressed the differences between that and Ignite in [1], but
> > looks
> >    like my post got deleted for some reason. I wonder why? ;) [2]
> >
> >  - Ignite's uses off-heap memory to avoid GC pauses, etc. and does it
> > highly
> >    efficiently.
> >
> >  - Ignite guarantees strong consistency
> >
> >  - Ignite supports full SQL99 as one of the ways to process the data w/
> > full
> >    support for ACID transactions (as you have pointed out)
> >
> >  - with Ignite a Java programmer shouldn't learn new ropes of Scala. And I
> >    will withhold my my professional opinion about the latter in order to
> > keep
> >    this threat polite and concise ;)
> >
> > I can keep on rumbling for a long time, but you might consider reading [3]
> > and
> > [4], where Nikita Ivanov - one of the founders of this project - has a good
> > reflection on key differences.
> >
> > [1] http://bit.ly/1JvTAB6
> > [2] https://twitter.com/c0sin/status/592825217606688768
> > [3] http://www.infoq.com/articles/gridgain-apache-ignite
> > [4] http://www.odbms.org/blog/2015/02/interview-nikita-ivanov/
> >
> > Hope it helps to clarify the differences a bit.
> >   Cos
> >
> > On Mon, Apr 27, 2015 at 05:20PM, Iker Huerga wrote:
> > > Hi Ignite team,
> > >
> > > My name is Iker Huerga, I'm a Software Engineer, Data Scientist and
> > > entrepreneur with more than 8 years of experience in Java, I was a
> > > Lucene/Solr contributor in the past, and have been using Hadoop in
> > > production for more than 3 years now.
> > >
> > > After being contacted by one the members of this community I got intriged
> > > by the project you guys are working on. I took a look at the code and
> > > documentation, and would like to say 'kudos' to all of you. It's clear
> > that
> > > there is a huge amount of work behind Ignite.
> > >
> > > I would like to see whether I can be a contributor to Ignite, but there's
> > > been a question in the back of my mind since I started reading about
> > > Ignite, what is the main difference with Apache Spark?
> > >
> > > Please note that I've already read the proposal [1], and I get the point
> > > that Ignite is a more general in-memory engine. But Spark also provide
> > > streaming processing, mapreduce computations, etc. Would you say the main
> > > difference is ACID trx in memory?
> > >
> > > Also, what is the route map for Ignite? Is it production ready?
> > >
> > > Sorry for so many questions..... in exchange of an answer I can take care
> > > of https://issues.apache.org/jira/browse/IGNITE-640  if you guys want to
> > > assign it to me
> > >
> > > Thanks in advance!
> > > Iker
> > >
> > >
> > > [1] https://wiki.apache.org/incubator/IgniteProposal
> > >
> > > --
> > > Iker Huerga
> > > http://www.ikerhuerga.com/
> > > ᐧ
> >
> 
> 
> 
> -- 
> Iker Huerga
> http://www.ikerhuerga.com/

Re: New contributor?

Posted by Iker Huerga <ik...@gmail.com>.

Thanks so much for the detailed response Cos, was really helpful!

As far as contributing is concerned, how about assigning
https://issues.apache.org/jira/browse/IGNITE-640 to me?

Best
Iker
ᐧ

2015-04-27 19:05 GMT-04:00 Konstantin Boudnik <co...@apache.org>:

> Hi Iker and welcome!
>
> It's nice to have more ppl being involved into the project and bringing in
> new
> ideas, feedback and code!
>
> I'd like to touch on a couple of differences between Ignite and Spark, but
> I
> am sure other ppl will add their views as well.
>
>  - The main different is, of course, that Ignite is in-memory computing
>    system, e.g. the one that treats RAM as primary storage facility.
> Where's
>    others - Spark included - only use RAM for precessing.
>
>  - Ignite's mapreduce is fully compatibly with Hadoop MR APIs which let
>    everyone to simply reuse existing legacy MR code yet run it with >30x
>    performance improvement.
>
>  - Also, unlike Spark's the streaming in Ignite isn't quantified by the
> size
>    of RDD. In other words, you don't need to form an RDD first before
>    processing it; you can actually do the real streaming.
>
>  - Unlike Spark Ignite doesn't have the issue with data spil-overs to the
> disk
>    (which was attempted to be addressed with Tachyon)
>
>  - as one of the components, Ignite provides the first-class citizen
>    file-system caching layer. Note, there's a Tachyon project and I have
>    already addressed the differences between that and Ignite in [1], but
> looks
>    like my post got deleted for some reason. I wonder why? ;) [2]
>
>  - Ignite's uses off-heap memory to avoid GC pauses, etc. and does it
> highly
>    efficiently.
>
>  - Ignite guarantees strong consistency
>
>  - Ignite supports full SQL99 as one of the ways to process the data w/
> full
>    support for ACID transactions (as you have pointed out)
>
>  - with Ignite a Java programmer shouldn't learn new ropes of Scala. And I
>    will withhold my my professional opinion about the latter in order to
> keep
>    this threat polite and concise ;)
>
> I can keep on rumbling for a long time, but you might consider reading [3]
> and
> [4], where Nikita Ivanov - one of the founders of this project - has a good
> reflection on key differences.
>
> [1] http://bit.ly/1JvTAB6
> [2] https://twitter.com/c0sin/status/592825217606688768
> [3] http://www.infoq.com/articles/gridgain-apache-ignite
> [4] http://www.odbms.org/blog/2015/02/interview-nikita-ivanov/
>
> Hope it helps to clarify the differences a bit.
>   Cos
>
> On Mon, Apr 27, 2015 at 05:20PM, Iker Huerga wrote:
> > Hi Ignite team,
> >
> > My name is Iker Huerga, I'm a Software Engineer, Data Scientist and
> > entrepreneur with more than 8 years of experience in Java, I was a
> > Lucene/Solr contributor in the past, and have been using Hadoop in
> > production for more than 3 years now.
> >
> > After being contacted by one the members of this community I got intriged
> > by the project you guys are working on. I took a look at the code and
> > documentation, and would like to say 'kudos' to all of you. It's clear
> that
> > there is a huge amount of work behind Ignite.
> >
> > I would like to see whether I can be a contributor to Ignite, but there's
> > been a question in the back of my mind since I started reading about
> > Ignite, what is the main difference with Apache Spark?
> >
> > Please note that I've already read the proposal [1], and I get the point
> > that Ignite is a more general in-memory engine. But Spark also provide
> > streaming processing, mapreduce computations, etc. Would you say the main
> > difference is ACID trx in memory?
> >
> > Also, what is the route map for Ignite? Is it production ready?
> >
> > Sorry for so many questions..... in exchange of an answer I can take care
> > of https://issues.apache.org/jira/browse/IGNITE-640  if you guys want to
> > assign it to me
> >
> > Thanks in advance!
> > Iker
> >
> >
> > [1] https://wiki.apache.org/incubator/IgniteProposal
> >
> > --
> > Iker Huerga
> > http://www.ikerhuerga.com/
> > ᐧ
>



-- 
Iker Huerga
http://www.ikerhuerga.com/

Re: New contributor?

Posted by Konstantin Boudnik <co...@apache.org>.

On Mon, Apr 27, 2015 at 11:01PM, Ognen Duzlevski wrote:
> Nikita, (thanks for the addition to an already nice answer by Cos! See my
> question below;
> 
> On Mon, Apr 27, 2015 at 6:46 PM, Nikita Ivanov <ni...@gmail.com> wrote:
> 
> > Just to add to excellent Cos's response:
> > - The code base for Apache Ignite has been in production usage since 2007.
> > It's the only in-memory system that I'm aware of that can boast over 2000
> > nodes in a single mission critical installation working in a fully
> > transactional topology.
> >
> 
> Perhaps you can then answer my question from a different thread: what do
> people do when they want to add additional proprietary  classes to be
> cached to a live ignite topology? Surely they don't stop all 2,000 nodes,
> copy a new jar around and restart all 2,000 nodes? :-)

There's so called 'Peer class loading' as per
  http://apacheignite.readme.io/v1.0/docs/zero-deployment

but looks like you're looking for somewhat different usecase. I wonder if you
should be able to distribute the classes to the nodes and then do the rolling
restart. Just thinking out load - not 100% sure. Perhaps some of the old-timer
can chime in

Cos

Re: New contributor?

Posted by Ognen Duzlevski <og...@gmail.com>.

Nikita, (thanks for the addition to an already nice answer by Cos! See my
question below;

On Mon, Apr 27, 2015 at 6:46 PM, Nikita Ivanov <ni...@gmail.com> wrote:

> Just to add to excellent Cos's response:
> - The code base for Apache Ignite has been in production usage since 2007.
> It's the only in-memory system that I'm aware of that can boast over 2000
> nodes in a single mission critical installation working in a fully
> transactional topology.
>

Perhaps you can then answer my question from a different thread: what do
people do when they want to add additional proprietary  classes to be
cached to a live ignite topology? Surely they don't stop all 2,000 nodes,
copy a new jar around and restart all 2,000 nodes? :-)

Thanks!

Re: New contributor?

Posted by Konstantin Boudnik <co...@apache.org>.

On Mon, Apr 27, 2015 at 04:46PM, Nikita Ivanov wrote:
> Just to add to excellent Cos's response:
> - The code base for Apache Ignite has been in production usage since 2007.
> It's the only in-memory system that I'm aware of that can boast over 2000
> nodes in a single mission critical installation working in a fully
> transactional topology.
> 
> P.S.
> Some of us prefer Scala over Java :) Yet, Apache Ignite can be natively
> used with either Java, Scala or Groovy.

Very true! I am a bit fun of the latter and I have found that it extremely
easy to write in Groovy for Ignite. 

> On Mon, Apr 27, 2015 at 4:05 PM, Konstantin Boudnik <co...@apache.org> wrote:
> 
> > Hi Iker and welcome!
> >
> > It's nice to have more ppl being involved into the project and bringing in
> > new
> > ideas, feedback and code!
> >
> > I'd like to touch on a couple of differences between Ignite and Spark, but
> > I
> > am sure other ppl will add their views as well.
> >
> >  - The main different is, of course, that Ignite is in-memory computing
> >    system, e.g. the one that treats RAM as primary storage facility.
> > Where's
> >    others - Spark included - only use RAM for precessing.
> >
> >  - Ignite's mapreduce is fully compatibly with Hadoop MR APIs which let
> >    everyone to simply reuse existing legacy MR code yet run it with >30x
> >    performance improvement.
> >
> >  - Also, unlike Spark's the streaming in Ignite isn't quantified by the
> > size
> >    of RDD. In other words, you don't need to form an RDD first before
> >    processing it; you can actually do the real streaming.
> >
> >  - Unlike Spark Ignite doesn't have the issue with data spil-overs to the
> > disk
> >    (which was attempted to be addressed with Tachyon)
> >
> >  - as one of the components, Ignite provides the first-class citizen
> >    file-system caching layer. Note, there's a Tachyon project and I have
> >    already addressed the differences between that and Ignite in [1], but
> > looks
> >    like my post got deleted for some reason. I wonder why? ;) [2]
> >
> >  - Ignite's uses off-heap memory to avoid GC pauses, etc. and does it
> > highly
> >    efficiently.
> >
> >  - Ignite guarantees strong consistency
> >
> >  - Ignite supports full SQL99 as one of the ways to process the data w/
> > full
> >    support for ACID transactions (as you have pointed out)
> >
> >  - with Ignite a Java programmer shouldn't learn new ropes of Scala. And I
> >    will withhold my my professional opinion about the latter in order to
> > keep
> >    this threat polite and concise ;)
> >
> > I can keep on rumbling for a long time, but you might consider reading [3]
> > and
> > [4], where Nikita Ivanov - one of the founders of this project - has a good
> > reflection on key differences.
> >
> > [1] http://bit.ly/1JvTAB6
> > [2] https://twitter.com/c0sin/status/592825217606688768
> > [3] http://www.infoq.com/articles/gridgain-apache-ignite
> > [4] http://www.odbms.org/blog/2015/02/interview-nikita-ivanov/
> >
> > Hope it helps to clarify the differences a bit.
> >   Cos
> >
> > On Mon, Apr 27, 2015 at 05:20PM, Iker Huerga wrote:
> > > Hi Ignite team,
> > >
> > > My name is Iker Huerga, I'm a Software Engineer, Data Scientist and
> > > entrepreneur with more than 8 years of experience in Java, I was a
> > > Lucene/Solr contributor in the past, and have been using Hadoop in
> > > production for more than 3 years now.
> > >
> > > After being contacted by one the members of this community I got intriged
> > > by the project you guys are working on. I took a look at the code and
> > > documentation, and would like to say 'kudos' to all of you. It's clear
> > that
> > > there is a huge amount of work behind Ignite.
> > >
> > > I would like to see whether I can be a contributor to Ignite, but there's
> > > been a question in the back of my mind since I started reading about
> > > Ignite, what is the main difference with Apache Spark?
> > >
> > > Please note that I've already read the proposal [1], and I get the point
> > > that Ignite is a more general in-memory engine. But Spark also provide
> > > streaming processing, mapreduce computations, etc. Would you say the main
> > > difference is ACID trx in memory?
> > >
> > > Also, what is the route map for Ignite? Is it production ready?
> > >
> > > Sorry for so many questions..... in exchange of an answer I can take care
> > > of https://issues.apache.org/jira/browse/IGNITE-640  if you guys want to
> > > assign it to me
> > >
> > > Thanks in advance!
> > > Iker
> > >
> > >
> > > [1] https://wiki.apache.org/incubator/IgniteProposal
> > >
> > > --
> > > Iker Huerga
> > > http://www.ikerhuerga.com/
> > > ᐧ
> >

Re: New contributor?

Posted by Nikita Ivanov <ni...@gmail.com>.

Just to add to excellent Cos's response:
- The code base for Apache Ignite has been in production usage since 2007.
It's the only in-memory system that I'm aware of that can boast over 2000
nodes in a single mission critical installation working in a fully
transactional topology.

P.S.
Some of us prefer Scala over Java :) Yet, Apache Ignite can be natively
used with either Java, Scala or Groovy.
--
Nikita Ivanov


On Mon, Apr 27, 2015 at 4:05 PM, Konstantin Boudnik <co...@apache.org> wrote:

> Hi Iker and welcome!
>
> It's nice to have more ppl being involved into the project and bringing in
> new
> ideas, feedback and code!
>
> I'd like to touch on a couple of differences between Ignite and Spark, but
> I
> am sure other ppl will add their views as well.
>
>  - The main different is, of course, that Ignite is in-memory computing
>    system, e.g. the one that treats RAM as primary storage facility.
> Where's
>    others - Spark included - only use RAM for precessing.
>
>  - Ignite's mapreduce is fully compatibly with Hadoop MR APIs which let
>    everyone to simply reuse existing legacy MR code yet run it with >30x
>    performance improvement.
>
>  - Also, unlike Spark's the streaming in Ignite isn't quantified by the
> size
>    of RDD. In other words, you don't need to form an RDD first before
>    processing it; you can actually do the real streaming.
>
>  - Unlike Spark Ignite doesn't have the issue with data spil-overs to the
> disk
>    (which was attempted to be addressed with Tachyon)
>
>  - as one of the components, Ignite provides the first-class citizen
>    file-system caching layer. Note, there's a Tachyon project and I have
>    already addressed the differences between that and Ignite in [1], but
> looks
>    like my post got deleted for some reason. I wonder why? ;) [2]
>
>  - Ignite's uses off-heap memory to avoid GC pauses, etc. and does it
> highly
>    efficiently.
>
>  - Ignite guarantees strong consistency
>
>  - Ignite supports full SQL99 as one of the ways to process the data w/
> full
>    support for ACID transactions (as you have pointed out)
>
>  - with Ignite a Java programmer shouldn't learn new ropes of Scala. And I
>    will withhold my my professional opinion about the latter in order to
> keep
>    this threat polite and concise ;)
>
> I can keep on rumbling for a long time, but you might consider reading [3]
> and
> [4], where Nikita Ivanov - one of the founders of this project - has a good
> reflection on key differences.
>
> [1] http://bit.ly/1JvTAB6
> [2] https://twitter.com/c0sin/status/592825217606688768
> [3] http://www.infoq.com/articles/gridgain-apache-ignite
> [4] http://www.odbms.org/blog/2015/02/interview-nikita-ivanov/
>
> Hope it helps to clarify the differences a bit.
>   Cos
>
> On Mon, Apr 27, 2015 at 05:20PM, Iker Huerga wrote:
> > Hi Ignite team,
> >
> > My name is Iker Huerga, I'm a Software Engineer, Data Scientist and
> > entrepreneur with more than 8 years of experience in Java, I was a
> > Lucene/Solr contributor in the past, and have been using Hadoop in
> > production for more than 3 years now.
> >
> > After being contacted by one the members of this community I got intriged
> > by the project you guys are working on. I took a look at the code and
> > documentation, and would like to say 'kudos' to all of you. It's clear
> that
> > there is a huge amount of work behind Ignite.
> >
> > I would like to see whether I can be a contributor to Ignite, but there's
> > been a question in the back of my mind since I started reading about
> > Ignite, what is the main difference with Apache Spark?
> >
> > Please note that I've already read the proposal [1], and I get the point
> > that Ignite is a more general in-memory engine. But Spark also provide
> > streaming processing, mapreduce computations, etc. Would you say the main
> > difference is ACID trx in memory?
> >
> > Also, what is the route map for Ignite? Is it production ready?
> >
> > Sorry for so many questions..... in exchange of an answer I can take care
> > of https://issues.apache.org/jira/browse/IGNITE-640  if you guys want to
> > assign it to me
> >
> > Thanks in advance!
> > Iker
> >
> >
> > [1] https://wiki.apache.org/incubator/IgniteProposal
> >
> > --
> > Iker Huerga
> > http://www.ikerhuerga.com/
> > ᐧ
>

Re: New contributor?

Posted by Konstantin Boudnik <co...@apache.org>.

Hi Iker and welcome!

It's nice to have more ppl being involved into the project and bringing in new
ideas, feedback and code!

I'd like to touch on a couple of differences between Ignite and Spark, but I
am sure other ppl will add their views as well.

 - The main different is, of course, that Ignite is in-memory computing
   system, e.g. the one that treats RAM as primary storage facility. Where's
   others - Spark included - only use RAM for precessing. 

 - Ignite's mapreduce is fully compatibly with Hadoop MR APIs which let
   everyone to simply reuse existing legacy MR code yet run it with >30x
   performance improvement.

 - Also, unlike Spark's the streaming in Ignite isn't quantified by the size
   of RDD. In other words, you don't need to form an RDD first before
   processing it; you can actually do the real streaming.

 - Unlike Spark Ignite doesn't have the issue with data spil-overs to the disk
   (which was attempted to be addressed with Tachyon)

 - as one of the components, Ignite provides the first-class citizen
   file-system caching layer. Note, there's a Tachyon project and I have
   already addressed the differences between that and Ignite in [1], but looks
   like my post got deleted for some reason. I wonder why? ;) [2]

 - Ignite's uses off-heap memory to avoid GC pauses, etc. and does it highly
   efficiently.

 - Ignite guarantees strong consistency

 - Ignite supports full SQL99 as one of the ways to process the data w/ full
   support for ACID transactions (as you have pointed out)

 - with Ignite a Java programmer shouldn't learn new ropes of Scala. And I
   will withhold my my professional opinion about the latter in order to keep
   this threat polite and concise ;)

I can keep on rumbling for a long time, but you might consider reading [3] and
[4], where Nikita Ivanov - one of the founders of this project - has a good
reflection on key differences.

[1] http://bit.ly/1JvTAB6
[2] https://twitter.com/c0sin/status/592825217606688768
[3] http://www.infoq.com/articles/gridgain-apache-ignite
[4] http://www.odbms.org/blog/2015/02/interview-nikita-ivanov/

Hope it helps to clarify the differences a bit.
  Cos

On Mon, Apr 27, 2015 at 05:20PM, Iker Huerga wrote:
> Hi Ignite team,
> 
> My name is Iker Huerga, I'm a Software Engineer, Data Scientist and
> entrepreneur with more than 8 years of experience in Java, I was a
> Lucene/Solr contributor in the past, and have been using Hadoop in
> production for more than 3 years now.
> 
> After being contacted by one the members of this community I got intriged
> by the project you guys are working on. I took a look at the code and
> documentation, and would like to say 'kudos' to all of you. It's clear that
> there is a huge amount of work behind Ignite.
> 
> I would like to see whether I can be a contributor to Ignite, but there's
> been a question in the back of my mind since I started reading about
> Ignite, what is the main difference with Apache Spark?
> 
> Please note that I've already read the proposal [1], and I get the point
> that Ignite is a more general in-memory engine. But Spark also provide
> streaming processing, mapreduce computations, etc. Would you say the main
> difference is ACID trx in memory?
> 
> Also, what is the route map for Ignite? Is it production ready?
> 
> Sorry for so many questions..... in exchange of an answer I can take care
> of https://issues.apache.org/jira/browse/IGNITE-640  if you guys want to
> assign it to me
> 
> Thanks in advance!
> Iker
> 
> 
> [1] https://wiki.apache.org/incubator/IgniteProposal
> 
> -- 
> Iker Huerga
> http://www.ikerhuerga.com/
> ᐧ