You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com> on 2014/07/01 17:00:17 UTC

The future of MapReduce

“The Mahout community decided to move its codebase onto modern data processing systems that offer a richer programming model and more efficient execution than Hadoop MapReduce.”

Does this mean that learning MapReduce is a waste of time? Is Storm the future or are both technologies necessary?

B.

Re: The future of MapReduce

Posted by Peyman Mohajerian <mo...@gmail.com>.

This statement is inaccurate. Not all machine learning involves iterative
computation, not all dataset can fit in-memory. I'm not an expert in
Machine Learning, but I know enough to know that talking about it in some
generic sense from a standpoint of spark vs mahout, or R vs Python makes no
sense. Many Machine Learning algorithms involves creating models from
massive amount of data and in no context it would make sense to do it
in-memory.
Also people do map/reduce in-memory, Shahab elaborated on that nicely later
on the same thread.

On Tue, Jul 1, 2014 at 2:17 PM, kartik saxena <ka...@gmail.com> wrote:

> Spark https://spark.apache.org/ is also getting a lot attention with its
> in-memory computations and caching features. Performance wise it is being
> touted better than mahout because machine learning involves iterative
> computations and Spark could cache these computations in-memory for faster
> processing.
>
>
> On Tue, Jul 1, 2014 at 11:07 AM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   From your answer, it sounds like you need to be able to do both.
>>
>>  *From:* Marco Shaw <ma...@gmail.com>
>> *Sent:* Tuesday, July 01, 2014 10:24 AM
>> *To:* user <us...@hadoop.apache.org>
>> *Subject:* Re: The future of MapReduce
>>
>>  It depends...  It seems most are evolving from needing "lots of data
>> crunched", to "lots of data crunched right now".  Most are looking for
>> *real-time* fraud detection or recommendations, for example, which
>> MapReduce is not ideal for.
>>
>> Marco
>>
>>
>> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
>> adaryl.wakefield@hotmail.com> wrote:
>>
>>>   “The Mahout community decided to move its codebase onto modern data
>>> processing systems that offer a richer programming model and more efficient
>>> execution than Hadoop MapReduce.”
>>>
>>> Does this mean that learning MapReduce is a waste of time? Is Storm the
>>> future or are both technologies necessary?
>>>
>>> B.
>>>
>>
>>
>
>

Re: The future of MapReduce

Posted by Peyman Mohajerian <mo...@gmail.com>.

This statement is inaccurate. Not all machine learning involves iterative
computation, not all dataset can fit in-memory. I'm not an expert in
Machine Learning, but I know enough to know that talking about it in some
generic sense from a standpoint of spark vs mahout, or R vs Python makes no
sense. Many Machine Learning algorithms involves creating models from
massive amount of data and in no context it would make sense to do it
in-memory.
Also people do map/reduce in-memory, Shahab elaborated on that nicely later
on the same thread.

On Tue, Jul 1, 2014 at 2:17 PM, kartik saxena <ka...@gmail.com> wrote:

> Spark https://spark.apache.org/ is also getting a lot attention with its
> in-memory computations and caching features. Performance wise it is being
> touted better than mahout because machine learning involves iterative
> computations and Spark could cache these computations in-memory for faster
> processing.
>
>
> On Tue, Jul 1, 2014 at 11:07 AM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   From your answer, it sounds like you need to be able to do both.
>>
>>  *From:* Marco Shaw <ma...@gmail.com>
>> *Sent:* Tuesday, July 01, 2014 10:24 AM
>> *To:* user <us...@hadoop.apache.org>
>> *Subject:* Re: The future of MapReduce
>>
>>  It depends...  It seems most are evolving from needing "lots of data
>> crunched", to "lots of data crunched right now".  Most are looking for
>> *real-time* fraud detection or recommendations, for example, which
>> MapReduce is not ideal for.
>>
>> Marco
>>
>>
>> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
>> adaryl.wakefield@hotmail.com> wrote:
>>
>>>   “The Mahout community decided to move its codebase onto modern data
>>> processing systems that offer a richer programming model and more efficient
>>> execution than Hadoop MapReduce.”
>>>
>>> Does this mean that learning MapReduce is a waste of time? Is Storm the
>>> future or are both technologies necessary?
>>>
>>> B.
>>>
>>
>>
>
>

Re: The future of MapReduce

Posted by Peyman Mohajerian <mo...@gmail.com>.

This statement is inaccurate. Not all machine learning involves iterative
computation, not all dataset can fit in-memory. I'm not an expert in
Machine Learning, but I know enough to know that talking about it in some
generic sense from a standpoint of spark vs mahout, or R vs Python makes no
sense. Many Machine Learning algorithms involves creating models from
massive amount of data and in no context it would make sense to do it
in-memory.
Also people do map/reduce in-memory, Shahab elaborated on that nicely later
on the same thread.

On Tue, Jul 1, 2014 at 2:17 PM, kartik saxena <ka...@gmail.com> wrote:

> Spark https://spark.apache.org/ is also getting a lot attention with its
> in-memory computations and caching features. Performance wise it is being
> touted better than mahout because machine learning involves iterative
> computations and Spark could cache these computations in-memory for faster
> processing.
>
>
> On Tue, Jul 1, 2014 at 11:07 AM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   From your answer, it sounds like you need to be able to do both.
>>
>>  *From:* Marco Shaw <ma...@gmail.com>
>> *Sent:* Tuesday, July 01, 2014 10:24 AM
>> *To:* user <us...@hadoop.apache.org>
>> *Subject:* Re: The future of MapReduce
>>
>>  It depends...  It seems most are evolving from needing "lots of data
>> crunched", to "lots of data crunched right now".  Most are looking for
>> *real-time* fraud detection or recommendations, for example, which
>> MapReduce is not ideal for.
>>
>> Marco
>>
>>
>> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
>> adaryl.wakefield@hotmail.com> wrote:
>>
>>>   “The Mahout community decided to move its codebase onto modern data
>>> processing systems that offer a richer programming model and more efficient
>>> execution than Hadoop MapReduce.”
>>>
>>> Does this mean that learning MapReduce is a waste of time? Is Storm the
>>> future or are both technologies necessary?
>>>
>>> B.
>>>
>>
>>
>
>

Re: The future of MapReduce

Posted by Peyman Mohajerian <mo...@gmail.com>.

This statement is inaccurate. Not all machine learning involves iterative
computation, not all dataset can fit in-memory. I'm not an expert in
Machine Learning, but I know enough to know that talking about it in some
generic sense from a standpoint of spark vs mahout, or R vs Python makes no
sense. Many Machine Learning algorithms involves creating models from
massive amount of data and in no context it would make sense to do it
in-memory.
Also people do map/reduce in-memory, Shahab elaborated on that nicely later
on the same thread.

On Tue, Jul 1, 2014 at 2:17 PM, kartik saxena <ka...@gmail.com> wrote:

> Spark https://spark.apache.org/ is also getting a lot attention with its
> in-memory computations and caching features. Performance wise it is being
> touted better than mahout because machine learning involves iterative
> computations and Spark could cache these computations in-memory for faster
> processing.
>
>
> On Tue, Jul 1, 2014 at 11:07 AM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   From your answer, it sounds like you need to be able to do both.
>>
>>  *From:* Marco Shaw <ma...@gmail.com>
>> *Sent:* Tuesday, July 01, 2014 10:24 AM
>> *To:* user <us...@hadoop.apache.org>
>> *Subject:* Re: The future of MapReduce
>>
>>  It depends...  It seems most are evolving from needing "lots of data
>> crunched", to "lots of data crunched right now".  Most are looking for
>> *real-time* fraud detection or recommendations, for example, which
>> MapReduce is not ideal for.
>>
>> Marco
>>
>>
>> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
>> adaryl.wakefield@hotmail.com> wrote:
>>
>>>   “The Mahout community decided to move its codebase onto modern data
>>> processing systems that offer a richer programming model and more efficient
>>> execution than Hadoop MapReduce.”
>>>
>>> Does this mean that learning MapReduce is a waste of time? Is Storm the
>>> future or are both technologies necessary?
>>>
>>> B.
>>>
>>
>>
>
>

Re: The future of MapReduce

Posted by kartik saxena <ka...@gmail.com>.

Spark https://spark.apache.org/ is also getting a lot attention with its
in-memory computations and caching features. Performance wise it is being
touted better than mahout because machine learning involves iterative
computations and Spark could cache these computations in-memory for faster
processing.

On Tue, Jul 1, 2014 at 11:07 AM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   From your answer, it sounds like you need to be able to do both.
>
>  *From:* Marco Shaw <ma...@gmail.com>
> *Sent:* Tuesday, July 01, 2014 10:24 AM
> *To:* user <us...@hadoop.apache.org>
> *Subject:* Re: The future of MapReduce
>
>  It depends...  It seems most are evolving from needing "lots of data
> crunched", to "lots of data crunched right now".  Most are looking for
> *real-time* fraud detection or recommendations, for example, which
> MapReduce is not ideal for.
>
> Marco
>
>
> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   “The Mahout community decided to move its codebase onto modern data
>> processing systems that offer a richer programming model and more efficient
>> execution than Hadoop MapReduce.”
>>
>> Does this mean that learning MapReduce is a waste of time? Is Storm the
>> future or are both technologies necessary?
>>
>> B.
>>
>
>

Re: The future of MapReduce

Posted by Shahab Yunus <sh...@gmail.com>.

My personal thoughts on this.

I approach this problem in a different way. Map/Reduce is not a framework
or a technology. It was a paradigm for distributed and parallel processing
which can be implemented in different frameworks and style. So given that,
I don't think there is as such any harm in learning this paradigm as it
does not bind you too any specific framework or tool. What I mean to say is
that it is a general concept. You can pick any implementation and explore.

Continuing on that, given the major break-through it was in terms of Big
Data world, even if it is eventually phased-out, to understand it and look
into it is still very beneficial and can help in increasing better
understanding of this relatively new field of Big Data and build solid
foundation. Given it's seminal nature and scope, I would not consider it a
waste of time.

Lastly, it is even more personal way of looking at this problem: if one
approaches M/R as a more generic concept rather than a tool than it is not
that difficult or time-consuming to learn or understand it.

On the point of whether Google is the *only* one dealing with seriously
large amounts of data, I would not only say that Facebook will catch-up
pretty soon but one should take a look at this interview by Chris Mattmann
from NASA :)

"...on Big Data Infrastructure for Scientific Data Processing"
http://www.infoq.com/interviews/mattmann-science-data

Regards,
Shahab


On Tue, Jul 1, 2014 at 5:43 PM, snehil wakchaure <sn...@gmail.com> wrote:

> Heard about Google dataflow from last week
> On Jul 1, 2014 4:42 PM, "Marco Shaw" <ma...@gmail.com> wrote:
>
>> Interesting timing:
>> http://java.dzone.com/articles/there-future-mapreduce
>>
>> Google declared last week that "MapReduce was dead" more or less, but
>> there are very few that process data at Google's level.
>>
>> Makes me wonder what Yahoo has for a tech mix these days...
>>
>>
>>
>> On Tue, Jul 1, 2014 at 6:01 PM, Adaryl "Bob" Wakefield, MBA <
>> adaryl.wakefield@hotmail.com> wrote:
>>
>>>   It was a declarative statement designed to elicit further explanation.
>>>
>>> If someone is brand new and trying to figure out how to eat the elephant
>>> as it were, you kind of want to burn things down to their essentials. If
>>> MapReduce isn’t going to be part of the ecosystem in the future, one does
>>> not want to spend hours learning how to write MapReduce jobs.
>>>
>>> B.
>>>
>>>  *From:* Marco Shaw <ma...@gmail.com>
>>> *Sent:* Tuesday, July 01, 2014 3:50 PM
>>> *To:* user <us...@hadoop.apache.org>
>>> *Subject:* Re: The future of MapReduce
>>>
>>>  Sorry, not sure if that's a question.
>>>
>>> Hadoop v1=HDFS+MapReduce
>>> Hadoop v2=HDFS+YARN (+ MapReduce part of the core, but now considered
>>> optional to "get work done")
>>>
>>> v2 adds a better resourcing framework.  Now you can run Storm, Spark,
>>> MapReduce, etc. on Hadoop and mix-and-match jobs/tasks with whatever your
>>> requirements, which may actually be both batch "stuff" and/or real-time.
>>>
>>> Not sure if that clarifies things...  Just like you can evaluate all
>>> kinds of Apache ecosystems products to meet your needs, MapReduce is no
>>> longer the only kid on the bock.
>>>
>>>
>>> On Tue, Jul 1, 2014 at 3:07 PM, Adaryl "Bob" Wakefield, MBA <
>>> adaryl.wakefield@hotmail.com> wrote:
>>>
>>>>   From your answer, it sounds like you need to be able to do both.
>>>>
>>>>  *From:* Marco Shaw <ma...@gmail.com>
>>>> *Sent:* Tuesday, July 01, 2014 10:24 AM
>>>> *To:* user <us...@hadoop.apache.org>
>>>> *Subject:* Re: The future of MapReduce
>>>>
>>>>  It depends...  It seems most are evolving from needing "lots of data
>>>> crunched", to "lots of data crunched right now".  Most are looking for
>>>> *real-time* fraud detection or recommendations, for example, which
>>>> MapReduce is not ideal for.
>>>>
>>>> Marco
>>>>
>>>>
>>>> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
>>>> adaryl.wakefield@hotmail.com> wrote:
>>>>
>>>>>   “The Mahout community decided to move its codebase onto modern data
>>>>> processing systems that offer a richer programming model and more efficient
>>>>> execution than Hadoop MapReduce.”
>>>>>
>>>>> Does this mean that learning MapReduce is a waste of time? Is Storm
>>>>> the future or are both technologies necessary?
>>>>>
>>>>> B.
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>

Re: The future of MapReduce

Posted by Shahab Yunus <sh...@gmail.com>.

My personal thoughts on this.

I approach this problem in a different way. Map/Reduce is not a framework
or a technology. It was a paradigm for distributed and parallel processing
which can be implemented in different frameworks and style. So given that,
I don't think there is as such any harm in learning this paradigm as it
does not bind you too any specific framework or tool. What I mean to say is
that it is a general concept. You can pick any implementation and explore.

Continuing on that, given the major break-through it was in terms of Big
Data world, even if it is eventually phased-out, to understand it and look
into it is still very beneficial and can help in increasing better
understanding of this relatively new field of Big Data and build solid
foundation. Given it's seminal nature and scope, I would not consider it a
waste of time.

Lastly, it is even more personal way of looking at this problem: if one
approaches M/R as a more generic concept rather than a tool than it is not
that difficult or time-consuming to learn or understand it.

On the point of whether Google is the *only* one dealing with seriously
large amounts of data, I would not only say that Facebook will catch-up
pretty soon but one should take a look at this interview by Chris Mattmann
from NASA :)

"...on Big Data Infrastructure for Scientific Data Processing"
http://www.infoq.com/interviews/mattmann-science-data

Regards,
Shahab


On Tue, Jul 1, 2014 at 5:43 PM, snehil wakchaure <sn...@gmail.com> wrote:

> Heard about Google dataflow from last week
> On Jul 1, 2014 4:42 PM, "Marco Shaw" <ma...@gmail.com> wrote:
>
>> Interesting timing:
>> http://java.dzone.com/articles/there-future-mapreduce
>>
>> Google declared last week that "MapReduce was dead" more or less, but
>> there are very few that process data at Google's level.
>>
>> Makes me wonder what Yahoo has for a tech mix these days...
>>
>>
>>
>> On Tue, Jul 1, 2014 at 6:01 PM, Adaryl "Bob" Wakefield, MBA <
>> adaryl.wakefield@hotmail.com> wrote:
>>
>>>   It was a declarative statement designed to elicit further explanation.
>>>
>>> If someone is brand new and trying to figure out how to eat the elephant
>>> as it were, you kind of want to burn things down to their essentials. If
>>> MapReduce isn’t going to be part of the ecosystem in the future, one does
>>> not want to spend hours learning how to write MapReduce jobs.
>>>
>>> B.
>>>
>>>  *From:* Marco Shaw <ma...@gmail.com>
>>> *Sent:* Tuesday, July 01, 2014 3:50 PM
>>> *To:* user <us...@hadoop.apache.org>
>>> *Subject:* Re: The future of MapReduce
>>>
>>>  Sorry, not sure if that's a question.
>>>
>>> Hadoop v1=HDFS+MapReduce
>>> Hadoop v2=HDFS+YARN (+ MapReduce part of the core, but now considered
>>> optional to "get work done")
>>>
>>> v2 adds a better resourcing framework.  Now you can run Storm, Spark,
>>> MapReduce, etc. on Hadoop and mix-and-match jobs/tasks with whatever your
>>> requirements, which may actually be both batch "stuff" and/or real-time.
>>>
>>> Not sure if that clarifies things...  Just like you can evaluate all
>>> kinds of Apache ecosystems products to meet your needs, MapReduce is no
>>> longer the only kid on the bock.
>>>
>>>
>>> On Tue, Jul 1, 2014 at 3:07 PM, Adaryl "Bob" Wakefield, MBA <
>>> adaryl.wakefield@hotmail.com> wrote:
>>>
>>>>   From your answer, it sounds like you need to be able to do both.
>>>>
>>>>  *From:* Marco Shaw <ma...@gmail.com>
>>>> *Sent:* Tuesday, July 01, 2014 10:24 AM
>>>> *To:* user <us...@hadoop.apache.org>
>>>> *Subject:* Re: The future of MapReduce
>>>>
>>>>  It depends...  It seems most are evolving from needing "lots of data
>>>> crunched", to "lots of data crunched right now".  Most are looking for
>>>> *real-time* fraud detection or recommendations, for example, which
>>>> MapReduce is not ideal for.
>>>>
>>>> Marco
>>>>
>>>>
>>>> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
>>>> adaryl.wakefield@hotmail.com> wrote:
>>>>
>>>>>   “The Mahout community decided to move its codebase onto modern data
>>>>> processing systems that offer a richer programming model and more efficient
>>>>> execution than Hadoop MapReduce.”
>>>>>
>>>>> Does this mean that learning MapReduce is a waste of time? Is Storm
>>>>> the future or are both technologies necessary?
>>>>>
>>>>> B.
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>

Re: The future of MapReduce

Posted by Shahab Yunus <sh...@gmail.com>.

My personal thoughts on this.

I approach this problem in a different way. Map/Reduce is not a framework
or a technology. It was a paradigm for distributed and parallel processing
which can be implemented in different frameworks and style. So given that,
I don't think there is as such any harm in learning this paradigm as it
does not bind you too any specific framework or tool. What I mean to say is
that it is a general concept. You can pick any implementation and explore.

Continuing on that, given the major break-through it was in terms of Big
Data world, even if it is eventually phased-out, to understand it and look
into it is still very beneficial and can help in increasing better
understanding of this relatively new field of Big Data and build solid
foundation. Given it's seminal nature and scope, I would not consider it a
waste of time.

Lastly, it is even more personal way of looking at this problem: if one
approaches M/R as a more generic concept rather than a tool than it is not
that difficult or time-consuming to learn or understand it.

On the point of whether Google is the *only* one dealing with seriously
large amounts of data, I would not only say that Facebook will catch-up
pretty soon but one should take a look at this interview by Chris Mattmann
from NASA :)

"...on Big Data Infrastructure for Scientific Data Processing"
http://www.infoq.com/interviews/mattmann-science-data

Regards,
Shahab


On Tue, Jul 1, 2014 at 5:43 PM, snehil wakchaure <sn...@gmail.com> wrote:

> Heard about Google dataflow from last week
> On Jul 1, 2014 4:42 PM, "Marco Shaw" <ma...@gmail.com> wrote:
>
>> Interesting timing:
>> http://java.dzone.com/articles/there-future-mapreduce
>>
>> Google declared last week that "MapReduce was dead" more or less, but
>> there are very few that process data at Google's level.
>>
>> Makes me wonder what Yahoo has for a tech mix these days...
>>
>>
>>
>> On Tue, Jul 1, 2014 at 6:01 PM, Adaryl "Bob" Wakefield, MBA <
>> adaryl.wakefield@hotmail.com> wrote:
>>
>>>   It was a declarative statement designed to elicit further explanation.
>>>
>>> If someone is brand new and trying to figure out how to eat the elephant
>>> as it were, you kind of want to burn things down to their essentials. If
>>> MapReduce isn’t going to be part of the ecosystem in the future, one does
>>> not want to spend hours learning how to write MapReduce jobs.
>>>
>>> B.
>>>
>>>  *From:* Marco Shaw <ma...@gmail.com>
>>> *Sent:* Tuesday, July 01, 2014 3:50 PM
>>> *To:* user <us...@hadoop.apache.org>
>>> *Subject:* Re: The future of MapReduce
>>>
>>>  Sorry, not sure if that's a question.
>>>
>>> Hadoop v1=HDFS+MapReduce
>>> Hadoop v2=HDFS+YARN (+ MapReduce part of the core, but now considered
>>> optional to "get work done")
>>>
>>> v2 adds a better resourcing framework.  Now you can run Storm, Spark,
>>> MapReduce, etc. on Hadoop and mix-and-match jobs/tasks with whatever your
>>> requirements, which may actually be both batch "stuff" and/or real-time.
>>>
>>> Not sure if that clarifies things...  Just like you can evaluate all
>>> kinds of Apache ecosystems products to meet your needs, MapReduce is no
>>> longer the only kid on the bock.
>>>
>>>
>>> On Tue, Jul 1, 2014 at 3:07 PM, Adaryl "Bob" Wakefield, MBA <
>>> adaryl.wakefield@hotmail.com> wrote:
>>>
>>>>   From your answer, it sounds like you need to be able to do both.
>>>>
>>>>  *From:* Marco Shaw <ma...@gmail.com>
>>>> *Sent:* Tuesday, July 01, 2014 10:24 AM
>>>> *To:* user <us...@hadoop.apache.org>
>>>> *Subject:* Re: The future of MapReduce
>>>>
>>>>  It depends...  It seems most are evolving from needing "lots of data
>>>> crunched", to "lots of data crunched right now".  Most are looking for
>>>> *real-time* fraud detection or recommendations, for example, which
>>>> MapReduce is not ideal for.
>>>>
>>>> Marco
>>>>
>>>>
>>>> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
>>>> adaryl.wakefield@hotmail.com> wrote:
>>>>
>>>>>   “The Mahout community decided to move its codebase onto modern data
>>>>> processing systems that offer a richer programming model and more efficient
>>>>> execution than Hadoop MapReduce.”
>>>>>
>>>>> Does this mean that learning MapReduce is a waste of time? Is Storm
>>>>> the future or are both technologies necessary?
>>>>>
>>>>> B.
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>

Re: The future of MapReduce

Posted by Shahab Yunus <sh...@gmail.com>.

My personal thoughts on this.

I approach this problem in a different way. Map/Reduce is not a framework
or a technology. It was a paradigm for distributed and parallel processing
which can be implemented in different frameworks and style. So given that,
I don't think there is as such any harm in learning this paradigm as it
does not bind you too any specific framework or tool. What I mean to say is
that it is a general concept. You can pick any implementation and explore.

Continuing on that, given the major break-through it was in terms of Big
Data world, even if it is eventually phased-out, to understand it and look
into it is still very beneficial and can help in increasing better
understanding of this relatively new field of Big Data and build solid
foundation. Given it's seminal nature and scope, I would not consider it a
waste of time.

Lastly, it is even more personal way of looking at this problem: if one
approaches M/R as a more generic concept rather than a tool than it is not
that difficult or time-consuming to learn or understand it.

On the point of whether Google is the *only* one dealing with seriously
large amounts of data, I would not only say that Facebook will catch-up
pretty soon but one should take a look at this interview by Chris Mattmann
from NASA :)

"...on Big Data Infrastructure for Scientific Data Processing"
http://www.infoq.com/interviews/mattmann-science-data

Regards,
Shahab


On Tue, Jul 1, 2014 at 5:43 PM, snehil wakchaure <sn...@gmail.com> wrote:

> Heard about Google dataflow from last week
> On Jul 1, 2014 4:42 PM, "Marco Shaw" <ma...@gmail.com> wrote:
>
>> Interesting timing:
>> http://java.dzone.com/articles/there-future-mapreduce
>>
>> Google declared last week that "MapReduce was dead" more or less, but
>> there are very few that process data at Google's level.
>>
>> Makes me wonder what Yahoo has for a tech mix these days...
>>
>>
>>
>> On Tue, Jul 1, 2014 at 6:01 PM, Adaryl "Bob" Wakefield, MBA <
>> adaryl.wakefield@hotmail.com> wrote:
>>
>>>   It was a declarative statement designed to elicit further explanation.
>>>
>>> If someone is brand new and trying to figure out how to eat the elephant
>>> as it were, you kind of want to burn things down to their essentials. If
>>> MapReduce isn’t going to be part of the ecosystem in the future, one does
>>> not want to spend hours learning how to write MapReduce jobs.
>>>
>>> B.
>>>
>>>  *From:* Marco Shaw <ma...@gmail.com>
>>> *Sent:* Tuesday, July 01, 2014 3:50 PM
>>> *To:* user <us...@hadoop.apache.org>
>>> *Subject:* Re: The future of MapReduce
>>>
>>>  Sorry, not sure if that's a question.
>>>
>>> Hadoop v1=HDFS+MapReduce
>>> Hadoop v2=HDFS+YARN (+ MapReduce part of the core, but now considered
>>> optional to "get work done")
>>>
>>> v2 adds a better resourcing framework.  Now you can run Storm, Spark,
>>> MapReduce, etc. on Hadoop and mix-and-match jobs/tasks with whatever your
>>> requirements, which may actually be both batch "stuff" and/or real-time.
>>>
>>> Not sure if that clarifies things...  Just like you can evaluate all
>>> kinds of Apache ecosystems products to meet your needs, MapReduce is no
>>> longer the only kid on the bock.
>>>
>>>
>>> On Tue, Jul 1, 2014 at 3:07 PM, Adaryl "Bob" Wakefield, MBA <
>>> adaryl.wakefield@hotmail.com> wrote:
>>>
>>>>   From your answer, it sounds like you need to be able to do both.
>>>>
>>>>  *From:* Marco Shaw <ma...@gmail.com>
>>>> *Sent:* Tuesday, July 01, 2014 10:24 AM
>>>> *To:* user <us...@hadoop.apache.org>
>>>> *Subject:* Re: The future of MapReduce
>>>>
>>>>  It depends...  It seems most are evolving from needing "lots of data
>>>> crunched", to "lots of data crunched right now".  Most are looking for
>>>> *real-time* fraud detection or recommendations, for example, which
>>>> MapReduce is not ideal for.
>>>>
>>>> Marco
>>>>
>>>>
>>>> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
>>>> adaryl.wakefield@hotmail.com> wrote:
>>>>
>>>>>   “The Mahout community decided to move its codebase onto modern data
>>>>> processing systems that offer a richer programming model and more efficient
>>>>> execution than Hadoop MapReduce.”
>>>>>
>>>>> Does this mean that learning MapReduce is a waste of time? Is Storm
>>>>> the future or are both technologies necessary?
>>>>>
>>>>> B.
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>

Re: The future of MapReduce

Posted by snehil wakchaure <sn...@gmail.com>.

Heard about Google dataflow from last week
On Jul 1, 2014 4:42 PM, "Marco Shaw" <ma...@gmail.com> wrote:

> Interesting timing:
> http://java.dzone.com/articles/there-future-mapreduce
>
> Google declared last week that "MapReduce was dead" more or less, but
> there are very few that process data at Google's level.
>
> Makes me wonder what Yahoo has for a tech mix these days...
>
>
>
> On Tue, Jul 1, 2014 at 6:01 PM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   It was a declarative statement designed to elicit further explanation.
>>
>> If someone is brand new and trying to figure out how to eat the elephant
>> as it were, you kind of want to burn things down to their essentials. If
>> MapReduce isn’t going to be part of the ecosystem in the future, one does
>> not want to spend hours learning how to write MapReduce jobs.
>>
>> B.
>>
>>  *From:* Marco Shaw <ma...@gmail.com>
>> *Sent:* Tuesday, July 01, 2014 3:50 PM
>> *To:* user <us...@hadoop.apache.org>
>> *Subject:* Re: The future of MapReduce
>>
>>  Sorry, not sure if that's a question.
>>
>> Hadoop v1=HDFS+MapReduce
>> Hadoop v2=HDFS+YARN (+ MapReduce part of the core, but now considered
>> optional to "get work done")
>>
>> v2 adds a better resourcing framework.  Now you can run Storm, Spark,
>> MapReduce, etc. on Hadoop and mix-and-match jobs/tasks with whatever your
>> requirements, which may actually be both batch "stuff" and/or real-time.
>>
>> Not sure if that clarifies things...  Just like you can evaluate all
>> kinds of Apache ecosystems products to meet your needs, MapReduce is no
>> longer the only kid on the bock.
>>
>>
>> On Tue, Jul 1, 2014 at 3:07 PM, Adaryl "Bob" Wakefield, MBA <
>> adaryl.wakefield@hotmail.com> wrote:
>>
>>>   From your answer, it sounds like you need to be able to do both.
>>>
>>>  *From:* Marco Shaw <ma...@gmail.com>
>>> *Sent:* Tuesday, July 01, 2014 10:24 AM
>>> *To:* user <us...@hadoop.apache.org>
>>> *Subject:* Re: The future of MapReduce
>>>
>>>  It depends...  It seems most are evolving from needing "lots of data
>>> crunched", to "lots of data crunched right now".  Most are looking for
>>> *real-time* fraud detection or recommendations, for example, which
>>> MapReduce is not ideal for.
>>>
>>> Marco
>>>
>>>
>>> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
>>> adaryl.wakefield@hotmail.com> wrote:
>>>
>>>>   “The Mahout community decided to move its codebase onto modern data
>>>> processing systems that offer a richer programming model and more efficient
>>>> execution than Hadoop MapReduce.”
>>>>
>>>> Does this mean that learning MapReduce is a waste of time? Is Storm the
>>>> future or are both technologies necessary?
>>>>
>>>> B.
>>>>
>>>
>>>
>>
>>
>
>

Re: The future of MapReduce

Posted by snehil wakchaure <sn...@gmail.com>.

Heard about Google dataflow from last week
On Jul 1, 2014 4:42 PM, "Marco Shaw" <ma...@gmail.com> wrote:

> Interesting timing:
> http://java.dzone.com/articles/there-future-mapreduce
>
> Google declared last week that "MapReduce was dead" more or less, but
> there are very few that process data at Google's level.
>
> Makes me wonder what Yahoo has for a tech mix these days...
>
>
>
> On Tue, Jul 1, 2014 at 6:01 PM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   It was a declarative statement designed to elicit further explanation.
>>
>> If someone is brand new and trying to figure out how to eat the elephant
>> as it were, you kind of want to burn things down to their essentials. If
>> MapReduce isn’t going to be part of the ecosystem in the future, one does
>> not want to spend hours learning how to write MapReduce jobs.
>>
>> B.
>>
>>  *From:* Marco Shaw <ma...@gmail.com>
>> *Sent:* Tuesday, July 01, 2014 3:50 PM
>> *To:* user <us...@hadoop.apache.org>
>> *Subject:* Re: The future of MapReduce
>>
>>  Sorry, not sure if that's a question.
>>
>> Hadoop v1=HDFS+MapReduce
>> Hadoop v2=HDFS+YARN (+ MapReduce part of the core, but now considered
>> optional to "get work done")
>>
>> v2 adds a better resourcing framework.  Now you can run Storm, Spark,
>> MapReduce, etc. on Hadoop and mix-and-match jobs/tasks with whatever your
>> requirements, which may actually be both batch "stuff" and/or real-time.
>>
>> Not sure if that clarifies things...  Just like you can evaluate all
>> kinds of Apache ecosystems products to meet your needs, MapReduce is no
>> longer the only kid on the bock.
>>
>>
>> On Tue, Jul 1, 2014 at 3:07 PM, Adaryl "Bob" Wakefield, MBA <
>> adaryl.wakefield@hotmail.com> wrote:
>>
>>>   From your answer, it sounds like you need to be able to do both.
>>>
>>>  *From:* Marco Shaw <ma...@gmail.com>
>>> *Sent:* Tuesday, July 01, 2014 10:24 AM
>>> *To:* user <us...@hadoop.apache.org>
>>> *Subject:* Re: The future of MapReduce
>>>
>>>  It depends...  It seems most are evolving from needing "lots of data
>>> crunched", to "lots of data crunched right now".  Most are looking for
>>> *real-time* fraud detection or recommendations, for example, which
>>> MapReduce is not ideal for.
>>>
>>> Marco
>>>
>>>
>>> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
>>> adaryl.wakefield@hotmail.com> wrote:
>>>
>>>>   “The Mahout community decided to move its codebase onto modern data
>>>> processing systems that offer a richer programming model and more efficient
>>>> execution than Hadoop MapReduce.”
>>>>
>>>> Does this mean that learning MapReduce is a waste of time? Is Storm the
>>>> future or are both technologies necessary?
>>>>
>>>> B.
>>>>
>>>
>>>
>>
>>
>
>

Re: The future of MapReduce

Posted by snehil wakchaure <sn...@gmail.com>.

Heard about Google dataflow from last week
On Jul 1, 2014 4:42 PM, "Marco Shaw" <ma...@gmail.com> wrote:

> Interesting timing:
> http://java.dzone.com/articles/there-future-mapreduce
>
> Google declared last week that "MapReduce was dead" more or less, but
> there are very few that process data at Google's level.
>
> Makes me wonder what Yahoo has for a tech mix these days...
>
>
>
> On Tue, Jul 1, 2014 at 6:01 PM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   It was a declarative statement designed to elicit further explanation.
>>
>> If someone is brand new and trying to figure out how to eat the elephant
>> as it were, you kind of want to burn things down to their essentials. If
>> MapReduce isn’t going to be part of the ecosystem in the future, one does
>> not want to spend hours learning how to write MapReduce jobs.
>>
>> B.
>>
>>  *From:* Marco Shaw <ma...@gmail.com>
>> *Sent:* Tuesday, July 01, 2014 3:50 PM
>> *To:* user <us...@hadoop.apache.org>
>> *Subject:* Re: The future of MapReduce
>>
>>  Sorry, not sure if that's a question.
>>
>> Hadoop v1=HDFS+MapReduce
>> Hadoop v2=HDFS+YARN (+ MapReduce part of the core, but now considered
>> optional to "get work done")
>>
>> v2 adds a better resourcing framework.  Now you can run Storm, Spark,
>> MapReduce, etc. on Hadoop and mix-and-match jobs/tasks with whatever your
>> requirements, which may actually be both batch "stuff" and/or real-time.
>>
>> Not sure if that clarifies things...  Just like you can evaluate all
>> kinds of Apache ecosystems products to meet your needs, MapReduce is no
>> longer the only kid on the bock.
>>
>>
>> On Tue, Jul 1, 2014 at 3:07 PM, Adaryl "Bob" Wakefield, MBA <
>> adaryl.wakefield@hotmail.com> wrote:
>>
>>>   From your answer, it sounds like you need to be able to do both.
>>>
>>>  *From:* Marco Shaw <ma...@gmail.com>
>>> *Sent:* Tuesday, July 01, 2014 10:24 AM
>>> *To:* user <us...@hadoop.apache.org>
>>> *Subject:* Re: The future of MapReduce
>>>
>>>  It depends...  It seems most are evolving from needing "lots of data
>>> crunched", to "lots of data crunched right now".  Most are looking for
>>> *real-time* fraud detection or recommendations, for example, which
>>> MapReduce is not ideal for.
>>>
>>> Marco
>>>
>>>
>>> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
>>> adaryl.wakefield@hotmail.com> wrote:
>>>
>>>>   “The Mahout community decided to move its codebase onto modern data
>>>> processing systems that offer a richer programming model and more efficient
>>>> execution than Hadoop MapReduce.”
>>>>
>>>> Does this mean that learning MapReduce is a waste of time? Is Storm the
>>>> future or are both technologies necessary?
>>>>
>>>> B.
>>>>
>>>
>>>
>>
>>
>
>

Re: The future of MapReduce

Posted by snehil wakchaure <sn...@gmail.com>.

Heard about Google dataflow from last week
On Jul 1, 2014 4:42 PM, "Marco Shaw" <ma...@gmail.com> wrote:

> Interesting timing:
> http://java.dzone.com/articles/there-future-mapreduce
>
> Google declared last week that "MapReduce was dead" more or less, but
> there are very few that process data at Google's level.
>
> Makes me wonder what Yahoo has for a tech mix these days...
>
>
>
> On Tue, Jul 1, 2014 at 6:01 PM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   It was a declarative statement designed to elicit further explanation.
>>
>> If someone is brand new and trying to figure out how to eat the elephant
>> as it were, you kind of want to burn things down to their essentials. If
>> MapReduce isn’t going to be part of the ecosystem in the future, one does
>> not want to spend hours learning how to write MapReduce jobs.
>>
>> B.
>>
>>  *From:* Marco Shaw <ma...@gmail.com>
>> *Sent:* Tuesday, July 01, 2014 3:50 PM
>> *To:* user <us...@hadoop.apache.org>
>> *Subject:* Re: The future of MapReduce
>>
>>  Sorry, not sure if that's a question.
>>
>> Hadoop v1=HDFS+MapReduce
>> Hadoop v2=HDFS+YARN (+ MapReduce part of the core, but now considered
>> optional to "get work done")
>>
>> v2 adds a better resourcing framework.  Now you can run Storm, Spark,
>> MapReduce, etc. on Hadoop and mix-and-match jobs/tasks with whatever your
>> requirements, which may actually be both batch "stuff" and/or real-time.
>>
>> Not sure if that clarifies things...  Just like you can evaluate all
>> kinds of Apache ecosystems products to meet your needs, MapReduce is no
>> longer the only kid on the bock.
>>
>>
>> On Tue, Jul 1, 2014 at 3:07 PM, Adaryl "Bob" Wakefield, MBA <
>> adaryl.wakefield@hotmail.com> wrote:
>>
>>>   From your answer, it sounds like you need to be able to do both.
>>>
>>>  *From:* Marco Shaw <ma...@gmail.com>
>>> *Sent:* Tuesday, July 01, 2014 10:24 AM
>>> *To:* user <us...@hadoop.apache.org>
>>> *Subject:* Re: The future of MapReduce
>>>
>>>  It depends...  It seems most are evolving from needing "lots of data
>>> crunched", to "lots of data crunched right now".  Most are looking for
>>> *real-time* fraud detection or recommendations, for example, which
>>> MapReduce is not ideal for.
>>>
>>> Marco
>>>
>>>
>>> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
>>> adaryl.wakefield@hotmail.com> wrote:
>>>
>>>>   “The Mahout community decided to move its codebase onto modern data
>>>> processing systems that offer a richer programming model and more efficient
>>>> execution than Hadoop MapReduce.”
>>>>
>>>> Does this mean that learning MapReduce is a waste of time? Is Storm the
>>>> future or are both technologies necessary?
>>>>
>>>> B.
>>>>
>>>
>>>
>>
>>
>
>

Re: The future of MapReduce

Posted by Marco Shaw <ma...@gmail.com>.

Interesting timing:
http://java.dzone.com/articles/there-future-mapreduce

Google declared last week that "MapReduce was dead" more or less, but there
are very few that process data at Google's level.

Makes me wonder what Yahoo has for a tech mix these days...



On Tue, Jul 1, 2014 at 6:01 PM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   It was a declarative statement designed to elicit further explanation.
>
> If someone is brand new and trying to figure out how to eat the elephant
> as it were, you kind of want to burn things down to their essentials. If
> MapReduce isn’t going to be part of the ecosystem in the future, one does
> not want to spend hours learning how to write MapReduce jobs.
>
> B.
>
>  *From:* Marco Shaw <ma...@gmail.com>
> *Sent:* Tuesday, July 01, 2014 3:50 PM
> *To:* user <us...@hadoop.apache.org>
> *Subject:* Re: The future of MapReduce
>
>  Sorry, not sure if that's a question.
>
> Hadoop v1=HDFS+MapReduce
> Hadoop v2=HDFS+YARN (+ MapReduce part of the core, but now considered
> optional to "get work done")
>
> v2 adds a better resourcing framework.  Now you can run Storm, Spark,
> MapReduce, etc. on Hadoop and mix-and-match jobs/tasks with whatever your
> requirements, which may actually be both batch "stuff" and/or real-time.
>
> Not sure if that clarifies things...  Just like you can evaluate all kinds
> of Apache ecosystems products to meet your needs, MapReduce is no longer
> the only kid on the bock.
>
>
> On Tue, Jul 1, 2014 at 3:07 PM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   From your answer, it sounds like you need to be able to do both.
>>
>>  *From:* Marco Shaw <ma...@gmail.com>
>> *Sent:* Tuesday, July 01, 2014 10:24 AM
>> *To:* user <us...@hadoop.apache.org>
>> *Subject:* Re: The future of MapReduce
>>
>>  It depends...  It seems most are evolving from needing "lots of data
>> crunched", to "lots of data crunched right now".  Most are looking for
>> *real-time* fraud detection or recommendations, for example, which
>> MapReduce is not ideal for.
>>
>> Marco
>>
>>
>> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
>> adaryl.wakefield@hotmail.com> wrote:
>>
>>>   “The Mahout community decided to move its codebase onto modern data
>>> processing systems that offer a richer programming model and more efficient
>>> execution than Hadoop MapReduce.”
>>>
>>> Does this mean that learning MapReduce is a waste of time? Is Storm the
>>> future or are both technologies necessary?
>>>
>>> B.
>>>
>>
>>
>
>

Re: The future of MapReduce

Posted by Marco Shaw <ma...@gmail.com>.

Interesting timing:
http://java.dzone.com/articles/there-future-mapreduce

Google declared last week that "MapReduce was dead" more or less, but there
are very few that process data at Google's level.

Makes me wonder what Yahoo has for a tech mix these days...



On Tue, Jul 1, 2014 at 6:01 PM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   It was a declarative statement designed to elicit further explanation.
>
> If someone is brand new and trying to figure out how to eat the elephant
> as it were, you kind of want to burn things down to their essentials. If
> MapReduce isn’t going to be part of the ecosystem in the future, one does
> not want to spend hours learning how to write MapReduce jobs.
>
> B.
>
>  *From:* Marco Shaw <ma...@gmail.com>
> *Sent:* Tuesday, July 01, 2014 3:50 PM
> *To:* user <us...@hadoop.apache.org>
> *Subject:* Re: The future of MapReduce
>
>  Sorry, not sure if that's a question.
>
> Hadoop v1=HDFS+MapReduce
> Hadoop v2=HDFS+YARN (+ MapReduce part of the core, but now considered
> optional to "get work done")
>
> v2 adds a better resourcing framework.  Now you can run Storm, Spark,
> MapReduce, etc. on Hadoop and mix-and-match jobs/tasks with whatever your
> requirements, which may actually be both batch "stuff" and/or real-time.
>
> Not sure if that clarifies things...  Just like you can evaluate all kinds
> of Apache ecosystems products to meet your needs, MapReduce is no longer
> the only kid on the bock.
>
>
> On Tue, Jul 1, 2014 at 3:07 PM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   From your answer, it sounds like you need to be able to do both.
>>
>>  *From:* Marco Shaw <ma...@gmail.com>
>> *Sent:* Tuesday, July 01, 2014 10:24 AM
>> *To:* user <us...@hadoop.apache.org>
>> *Subject:* Re: The future of MapReduce
>>
>>  It depends...  It seems most are evolving from needing "lots of data
>> crunched", to "lots of data crunched right now".  Most are looking for
>> *real-time* fraud detection or recommendations, for example, which
>> MapReduce is not ideal for.
>>
>> Marco
>>
>>
>> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
>> adaryl.wakefield@hotmail.com> wrote:
>>
>>>   “The Mahout community decided to move its codebase onto modern data
>>> processing systems that offer a richer programming model and more efficient
>>> execution than Hadoop MapReduce.”
>>>
>>> Does this mean that learning MapReduce is a waste of time? Is Storm the
>>> future or are both technologies necessary?
>>>
>>> B.
>>>
>>
>>
>
>

Re: The future of MapReduce

Posted by Marco Shaw <ma...@gmail.com>.

Interesting timing:
http://java.dzone.com/articles/there-future-mapreduce

Google declared last week that "MapReduce was dead" more or less, but there
are very few that process data at Google's level.

Makes me wonder what Yahoo has for a tech mix these days...



On Tue, Jul 1, 2014 at 6:01 PM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   It was a declarative statement designed to elicit further explanation.
>
> If someone is brand new and trying to figure out how to eat the elephant
> as it were, you kind of want to burn things down to their essentials. If
> MapReduce isn’t going to be part of the ecosystem in the future, one does
> not want to spend hours learning how to write MapReduce jobs.
>
> B.
>
>  *From:* Marco Shaw <ma...@gmail.com>
> *Sent:* Tuesday, July 01, 2014 3:50 PM
> *To:* user <us...@hadoop.apache.org>
> *Subject:* Re: The future of MapReduce
>
>  Sorry, not sure if that's a question.
>
> Hadoop v1=HDFS+MapReduce
> Hadoop v2=HDFS+YARN (+ MapReduce part of the core, but now considered
> optional to "get work done")
>
> v2 adds a better resourcing framework.  Now you can run Storm, Spark,
> MapReduce, etc. on Hadoop and mix-and-match jobs/tasks with whatever your
> requirements, which may actually be both batch "stuff" and/or real-time.
>
> Not sure if that clarifies things...  Just like you can evaluate all kinds
> of Apache ecosystems products to meet your needs, MapReduce is no longer
> the only kid on the bock.
>
>
> On Tue, Jul 1, 2014 at 3:07 PM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   From your answer, it sounds like you need to be able to do both.
>>
>>  *From:* Marco Shaw <ma...@gmail.com>
>> *Sent:* Tuesday, July 01, 2014 10:24 AM
>> *To:* user <us...@hadoop.apache.org>
>> *Subject:* Re: The future of MapReduce
>>
>>  It depends...  It seems most are evolving from needing "lots of data
>> crunched", to "lots of data crunched right now".  Most are looking for
>> *real-time* fraud detection or recommendations, for example, which
>> MapReduce is not ideal for.
>>
>> Marco
>>
>>
>> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
>> adaryl.wakefield@hotmail.com> wrote:
>>
>>>   “The Mahout community decided to move its codebase onto modern data
>>> processing systems that offer a richer programming model and more efficient
>>> execution than Hadoop MapReduce.”
>>>
>>> Does this mean that learning MapReduce is a waste of time? Is Storm the
>>> future or are both technologies necessary?
>>>
>>> B.
>>>
>>
>>
>
>

Re: The future of MapReduce

Posted by Marco Shaw <ma...@gmail.com>.

Interesting timing:
http://java.dzone.com/articles/there-future-mapreduce

Google declared last week that "MapReduce was dead" more or less, but there
are very few that process data at Google's level.

Makes me wonder what Yahoo has for a tech mix these days...



On Tue, Jul 1, 2014 at 6:01 PM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   It was a declarative statement designed to elicit further explanation.
>
> If someone is brand new and trying to figure out how to eat the elephant
> as it were, you kind of want to burn things down to their essentials. If
> MapReduce isn’t going to be part of the ecosystem in the future, one does
> not want to spend hours learning how to write MapReduce jobs.
>
> B.
>
>  *From:* Marco Shaw <ma...@gmail.com>
> *Sent:* Tuesday, July 01, 2014 3:50 PM
> *To:* user <us...@hadoop.apache.org>
> *Subject:* Re: The future of MapReduce
>
>  Sorry, not sure if that's a question.
>
> Hadoop v1=HDFS+MapReduce
> Hadoop v2=HDFS+YARN (+ MapReduce part of the core, but now considered
> optional to "get work done")
>
> v2 adds a better resourcing framework.  Now you can run Storm, Spark,
> MapReduce, etc. on Hadoop and mix-and-match jobs/tasks with whatever your
> requirements, which may actually be both batch "stuff" and/or real-time.
>
> Not sure if that clarifies things...  Just like you can evaluate all kinds
> of Apache ecosystems products to meet your needs, MapReduce is no longer
> the only kid on the bock.
>
>
> On Tue, Jul 1, 2014 at 3:07 PM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   From your answer, it sounds like you need to be able to do both.
>>
>>  *From:* Marco Shaw <ma...@gmail.com>
>> *Sent:* Tuesday, July 01, 2014 10:24 AM
>> *To:* user <us...@hadoop.apache.org>
>> *Subject:* Re: The future of MapReduce
>>
>>  It depends...  It seems most are evolving from needing "lots of data
>> crunched", to "lots of data crunched right now".  Most are looking for
>> *real-time* fraud detection or recommendations, for example, which
>> MapReduce is not ideal for.
>>
>> Marco
>>
>>
>> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
>> adaryl.wakefield@hotmail.com> wrote:
>>
>>>   “The Mahout community decided to move its codebase onto modern data
>>> processing systems that offer a richer programming model and more efficient
>>> execution than Hadoop MapReduce.”
>>>
>>> Does this mean that learning MapReduce is a waste of time? Is Storm the
>>> future or are both technologies necessary?
>>>
>>> B.
>>>
>>
>>
>
>

Re: The future of MapReduce

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.

It was a declarative statement designed to elicit further explanation.

If someone is brand new and trying to figure out how to eat the elephant as it were, you kind of want to burn things down to their essentials. If MapReduce isn’t going to be part of the ecosystem in the future, one does not want to spend hours learning how to write MapReduce jobs.

B.

From: Marco Shaw 
Sent: Tuesday, July 01, 2014 3:50 PM
To: user 
Subject: Re: The future of MapReduce

Sorry, not sure if that's a question.

Hadoop v1=HDFS+MapReduce
Hadoop v2=HDFS+YARN (+ MapReduce part of the core, but now considered optional to "get work done")

v2 adds a better resourcing framework.  Now you can run Storm, Spark, MapReduce, etc. on Hadoop and mix-and-match jobs/tasks with whatever your requirements, which may actually be both batch "stuff" and/or real-time.

Not sure if that clarifies things...  Just like you can evaluate all kinds of Apache ecosystems products to meet your needs, MapReduce is no longer the only kid on the bock.

On Tue, Jul 1, 2014 at 3:07 PM, Adaryl "Bob" Wakefield, MBA <ad...@hotmail.com> wrote:

  From your answer, it sounds like you need to be able to do both.

  From: Marco Shaw 
  Sent: Tuesday, July 01, 2014 10:24 AM
  To: user 
  Subject: Re: The future of MapReduce

  It depends...  It seems most are evolving from needing "lots of data crunched", to "lots of data crunched right now".  Most are looking for *real-time* fraud detection or recommendations, for example, which MapReduce is not ideal for.

  Marco

  On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <ad...@hotmail.com> wrote:

    “The Mahout community decided to move its codebase onto modern data processing systems that offer a richer programming model and more efficient execution than Hadoop MapReduce.”

    Does this mean that learning MapReduce is a waste of time? Is Storm the future or are both technologies necessary?

    B.

Re: The future of MapReduce

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.

It was a declarative statement designed to elicit further explanation.

If someone is brand new and trying to figure out how to eat the elephant as it were, you kind of want to burn things down to their essentials. If MapReduce isn’t going to be part of the ecosystem in the future, one does not want to spend hours learning how to write MapReduce jobs.

B.

From: Marco Shaw 
Sent: Tuesday, July 01, 2014 3:50 PM
To: user 
Subject: Re: The future of MapReduce

Sorry, not sure if that's a question.

Hadoop v1=HDFS+MapReduce
Hadoop v2=HDFS+YARN (+ MapReduce part of the core, but now considered optional to "get work done")

v2 adds a better resourcing framework.  Now you can run Storm, Spark, MapReduce, etc. on Hadoop and mix-and-match jobs/tasks with whatever your requirements, which may actually be both batch "stuff" and/or real-time.

Not sure if that clarifies things...  Just like you can evaluate all kinds of Apache ecosystems products to meet your needs, MapReduce is no longer the only kid on the bock.

On Tue, Jul 1, 2014 at 3:07 PM, Adaryl "Bob" Wakefield, MBA <ad...@hotmail.com> wrote:

  From your answer, it sounds like you need to be able to do both.

  From: Marco Shaw 
  Sent: Tuesday, July 01, 2014 10:24 AM
  To: user 
  Subject: Re: The future of MapReduce

  It depends...  It seems most are evolving from needing "lots of data crunched", to "lots of data crunched right now".  Most are looking for *real-time* fraud detection or recommendations, for example, which MapReduce is not ideal for.

  Marco

  On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <ad...@hotmail.com> wrote:

    “The Mahout community decided to move its codebase onto modern data processing systems that offer a richer programming model and more efficient execution than Hadoop MapReduce.”

    Does this mean that learning MapReduce is a waste of time? Is Storm the future or are both technologies necessary?

    B.

Re: The future of MapReduce

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.

It was a declarative statement designed to elicit further explanation.

If someone is brand new and trying to figure out how to eat the elephant as it were, you kind of want to burn things down to their essentials. If MapReduce isn’t going to be part of the ecosystem in the future, one does not want to spend hours learning how to write MapReduce jobs.

B.

From: Marco Shaw 
Sent: Tuesday, July 01, 2014 3:50 PM
To: user 
Subject: Re: The future of MapReduce

Sorry, not sure if that's a question.

Hadoop v1=HDFS+MapReduce
Hadoop v2=HDFS+YARN (+ MapReduce part of the core, but now considered optional to "get work done")

v2 adds a better resourcing framework.  Now you can run Storm, Spark, MapReduce, etc. on Hadoop and mix-and-match jobs/tasks with whatever your requirements, which may actually be both batch "stuff" and/or real-time.

Not sure if that clarifies things...  Just like you can evaluate all kinds of Apache ecosystems products to meet your needs, MapReduce is no longer the only kid on the bock.

On Tue, Jul 1, 2014 at 3:07 PM, Adaryl "Bob" Wakefield, MBA <ad...@hotmail.com> wrote:

  From your answer, it sounds like you need to be able to do both.

  From: Marco Shaw 
  Sent: Tuesday, July 01, 2014 10:24 AM
  To: user 
  Subject: Re: The future of MapReduce

  It depends...  It seems most are evolving from needing "lots of data crunched", to "lots of data crunched right now".  Most are looking for *real-time* fraud detection or recommendations, for example, which MapReduce is not ideal for.

  Marco

  On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <ad...@hotmail.com> wrote:

    “The Mahout community decided to move its codebase onto modern data processing systems that offer a richer programming model and more efficient execution than Hadoop MapReduce.”

    Does this mean that learning MapReduce is a waste of time? Is Storm the future or are both technologies necessary?

    B.

Re: The future of MapReduce

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.

It was a declarative statement designed to elicit further explanation.

If someone is brand new and trying to figure out how to eat the elephant as it were, you kind of want to burn things down to their essentials. If MapReduce isn’t going to be part of the ecosystem in the future, one does not want to spend hours learning how to write MapReduce jobs.

B.

From: Marco Shaw 
Sent: Tuesday, July 01, 2014 3:50 PM
To: user 
Subject: Re: The future of MapReduce

Sorry, not sure if that's a question.

Hadoop v1=HDFS+MapReduce
Hadoop v2=HDFS+YARN (+ MapReduce part of the core, but now considered optional to "get work done")

v2 adds a better resourcing framework.  Now you can run Storm, Spark, MapReduce, etc. on Hadoop and mix-and-match jobs/tasks with whatever your requirements, which may actually be both batch "stuff" and/or real-time.

Not sure if that clarifies things...  Just like you can evaluate all kinds of Apache ecosystems products to meet your needs, MapReduce is no longer the only kid on the bock.

On Tue, Jul 1, 2014 at 3:07 PM, Adaryl "Bob" Wakefield, MBA <ad...@hotmail.com> wrote:

  From your answer, it sounds like you need to be able to do both.

  From: Marco Shaw 
  Sent: Tuesday, July 01, 2014 10:24 AM
  To: user 
  Subject: Re: The future of MapReduce

  It depends...  It seems most are evolving from needing "lots of data crunched", to "lots of data crunched right now".  Most are looking for *real-time* fraud detection or recommendations, for example, which MapReduce is not ideal for.

  Marco

  On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <ad...@hotmail.com> wrote:

    “The Mahout community decided to move its codebase onto modern data processing systems that offer a richer programming model and more efficient execution than Hadoop MapReduce.”

    Does this mean that learning MapReduce is a waste of time? Is Storm the future or are both technologies necessary?

    B.

Re: The future of MapReduce

Posted by Marco Shaw <ma...@gmail.com>.

Sorry, not sure if that's a question.

Hadoop v1=HDFS+MapReduce
Hadoop v2=HDFS+YARN (+ MapReduce part of the core, but now considered
optional to "get work done")

v2 adds a better resourcing framework.  Now you can run Storm, Spark,
MapReduce, etc. on Hadoop and mix-and-match jobs/tasks with whatever your
requirements, which may actually be both batch "stuff" and/or real-time.

Not sure if that clarifies things...  Just like you can evaluate all kinds
of Apache ecosystems products to meet your needs, MapReduce is no longer
the only kid on the bock.

On Tue, Jul 1, 2014 at 3:07 PM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   From your answer, it sounds like you need to be able to do both.
>
>  *From:* Marco Shaw <ma...@gmail.com>
> *Sent:* Tuesday, July 01, 2014 10:24 AM
> *To:* user <us...@hadoop.apache.org>
> *Subject:* Re: The future of MapReduce
>
>  It depends...  It seems most are evolving from needing "lots of data
> crunched", to "lots of data crunched right now".  Most are looking for
> *real-time* fraud detection or recommendations, for example, which
> MapReduce is not ideal for.
>
> Marco
>
>
> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   “The Mahout community decided to move its codebase onto modern data
>> processing systems that offer a richer programming model and more efficient
>> execution than Hadoop MapReduce.”
>>
>> Does this mean that learning MapReduce is a waste of time? Is Storm the
>> future or are both technologies necessary?
>>
>> B.
>>
>
>

Re: The future of MapReduce

Posted by kartik saxena <ka...@gmail.com>.

Spark https://spark.apache.org/ is also getting a lot attention with its
in-memory computations and caching features. Performance wise it is being
touted better than mahout because machine learning involves iterative
computations and Spark could cache these computations in-memory for faster
processing.

On Tue, Jul 1, 2014 at 11:07 AM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   From your answer, it sounds like you need to be able to do both.
>
>  *From:* Marco Shaw <ma...@gmail.com>
> *Sent:* Tuesday, July 01, 2014 10:24 AM
> *To:* user <us...@hadoop.apache.org>
> *Subject:* Re: The future of MapReduce
>
>  It depends...  It seems most are evolving from needing "lots of data
> crunched", to "lots of data crunched right now".  Most are looking for
> *real-time* fraud detection or recommendations, for example, which
> MapReduce is not ideal for.
>
> Marco
>
>
> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   “The Mahout community decided to move its codebase onto modern data
>> processing systems that offer a richer programming model and more efficient
>> execution than Hadoop MapReduce.”
>>
>> Does this mean that learning MapReduce is a waste of time? Is Storm the
>> future or are both technologies necessary?
>>
>> B.
>>
>
>

Re: The future of MapReduce

Posted by kartik saxena <ka...@gmail.com>.

Spark https://spark.apache.org/ is also getting a lot attention with its
in-memory computations and caching features. Performance wise it is being
touted better than mahout because machine learning involves iterative
computations and Spark could cache these computations in-memory for faster
processing.

On Tue, Jul 1, 2014 at 11:07 AM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   From your answer, it sounds like you need to be able to do both.
>
>  *From:* Marco Shaw <ma...@gmail.com>
> *Sent:* Tuesday, July 01, 2014 10:24 AM
> *To:* user <us...@hadoop.apache.org>
> *Subject:* Re: The future of MapReduce
>
>  It depends...  It seems most are evolving from needing "lots of data
> crunched", to "lots of data crunched right now".  Most are looking for
> *real-time* fraud detection or recommendations, for example, which
> MapReduce is not ideal for.
>
> Marco
>
>
> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   “The Mahout community decided to move its codebase onto modern data
>> processing systems that offer a richer programming model and more efficient
>> execution than Hadoop MapReduce.”
>>
>> Does this mean that learning MapReduce is a waste of time? Is Storm the
>> future or are both technologies necessary?
>>
>> B.
>>
>
>

Re: The future of MapReduce

Posted by Marco Shaw <ma...@gmail.com>.

Sorry, not sure if that's a question.

Hadoop v1=HDFS+MapReduce
Hadoop v2=HDFS+YARN (+ MapReduce part of the core, but now considered
optional to "get work done")

v2 adds a better resourcing framework.  Now you can run Storm, Spark,
MapReduce, etc. on Hadoop and mix-and-match jobs/tasks with whatever your
requirements, which may actually be both batch "stuff" and/or real-time.

Not sure if that clarifies things...  Just like you can evaluate all kinds
of Apache ecosystems products to meet your needs, MapReduce is no longer
the only kid on the bock.

On Tue, Jul 1, 2014 at 3:07 PM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   From your answer, it sounds like you need to be able to do both.
>
>  *From:* Marco Shaw <ma...@gmail.com>
> *Sent:* Tuesday, July 01, 2014 10:24 AM
> *To:* user <us...@hadoop.apache.org>
> *Subject:* Re: The future of MapReduce
>
>  It depends...  It seems most are evolving from needing "lots of data
> crunched", to "lots of data crunched right now".  Most are looking for
> *real-time* fraud detection or recommendations, for example, which
> MapReduce is not ideal for.
>
> Marco
>
>
> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   “The Mahout community decided to move its codebase onto modern data
>> processing systems that offer a richer programming model and more efficient
>> execution than Hadoop MapReduce.”
>>
>> Does this mean that learning MapReduce is a waste of time? Is Storm the
>> future or are both technologies necessary?
>>
>> B.
>>
>
>

Re: The future of MapReduce

Posted by Marco Shaw <ma...@gmail.com>.

Sorry, not sure if that's a question.

Hadoop v1=HDFS+MapReduce
Hadoop v2=HDFS+YARN (+ MapReduce part of the core, but now considered
optional to "get work done")

v2 adds a better resourcing framework.  Now you can run Storm, Spark,
MapReduce, etc. on Hadoop and mix-and-match jobs/tasks with whatever your
requirements, which may actually be both batch "stuff" and/or real-time.

Not sure if that clarifies things...  Just like you can evaluate all kinds
of Apache ecosystems products to meet your needs, MapReduce is no longer
the only kid on the bock.

On Tue, Jul 1, 2014 at 3:07 PM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   From your answer, it sounds like you need to be able to do both.
>
>  *From:* Marco Shaw <ma...@gmail.com>
> *Sent:* Tuesday, July 01, 2014 10:24 AM
> *To:* user <us...@hadoop.apache.org>
> *Subject:* Re: The future of MapReduce
>
>  It depends...  It seems most are evolving from needing "lots of data
> crunched", to "lots of data crunched right now".  Most are looking for
> *real-time* fraud detection or recommendations, for example, which
> MapReduce is not ideal for.
>
> Marco
>
>
> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   “The Mahout community decided to move its codebase onto modern data
>> processing systems that offer a richer programming model and more efficient
>> execution than Hadoop MapReduce.”
>>
>> Does this mean that learning MapReduce is a waste of time? Is Storm the
>> future or are both technologies necessary?
>>
>> B.
>>
>
>

Re: The future of MapReduce

Posted by Marco Shaw <ma...@gmail.com>.

Sorry, not sure if that's a question.

Hadoop v1=HDFS+MapReduce
Hadoop v2=HDFS+YARN (+ MapReduce part of the core, but now considered
optional to "get work done")

v2 adds a better resourcing framework.  Now you can run Storm, Spark,
MapReduce, etc. on Hadoop and mix-and-match jobs/tasks with whatever your
requirements, which may actually be both batch "stuff" and/or real-time.

Not sure if that clarifies things...  Just like you can evaluate all kinds
of Apache ecosystems products to meet your needs, MapReduce is no longer
the only kid on the bock.

On Tue, Jul 1, 2014 at 3:07 PM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   From your answer, it sounds like you need to be able to do both.
>
>  *From:* Marco Shaw <ma...@gmail.com>
> *Sent:* Tuesday, July 01, 2014 10:24 AM
> *To:* user <us...@hadoop.apache.org>
> *Subject:* Re: The future of MapReduce
>
>  It depends...  It seems most are evolving from needing "lots of data
> crunched", to "lots of data crunched right now".  Most are looking for
> *real-time* fraud detection or recommendations, for example, which
> MapReduce is not ideal for.
>
> Marco
>
>
> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   “The Mahout community decided to move its codebase onto modern data
>> processing systems that offer a richer programming model and more efficient
>> execution than Hadoop MapReduce.”
>>
>> Does this mean that learning MapReduce is a waste of time? Is Storm the
>> future or are both technologies necessary?
>>
>> B.
>>
>
>

Re: The future of MapReduce

Posted by kartik saxena <ka...@gmail.com>.

Spark https://spark.apache.org/ is also getting a lot attention with its
in-memory computations and caching features. Performance wise it is being
touted better than mahout because machine learning involves iterative
computations and Spark could cache these computations in-memory for faster
processing.

On Tue, Jul 1, 2014 at 11:07 AM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   From your answer, it sounds like you need to be able to do both.
>
>  *From:* Marco Shaw <ma...@gmail.com>
> *Sent:* Tuesday, July 01, 2014 10:24 AM
> *To:* user <us...@hadoop.apache.org>
> *Subject:* Re: The future of MapReduce
>
>  It depends...  It seems most are evolving from needing "lots of data
> crunched", to "lots of data crunched right now".  Most are looking for
> *real-time* fraud detection or recommendations, for example, which
> MapReduce is not ideal for.
>
> Marco
>
>
> On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>>   “The Mahout community decided to move its codebase onto modern data
>> processing systems that offer a richer programming model and more efficient
>> execution than Hadoop MapReduce.”
>>
>> Does this mean that learning MapReduce is a waste of time? Is Storm the
>> future or are both technologies necessary?
>>
>> B.
>>
>
>

Re: The future of MapReduce

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.

>From your answer, it sounds like you need to be able to do both.

From: Marco Shaw 
Sent: Tuesday, July 01, 2014 10:24 AM
To: user 
Subject: Re: The future of MapReduce

It depends...  It seems most are evolving from needing "lots of data crunched", to "lots of data crunched right now".  Most are looking for *real-time* fraud detection or recommendations, for example, which MapReduce is not ideal for.

Marco

On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <ad...@hotmail.com> wrote:

  “The Mahout community decided to move its codebase onto modern data processing systems that offer a richer programming model and more efficient execution than Hadoop MapReduce.”

  Does this mean that learning MapReduce is a waste of time? Is Storm the future or are both technologies necessary?

  B.

Re: The future of MapReduce

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.

>From your answer, it sounds like you need to be able to do both.

From: Marco Shaw 
Sent: Tuesday, July 01, 2014 10:24 AM
To: user 
Subject: Re: The future of MapReduce

It depends...  It seems most are evolving from needing "lots of data crunched", to "lots of data crunched right now".  Most are looking for *real-time* fraud detection or recommendations, for example, which MapReduce is not ideal for.

Marco

On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <ad...@hotmail.com> wrote:

  “The Mahout community decided to move its codebase onto modern data processing systems that offer a richer programming model and more efficient execution than Hadoop MapReduce.”

  Does this mean that learning MapReduce is a waste of time? Is Storm the future or are both technologies necessary?

  B.

Re: The future of MapReduce

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.

>From your answer, it sounds like you need to be able to do both.

From: Marco Shaw 
Sent: Tuesday, July 01, 2014 10:24 AM
To: user 
Subject: Re: The future of MapReduce

It depends...  It seems most are evolving from needing "lots of data crunched", to "lots of data crunched right now".  Most are looking for *real-time* fraud detection or recommendations, for example, which MapReduce is not ideal for.

Marco

On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <ad...@hotmail.com> wrote:

  “The Mahout community decided to move its codebase onto modern data processing systems that offer a richer programming model and more efficient execution than Hadoop MapReduce.”

  Does this mean that learning MapReduce is a waste of time? Is Storm the future or are both technologies necessary?

  B.

Re: The future of MapReduce

Posted by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com>.

>From your answer, it sounds like you need to be able to do both.

From: Marco Shaw 
Sent: Tuesday, July 01, 2014 10:24 AM
To: user 
Subject: Re: The future of MapReduce

It depends...  It seems most are evolving from needing "lots of data crunched", to "lots of data crunched right now".  Most are looking for *real-time* fraud detection or recommendations, for example, which MapReduce is not ideal for.

Marco

On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <ad...@hotmail.com> wrote:

  “The Mahout community decided to move its codebase onto modern data processing systems that offer a richer programming model and more efficient execution than Hadoop MapReduce.”

  Does this mean that learning MapReduce is a waste of time? Is Storm the future or are both technologies necessary?

  B.

Re: The future of MapReduce

Posted by Marco Shaw <ma...@gmail.com>.

It depends...  It seems most are evolving from needing "lots of data
crunched", to "lots of data crunched right now".  Most are looking for
*real-time* fraud detection or recommendations, for example, which
MapReduce is not ideal for.

Marco

On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   “The Mahout community decided to move its codebase onto modern data
> processing systems that offer a richer programming model and more efficient
> execution than Hadoop MapReduce.”
>
> Does this mean that learning MapReduce is a waste of time? Is Storm the
> future or are both technologies necessary?
>
> B.
>

Re: The future of MapReduce

Posted by Marco Shaw <ma...@gmail.com>.

It depends...  It seems most are evolving from needing "lots of data
crunched", to "lots of data crunched right now".  Most are looking for
*real-time* fraud detection or recommendations, for example, which
MapReduce is not ideal for.

Marco

On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   “The Mahout community decided to move its codebase onto modern data
> processing systems that offer a richer programming model and more efficient
> execution than Hadoop MapReduce.”
>
> Does this mean that learning MapReduce is a waste of time? Is Storm the
> future or are both technologies necessary?
>
> B.
>

Re: The future of MapReduce

Posted by Marco Shaw <ma...@gmail.com>.

It depends...  It seems most are evolving from needing "lots of data
crunched", to "lots of data crunched right now".  Most are looking for
*real-time* fraud detection or recommendations, for example, which
MapReduce is not ideal for.

Marco

On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   “The Mahout community decided to move its codebase onto modern data
> processing systems that offer a richer programming model and more efficient
> execution than Hadoop MapReduce.”
>
> Does this mean that learning MapReduce is a waste of time? Is Storm the
> future or are both technologies necessary?
>
> B.
>

Re: The future of MapReduce

Posted by Marco Shaw <ma...@gmail.com>.

It depends...  It seems most are evolving from needing "lots of data
crunched", to "lots of data crunched right now".  Most are looking for
*real-time* fraud detection or recommendations, for example, which
MapReduce is not ideal for.

Marco

On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   “The Mahout community decided to move its codebase onto modern data
> processing systems that offer a richer programming model and more efficient
> execution than Hadoop MapReduce.”
>
> Does this mean that learning MapReduce is a waste of time? Is Storm the
> future or are both technologies necessary?
>
> B.
>