You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hama.apache.org by Tommaso Teofili <to...@gmail.com> on 2014/01/02 15:55:39 UTC

Re: Website Update

Hi all,

I just noticed that the graph on our homepage [0] looks very similar to the
one on Spark homepage [1] so I wonder if we could at least make it a bit
clearer either by writing the benchmarks results in a table near it (it
seems Hama always takes ~0) or something else I cannot think to right now.

The reference to the benchmarks wiki page is ok but I cannot find the entry
for the comparison with Mahout, maybe I'm missing something ...
Regards and happy new year everyone.
Tommaso

[0] : http://hama.apache.org/images/mahout_vs_hama.png
[1] : http://spark.incubator.apache.org/images/spark-lr.png


2013/12/20 Edward J. Yoon <ed...@apache.org>

> Hi all,
>
> I published new website for our community. If you have other ideas, please
> feel free to share your comments or file a JIRA ticket.
>
>
> On Fri, Dec 20, 2013 at 11:29 AM, Edward J. Yoon <edwardyoon@apache.org
> >wrote:
>
> > Thanks, I'll.
> >
> > Sent from my iPhone
> >
> > > On 2013. 12. 19., at 오후 10:41, Yexi Jiang <ye...@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > It looks nice. Is it possible to add more description to the figure?
> When
> > > people first saw this, they may not know what the x axis is (the number
> > of
> > > cores or the number of the number of groom servers?). Moreover, it is
> > > better to tell the reader some specs of the dataset used.
> > >
> > > Regards,
> > > Yexi
> > >
> > >
> > > 2013/12/19 Edward J. Yoon <ed...@apache.org>
> > >
> > >> Thank you so much!
> > >>
> > >> Sent from my iPhone
> > >>
> > >>>> On 2013. 12. 19., at 오후 4:45, Tommaso Teofili <
> > tommaso.teofili@gmail.com>
> > >>> wrote:
> > >>>
> > >>> Hi Edward,
> > >>>
> > >>> I think it generally looks better than the current one, I would just
> > >> change
> > >>> this:
> > >>>
> > >>> Many data analysis techniques such as machine learning and graph
> > >> algorithms
> > >>> require iterative computations but MapReduce model doesn't fit for
> > these
> > >>> iterative data analysis applications. To run these iterative data
> > >> analysis
> > >>> applications more efficiently, Hama offers pure Bulk Synchronous
> > Parallel
> > >>> computing engine.
> > >>>
> > >>>
> > >>> to something like this:
> > >>>
> > >>> Many data analysis techniques such as machine learning and graph
> > >> algorithms
> > >>> require iterative computations, this is where Bulk Synchronous
> Parallel
> > >>> model can be more effective than "plain" MapReduce. Therefore to run
> > such
> > >>> iterative data analysis applications more efficiently, Hama offers
> pure
> > >>> Bulk Synchronous Parallel computing engine.
> > >>>
> > >>>
> > >>> As I wouldn't say MR is inherently not good for iterative
> computations,
> > >>> just BSP can be a better / more perfomant alternative.
> > >>> My 2 cents,
> > >>> Tommaso
> > >>>
> > >>>
> > >>> 2013/12/19 Edward J. Yoon <ed...@apache.org>
> > >>>
> > >>>> Hi,
> > >>>>
> > >>>> I've made some changes to our website -
> > >>>> http://people.apache.org/~edwardyoon/site/ - Please review and
> > feedback
> > >>>> here.
> > >>>>
> > >>>> --
> > >>>> Best Regards, Edward J. Yoon
> > >>>> @eddieyoon
> > >
> > >
> > >
> > > --
> > > ------
> > > Yexi Jiang,
> > > ECS 251,  yjian004@cs.fiu.edu
> > > School of Computer and Information Science,
> > > Florida International University
> > > Homepage: http://users.cis.fiu.edu/~yjian004/
> >
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>

Re: Website Update

Posted by "Edward J. Yoon" <ed...@apache.org>.

I found another bug in Graph package and SemiClustering example.

If setup() method should be called once at the start of the program,
below variables should be declared as a static (purly class
variables). Otherwise, it won't work with DiskVerticesInfo or
Serialization version.

{code}
  private int semiClusterMaximumVertexCount;
  private int graphJobMessageSentCount;
  private int graphJobVertexMaxClusterCount;

  @Override
  public void setup(HamaConfiguration conf) {
    semiClusterMaximumVertexCount =
conf.getInt("semicluster.max.vertex.count", 10);
    graphJobMessageSentCount =
conf.getInt("semicluster.max.message.sent.count", 10);
    graphJobVertexMaxClusterCount = conf.getInt("vertex.max.cluster.count", 10);
  }
{code}

On Fri, Jan 3, 2014 at 11:28 AM, Edward J. Yoon <ed...@apache.org> wrote:
> Almost fixed.
>
> 1. http://markmail.org/thread/vv4z3gskfms6bhix
> 2. HAMA-821
>
> Graph examples still has memory issue.
>
> On Fri, Jan 3, 2014 at 11:03 AM, Yexi Jiang <ye...@gmail.com> wrote:
>> Is there any details about the test results?
>>
>>
>> 2014/1/2 Edward J. Yoon <ed...@apache.org>
>>
>>> Input was too small :/ So, I'll update website using confident benchmarks
>>> soon.
>>>
>>> https://twitter.com/tjungblut/status/414717432293363712
>>>
>>> After I saw this tweet, I thought we need to update the website. Many
>>> people seems think that hama is a graph processing framework. If you
>>> have some good idea, Please let me know.
>>>
>>> P.S., I recently tested Hama examples and 80% of them didn't work.
>>> Let's fix them all in 0.7 release and update the website clearly.
>>>
>>>
>>> On Thu, Jan 2, 2014 at 11:55 PM, Tommaso Teofili
>>> <to...@gmail.com> wrote:
>>> > Hi all,
>>> >
>>> > I just noticed that the graph on our homepage [0] looks very similar to
>>> the
>>> > one on Spark homepage [1] so I wonder if we could at least make it a bit
>>> > clearer either by writing the benchmarks results in a table near it (it
>>> > seems Hama always takes ~0) or something else I cannot think to right
>>> now.
>>> >
>>> > The reference to the benchmarks wiki page is ok but I cannot find the
>>> entry
>>> > for the comparison with Mahout, maybe I'm missing something ...
>>> > Regards and happy new year everyone.
>>> > Tommaso
>>> >
>>> > [0] : http://hama.apache.org/images/mahout_vs_hama.png
>>> > [1] : http://spark.incubator.apache.org/images/spark-lr.png
>>> >
>>> >
>>> > 2013/12/20 Edward J. Yoon <ed...@apache.org>
>>> >
>>> >> Hi all,
>>> >>
>>> >> I published new website for our community. If you have other ideas,
>>> please
>>> >> feel free to share your comments or file a JIRA ticket.
>>> >>
>>> >>
>>> >> On Fri, Dec 20, 2013 at 11:29 AM, Edward J. Yoon <edwardyoon@apache.org
>>> >> >wrote:
>>> >>
>>> >> > Thanks, I'll.
>>> >> >
>>> >> > Sent from my iPhone
>>> >> >
>>> >> > > On 2013. 12. 19., at 오후 10:41, Yexi Jiang <ye...@gmail.com>
>>> wrote:
>>> >> > >
>>> >> > > Hi,
>>> >> > >
>>> >> > > It looks nice. Is it possible to add more description to the figure?
>>> >> When
>>> >> > > people first saw this, they may not know what the x axis is (the
>>> number
>>> >> > of
>>> >> > > cores or the number of the number of groom servers?). Moreover, it
>>> is
>>> >> > > better to tell the reader some specs of the dataset used.
>>> >> > >
>>> >> > > Regards,
>>> >> > > Yexi
>>> >> > >
>>> >> > >
>>> >> > > 2013/12/19 Edward J. Yoon <ed...@apache.org>
>>> >> > >
>>> >> > >> Thank you so much!
>>> >> > >>
>>> >> > >> Sent from my iPhone
>>> >> > >>
>>> >> > >>>> On 2013. 12. 19., at 오후 4:45, Tommaso Teofili <
>>> >> > tommaso.teofili@gmail.com>
>>> >> > >>> wrote:
>>> >> > >>>
>>> >> > >>> Hi Edward,
>>> >> > >>>
>>> >> > >>> I think it generally looks better than the current one, I would
>>> just
>>> >> > >> change
>>> >> > >>> this:
>>> >> > >>>
>>> >> > >>> Many data analysis techniques such as machine learning and graph
>>> >> > >> algorithms
>>> >> > >>> require iterative computations but MapReduce model doesn't fit for
>>> >> > these
>>> >> > >>> iterative data analysis applications. To run these iterative data
>>> >> > >> analysis
>>> >> > >>> applications more efficiently, Hama offers pure Bulk Synchronous
>>> >> > Parallel
>>> >> > >>> computing engine.
>>> >> > >>>
>>> >> > >>>
>>> >> > >>> to something like this:
>>> >> > >>>
>>> >> > >>> Many data analysis techniques such as machine learning and graph
>>> >> > >> algorithms
>>> >> > >>> require iterative computations, this is where Bulk Synchronous
>>> >> Parallel
>>> >> > >>> model can be more effective than "plain" MapReduce. Therefore to
>>> run
>>> >> > such
>>> >> > >>> iterative data analysis applications more efficiently, Hama offers
>>> >> pure
>>> >> > >>> Bulk Synchronous Parallel computing engine.
>>> >> > >>>
>>> >> > >>>
>>> >> > >>> As I wouldn't say MR is inherently not good for iterative
>>> >> computations,
>>> >> > >>> just BSP can be a better / more perfomant alternative.
>>> >> > >>> My 2 cents,
>>> >> > >>> Tommaso
>>> >> > >>>
>>> >> > >>>
>>> >> > >>> 2013/12/19 Edward J. Yoon <ed...@apache.org>
>>> >> > >>>
>>> >> > >>>> Hi,
>>> >> > >>>>
>>> >> > >>>> I've made some changes to our website -
>>> >> > >>>> http://people.apache.org/~edwardyoon/site/ - Please review and
>>> >> > feedback
>>> >> > >>>> here.
>>> >> > >>>>
>>> >> > >>>> --
>>> >> > >>>> Best Regards, Edward J. Yoon
>>> >> > >>>> @eddieyoon
>>> >> > >
>>> >> > >
>>> >> > >
>>> >> > > --
>>> >> > > ------
>>> >> > > Yexi Jiang,
>>> >> > > ECS 251,  yjian004@cs.fiu.edu
>>> >> > > School of Computer and Information Science,
>>> >> > > Florida International University
>>> >> > > Homepage: http://users.cis.fiu.edu/~yjian004/
>>> >> >
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Best Regards, Edward J. Yoon
>>> >> @eddieyoon
>>> >>
>>>
>>>
>>>
>>> --
>>> Best Regards, Edward J. Yoon
>>> @eddieyoon
>>>
>>
>>
>>
>> --
>> ------
>> Yexi Jiang,
>> ECS 251,  yjian004@cs.fiu.edu
>> School of Computer and Information Science,
>> Florida International University
>> Homepage: http://users.cis.fiu.edu/~yjian004/
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: Website Update

Posted by "Edward J. Yoon" <ed...@apache.org>.

Almost fixed.

1. http://markmail.org/thread/vv4z3gskfms6bhix
2. HAMA-821

Graph examples still has memory issue.

On Fri, Jan 3, 2014 at 11:03 AM, Yexi Jiang <ye...@gmail.com> wrote:
> Is there any details about the test results?
>
>
> 2014/1/2 Edward J. Yoon <ed...@apache.org>
>
>> Input was too small :/ So, I'll update website using confident benchmarks
>> soon.
>>
>> https://twitter.com/tjungblut/status/414717432293363712
>>
>> After I saw this tweet, I thought we need to update the website. Many
>> people seems think that hama is a graph processing framework. If you
>> have some good idea, Please let me know.
>>
>> P.S., I recently tested Hama examples and 80% of them didn't work.
>> Let's fix them all in 0.7 release and update the website clearly.
>>
>>
>> On Thu, Jan 2, 2014 at 11:55 PM, Tommaso Teofili
>> <to...@gmail.com> wrote:
>> > Hi all,
>> >
>> > I just noticed that the graph on our homepage [0] looks very similar to
>> the
>> > one on Spark homepage [1] so I wonder if we could at least make it a bit
>> > clearer either by writing the benchmarks results in a table near it (it
>> > seems Hama always takes ~0) or something else I cannot think to right
>> now.
>> >
>> > The reference to the benchmarks wiki page is ok but I cannot find the
>> entry
>> > for the comparison with Mahout, maybe I'm missing something ...
>> > Regards and happy new year everyone.
>> > Tommaso
>> >
>> > [0] : http://hama.apache.org/images/mahout_vs_hama.png
>> > [1] : http://spark.incubator.apache.org/images/spark-lr.png
>> >
>> >
>> > 2013/12/20 Edward J. Yoon <ed...@apache.org>
>> >
>> >> Hi all,
>> >>
>> >> I published new website for our community. If you have other ideas,
>> please
>> >> feel free to share your comments or file a JIRA ticket.
>> >>
>> >>
>> >> On Fri, Dec 20, 2013 at 11:29 AM, Edward J. Yoon <edwardyoon@apache.org
>> >> >wrote:
>> >>
>> >> > Thanks, I'll.
>> >> >
>> >> > Sent from my iPhone
>> >> >
>> >> > > On 2013. 12. 19., at 오후 10:41, Yexi Jiang <ye...@gmail.com>
>> wrote:
>> >> > >
>> >> > > Hi,
>> >> > >
>> >> > > It looks nice. Is it possible to add more description to the figure?
>> >> When
>> >> > > people first saw this, they may not know what the x axis is (the
>> number
>> >> > of
>> >> > > cores or the number of the number of groom servers?). Moreover, it
>> is
>> >> > > better to tell the reader some specs of the dataset used.
>> >> > >
>> >> > > Regards,
>> >> > > Yexi
>> >> > >
>> >> > >
>> >> > > 2013/12/19 Edward J. Yoon <ed...@apache.org>
>> >> > >
>> >> > >> Thank you so much!
>> >> > >>
>> >> > >> Sent from my iPhone
>> >> > >>
>> >> > >>>> On 2013. 12. 19., at 오후 4:45, Tommaso Teofili <
>> >> > tommaso.teofili@gmail.com>
>> >> > >>> wrote:
>> >> > >>>
>> >> > >>> Hi Edward,
>> >> > >>>
>> >> > >>> I think it generally looks better than the current one, I would
>> just
>> >> > >> change
>> >> > >>> this:
>> >> > >>>
>> >> > >>> Many data analysis techniques such as machine learning and graph
>> >> > >> algorithms
>> >> > >>> require iterative computations but MapReduce model doesn't fit for
>> >> > these
>> >> > >>> iterative data analysis applications. To run these iterative data
>> >> > >> analysis
>> >> > >>> applications more efficiently, Hama offers pure Bulk Synchronous
>> >> > Parallel
>> >> > >>> computing engine.
>> >> > >>>
>> >> > >>>
>> >> > >>> to something like this:
>> >> > >>>
>> >> > >>> Many data analysis techniques such as machine learning and graph
>> >> > >> algorithms
>> >> > >>> require iterative computations, this is where Bulk Synchronous
>> >> Parallel
>> >> > >>> model can be more effective than "plain" MapReduce. Therefore to
>> run
>> >> > such
>> >> > >>> iterative data analysis applications more efficiently, Hama offers
>> >> pure
>> >> > >>> Bulk Synchronous Parallel computing engine.
>> >> > >>>
>> >> > >>>
>> >> > >>> As I wouldn't say MR is inherently not good for iterative
>> >> computations,
>> >> > >>> just BSP can be a better / more perfomant alternative.
>> >> > >>> My 2 cents,
>> >> > >>> Tommaso
>> >> > >>>
>> >> > >>>
>> >> > >>> 2013/12/19 Edward J. Yoon <ed...@apache.org>
>> >> > >>>
>> >> > >>>> Hi,
>> >> > >>>>
>> >> > >>>> I've made some changes to our website -
>> >> > >>>> http://people.apache.org/~edwardyoon/site/ - Please review and
>> >> > feedback
>> >> > >>>> here.
>> >> > >>>>
>> >> > >>>> --
>> >> > >>>> Best Regards, Edward J. Yoon
>> >> > >>>> @eddieyoon
>> >> > >
>> >> > >
>> >> > >
>> >> > > --
>> >> > > ------
>> >> > > Yexi Jiang,
>> >> > > ECS 251,  yjian004@cs.fiu.edu
>> >> > > School of Computer and Information Science,
>> >> > > Florida International University
>> >> > > Homepage: http://users.cis.fiu.edu/~yjian004/
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Best Regards, Edward J. Yoon
>> >> @eddieyoon
>> >>
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>>
>
>
>
> --
> ------
> Yexi Jiang,
> ECS 251,  yjian004@cs.fiu.edu
> School of Computer and Information Science,
> Florida International University
> Homepage: http://users.cis.fiu.edu/~yjian004/



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: Website Update

Posted by Yexi Jiang <ye...@gmail.com>.

I intended to fix the bugs you mentioned but you have already fixed it.


2014/1/3 Edward J. Yoon <ed...@apache.org>

> > Is there any details about the test results?
>
> As described in there, 6M poinsts input and Oracle BDA was used. Hama
> finishes KMeans in a few seconds. Mahout takes almost 500 ~ 1000 secs.
> See also http://lambda.uta.edu/mrql-bsp.pdf.
>
> Do you need more?
>
> On Fri, Jan 3, 2014 at 11:03 AM, Yexi Jiang <ye...@gmail.com> wrote:
> > Is there any details about the test results?
> >
> >
> > 2014/1/2 Edward J. Yoon <ed...@apache.org>
> >
> >> Input was too small :/ So, I'll update website using confident
> benchmarks
> >> soon.
> >>
> >> https://twitter.com/tjungblut/status/414717432293363712
> >>
> >> After I saw this tweet, I thought we need to update the website. Many
> >> people seems think that hama is a graph processing framework. If you
> >> have some good idea, Please let me know.
> >>
> >> P.S., I recently tested Hama examples and 80% of them didn't work.
> >> Let's fix them all in 0.7 release and update the website clearly.
> >>
> >>
> >> On Thu, Jan 2, 2014 at 11:55 PM, Tommaso Teofili
> >> <to...@gmail.com> wrote:
> >> > Hi all,
> >> >
> >> > I just noticed that the graph on our homepage [0] looks very similar
> to
> >> the
> >> > one on Spark homepage [1] so I wonder if we could at least make it a
> bit
> >> > clearer either by writing the benchmarks results in a table near it
> (it
> >> > seems Hama always takes ~0) or something else I cannot think to right
> >> now.
> >> >
> >> > The reference to the benchmarks wiki page is ok but I cannot find the
> >> entry
> >> > for the comparison with Mahout, maybe I'm missing something ...
> >> > Regards and happy new year everyone.
> >> > Tommaso
> >> >
> >> > [0] : http://hama.apache.org/images/mahout_vs_hama.png
> >> > [1] : http://spark.incubator.apache.org/images/spark-lr.png
> >> >
> >> >
> >> > 2013/12/20 Edward J. Yoon <ed...@apache.org>
> >> >
> >> >> Hi all,
> >> >>
> >> >> I published new website for our community. If you have other ideas,
> >> please
> >> >> feel free to share your comments or file a JIRA ticket.
> >> >>
> >> >>
> >> >> On Fri, Dec 20, 2013 at 11:29 AM, Edward J. Yoon <
> edwardyoon@apache.org
> >> >> >wrote:
> >> >>
> >> >> > Thanks, I'll.
> >> >> >
> >> >> > Sent from my iPhone
> >> >> >
> >> >> > > On 2013. 12. 19., at 오후 10:41, Yexi Jiang <ye...@gmail.com>
> >> wrote:
> >> >> > >
> >> >> > > Hi,
> >> >> > >
> >> >> > > It looks nice. Is it possible to add more description to the
> figure?
> >> >> When
> >> >> > > people first saw this, they may not know what the x axis is (the
> >> number
> >> >> > of
> >> >> > > cores or the number of the number of groom servers?). Moreover,
> it
> >> is
> >> >> > > better to tell the reader some specs of the dataset used.
> >> >> > >
> >> >> > > Regards,
> >> >> > > Yexi
> >> >> > >
> >> >> > >
> >> >> > > 2013/12/19 Edward J. Yoon <ed...@apache.org>
> >> >> > >
> >> >> > >> Thank you so much!
> >> >> > >>
> >> >> > >> Sent from my iPhone
> >> >> > >>
> >> >> > >>>> On 2013. 12. 19., at 오후 4:45, Tommaso Teofili <
> >> >> > tommaso.teofili@gmail.com>
> >> >> > >>> wrote:
> >> >> > >>>
> >> >> > >>> Hi Edward,
> >> >> > >>>
> >> >> > >>> I think it generally looks better than the current one, I would
> >> just
> >> >> > >> change
> >> >> > >>> this:
> >> >> > >>>
> >> >> > >>> Many data analysis techniques such as machine learning and
> graph
> >> >> > >> algorithms
> >> >> > >>> require iterative computations but MapReduce model doesn't fit
> for
> >> >> > these
> >> >> > >>> iterative data analysis applications. To run these iterative
> data
> >> >> > >> analysis
> >> >> > >>> applications more efficiently, Hama offers pure Bulk
> Synchronous
> >> >> > Parallel
> >> >> > >>> computing engine.
> >> >> > >>>
> >> >> > >>>
> >> >> > >>> to something like this:
> >> >> > >>>
> >> >> > >>> Many data analysis techniques such as machine learning and
> graph
> >> >> > >> algorithms
> >> >> > >>> require iterative computations, this is where Bulk Synchronous
> >> >> Parallel
> >> >> > >>> model can be more effective than "plain" MapReduce. Therefore
> to
> >> run
> >> >> > such
> >> >> > >>> iterative data analysis applications more efficiently, Hama
> offers
> >> >> pure
> >> >> > >>> Bulk Synchronous Parallel computing engine.
> >> >> > >>>
> >> >> > >>>
> >> >> > >>> As I wouldn't say MR is inherently not good for iterative
> >> >> computations,
> >> >> > >>> just BSP can be a better / more perfomant alternative.
> >> >> > >>> My 2 cents,
> >> >> > >>> Tommaso
> >> >> > >>>
> >> >> > >>>
> >> >> > >>> 2013/12/19 Edward J. Yoon <ed...@apache.org>
> >> >> > >>>
> >> >> > >>>> Hi,
> >> >> > >>>>
> >> >> > >>>> I've made some changes to our website -
> >> >> > >>>> http://people.apache.org/~edwardyoon/site/ - Please review
> and
> >> >> > feedback
> >> >> > >>>> here.
> >> >> > >>>>
> >> >> > >>>> --
> >> >> > >>>> Best Regards, Edward J. Yoon
> >> >> > >>>> @eddieyoon
> >> >> > >
> >> >> > >
> >> >> > >
> >> >> > > --
> >> >> > > ------
> >> >> > > Yexi Jiang,
> >> >> > > ECS 251,  yjian004@cs.fiu.edu
> >> >> > > School of Computer and Information Science,
> >> >> > > Florida International University
> >> >> > > Homepage: http://users.cis.fiu.edu/~yjian004/
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Best Regards, Edward J. Yoon
> >> >> @eddieyoon
> >> >>
> >>
> >>
> >>
> >> --
> >> Best Regards, Edward J. Yoon
> >> @eddieyoon
> >>
> >
> >
> >
> > --
> > ------
> > Yexi Jiang,
> > ECS 251,  yjian004@cs.fiu.edu
> > School of Computer and Information Science,
> > Florida International University
> > Homepage: http://users.cis.fiu.edu/~yjian004/
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>



-- 
------
Yexi Jiang,
ECS 251,  yjian004@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/

Re: Website Update

Posted by "Edward J. Yoon" <ed...@apache.org>.

> Is there any details about the test results?

As described in there, 6M poinsts input and Oracle BDA was used. Hama
finishes KMeans in a few seconds. Mahout takes almost 500 ~ 1000 secs.
See also http://lambda.uta.edu/mrql-bsp.pdf.

Do you need more?

On Fri, Jan 3, 2014 at 11:03 AM, Yexi Jiang <ye...@gmail.com> wrote:
> Is there any details about the test results?
>
>
> 2014/1/2 Edward J. Yoon <ed...@apache.org>
>
>> Input was too small :/ So, I'll update website using confident benchmarks
>> soon.
>>
>> https://twitter.com/tjungblut/status/414717432293363712
>>
>> After I saw this tweet, I thought we need to update the website. Many
>> people seems think that hama is a graph processing framework. If you
>> have some good idea, Please let me know.
>>
>> P.S., I recently tested Hama examples and 80% of them didn't work.
>> Let's fix them all in 0.7 release and update the website clearly.
>>
>>
>> On Thu, Jan 2, 2014 at 11:55 PM, Tommaso Teofili
>> <to...@gmail.com> wrote:
>> > Hi all,
>> >
>> > I just noticed that the graph on our homepage [0] looks very similar to
>> the
>> > one on Spark homepage [1] so I wonder if we could at least make it a bit
>> > clearer either by writing the benchmarks results in a table near it (it
>> > seems Hama always takes ~0) or something else I cannot think to right
>> now.
>> >
>> > The reference to the benchmarks wiki page is ok but I cannot find the
>> entry
>> > for the comparison with Mahout, maybe I'm missing something ...
>> > Regards and happy new year everyone.
>> > Tommaso
>> >
>> > [0] : http://hama.apache.org/images/mahout_vs_hama.png
>> > [1] : http://spark.incubator.apache.org/images/spark-lr.png
>> >
>> >
>> > 2013/12/20 Edward J. Yoon <ed...@apache.org>
>> >
>> >> Hi all,
>> >>
>> >> I published new website for our community. If you have other ideas,
>> please
>> >> feel free to share your comments or file a JIRA ticket.
>> >>
>> >>
>> >> On Fri, Dec 20, 2013 at 11:29 AM, Edward J. Yoon <edwardyoon@apache.org
>> >> >wrote:
>> >>
>> >> > Thanks, I'll.
>> >> >
>> >> > Sent from my iPhone
>> >> >
>> >> > > On 2013. 12. 19., at 오후 10:41, Yexi Jiang <ye...@gmail.com>
>> wrote:
>> >> > >
>> >> > > Hi,
>> >> > >
>> >> > > It looks nice. Is it possible to add more description to the figure?
>> >> When
>> >> > > people first saw this, they may not know what the x axis is (the
>> number
>> >> > of
>> >> > > cores or the number of the number of groom servers?). Moreover, it
>> is
>> >> > > better to tell the reader some specs of the dataset used.
>> >> > >
>> >> > > Regards,
>> >> > > Yexi
>> >> > >
>> >> > >
>> >> > > 2013/12/19 Edward J. Yoon <ed...@apache.org>
>> >> > >
>> >> > >> Thank you so much!
>> >> > >>
>> >> > >> Sent from my iPhone
>> >> > >>
>> >> > >>>> On 2013. 12. 19., at 오후 4:45, Tommaso Teofili <
>> >> > tommaso.teofili@gmail.com>
>> >> > >>> wrote:
>> >> > >>>
>> >> > >>> Hi Edward,
>> >> > >>>
>> >> > >>> I think it generally looks better than the current one, I would
>> just
>> >> > >> change
>> >> > >>> this:
>> >> > >>>
>> >> > >>> Many data analysis techniques such as machine learning and graph
>> >> > >> algorithms
>> >> > >>> require iterative computations but MapReduce model doesn't fit for
>> >> > these
>> >> > >>> iterative data analysis applications. To run these iterative data
>> >> > >> analysis
>> >> > >>> applications more efficiently, Hama offers pure Bulk Synchronous
>> >> > Parallel
>> >> > >>> computing engine.
>> >> > >>>
>> >> > >>>
>> >> > >>> to something like this:
>> >> > >>>
>> >> > >>> Many data analysis techniques such as machine learning and graph
>> >> > >> algorithms
>> >> > >>> require iterative computations, this is where Bulk Synchronous
>> >> Parallel
>> >> > >>> model can be more effective than "plain" MapReduce. Therefore to
>> run
>> >> > such
>> >> > >>> iterative data analysis applications more efficiently, Hama offers
>> >> pure
>> >> > >>> Bulk Synchronous Parallel computing engine.
>> >> > >>>
>> >> > >>>
>> >> > >>> As I wouldn't say MR is inherently not good for iterative
>> >> computations,
>> >> > >>> just BSP can be a better / more perfomant alternative.
>> >> > >>> My 2 cents,
>> >> > >>> Tommaso
>> >> > >>>
>> >> > >>>
>> >> > >>> 2013/12/19 Edward J. Yoon <ed...@apache.org>
>> >> > >>>
>> >> > >>>> Hi,
>> >> > >>>>
>> >> > >>>> I've made some changes to our website -
>> >> > >>>> http://people.apache.org/~edwardyoon/site/ - Please review and
>> >> > feedback
>> >> > >>>> here.
>> >> > >>>>
>> >> > >>>> --
>> >> > >>>> Best Regards, Edward J. Yoon
>> >> > >>>> @eddieyoon
>> >> > >
>> >> > >
>> >> > >
>> >> > > --
>> >> > > ------
>> >> > > Yexi Jiang,
>> >> > > ECS 251,  yjian004@cs.fiu.edu
>> >> > > School of Computer and Information Science,
>> >> > > Florida International University
>> >> > > Homepage: http://users.cis.fiu.edu/~yjian004/
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Best Regards, Edward J. Yoon
>> >> @eddieyoon
>> >>
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>>
>
>
>
> --
> ------
> Yexi Jiang,
> ECS 251,  yjian004@cs.fiu.edu
> School of Computer and Information Science,
> Florida International University
> Homepage: http://users.cis.fiu.edu/~yjian004/



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: Website Update

Posted by Yexi Jiang <ye...@gmail.com>.

Is there any details about the test results?


2014/1/2 Edward J. Yoon <ed...@apache.org>

> Input was too small :/ So, I'll update website using confident benchmarks
> soon.
>
> https://twitter.com/tjungblut/status/414717432293363712
>
> After I saw this tweet, I thought we need to update the website. Many
> people seems think that hama is a graph processing framework. If you
> have some good idea, Please let me know.
>
> P.S., I recently tested Hama examples and 80% of them didn't work.
> Let's fix them all in 0.7 release and update the website clearly.
>
>
> On Thu, Jan 2, 2014 at 11:55 PM, Tommaso Teofili
> <to...@gmail.com> wrote:
> > Hi all,
> >
> > I just noticed that the graph on our homepage [0] looks very similar to
> the
> > one on Spark homepage [1] so I wonder if we could at least make it a bit
> > clearer either by writing the benchmarks results in a table near it (it
> > seems Hama always takes ~0) or something else I cannot think to right
> now.
> >
> > The reference to the benchmarks wiki page is ok but I cannot find the
> entry
> > for the comparison with Mahout, maybe I'm missing something ...
> > Regards and happy new year everyone.
> > Tommaso
> >
> > [0] : http://hama.apache.org/images/mahout_vs_hama.png
> > [1] : http://spark.incubator.apache.org/images/spark-lr.png
> >
> >
> > 2013/12/20 Edward J. Yoon <ed...@apache.org>
> >
> >> Hi all,
> >>
> >> I published new website for our community. If you have other ideas,
> please
> >> feel free to share your comments or file a JIRA ticket.
> >>
> >>
> >> On Fri, Dec 20, 2013 at 11:29 AM, Edward J. Yoon <edwardyoon@apache.org
> >> >wrote:
> >>
> >> > Thanks, I'll.
> >> >
> >> > Sent from my iPhone
> >> >
> >> > > On 2013. 12. 19., at 오후 10:41, Yexi Jiang <ye...@gmail.com>
> wrote:
> >> > >
> >> > > Hi,
> >> > >
> >> > > It looks nice. Is it possible to add more description to the figure?
> >> When
> >> > > people first saw this, they may not know what the x axis is (the
> number
> >> > of
> >> > > cores or the number of the number of groom servers?). Moreover, it
> is
> >> > > better to tell the reader some specs of the dataset used.
> >> > >
> >> > > Regards,
> >> > > Yexi
> >> > >
> >> > >
> >> > > 2013/12/19 Edward J. Yoon <ed...@apache.org>
> >> > >
> >> > >> Thank you so much!
> >> > >>
> >> > >> Sent from my iPhone
> >> > >>
> >> > >>>> On 2013. 12. 19., at 오후 4:45, Tommaso Teofili <
> >> > tommaso.teofili@gmail.com>
> >> > >>> wrote:
> >> > >>>
> >> > >>> Hi Edward,
> >> > >>>
> >> > >>> I think it generally looks better than the current one, I would
> just
> >> > >> change
> >> > >>> this:
> >> > >>>
> >> > >>> Many data analysis techniques such as machine learning and graph
> >> > >> algorithms
> >> > >>> require iterative computations but MapReduce model doesn't fit for
> >> > these
> >> > >>> iterative data analysis applications. To run these iterative data
> >> > >> analysis
> >> > >>> applications more efficiently, Hama offers pure Bulk Synchronous
> >> > Parallel
> >> > >>> computing engine.
> >> > >>>
> >> > >>>
> >> > >>> to something like this:
> >> > >>>
> >> > >>> Many data analysis techniques such as machine learning and graph
> >> > >> algorithms
> >> > >>> require iterative computations, this is where Bulk Synchronous
> >> Parallel
> >> > >>> model can be more effective than "plain" MapReduce. Therefore to
> run
> >> > such
> >> > >>> iterative data analysis applications more efficiently, Hama offers
> >> pure
> >> > >>> Bulk Synchronous Parallel computing engine.
> >> > >>>
> >> > >>>
> >> > >>> As I wouldn't say MR is inherently not good for iterative
> >> computations,
> >> > >>> just BSP can be a better / more perfomant alternative.
> >> > >>> My 2 cents,
> >> > >>> Tommaso
> >> > >>>
> >> > >>>
> >> > >>> 2013/12/19 Edward J. Yoon <ed...@apache.org>
> >> > >>>
> >> > >>>> Hi,
> >> > >>>>
> >> > >>>> I've made some changes to our website -
> >> > >>>> http://people.apache.org/~edwardyoon/site/ - Please review and
> >> > feedback
> >> > >>>> here.
> >> > >>>>
> >> > >>>> --
> >> > >>>> Best Regards, Edward J. Yoon
> >> > >>>> @eddieyoon
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > ------
> >> > > Yexi Jiang,
> >> > > ECS 251,  yjian004@cs.fiu.edu
> >> > > School of Computer and Information Science,
> >> > > Florida International University
> >> > > Homepage: http://users.cis.fiu.edu/~yjian004/
> >> >
> >>
> >>
> >>
> >> --
> >> Best Regards, Edward J. Yoon
> >> @eddieyoon
> >>
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>



-- 
------
Yexi Jiang,
ECS 251,  yjian004@cs.fiu.edu
School of Computer and Information Science,
Florida International University
Homepage: http://users.cis.fiu.edu/~yjian004/

Re: Website Update

Posted by "Edward J. Yoon" <ed...@apache.org>.

Input was too small :/ So, I'll update website using confident benchmarks soon.

https://twitter.com/tjungblut/status/414717432293363712

After I saw this tweet, I thought we need to update the website. Many
people seems think that hama is a graph processing framework. If you
have some good idea, Please let me know.

P.S., I recently tested Hama examples and 80% of them didn't work.
Let's fix them all in 0.7 release and update the website clearly.


On Thu, Jan 2, 2014 at 11:55 PM, Tommaso Teofili
<to...@gmail.com> wrote:
> Hi all,
>
> I just noticed that the graph on our homepage [0] looks very similar to the
> one on Spark homepage [1] so I wonder if we could at least make it a bit
> clearer either by writing the benchmarks results in a table near it (it
> seems Hama always takes ~0) or something else I cannot think to right now.
>
> The reference to the benchmarks wiki page is ok but I cannot find the entry
> for the comparison with Mahout, maybe I'm missing something ...
> Regards and happy new year everyone.
> Tommaso
>
> [0] : http://hama.apache.org/images/mahout_vs_hama.png
> [1] : http://spark.incubator.apache.org/images/spark-lr.png
>
>
> 2013/12/20 Edward J. Yoon <ed...@apache.org>
>
>> Hi all,
>>
>> I published new website for our community. If you have other ideas, please
>> feel free to share your comments or file a JIRA ticket.
>>
>>
>> On Fri, Dec 20, 2013 at 11:29 AM, Edward J. Yoon <edwardyoon@apache.org
>> >wrote:
>>
>> > Thanks, I'll.
>> >
>> > Sent from my iPhone
>> >
>> > > On 2013. 12. 19., at 오후 10:41, Yexi Jiang <ye...@gmail.com> wrote:
>> > >
>> > > Hi,
>> > >
>> > > It looks nice. Is it possible to add more description to the figure?
>> When
>> > > people first saw this, they may not know what the x axis is (the number
>> > of
>> > > cores or the number of the number of groom servers?). Moreover, it is
>> > > better to tell the reader some specs of the dataset used.
>> > >
>> > > Regards,
>> > > Yexi
>> > >
>> > >
>> > > 2013/12/19 Edward J. Yoon <ed...@apache.org>
>> > >
>> > >> Thank you so much!
>> > >>
>> > >> Sent from my iPhone
>> > >>
>> > >>>> On 2013. 12. 19., at 오후 4:45, Tommaso Teofili <
>> > tommaso.teofili@gmail.com>
>> > >>> wrote:
>> > >>>
>> > >>> Hi Edward,
>> > >>>
>> > >>> I think it generally looks better than the current one, I would just
>> > >> change
>> > >>> this:
>> > >>>
>> > >>> Many data analysis techniques such as machine learning and graph
>> > >> algorithms
>> > >>> require iterative computations but MapReduce model doesn't fit for
>> > these
>> > >>> iterative data analysis applications. To run these iterative data
>> > >> analysis
>> > >>> applications more efficiently, Hama offers pure Bulk Synchronous
>> > Parallel
>> > >>> computing engine.
>> > >>>
>> > >>>
>> > >>> to something like this:
>> > >>>
>> > >>> Many data analysis techniques such as machine learning and graph
>> > >> algorithms
>> > >>> require iterative computations, this is where Bulk Synchronous
>> Parallel
>> > >>> model can be more effective than "plain" MapReduce. Therefore to run
>> > such
>> > >>> iterative data analysis applications more efficiently, Hama offers
>> pure
>> > >>> Bulk Synchronous Parallel computing engine.
>> > >>>
>> > >>>
>> > >>> As I wouldn't say MR is inherently not good for iterative
>> computations,
>> > >>> just BSP can be a better / more perfomant alternative.
>> > >>> My 2 cents,
>> > >>> Tommaso
>> > >>>
>> > >>>
>> > >>> 2013/12/19 Edward J. Yoon <ed...@apache.org>
>> > >>>
>> > >>>> Hi,
>> > >>>>
>> > >>>> I've made some changes to our website -
>> > >>>> http://people.apache.org/~edwardyoon/site/ - Please review and
>> > feedback
>> > >>>> here.
>> > >>>>
>> > >>>> --
>> > >>>> Best Regards, Edward J. Yoon
>> > >>>> @eddieyoon
>> > >
>> > >
>> > >
>> > > --
>> > > ------
>> > > Yexi Jiang,
>> > > ECS 251,  yjian004@cs.fiu.edu
>> > > School of Computer and Information Science,
>> > > Florida International University
>> > > Homepage: http://users.cis.fiu.edu/~yjian004/
>> >
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>>



-- 
Best Regards, Edward J. Yoon
@eddieyoon