You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@bigtop.apache.org by Jay Vyas <ja...@gmail.com> on 2013/09/19 23:19:13 UTC

"bigpetstore" another idea that i forgot to mention which might fit into bigtop.

Hey bigtop:

Another idea, which i have been toying with for some time - is the idea of
implementing the old hibernate/ibatis app "jpetstore" for hadoop.

I think bigtop might be a good template for this, but not sure if it should
go in bigtop itself : i.e.  put an entire bigdata workflow into bigtop as
an example/template for people to better comprehend how mapreduce ETL plays
with adhoc analytics (HIVE/PIG) , and how machine learning (mahout etc)
finally interact with end sinks (hbase). etc...

Not sure if this is in the scope of bigtop but i think, for people getting
into the hadoop ecosystem and useing bigtop as a venue to do so, an example
app of this sort might be particularly useful.

Apologies is this is off scope of bigtop but let me know!

-- 
Jay Vyas
http://jayunit100.blogspot.com

Re: "bigpetstore" another idea that i forgot to mention which might fit into bigtop.

Posted by Bruno Mahé <bm...@apache.org>.
This is a great idea and this proposal looks good to me.

My only feedback would be:
Would "samples" be more obvious than "blueprints"?

On 09/23/2013 11:47 AM, Jay Vyas wrote:
> bumtp^^ ... Any thoughts on where these blueprints should go and how to
> organize them?  At that point ill roll it into  a jira
>
>
> On Thu, Sep 19, 2013 at 5:41 PM, Jay Vyas <jayunit100@gmail.com
> <ma...@gmail.com>> wrote:
>
>     Okay that makes sense.  Now time for my all-to-often asked bigtop
>     question:
>
>     Where would this project go?
>
>     My proposal:
>
>     My initial thoughts are a
>
>     1) location: Simply a new submodule, under top level bigtop, called
>     blueprints/ with a single java application under bigpetstore/ as the
>     submodule.
>
>     2) extensibility: Then others could add their own submodules easily
>     by just creating a new folder.
>
>     3) deliverable: The artifact created by this submodule would simply
>     be a jar file, with a shell script for executing the whole pipeline.
>
>     4) bootstrap / input data: We could put CSV delimited input data
>     somewhere on a public s3 instance , and have small input csv text
>     files as a failsafe inside the repo so people can always run it from
>     just the git repo alone.
>
>
>
>
>
>
>
>
>
>
>
>
>     On Thu, Sep 19, 2013 at 5:28 PM, Roman Shaposhnik <rvs@apache.org
>     <ma...@apache.org>> wrote:
>
>         On Thu, Sep 19, 2013 at 2:19 PM, Jay Vyas <jayunit100@gmail.com
>         <ma...@gmail.com>> wrote:
>          > Hey bigtop:
>          >
>          > Another idea, which i have been toying with for some time -
>         is the idea of
>          > implementing the old hibernate/ibatis app "jpetstore" for hadoop.
>
>         I think providing example would be very nice. I honestly think that
>         perhaps the best place to start would be in Hue, though. Hue already
>         comes with simple toy example for things like Hive/Pig
>         workflows, etc.
>
>         Take a look at those.
>
>          > I think bigtop might be a good template for this, but not
>         sure if it should
>          > go in bigtop itself : i.e.  put an entire bigdata workflow
>         into bigtop as an
>          > example/template for people to better comprehend how
>         mapreduce ETL plays
>          > with adhoc analytics (HIVE/PIG) , and how machine learning
>         (mahout etc)
>          > finally interact with end sinks (hbase). etc...
>
>         Ah! That actually goes beyond examples and would also be quite
>         appreciated.
>         I'd call those 'Bigdata pipelines blueprints'. There I would
>         encourage folks
>         to approach it from the Oozie perspective. That's what most of the
>         heavyweight Hadoop users seems to be doing -- they've got those
>         complex
>         pipelines with ingest coming from the Flume side of things,
>         batch managed
>         by Oozie and analytic being provided by Hive/Pig/Spark and most
>         recently Solr.
>
>          > Not sure if this is in the scope of bigtop but i think, for
>         people getting
>          > into the hadoop ecosystem and useing bigtop as a venue to do
>         so, an example
>          > app of this sort might be particularly useful.
>          >
>          > Apologies is this is off scope of bigtop but let me know!
>
>         Personally I think Bigtop is a really good place for these types
>         of blueprints
>         to be developed and tested.
>
>         Thanks,
>         Roman.
>
>
>
>
>     --
>     Jay Vyas
>     http://jayunit100.blogspot.com
>
>
>
>
> --
> Jay Vyas
> http://jayunit100.blogspot.com


Re: "bigpetstore" another idea that i forgot to mention which might fit into bigtop.

Posted by Jay Vyas <ja...@gmail.com>.
bumtp^^ ... Any thoughts on where these blueprints should go and how to
organize them?  At that point ill roll it into  a jira


On Thu, Sep 19, 2013 at 5:41 PM, Jay Vyas <ja...@gmail.com> wrote:

> Okay that makes sense.  Now time for my all-to-often asked bigtop
> question:
>
> Where would this project go?
>
> My proposal:
>
> My initial thoughts are a
>
> 1) location: Simply a new submodule, under top level bigtop, called
> blueprints/ with a single java application under bigpetstore/ as the
> submodule.
>
> 2) extensibility: Then others could add their own submodules easily by
> just creating a new folder.
>
> 3) deliverable: The artifact created by this submodule would simply be a
> jar file, with a shell script for executing the whole pipeline.
>
> 4) bootstrap / input data: We could put CSV delimited input data somewhere
> on a public s3 instance , and have small input csv text files as a failsafe
> inside the repo so people can always run it from just the git repo alone.
>
>
>
>
>
>
>
>
>
>
>
>
> On Thu, Sep 19, 2013 at 5:28 PM, Roman Shaposhnik <rv...@apache.org> wrote:
>
>> On Thu, Sep 19, 2013 at 2:19 PM, Jay Vyas <ja...@gmail.com> wrote:
>> > Hey bigtop:
>> >
>> > Another idea, which i have been toying with for some time - is the idea
>> of
>> > implementing the old hibernate/ibatis app "jpetstore" for hadoop.
>>
>> I think providing example would be very nice. I honestly think that
>> perhaps the best place to start would be in Hue, though. Hue already
>> comes with simple toy example for things like Hive/Pig workflows, etc.
>>
>> Take a look at those.
>>
>> > I think bigtop might be a good template for this, but not sure if it
>> should
>> > go in bigtop itself : i.e.  put an entire bigdata workflow into bigtop
>> as an
>> > example/template for people to better comprehend how mapreduce ETL plays
>> > with adhoc analytics (HIVE/PIG) , and how machine learning (mahout etc)
>> > finally interact with end sinks (hbase). etc...
>>
>> Ah! That actually goes beyond examples and would also be quite
>> appreciated.
>> I'd call those 'Bigdata pipelines blueprints'. There I would encourage
>> folks
>> to approach it from the Oozie perspective. That's what most of the
>> heavyweight Hadoop users seems to be doing -- they've got those complex
>> pipelines with ingest coming from the Flume side of things, batch managed
>> by Oozie and analytic being provided by Hive/Pig/Spark and most recently
>> Solr.
>>
>> > Not sure if this is in the scope of bigtop but i think, for people
>> getting
>> > into the hadoop ecosystem and useing bigtop as a venue to do so, an
>> example
>> > app of this sort might be particularly useful.
>> >
>> > Apologies is this is off scope of bigtop but let me know!
>>
>> Personally I think Bigtop is a really good place for these types of
>> blueprints
>> to be developed and tested.
>>
>> Thanks,
>> Roman.
>>
>
>
>
> --
> Jay Vyas
> http://jayunit100.blogspot.com
>



-- 
Jay Vyas
http://jayunit100.blogspot.com

Re: "bigpetstore" another idea that i forgot to mention which might fit into bigtop.

Posted by Jay Vyas <ja...@gmail.com>.
Okay that makes sense.  Now time for my all-to-often asked bigtop question:

Where would this project go?

My proposal:

My initial thoughts are a

1) location: Simply a new submodule, under top level bigtop, called
blueprints/ with a single java application under bigpetstore/ as the
submodule.

2) extensibility: Then others could add their own submodules easily by just
creating a new folder.

3) deliverable: The artifact created by this submodule would simply be a
jar file, with a shell script for executing the whole pipeline.

4) bootstrap / input data: We could put CSV delimited input data somewhere
on a public s3 instance , and have small input csv text files as a failsafe
inside the repo so people can always run it from just the git repo alone.












On Thu, Sep 19, 2013 at 5:28 PM, Roman Shaposhnik <rv...@apache.org> wrote:

> On Thu, Sep 19, 2013 at 2:19 PM, Jay Vyas <ja...@gmail.com> wrote:
> > Hey bigtop:
> >
> > Another idea, which i have been toying with for some time - is the idea
> of
> > implementing the old hibernate/ibatis app "jpetstore" for hadoop.
>
> I think providing example would be very nice. I honestly think that
> perhaps the best place to start would be in Hue, though. Hue already
> comes with simple toy example for things like Hive/Pig workflows, etc.
>
> Take a look at those.
>
> > I think bigtop might be a good template for this, but not sure if it
> should
> > go in bigtop itself : i.e.  put an entire bigdata workflow into bigtop
> as an
> > example/template for people to better comprehend how mapreduce ETL plays
> > with adhoc analytics (HIVE/PIG) , and how machine learning (mahout etc)
> > finally interact with end sinks (hbase). etc...
>
> Ah! That actually goes beyond examples and would also be quite appreciated.
> I'd call those 'Bigdata pipelines blueprints'. There I would encourage
> folks
> to approach it from the Oozie perspective. That's what most of the
> heavyweight Hadoop users seems to be doing -- they've got those complex
> pipelines with ingest coming from the Flume side of things, batch managed
> by Oozie and analytic being provided by Hive/Pig/Spark and most recently
> Solr.
>
> > Not sure if this is in the scope of bigtop but i think, for people
> getting
> > into the hadoop ecosystem and useing bigtop as a venue to do so, an
> example
> > app of this sort might be particularly useful.
> >
> > Apologies is this is off scope of bigtop but let me know!
>
> Personally I think Bigtop is a really good place for these types of
> blueprints
> to be developed and tested.
>
> Thanks,
> Roman.
>



-- 
Jay Vyas
http://jayunit100.blogspot.com

Re: "bigpetstore" another idea that i forgot to mention which might fit into bigtop.

Posted by Roman Shaposhnik <rv...@apache.org>.
On Thu, Sep 19, 2013 at 2:19 PM, Jay Vyas <ja...@gmail.com> wrote:
> Hey bigtop:
>
> Another idea, which i have been toying with for some time - is the idea of
> implementing the old hibernate/ibatis app "jpetstore" for hadoop.

I think providing example would be very nice. I honestly think that
perhaps the best place to start would be in Hue, though. Hue already
comes with simple toy example for things like Hive/Pig workflows, etc.

Take a look at those.

> I think bigtop might be a good template for this, but not sure if it should
> go in bigtop itself : i.e.  put an entire bigdata workflow into bigtop as an
> example/template for people to better comprehend how mapreduce ETL plays
> with adhoc analytics (HIVE/PIG) , and how machine learning (mahout etc)
> finally interact with end sinks (hbase). etc...

Ah! That actually goes beyond examples and would also be quite appreciated.
I'd call those 'Bigdata pipelines blueprints'. There I would encourage folks
to approach it from the Oozie perspective. That's what most of the
heavyweight Hadoop users seems to be doing -- they've got those complex
pipelines with ingest coming from the Flume side of things, batch managed
by Oozie and analytic being provided by Hive/Pig/Spark and most recently Solr.

> Not sure if this is in the scope of bigtop but i think, for people getting
> into the hadoop ecosystem and useing bigtop as a venue to do so, an example
> app of this sort might be particularly useful.
>
> Apologies is this is off scope of bigtop but let me know!

Personally I think Bigtop is a really good place for these types of blueprints
to be developed and tested.

Thanks,
Roman.