You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crunch.apache.org by Matthias Friedrich <ma...@mafr.de> on 2012/09/17 19:04:47 UTC

Maven archetype for Crunch

Hi,

I think it would be a good idea to provide a Maven archetype for
Crunch to make it easier for users to play with it. I've created an
archetype [1] based on our example projects and uploaded it to my own
Maven repo for demo:

  mvn archetype:generate -DarchetypeCatalog=http://dev.mafr.de/repos/maven2/

Select the Crunch archetype, enter Maven coordinates and you'll get
a simple project with (hopefully) correct dependency setup and job
packaging. Crunch hit Maven Central this morning, so you don't even
need to mvn install Crunch first :)

Should we add the archetype to the Crunch code base? It's kind of
redundant (we have examples already), but it'll make writing a
"getting started" document for 0.4.0 really easy.

Another thing: In the archetype I had to add some dependencies that
are missing from hadoop-core but are needed to get LocalJobRunner
working. If we added those to Crunch, they would be added to the job
JAR's lib directory unnecessarily. But they don't cause any trouble
and it would make job setup easier for our users. What do you think?

Regards,
  Matthias

[1] hg clone http://dev.mafr.de/repos/hg/crunch-job-basic/

Re: Maven archetype for Crunch

Posted by Josh Wills <jw...@cloudera.com>.
On Mon, Sep 17, 2012 at 10:20 AM, Matthias Friedrich <ma...@mafr.de> wrote:
> On Monday, 2012-09-17, Josh Wills wrote:
>> On Mon, Sep 17, 2012 at 10:04 AM, Matthias Friedrich <ma...@mafr.de> wrote:
> [...]
>>> Another thing: In the archetype I had to add some dependencies that
>>> are missing from hadoop-core but are needed to get LocalJobRunner
>>> working. If we added those to Crunch, they would be added to the job
>>> JAR's lib directory unnecessarily. But they don't cause any trouble
>>> and it would make job setup easier for our users. What do you think?
>
>> How big are they, in terms of bytes?
>
> commons-io-2.1 is 163 KB and slf4j-api-1.4.3 is 15 KB. Without them,
> our lib directory is 3.9 MB for Hadoop 1.

Yeah, I'm fine with that.

>
> Regards,
>   Matthias



-- 
Director of Data Science
Cloudera
Twitter: @josh_wills

Re: Maven archetype for Crunch

Posted by Matthias Friedrich <ma...@mafr.de>.
On Monday, 2012-09-17, Josh Wills wrote:
> On Mon, Sep 17, 2012 at 10:04 AM, Matthias Friedrich <ma...@mafr.de> wrote:
[...] 
>> Another thing: In the archetype I had to add some dependencies that
>> are missing from hadoop-core but are needed to get LocalJobRunner
>> working. If we added those to Crunch, they would be added to the job
>> JAR's lib directory unnecessarily. But they don't cause any trouble
>> and it would make job setup easier for our users. What do you think?
 
> How big are they, in terms of bytes?
 
commons-io-2.1 is 163 KB and slf4j-api-1.4.3 is 15 KB. Without them,
our lib directory is 3.9 MB for Hadoop 1.
 
Regards,
  Matthias

Re: Maven archetype for Crunch

Posted by Josh Wills <jw...@cloudera.com>.
On Mon, Sep 17, 2012 at 10:04 AM, Matthias Friedrich <ma...@mafr.de> wrote:
> Hi,
>
> I think it would be a good idea to provide a Maven archetype for
> Crunch to make it easier for users to play with it. I've created an
> archetype [1] based on our example projects and uploaded it to my own
> Maven repo for demo:
>
>   mvn archetype:generate -DarchetypeCatalog=http://dev.mafr.de/repos/maven2/
>
> Select the Crunch archetype, enter Maven coordinates and you'll get
> a simple project with (hopefully) correct dependency setup and job
> packaging. Crunch hit Maven Central this morning, so you don't even
> need to mvn install Crunch first :)
>
> Should we add the archetype to the Crunch code base? It's kind of
> redundant (we have examples already), but it'll make writing a
> "getting started" document for 0.4.0 really easy.
>
> Another thing: In the archetype I had to add some dependencies that
> are missing from hadoop-core but are needed to get LocalJobRunner
> working. If we added those to Crunch, they would be added to the job
> JAR's lib directory unnecessarily. But they don't cause any trouble
> and it would make job setup easier for our users. What do you think?

How big are they, in terms of bytes?

>
> Regards,
>   Matthias
>
> [1] hg clone http://dev.mafr.de/repos/hg/crunch-job-basic/



-- 
Director of Data Science
Cloudera
Twitter: @josh_wills