You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@maven.apache.org by Christian Goetze <cg...@miaow.com> on 2007/01/04 20:54:52 UTC

First impressions of using maven (long)

In this post, I'd like to summarize my first impressions of using maven.

The product built in my company is a mixed bag of C/C++ code, java code 
and perl code. It is a classic three tier app, with the back end written 
in C/C++, the middle tier written in perl and java and the GUI written 
in java.

The java code is built via ant scripts, and they have grown to be quite 
intricate and unwieldy, and they also tended to obscure the actual 
dependencies between artifacts, causing multiple jars to contain varying 
subsets of the class files, and various random jars being duplicated 
into several places in the source tree -- in other words: your classic 
organically grown mess. Also, the way the ant scripts were written made 
it very hard to get incremental parallel builds to work, since they 
tended to spill files all over the tree while invoking each others in 
scary ways.

The top level build system is written in a make clone called "cook", 
written by a nice guy in Australia, Peter Miller. This clone has a 
variety of extras not available in any other build tool, which made it 
possible for me to write a system implementing a philosophy that matches 
almost perfectly with maven's philosophy. See 
http://www.cg-soft.com/tools/build/ for details. One thing to note is 
that "cook" is fairly old software (I think that the first version came 
out over 20 years ago), which makes the absence of many of its features 
in newer systems that much more depressing - more on that later.

So maven seems like a perfect fit, especially since our java code is 
fairly vanilla, using standard technologies and very few hacks. Indeed, 
it was possible for me to convert about 60% of the java build to maven 
builds within 2 days after reading the "Better Builds with Maven" book. 
Needless to say, I was impressed.

I really want to like maven. I think it has the right ideas, and the way 
it deals with deploying and reusing artifacts built elsewhere is a great 
model for dealing with third party software, and even inhouse software 
build by separate teams. It effortlessly implements a "wink-in" scheme 
which usually requires a lot of effort to get to work. In particular, I 
like the fact that developers only need to have those subtrees checked 
out where they are actually making changes, and can still build the 
complete product by downloading the artifacts built and deployed by the 
continuous build loop.

I am in the process of integrating some of the maven ideas (mainly the 
plugin architecture) into my cook based build system, and I almost 
wished maven had some support for C/C++ style artifacts - but I realize 
that this is a much harder problem than java artifacts - a testament to 
the wisdom of the java designers.

Nevertheless, there are difficulties and disappointments:

    * Incremental builds are not reliable;
    * Builds are not reproducible;
    * Builds are not parallizable and distributable;
    * Reactor builds are "all or nothing";
    * Propagation of build parameters is undocumented and unpredictable;
    * Release process is bizarre.

I think these are important issues. I'll go into details later, but I 
cannot stress how important it is to have a reliable build tool that 
actually removes workload from developers. Most developers are not 
thrilled about having control taken away from them via an "opiniated 
tool" like maven. They will only go along if it provides tangible 
benefits. It is therefore -extremely-important- to not disappoint them. 
I don't think there is disagreement about that - after all, the idea 
plugin is a great step in the right direction, but I do fear that people 
do not fully appreciate the difficulties created by having a build tool 
that fails in mysterious and random, hard to debug ways. Developers have 
strong egos, and will go to great lengths to try to figure out things by 
themselves and will only come and ask questions when they are desperate 
- and then they will blame the tool and the people who brought it into 
their lives.

Now, in detail:

_Incremental Builds are not Reliable_

There are two well known failure modes:

    * A source file has been relocated or removed
    * A source file was updated, but with a timestamp older than the
      associated derived file(s).

Supporting those two cases is not really that hard: In the first case, 
you record a hash signature of the sorted list of all "ingredient" files 
used to produce the target file, and consider it out of date if that 
signature changes. In the second case, you record timestamp and hash 
signature of the source file and consider it out of date if the 
timestamp and signature changes. As a side effect, you get free build 
avoidance by comparing hash signature of a generated file with the 
previous version and consider subsequent dependencies up to date if the 
signature did not change. "cook" has been doing this for years and it 
works great.

The real problem seems to be a lack of awareness of why this is so bad. 
The classic shrug, followed by "just say mvn clean install"... Besides 
the fact that this is horrible for doing continuous builds, it also 
makes build script writers very lazy, since all they need to do is 
support the "clean" case. So, for example, jars will include obsolete 
.class files unless a clean build is used. Sites will contain obsolete 
html files unless a clean build is used...

(before I discovered maven, I was about 50% done with integrating java 
builds into my "cook" system, and one of the little tools I have is a 
perl script to determine the exact names of the class files generated by 
a java source file, so I will only pack the exact files produced by the 
current build)...

_Builds are not Reproducible_

This should be the holy grail of every release engineer. Arguably, it 
isn't really maven's fault, but rather the fault of the java designers 
to rely on a format that includes timestamps (jars). Run mvn install 
once, save the artifcats, run mvn clean install again, and the jars will 
look different. It requires trust in the system to accept that both 
builds are the same. QA engineers are very unwilling to trust - it's 
part of their job description. If you wish to reproduce a build, it 
would be great if it came out bitwise identical to the original build.

It is actually not impossible to do this, and "cook" as a feature to at 
least solve the timestamp issue, assuming that your version control 
system is good enough and can reproduce a source tree with the exact 
same timestamps as the original. The trick is to backdate every 
generated file to be one second younger than the youngest ingredient 
file. As a side effect, it will actually make your jars slightly 
smaller, since those timestamps will compress better - yeah, big deal :)

Another wrench in the works is the fact that maven itself may change 
over time and may cause builds to change. I'm not sure I understand all 
the ramifications, but it would seem reasonable to ask how to ensure 
that the right version of maven and its plugins be used when reproducing 
any particular build.

_Builds are not Parallel_

I actually don't know this for sure - is javac multithreaded? I know 
that the reactor build isn't. Perhaps not such an urgent feature, but 
still a pity, since GNU Make and cook have supported parallel builds for 
over ten years...

My solution is to use "cook" to invoke "mvn -N compile" and "mvn -N 
install" in every directory that has a pom.xml, after extracting the 
dependencies from the pom.xml files, and let "cook" do the equivalent of 
the reactor build, using cook's parallelization (compile and install 
invoked separately, so that cook has a chance to insert JNI header 
generation in between, and pack the C/C++ shared objects built by cook 
into the assembly built in the "mvn install" step) . This also addresses 
the next issue:

_Reactor Builds are "all or nothing"_

If I'm in a really large multi-module project, and I need to work on 
multiple modules at the same time, I only have two choices:

    * I use the reactor build (and wait, and wait....)
    * I run mvn in dependency order by hand.

I'd like to have a mode where mvn will do a reactor build for a specific 
module and its prerequisites only.

_Propagation of Build Parameters Unpredictable and Undocumented_

As far as I can tell, there are pom parameters which will accumulate 
(e.g. dependencies), override (various plugin settings, properties) or 
just be ignored. Then there are parameters for which I can put in a 
${...} expression, and some where I can't. The only way to find out is 
via trial and error (or perhaps by reading the source code). I think 
this is importent enough to merit  documentation.

Others have complained before me about the difficulties in actually 
figuring out how things are supposed to work. This is a serious 
disadvantage of the declarative style: the declarations are by 
definition arbitrary and must somehow be documented. As much as I hate 
"ant", I have to say that this is much less of a problem there, mainly 
because one needs to explicitly describe the operations in ant anyway, 
and the atomic tasks are not so hard to understand.

_Release Process is Bizarre_

I don't envy anyone with the task of defining this. Fact is, there still 
is no widespread consensus on how a release process is supposed to work, 
as far as I can tell. All I know is that I can't use the release plugin 
as it is. I do need to understand better the idea behind the snapshot 
versioning and how version ranges will work out. I am uncomfortable with 
some of the logic there.

I hope that this post will give the maven developers some insights on 
the problems faced by a novice user attempting to use the product. I 
still very much want to like it, and I'll struggle to make it work - but 
it's not looking great at this moment.
--
cg


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: First impressions of using maven (long)

Posted by Johannes Schneider <jo...@familieschneider.info>.
Hi,

thanks for your post. I think you have addressed several issues that 
should be fixed asap.
But unfortunately the developing speed of Maven seems to be very low. 
There exist many bugs and shortcomings everybody complains about (e.g. 
site generation for multi module projects...) :-(.

The problem is that Maven is the best java build system available (the 
perl among thorns). And apperently there isn't too much motivation for 
fixing the existing issues without competition...


Johannes Schneider


Christian Goetze wrote:
> In this post, I'd like to summarize my first impressions of using maven.
> 
> The product built in my company is a mixed bag of C/C++ code, java code 
> and perl code. It is a classic three tier app, with the back end written 
> in C/C++, the middle tier written in perl and java and the GUI written 
> in java.
> 
> The java code is built via ant scripts, and they have grown to be quite 
> intricate and unwieldy, and they also tended to obscure the actual 
> dependencies between artifacts, causing multiple jars to contain varying 
> subsets of the class files, and various random jars being duplicated 
> into several places in the source tree -- in other words: your classic 
> organically grown mess. Also, the way the ant scripts were written made 
> it very hard to get incremental parallel builds to work, since they 
> tended to spill files all over the tree while invoking each others in 
> scary ways.
> 
> The top level build system is written in a make clone called "cook", 
> written by a nice guy in Australia, Peter Miller. This clone has a 
> variety of extras not available in any other build tool, which made it 
> possible for me to write a system implementing a philosophy that matches 
> almost perfectly with maven's philosophy. See 
> http://www.cg-soft.com/tools/build/ for details. One thing to note is 
> that "cook" is fairly old software (I think that the first version came 
> out over 20 years ago), which makes the absence of many of its features 
> in newer systems that much more depressing - more on that later.
> 
> So maven seems like a perfect fit, especially since our java code is 
> fairly vanilla, using standard technologies and very few hacks. Indeed, 
> it was possible for me to convert about 60% of the java build to maven 
> builds within 2 days after reading the "Better Builds with Maven" book. 
> Needless to say, I was impressed.
> 
> I really want to like maven. I think it has the right ideas, and the way 
> it deals with deploying and reusing artifacts built elsewhere is a great 
> model for dealing with third party software, and even inhouse software 
> build by separate teams. It effortlessly implements a "wink-in" scheme 
> which usually requires a lot of effort to get to work. In particular, I 
> like the fact that developers only need to have those subtrees checked 
> out where they are actually making changes, and can still build the 
> complete product by downloading the artifacts built and deployed by the 
> continuous build loop.
> 
> I am in the process of integrating some of the maven ideas (mainly the 
> plugin architecture) into my cook based build system, and I almost 
> wished maven had some support for C/C++ style artifacts - but I realize 
> that this is a much harder problem than java artifacts - a testament to 
> the wisdom of the java designers.
> 
> Nevertheless, there are difficulties and disappointments:
> 
>    * Incremental builds are not reliable;
>    * Builds are not reproducible;
>    * Builds are not parallizable and distributable;
>    * Reactor builds are "all or nothing";
>    * Propagation of build parameters is undocumented and unpredictable;
>    * Release process is bizarre.
> 
> I think these are important issues. I'll go into details later, but I 
> cannot stress how important it is to have a reliable build tool that 
> actually removes workload from developers. Most developers are not 
> thrilled about having control taken away from them via an "opiniated 
> tool" like maven. They will only go along if it provides tangible 
> benefits. It is therefore -extremely-important- to not disappoint them. 
> I don't think there is disagreement about that - after all, the idea 
> plugin is a great step in the right direction, but I do fear that people 
> do not fully appreciate the difficulties created by having a build tool 
> that fails in mysterious and random, hard to debug ways. Developers have 
> strong egos, and will go to great lengths to try to figure out things by 
> themselves and will only come and ask questions when they are desperate 
> - and then they will blame the tool and the people who brought it into 
> their lives.
> 
> Now, in detail:
> 
> _Incremental Builds are not Reliable_
> 
> There are two well known failure modes:
> 
>    * A source file has been relocated or removed
>    * A source file was updated, but with a timestamp older than the
>      associated derived file(s).
> 
> Supporting those two cases is not really that hard: In the first case, 
> you record a hash signature of the sorted list of all "ingredient" files 
> used to produce the target file, and consider it out of date if that 
> signature changes. In the second case, you record timestamp and hash 
> signature of the source file and consider it out of date if the 
> timestamp and signature changes. As a side effect, you get free build 
> avoidance by comparing hash signature of a generated file with the 
> previous version and consider subsequent dependencies up to date if the 
> signature did not change. "cook" has been doing this for years and it 
> works great.
> 
> The real problem seems to be a lack of awareness of why this is so bad. 
> The classic shrug, followed by "just say mvn clean install"... Besides 
> the fact that this is horrible for doing continuous builds, it also 
> makes build script writers very lazy, since all they need to do is 
> support the "clean" case. So, for example, jars will include obsolete 
> .class files unless a clean build is used. Sites will contain obsolete 
> html files unless a clean build is used...
> 
> (before I discovered maven, I was about 50% done with integrating java 
> builds into my "cook" system, and one of the little tools I have is a 
> perl script to determine the exact names of the class files generated by 
> a java source file, so I will only pack the exact files produced by the 
> current build)...
> 
> _Builds are not Reproducible_
> 
> This should be the holy grail of every release engineer. Arguably, it 
> isn't really maven's fault, but rather the fault of the java designers 
> to rely on a format that includes timestamps (jars). Run mvn install 
> once, save the artifcats, run mvn clean install again, and the jars will 
> look different. It requires trust in the system to accept that both 
> builds are the same. QA engineers are very unwilling to trust - it's 
> part of their job description. If you wish to reproduce a build, it 
> would be great if it came out bitwise identical to the original build.
> 
> It is actually not impossible to do this, and "cook" as a feature to at 
> least solve the timestamp issue, assuming that your version control 
> system is good enough and can reproduce a source tree with the exact 
> same timestamps as the original. The trick is to backdate every 
> generated file to be one second younger than the youngest ingredient 
> file. As a side effect, it will actually make your jars slightly 
> smaller, since those timestamps will compress better - yeah, big deal :)
> 
> Another wrench in the works is the fact that maven itself may change 
> over time and may cause builds to change. I'm not sure I understand all 
> the ramifications, but it would seem reasonable to ask how to ensure 
> that the right version of maven and its plugins be used when reproducing 
> any particular build.
> 
> _Builds are not Parallel_
> 
> I actually don't know this for sure - is javac multithreaded? I know 
> that the reactor build isn't. Perhaps not such an urgent feature, but 
> still a pity, since GNU Make and cook have supported parallel builds for 
> over ten years...
> 
> My solution is to use "cook" to invoke "mvn -N compile" and "mvn -N 
> install" in every directory that has a pom.xml, after extracting the 
> dependencies from the pom.xml files, and let "cook" do the equivalent of 
> the reactor build, using cook's parallelization (compile and install 
> invoked separately, so that cook has a chance to insert JNI header 
> generation in between, and pack the C/C++ shared objects built by cook 
> into the assembly built in the "mvn install" step) . This also addresses 
> the next issue:
> 
> _Reactor Builds are "all or nothing"_
> 
> If I'm in a really large multi-module project, and I need to work on 
> multiple modules at the same time, I only have two choices:
> 
>    * I use the reactor build (and wait, and wait....)
>    * I run mvn in dependency order by hand.
> 
> I'd like to have a mode where mvn will do a reactor build for a specific 
> module and its prerequisites only.
> 
> _Propagation of Build Parameters Unpredictable and Undocumented_
> 
> As far as I can tell, there are pom parameters which will accumulate 
> (e.g. dependencies), override (various plugin settings, properties) or 
> just be ignored. Then there are parameters for which I can put in a 
> ${...} expression, and some where I can't. The only way to find out is 
> via trial and error (or perhaps by reading the source code). I think 
> this is importent enough to merit  documentation.
> 
> Others have complained before me about the difficulties in actually 
> figuring out how things are supposed to work. This is a serious 
> disadvantage of the declarative style: the declarations are by 
> definition arbitrary and must somehow be documented. As much as I hate 
> "ant", I have to say that this is much less of a problem there, mainly 
> because one needs to explicitly describe the operations in ant anyway, 
> and the atomic tasks are not so hard to understand.
> 
> _Release Process is Bizarre_
> 
> I don't envy anyone with the task of defining this. Fact is, there still 
> is no widespread consensus on how a release process is supposed to work, 
> as far as I can tell. All I know is that I can't use the release plugin 
> as it is. I do need to understand better the idea behind the snapshot 
> versioning and how version ranges will work out. I am uncomfortable with 
> some of the logic there.
> 
> I hope that this post will give the maven developers some insights on 
> the problems faced by a novice user attempting to use the product. I 
> still very much want to like it, and I'll struggle to make it work - but 
> it's not looking great at this moment.
> -- 
> cg
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
> For additional commands, e-mail: users-help@maven.apache.org
> 

-- 
Johannes Schneider
Im Lindenwasen 15
72810 Gomaringen

+49 7072 9229972
johannes@familieschneider.info
http://www.johannes-schneider.info

Re: First impressions of using maven (long)

Posted by diroussel <na...@diroussel.xsmail.com>.
Great post.  Your right, maven is a great tool, but still has a way to go to
make if perfect for most people.

Part of the problem is that release processes are shaped by forces outside
of the development team, such as legacy systems, existing team structures,
and other sorts of company history.  So any tool will have to be flexible.  

I do have some comments on this ...

Christian Goetze-3 wrote:
> 
> Builds are not Reproducible
> 
> This should be the holy grail of every release engineer. Arguably, it 
> isn't really maven's fault, but rather the fault of the java designers 
> to rely on a format that includes timestamps (jars). Run mvn install 
> once, save the artifcats, run mvn clean install again, and the jars will 
> look different. It requires trust in the system to accept that both 
> builds are the same. QA engineers are very unwilling to trust - it's 
> part of their job description. If you wish to reproduce a build, it 
> would be great if it came out bitwise identical to the original build.
> 

That is indeed a shame, but if the zip file format was not used java
adaption might not have been as strong?  Perhaps the maven packager could
implement your "set the timestamp to be the oldest entry" hack?

Alternatively, if the deployment people want to be sure about the contents
of a jar, rather than looking at it's md5sum, they should use jarsigning to
compare two jars.  When signing a jar, only the contents of the files are
hashed, not the timestamps.   If you have two signed release jars from the
same release, but build on different machines, you can compare the two
manifest files, or the two digital signatures.
See: http://java.sun.com/docs/books/tutorial/deployment/jar/intro.html


Christian Goetze-3 wrote:
> 
> Reactor builds are "all or nothing"
> 
I find this annoying too.  I have almost 20 modules in my project, and so I
hit this often.  Just thinking off the top my head, when the build looks at
a modules it should:
 1) check the "avoid rebuild" feature is turned on in the pom
 2) look in target\lastbuild.properties for info about the last build
 3) check the current buildVersion matches the one in lastbuild.properties
 4) check none of the source files are more recent than the timestamp in
lastbuild.properties
 5) if it's all ok, skip this module

Having the option to build a module and it's dependencies would be great as
well.

David

-- 
View this message in context: http://www.nabble.com/First-impressions-of-using-maven-%28long%29-tf2921690s177.html#a8216721
Sent from the Maven - Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org