You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@maven.apache.org by Christian Goetze <cg...@miaow.com> on 2007/01/04 20:54:52 UTC
First impressions of using maven (long)
In this post, I'd like to summarize my first impressions of using maven.
The product built in my company is a mixed bag of C/C++ code, java code
and perl code. It is a classic three tier app, with the back end written
in C/C++, the middle tier written in perl and java and the GUI written
in java.
The java code is built via ant scripts, and they have grown to be quite
intricate and unwieldy, and they also tended to obscure the actual
dependencies between artifacts, causing multiple jars to contain varying
subsets of the class files, and various random jars being duplicated
into several places in the source tree -- in other words: your classic
organically grown mess. Also, the way the ant scripts were written made
it very hard to get incremental parallel builds to work, since they
tended to spill files all over the tree while invoking each others in
scary ways.
The top level build system is written in a make clone called "cook",
written by a nice guy in Australia, Peter Miller. This clone has a
variety of extras not available in any other build tool, which made it
possible for me to write a system implementing a philosophy that matches
almost perfectly with maven's philosophy. See
http://www.cg-soft.com/tools/build/ for details. One thing to note is
that "cook" is fairly old software (I think that the first version came
out over 20 years ago), which makes the absence of many of its features
in newer systems that much more depressing - more on that later.
So maven seems like a perfect fit, especially since our java code is
fairly vanilla, using standard technologies and very few hacks. Indeed,
it was possible for me to convert about 60% of the java build to maven
builds within 2 days after reading the "Better Builds with Maven" book.
Needless to say, I was impressed.
I really want to like maven. I think it has the right ideas, and the way
it deals with deploying and reusing artifacts built elsewhere is a great
model for dealing with third party software, and even inhouse software
build by separate teams. It effortlessly implements a "wink-in" scheme
which usually requires a lot of effort to get to work. In particular, I
like the fact that developers only need to have those subtrees checked
out where they are actually making changes, and can still build the
complete product by downloading the artifacts built and deployed by the
continuous build loop.
I am in the process of integrating some of the maven ideas (mainly the
plugin architecture) into my cook based build system, and I almost
wished maven had some support for C/C++ style artifacts - but I realize
that this is a much harder problem than java artifacts - a testament to
the wisdom of the java designers.
Nevertheless, there are difficulties and disappointments:
* Incremental builds are not reliable;
* Builds are not reproducible;
* Builds are not parallizable and distributable;
* Reactor builds are "all or nothing";
* Propagation of build parameters is undocumented and unpredictable;
* Release process is bizarre.
I think these are important issues. I'll go into details later, but I
cannot stress how important it is to have a reliable build tool that
actually removes workload from developers. Most developers are not
thrilled about having control taken away from them via an "opiniated
tool" like maven. They will only go along if it provides tangible
benefits. It is therefore -extremely-important- to not disappoint them.
I don't think there is disagreement about that - after all, the idea
plugin is a great step in the right direction, but I do fear that people
do not fully appreciate the difficulties created by having a build tool
that fails in mysterious and random, hard to debug ways. Developers have
strong egos, and will go to great lengths to try to figure out things by
themselves and will only come and ask questions when they are desperate
- and then they will blame the tool and the people who brought it into
their lives.
Now, in detail:
_Incremental Builds are not Reliable_
There are two well known failure modes:
* A source file has been relocated or removed
* A source file was updated, but with a timestamp older than the
associated derived file(s).
Supporting those two cases is not really that hard: In the first case,
you record a hash signature of the sorted list of all "ingredient" files
used to produce the target file, and consider it out of date if that
signature changes. In the second case, you record timestamp and hash
signature of the source file and consider it out of date if the
timestamp and signature changes. As a side effect, you get free build
avoidance by comparing hash signature of a generated file with the
previous version and consider subsequent dependencies up to date if the
signature did not change. "cook" has been doing this for years and it
works great.
The real problem seems to be a lack of awareness of why this is so bad.
The classic shrug, followed by "just say mvn clean install"... Besides
the fact that this is horrible for doing continuous builds, it also
makes build script writers very lazy, since all they need to do is
support the "clean" case. So, for example, jars will include obsolete
.class files unless a clean build is used. Sites will contain obsolete
html files unless a clean build is used...
(before I discovered maven, I was about 50% done with integrating java
builds into my "cook" system, and one of the little tools I have is a
perl script to determine the exact names of the class files generated by
a java source file, so I will only pack the exact files produced by the
current build)...
_Builds are not Reproducible_
This should be the holy grail of every release engineer. Arguably, it
isn't really maven's fault, but rather the fault of the java designers
to rely on a format that includes timestamps (jars). Run mvn install
once, save the artifcats, run mvn clean install again, and the jars will
look different. It requires trust in the system to accept that both
builds are the same. QA engineers are very unwilling to trust - it's
part of their job description. If you wish to reproduce a build, it
would be great if it came out bitwise identical to the original build.
It is actually not impossible to do this, and "cook" as a feature to at
least solve the timestamp issue, assuming that your version control
system is good enough and can reproduce a source tree with the exact
same timestamps as the original. The trick is to backdate every
generated file to be one second younger than the youngest ingredient
file. As a side effect, it will actually make your jars slightly
smaller, since those timestamps will compress better - yeah, big deal :)
Another wrench in the works is the fact that maven itself may change
over time and may cause builds to change. I'm not sure I understand all
the ramifications, but it would seem reasonable to ask how to ensure
that the right version of maven and its plugins be used when reproducing
any particular build.
_Builds are not Parallel_
I actually don't know this for sure - is javac multithreaded? I know
that the reactor build isn't. Perhaps not such an urgent feature, but
still a pity, since GNU Make and cook have supported parallel builds for
over ten years...
My solution is to use "cook" to invoke "mvn -N compile" and "mvn -N
install" in every directory that has a pom.xml, after extracting the
dependencies from the pom.xml files, and let "cook" do the equivalent of
the reactor build, using cook's parallelization (compile and install
invoked separately, so that cook has a chance to insert JNI header
generation in between, and pack the C/C++ shared objects built by cook
into the assembly built in the "mvn install" step) . This also addresses
the next issue:
_Reactor Builds are "all or nothing"_
If I'm in a really large multi-module project, and I need to work on
multiple modules at the same time, I only have two choices:
* I use the reactor build (and wait, and wait....)
* I run mvn in dependency order by hand.
I'd like to have a mode where mvn will do a reactor build for a specific
module and its prerequisites only.
_Propagation of Build Parameters Unpredictable and Undocumented_
As far as I can tell, there are pom parameters which will accumulate
(e.g. dependencies), override (various plugin settings, properties) or
just be ignored. Then there are parameters for which I can put in a
${...} expression, and some where I can't. The only way to find out is
via trial and error (or perhaps by reading the source code). I think
this is importent enough to merit documentation.
Others have complained before me about the difficulties in actually
figuring out how things are supposed to work. This is a serious
disadvantage of the declarative style: the declarations are by
definition arbitrary and must somehow be documented. As much as I hate
"ant", I have to say that this is much less of a problem there, mainly
because one needs to explicitly describe the operations in ant anyway,
and the atomic tasks are not so hard to understand.
_Release Process is Bizarre_
I don't envy anyone with the task of defining this. Fact is, there still
is no widespread consensus on how a release process is supposed to work,
as far as I can tell. All I know is that I can't use the release plugin
as it is. I do need to understand better the idea behind the snapshot
versioning and how version ranges will work out. I am uncomfortable with
some of the logic there.
I hope that this post will give the maven developers some insights on
the problems faced by a novice user attempting to use the product. I
still very much want to like it, and I'll struggle to make it work - but
it's not looking great at this moment.
--
cg
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org
Re: First impressions of using maven (long)
Posted by Johannes Schneider <jo...@familieschneider.info>.
Hi,
thanks for your post. I think you have addressed several issues that
should be fixed asap.
But unfortunately the developing speed of Maven seems to be very low.
There exist many bugs and shortcomings everybody complains about (e.g.
site generation for multi module projects...) :-(.
The problem is that Maven is the best java build system available (the
perl among thorns). And apperently there isn't too much motivation for
fixing the existing issues without competition...
Johannes Schneider
Christian Goetze wrote:
> In this post, I'd like to summarize my first impressions of using maven.
>
> The product built in my company is a mixed bag of C/C++ code, java code
> and perl code. It is a classic three tier app, with the back end written
> in C/C++, the middle tier written in perl and java and the GUI written
> in java.
>
> The java code is built via ant scripts, and they have grown to be quite
> intricate and unwieldy, and they also tended to obscure the actual
> dependencies between artifacts, causing multiple jars to contain varying
> subsets of the class files, and various random jars being duplicated
> into several places in the source tree -- in other words: your classic
> organically grown mess. Also, the way the ant scripts were written made
> it very hard to get incremental parallel builds to work, since they
> tended to spill files all over the tree while invoking each others in
> scary ways.
>
> The top level build system is written in a make clone called "cook",
> written by a nice guy in Australia, Peter Miller. This clone has a
> variety of extras not available in any other build tool, which made it
> possible for me to write a system implementing a philosophy that matches
> almost perfectly with maven's philosophy. See
> http://www.cg-soft.com/tools/build/ for details. One thing to note is
> that "cook" is fairly old software (I think that the first version came
> out over 20 years ago), which makes the absence of many of its features
> in newer systems that much more depressing - more on that later.
>
> So maven seems like a perfect fit, especially since our java code is
> fairly vanilla, using standard technologies and very few hacks. Indeed,
> it was possible for me to convert about 60% of the java build to maven
> builds within 2 days after reading the "Better Builds with Maven" book.
> Needless to say, I was impressed.
>
> I really want to like maven. I think it has the right ideas, and the way
> it deals with deploying and reusing artifacts built elsewhere is a great
> model for dealing with third party software, and even inhouse software
> build by separate teams. It effortlessly implements a "wink-in" scheme
> which usually requires a lot of effort to get to work. In particular, I
> like the fact that developers only need to have those subtrees checked
> out where they are actually making changes, and can still build the
> complete product by downloading the artifacts built and deployed by the
> continuous build loop.
>
> I am in the process of integrating some of the maven ideas (mainly the
> plugin architecture) into my cook based build system, and I almost
> wished maven had some support for C/C++ style artifacts - but I realize
> that this is a much harder problem than java artifacts - a testament to
> the wisdom of the java designers.
>
> Nevertheless, there are difficulties and disappointments:
>
> * Incremental builds are not reliable;
> * Builds are not reproducible;
> * Builds are not parallizable and distributable;
> * Reactor builds are "all or nothing";
> * Propagation of build parameters is undocumented and unpredictable;
> * Release process is bizarre.
>
> I think these are important issues. I'll go into details later, but I
> cannot stress how important it is to have a reliable build tool that
> actually removes workload from developers. Most developers are not
> thrilled about having control taken away from them via an "opiniated
> tool" like maven. They will only go along if it provides tangible
> benefits. It is therefore -extremely-important- to not disappoint them.
> I don't think there is disagreement about that - after all, the idea
> plugin is a great step in the right direction, but I do fear that people
> do not fully appreciate the difficulties created by having a build tool
> that fails in mysterious and random, hard to debug ways. Developers have
> strong egos, and will go to great lengths to try to figure out things by
> themselves and will only come and ask questions when they are desperate
> - and then they will blame the tool and the people who brought it into
> their lives.
>
> Now, in detail:
>
> _Incremental Builds are not Reliable_
>
> There are two well known failure modes:
>
> * A source file has been relocated or removed
> * A source file was updated, but with a timestamp older than the
> associated derived file(s).
>
> Supporting those two cases is not really that hard: In the first case,
> you record a hash signature of the sorted list of all "ingredient" files
> used to produce the target file, and consider it out of date if that
> signature changes. In the second case, you record timestamp and hash
> signature of the source file and consider it out of date if the
> timestamp and signature changes. As a side effect, you get free build
> avoidance by comparing hash signature of a generated file with the
> previous version and consider subsequent dependencies up to date if the
> signature did not change. "cook" has been doing this for years and it
> works great.
>
> The real problem seems to be a lack of awareness of why this is so bad.
> The classic shrug, followed by "just say mvn clean install"... Besides
> the fact that this is horrible for doing continuous builds, it also
> makes build script writers very lazy, since all they need to do is
> support the "clean" case. So, for example, jars will include obsolete
> .class files unless a clean build is used. Sites will contain obsolete
> html files unless a clean build is used...
>
> (before I discovered maven, I was about 50% done with integrating java
> builds into my "cook" system, and one of the little tools I have is a
> perl script to determine the exact names of the class files generated by
> a java source file, so I will only pack the exact files produced by the
> current build)...
>
> _Builds are not Reproducible_
>
> This should be the holy grail of every release engineer. Arguably, it
> isn't really maven's fault, but rather the fault of the java designers
> to rely on a format that includes timestamps (jars). Run mvn install
> once, save the artifcats, run mvn clean install again, and the jars will
> look different. It requires trust in the system to accept that both
> builds are the same. QA engineers are very unwilling to trust - it's
> part of their job description. If you wish to reproduce a build, it
> would be great if it came out bitwise identical to the original build.
>
> It is actually not impossible to do this, and "cook" as a feature to at
> least solve the timestamp issue, assuming that your version control
> system is good enough and can reproduce a source tree with the exact
> same timestamps as the original. The trick is to backdate every
> generated file to be one second younger than the youngest ingredient
> file. As a side effect, it will actually make your jars slightly
> smaller, since those timestamps will compress better - yeah, big deal :)
>
> Another wrench in the works is the fact that maven itself may change
> over time and may cause builds to change. I'm not sure I understand all
> the ramifications, but it would seem reasonable to ask how to ensure
> that the right version of maven and its plugins be used when reproducing
> any particular build.
>
> _Builds are not Parallel_
>
> I actually don't know this for sure - is javac multithreaded? I know
> that the reactor build isn't. Perhaps not such an urgent feature, but
> still a pity, since GNU Make and cook have supported parallel builds for
> over ten years...
>
> My solution is to use "cook" to invoke "mvn -N compile" and "mvn -N
> install" in every directory that has a pom.xml, after extracting the
> dependencies from the pom.xml files, and let "cook" do the equivalent of
> the reactor build, using cook's parallelization (compile and install
> invoked separately, so that cook has a chance to insert JNI header
> generation in between, and pack the C/C++ shared objects built by cook
> into the assembly built in the "mvn install" step) . This also addresses
> the next issue:
>
> _Reactor Builds are "all or nothing"_
>
> If I'm in a really large multi-module project, and I need to work on
> multiple modules at the same time, I only have two choices:
>
> * I use the reactor build (and wait, and wait....)
> * I run mvn in dependency order by hand.
>
> I'd like to have a mode where mvn will do a reactor build for a specific
> module and its prerequisites only.
>
> _Propagation of Build Parameters Unpredictable and Undocumented_
>
> As far as I can tell, there are pom parameters which will accumulate
> (e.g. dependencies), override (various plugin settings, properties) or
> just be ignored. Then there are parameters for which I can put in a
> ${...} expression, and some where I can't. The only way to find out is
> via trial and error (or perhaps by reading the source code). I think
> this is importent enough to merit documentation.
>
> Others have complained before me about the difficulties in actually
> figuring out how things are supposed to work. This is a serious
> disadvantage of the declarative style: the declarations are by
> definition arbitrary and must somehow be documented. As much as I hate
> "ant", I have to say that this is much less of a problem there, mainly
> because one needs to explicitly describe the operations in ant anyway,
> and the atomic tasks are not so hard to understand.
>
> _Release Process is Bizarre_
>
> I don't envy anyone with the task of defining this. Fact is, there still
> is no widespread consensus on how a release process is supposed to work,
> as far as I can tell. All I know is that I can't use the release plugin
> as it is. I do need to understand better the idea behind the snapshot
> versioning and how version ranges will work out. I am uncomfortable with
> some of the logic there.
>
> I hope that this post will give the maven developers some insights on
> the problems faced by a novice user attempting to use the product. I
> still very much want to like it, and I'll struggle to make it work - but
> it's not looking great at this moment.
> --
> cg
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
> For additional commands, e-mail: users-help@maven.apache.org
>
--
Johannes Schneider
Im Lindenwasen 15
72810 Gomaringen
+49 7072 9229972
johannes@familieschneider.info
http://www.johannes-schneider.info
Re: First impressions of using maven (long)
Posted by diroussel <na...@diroussel.xsmail.com>.
Great post. Your right, maven is a great tool, but still has a way to go to
make if perfect for most people.
Part of the problem is that release processes are shaped by forces outside
of the development team, such as legacy systems, existing team structures,
and other sorts of company history. So any tool will have to be flexible.
I do have some comments on this ...
Christian Goetze-3 wrote:
>
> Builds are not Reproducible
>
> This should be the holy grail of every release engineer. Arguably, it
> isn't really maven's fault, but rather the fault of the java designers
> to rely on a format that includes timestamps (jars). Run mvn install
> once, save the artifcats, run mvn clean install again, and the jars will
> look different. It requires trust in the system to accept that both
> builds are the same. QA engineers are very unwilling to trust - it's
> part of their job description. If you wish to reproduce a build, it
> would be great if it came out bitwise identical to the original build.
>
That is indeed a shame, but if the zip file format was not used java
adaption might not have been as strong? Perhaps the maven packager could
implement your "set the timestamp to be the oldest entry" hack?
Alternatively, if the deployment people want to be sure about the contents
of a jar, rather than looking at it's md5sum, they should use jarsigning to
compare two jars. When signing a jar, only the contents of the files are
hashed, not the timestamps. If you have two signed release jars from the
same release, but build on different machines, you can compare the two
manifest files, or the two digital signatures.
See: http://java.sun.com/docs/books/tutorial/deployment/jar/intro.html
Christian Goetze-3 wrote:
>
> Reactor builds are "all or nothing"
>
I find this annoying too. I have almost 20 modules in my project, and so I
hit this often. Just thinking off the top my head, when the build looks at
a modules it should:
1) check the "avoid rebuild" feature is turned on in the pom
2) look in target\lastbuild.properties for info about the last build
3) check the current buildVersion matches the one in lastbuild.properties
4) check none of the source files are more recent than the timestamp in
lastbuild.properties
5) if it's all ok, skip this module
Having the option to build a module and it's dependencies would be great as
well.
David
--
View this message in context: http://www.nabble.com/First-impressions-of-using-maven-%28long%29-tf2921690s177.html#a8216721
Sent from the Maven - Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org