You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@maven.apache.org by Benjamin Bentmann <be...@udo.edu> on 2008/04/29 13:23:44 UTC

[POLL] Default Value for File Encoding

Dear community,

the Maven team is currently discussing a proposal about the future handling
of source file encoding by the various plugins, please see our wiki article
[0] for all details.

A controversial aspect of this proposal is which file encoding should be
assumed in case the user did not specify this in the POM. This poll should
help us to come to a well-founded decision.

These are the two possible directions to go:

a) Use the current platform encoding, aka the system property
   "file.encoding".

b) Use a static/fixed value that is defined by convention, i.e. is not
   platform-dependent.

Approach a) matches the current behavior of most plugins and is as such
backwards-compatible. Approach b) on the other hand can potentially break
builds when users update to a newer version of an affected plugin if:
- the build relies on an encoding other than ASCII/Latin-1 and
- this encoding is not explicitly stated in the plugin configuration

The reason why b) was suggested is its positive effect on build
reproducibility: Unlike approach a), a build will out-of-the-box deliver the
same output for all team members regardless of their OS or locale. It is now
to balance if this improvement is worth the potential breaks as illustrated
above.

So, please let us know:

[a] Use platform default encoding, keep backward-compat
[b] Use fixed default encoding, be platform-independent

Regards,


Benjamin Bentmann


[0]
http://docs.codehaus.org/display/MAVENUSER/POM+Element+for+Source+File+Encoding


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Mauro Talevi <ma...@aquilonia.org>.
+1 for b) - reproducibility is more important that the bother to have to 
define the encoding explicitly.


Benjamin Bentmann wrote:
> Dear community,
> 
> the Maven team is currently discussing a proposal about the future handling
> of source file encoding by the various plugins, please see our wiki article
> [0] for all details.
> 
> A controversial aspect of this proposal is which file encoding should be
> assumed in case the user did not specify this in the POM. This poll should
> help us to come to a well-founded decision.
> 
> These are the two possible directions to go:
> 
> a) Use the current platform encoding, aka the system property
>   "file.encoding".
> 
> b) Use a static/fixed value that is defined by convention, i.e. is not
>   platform-dependent.
> 
> Approach a) matches the current behavior of most plugins and is as such
> backwards-compatible. Approach b) on the other hand can potentially break
> builds when users update to a newer version of an affected plugin if:
> - the build relies on an encoding other than ASCII/Latin-1 and
> - this encoding is not explicitly stated in the plugin configuration
> 
> The reason why b) was suggested is its positive effect on build
> reproducibility: Unlike approach a), a build will out-of-the-box deliver 
> the
> same output for all team members regardless of their OS or locale. It is 
> now
> to balance if this improvement is worth the potential breaks as illustrated
> above.
> 
> So, please let us know:
> 
> [a] Use platform default encoding, keep backward-compat
> [b] Use fixed default encoding, be platform-independent
> 
> Regards,
> 
> 
> Benjamin Bentmann
> 
> 
> [0]
> http://docs.codehaus.org/display/MAVENUSER/POM+Element+for+Source+File+Encoding 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Benjamin Bentmann <be...@udo.edu>.
Roger Ye wrote:

> we can survive if we explicitly set the source file encoding in the 
> project
> pom.xml

Yes, this is right, explicitly setting the encoding is the golden answer. 
But will you do so right from the beginning if your platform default 
encoding happens to build as you expect or will you just wait until somebody 
reports a problem with the build because his default encoding does not work?

> the context of this statement is within a standalone system, I think this 
> is
> exactly what the notepad.exe does, notepad surely works, in its place.

But do we talk about a "standalone system"? I really feel there this is a 
little difference between Maven and Notepad... I mean Maven is quite a 
global player, building one or the other project over here and there, 
whereas Notepad, well, I don't know.

> by the way you're actually telling me that the two projects both have
> explicit encoding,

OK, then I didn't clearly express myself. With "using UTF-8" I mean that the 
sources are indeed UTF-8 encoded, but not necessarily that this encoding is 
also declared in the POM.

> Regarding SVN/CVS, I think the repository should have of strong type in 
> case
> of encoding, whether explicit or implicit.
> e.g. if the SVN repository is using UTF-8, then it's strange if the file
> checked out is in another one
> about this I don't know much of SVN/CVS, this is an interesting topic I'd
> like to know more.

To my knowledge, SVN is currently not aware of file encoding.


Benjamin 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Roger Ye <ro...@gmail.com>.
On 4/30/08, Benjamin Bentmann <be...@udo.edu> wrote:
>
> Roger Ye wrote:
>
>  e.g., in Linux, if LC_ALL=en_US.UTF-16 has been set,
> > one will be very confused in case of option [b], when maven uses another
> > encoding such as utf-8
> >
>
> Confusion, that is exactly my point. If one of your co-workers has
> "LC_ALL" set to a different value, won't he be confused why the build is
> failing for him when you just tell "works for me"? The same POM should
> deliver the same build output, that's just what I consider of "highest
> weight".


we can survive if we explicitly set the source file encoding in the project
pom.xml, this is visible and the overriding logic is reasonable.

and always respecting platform default encoding is the correct way to make
> > an application encoding-transparent
> >
>
> I feel I misunderstand you. From your description, I imagine a world were
> text editors don't bother to ask users for an encoding but simply always use
> platform default encoding. In such a world, I wonder how people would
> collaboratively work on the same sources.


the context of this statement is within a standalone system, I think this is
exactly what the notepad.exe does, notepad surely works, in its place.

so the application developer don't need to worry about converting
> > back-n-forth between several encodings/charsets,
>
>
ditto


Considering the internet and its wonderful aspect of bringing people all
> over the world together, I really believe it is time that application
> developers *do* worry about encoding and converting file contents to pull
> down the walls that our different locales or OS impose.
>
> Imagine two open-source projects, one using UTF-8 and the other Big5. How
> would people participate on these projects (using the same machine) if we
> expected applications to always stick to one system-wide encoding setting?


by explicitly setting the source file  encoding in  each project's own
pom.xml,  as UTF-8 and Big5, respectively.
surely this will be a problem for you if you don't explicitly specify the
encoding
and please note with option [b] there'll be no answer if you still insist
not to explicitly set encoding.
by the way you're actually telling me that the two projects both have
explicit encoding,
this is not the case of the VOTE which discuss project without explicit
encoding.

IMO, e.g.,  networking related applications, have to deal with encoding,
> > this is by nature, since network is used to connect
> > people from different places.
> >
>
> Let's remember that Maven is just sitting next to a "networking related
> application", i.e. source control management.


That why I suggest explicit encoding in pom.xml,
Regarding SVN/CVS, I think the repository should have of strong type in case
of encoding, whether explicit or implicit.
e.g. if the SVN repository is using UTF-8, then it's strange if the file
checked out is in another one
about this I don't know much of SVN/CVS, this is an interesting topic I'd
like to know more.

Nice to discuss here
Roger

Re: [POLL] Default Value for File Encoding

Posted by Benjamin Bentmann <be...@udo.edu>.
walid joseph Gedeon wrote:

> Note: it would probably be a good idea to include the encoding used 
> (whether
> default or set) in the plugin report information.

Which kind of "plugin report information" are you referring to? E.g. where 
exactly should the encoding used by the Maven Compiler Plugin be documented?

But then again, if you say "People that don't care about it don't need to 
worry", what would be the motivation/benefit of having such a report output?


Benjamin 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by walid joseph Gedeon <wg...@gmail.com>.
+1 for a)

- People that don't care about it don't need to worry
- It works similarly within groups that share the same encodings
- When it breaks, because cross-unicode-script contributors are involved,
then it needs to be specified in the pom.

The downside of b) is that it forces all those who don't use latin-1 to set
it in the pom, even if they're all using the same default encoding.

Note: it would probably be a good idea to include the encoding used (whether
default or set) in the plugin report information.

W

On Tue, Apr 29, 2008 at 8:32 PM, Roger Ye <ro...@gmail.com> wrote:

> Hi,
>
> On 4/30/08, Benjamin Bentmann <be...@udo.edu> wrote:
> >
> > I agree, having users explicitly state the encoding in their POMs is the
> > best we can have, the same applies to locking down plugin versions by
> the
> > way. No guessing, no implicit default values, just full control, let's
> call
> > it "heaven" ;-)
> >
> > But how to get their? The threat I see with continuing to use the
> platform
> > default encoding is that people will be left unaware of the encoding
> issue
> > because platform default encoding works just nicely most of time.
>
>
> For projects involving developers from different country (i.e. the
> developers use different default encodings from one to another), it's a
> must
> for everyone in the team / project to understand that his/her default
> encoding is not the "default" for others, e.g. I'm from China, I've
> created
> a Maven project, using the my default encoding GBK, and then shared it
> with
> you, Benjamin, then how would you collaborate with me? surely you cannot
> assume the encoding to be iso-8859-xx (your system default, excuse me if
> I'm
> wrong)
> Then there are two solutions IMO:
> 1). we set GBK as source file encoding in pom.xml
> 2). we don't change pom.xml, but we both use an imaginary-maven-fork which
> treats every file as encoded in GBK, this does not be platform-dependent.
> as
> option b)
>
> will you agree with solution 2)? even if there're 99 developers from China
> while only one of you from Germany :P
>
> so, I insist option a), and if it's problematic without explicit encoding,
> it means an explicit encoding is required in the POM.
>
> and I also insist that it's important for developers to understand the
> root
> cause of the inconsistent build result generated by developers from
> different country / region.
> such developers should understand Unicode, and different encodings, and
> how
> the platform default encoding affects the build result.
>
> Thanks
> Roger
>
> Or just a warning for not to expect "whole world is just using your
> > > preferred encoding"?
> > >
> >  Yes, a nice warning is surely due if a) wins.
> > Benjamin
> >
> >
>

Re: [POLL] Default Value for File Encoding

Posted by Benjamin Bentmann <be...@udo.edu>.
Roger Ye wrote:

> For projects involving developers from different country (i.e. the
> developers use different default encodings from one to another), it's a
> must
> for everyone in the team / project to understand that his/her default
> encoding is not the "default" for others

Yes, it would be great if this awareness of encoding differences was in
everybody's head. If you like, grab yourself a POM of a Maven
component/plugin, run "mvn help:effective-pom" on it and search for
"encoding"...

I mean, unless something breaks, people tend to just be happy with the
status quo. Also, not everybody cares about warnings ("Not an error and
works for me, so why bother?").

> e.g. I'm from China, I've created
> a Maven project, using the my default encoding GBK, and then shared it
> with
> you, Benjamin, then how would you collaborate with me?

I usually follow the conventions setup by the project owner/leader, so
naturally I tell my IDE (but only my IDE, not my entire OS) to user GBK for
our imaginary joint venture and are fine with editing the sources. The
remaining question is how will my build output look like.

> surely you cannot assume the encoding to be iso-8859-xx (your system
> default, excuse me if I'm wrong)

Just in case to be clear: Latin-1 was not chosen as the proposed default
value because it happens to be similar with my encoding. It was merely
proposed as a matter of consistency with another plugin that already had
this default value.

> Then there are two solutions IMO:
> 1). we set GBK as source file encoding in pom.xml
> 2). we don't change pom.xml, but we both use an imaginary-maven-fork which
> treats every file as encoded in GBK, this does not be platform-dependent.
> as
> option b)
>
> will you agree with solution 2)? even if there're 99 developers from China
> while only one of you from Germany :P

I'm not sure whether I got your point with solution 2) right: Of course we
shouldn't use some maven-fork, there should only be one Maven. So in either
way, the solution to go for your sketched project is 1), i.e. specify the
encoding GBK in the POM. Otherwise, if we leave the encoding unspecified I
would produce garbage output on my Western machine when building the GBK
encoded sources. The point with the option b) would have just been that
already the Chinese developers would have noticed the requirement to specify
the encoding in the POM, preventing build failures for people outside of
China. Wrong build output is quite a severe illness and should be fixed even
if only a minority of developers experience it.

> so, I insist option a)

We have also five more votes for a) over on the wiki, so it seems you need
not worry too much for this coming through ;-)


Benjamin


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Roger Ye <ro...@gmail.com>.
Hi,

On 4/30/08, Benjamin Bentmann <be...@udo.edu> wrote:
>
> I agree, having users explicitly state the encoding in their POMs is the
> best we can have, the same applies to locking down plugin versions by the
> way. No guessing, no implicit default values, just full control, let's call
> it "heaven" ;-)
>
> But how to get their? The threat I see with continuing to use the platform
> default encoding is that people will be left unaware of the encoding issue
> because platform default encoding works just nicely most of time.


For projects involving developers from different country (i.e. the
developers use different default encodings from one to another), it's a must
for everyone in the team / project to understand that his/her default
encoding is not the "default" for others, e.g. I'm from China, I've created
a Maven project, using the my default encoding GBK, and then shared it with
you, Benjamin, then how would you collaborate with me? surely you cannot
assume the encoding to be iso-8859-xx (your system default, excuse me if I'm
wrong)
Then there are two solutions IMO:
1). we set GBK as source file encoding in pom.xml
2). we don't change pom.xml, but we both use an imaginary-maven-fork which
treats every file as encoded in GBK, this does not be platform-dependent. as
option b)

will you agree with solution 2)? even if there're 99 developers from China
while only one of you from Germany :P

so, I insist option a), and if it's problematic without explicit encoding,
it means an explicit encoding is required in the POM.

and I also insist that it's important for developers to understand the root
cause of the inconsistent build result generated by developers from
different country / region.
such developers should understand Unicode, and different encodings, and how
the platform default encoding affects the build result.

Thanks
Roger

Or just a warning for not to expect "whole world is just using your
> > preferred encoding"?
> >
>  Yes, a nice warning is surely due if a) wins.
> Benjamin
>
>

Re: [POLL] Default Value for File Encoding

Posted by Rainer Pruy <Ra...@Acrys.COM>.

Benjamin Bentmann schrieb:
> Rainer Pruy wrote:
> 
>> I'm still not convinced that we will get their by trading one problematic
>> default for another.
> 
> I am not saying that this is the ultimate solution. I only believe it's a
> compromise and improvement until we can introduce a new POM version in
> Maven
> 2.1, comparable to the Maven 2.0.9 Super POM locking down some plugin
> versions.
> 
>> As stated already, one way is creating and improving awareness, e.g. by
>> flagging any problematic access to a file or better stop working (for
>> "new" projects) if encoding is not stated explicitly.
> 
> Once it's time to discuss the POM 4.1, we can surely come back to this and
> consider if the encoding setting should have a default value of simply be
> required by the user.
> 
> Alternatively, we could right now for Maven 2.0.x make plugins declare
> their
> encoding parameter to be @required. This will definitively halt the
> build in
> case the user did not specify an encoding. With regard to awareness, that
> would surely be the cleanest solution. Is that were you would Maven see
> to go?

Yes, I do consider this a cleaner solution and causing much more positive effective than changing some defaults

> 
>> Sigh, I'm a bit idealistic, I know....
> 
> Never mind, if you can accept me being a little of a radical ;-)
> 
Wellcome to the club...

> 
> Benjamin
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
> For additional commands, e-mail: users-help@maven.apache.org
> 

-- 
Rainer Pruy
Geschäftsführer

Acrys Consult GmbH & Co. KG
Untermainkai 29-30, D-60329 Frankfurt
Tel: +49-69-244506-0 - Fax: +49-69-244506-50
Web: http://www.acrys.com -  Email: office@acrys.com
Handelsregister: Frankfurt am Main, HRA 31151

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Benjamin Bentmann <be...@udo.edu>.
Rainer Pruy wrote:

> I'm still not convinced that we will get their by trading one problematic
> default for another.

I am not saying that this is the ultimate solution. I only believe it's a
compromise and improvement until we can introduce a new POM version in Maven
2.1, comparable to the Maven 2.0.9 Super POM locking down some plugin
versions.

> As stated already, one way is creating and improving awareness, e.g. by
> flagging any problematic access to a file or better stop working (for
> "new" projects) if encoding is not stated explicitly.

Once it's time to discuss the POM 4.1, we can surely come back to this and
consider if the encoding setting should have a default value of simply be
required by the user.

Alternatively, we could right now for Maven 2.0.x make plugins declare their
encoding parameter to be @required. This will definitively halt the build in
case the user did not specify an encoding. With regard to awareness, that
would surely be the cleanest solution. Is that were you would Maven see to 
go?

> Sigh, I'm a bit idealistic, I know....

Never mind, if you can accept me being a little of a radical ;-)


Benjamin


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Rainer Pruy <Ra...@Acrys.COM>.

Benjamin Bentmann schrieb:
> Roger Ye wrote:
> 
>> But is'nt this more an argument for "get used to explicitly state
>> encoding" than for
>> "a maven wide default is better than a platform wide default"?
> 
> I agree, having users explicitly state the encoding in their POMs is the
> best we can have, the same applies to locking down plugin versions by
> the way. No guessing, no implicit default values, just full control,
> let's call it "heaven" ;-)
> 
> But how to get their? The threat I see with continuing to use the
> platform default encoding is that people will be left unaware of the
> encoding issue because platform default encoding works just nicely most
> of time.

I'm still not convinced that we will get their by trading one problematic default for another.
As stated already, one way is creating and improving awareness, e.g. by flagging any problematic access to a file or better stop
working (for "new" projects) if encoding is not stated explicitly.

Sigh, I'm a bit idealistic, I know....


> 
>> Or just a warning for not to expect "whole world is just using your
>> preferred encoding"?
> 
> Yes, a nice warning is surely due if a) wins.
> 
> 
> Benjamin
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
> For additional commands, e-mail: users-help@maven.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Benjamin Bentmann <be...@udo.edu>.
Roger Ye wrote:

> But is'nt this more an argument for "get used to explicitly state 
> encoding" than for
> "a maven wide default is better than a platform wide default"?

I agree, having users explicitly state the encoding in their POMs is the 
best we can have, the same applies to locking down plugin versions by the 
way. No guessing, no implicit default values, just full control, let's call 
it "heaven" ;-)

But how to get their? The threat I see with continuing to use the platform 
default encoding is that people will be left unaware of the encoding issue 
because platform default encoding works just nicely most of time.

> Or just a warning for not to expect "whole world is just using your 
> preferred encoding"?

Yes, a nice warning is surely due if a) wins.


Benjamin 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Rainer Pruy <Ra...@Acrys.COM>.

Benjamin Bentmann schrieb:
> Roger Ye wrote:
> 
>> e.g., in Linux, if LC_ALL=en_US.UTF-16 has been set,
>> one will be very confused in case of option [b], when maven uses another
>> encoding such as utf-8
> 
> Confusion, that is exactly my point. If one of your co-workers has
> "LC_ALL" set to a different value, won't he be confused why the build is
> failing for him when you just tell "works for me"? The same POM should
> deliver the same build output, that's just what I consider of "highest
> weight".

But is'nt this more an argument for "get used to explicitly state encoding" than for
"a maven wide default is better than a platform wide default"?
Or just a warning for not to expect "whole world is just using your preferred encoding"?

> 
>> and always respecting platform default encoding is the correct way to
>> make
>> an application encoding-transparent
> 
> I feel I misunderstand you. From your description, I imagine a world
> were text editors don't bother to ask users for an encoding but simply
> always use platform default encoding. In such a world, I wonder how
> people would collaboratively work on the same sources.
> 
>> so the application developer don't need to worry about converting
>> back-n-forth between several encodings/charsets,
> 
> Considering the internet and its wonderful aspect of bringing people all
> over the world together, I really believe it is time that application
> developers *do* worry about encoding and converting file contents to
> pull down the walls that our different locales or OS impose.

Fully agreed!
But, the discussion is about implied defaults not evangelizing explicit encoding declarations.
Cooperating people from different encoding worlds are usually already quite aware of those problems and used to attacking them.
Defining a default maven encoding brings this problem to solitaire users that just happen to live in a different encoding world than
maven default...

> 
> Imagine two open-source projects, one using UTF-8 and the other Big5.
> How would people participate on these projects (using the same machine)
> if we expected applications to always stick to one system-wide encoding
> setting?
> 
>> IMO, e.g.,  networking related applications, have to deal with encoding,
>> this is by nature, since network is used to connect
>> people from different places.
> 
> Let's remember that Maven is just sitting next to a "networking related
> application", i.e. source control management.
> 
> 
> Benjamin
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
> For additional commands, e-mail: users-help@maven.apache.org
> 

-- 
Rainer Pruy
Geschäftsführer

Acrys Consult GmbH & Co. KG
Untermainkai 29-30, D-60329 Frankfurt
Tel: +49-69-244506-0 - Fax: +49-69-244506-50
Web: http://www.acrys.com -  Email: office@acrys.com
Handelsregister: Frankfurt am Main, HRA 31151

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Benjamin Bentmann <be...@udo.edu>.
Roger Ye wrote:

> e.g., in Linux, if LC_ALL=en_US.UTF-16 has been set,
> one will be very confused in case of option [b], when maven uses another
> encoding such as utf-8

Confusion, that is exactly my point. If one of your co-workers has "LC_ALL" 
set to a different value, won't he be confused why the build is failing for 
him when you just tell "works for me"? The same POM should deliver the same 
build output, that's just what I consider of "highest weight".

> and always respecting platform default encoding is the correct way to make
> an application encoding-transparent

I feel I misunderstand you. From your description, I imagine a world were 
text editors don't bother to ask users for an encoding but simply always use 
platform default encoding. In such a world, I wonder how people would 
collaboratively work on the same sources.

> so the application developer don't need to worry about converting
> back-n-forth between several encodings/charsets,

Considering the internet and its wonderful aspect of bringing people all 
over the world together, I really believe it is time that application 
developers *do* worry about encoding and converting file contents to pull 
down the walls that our different locales or OS impose.

Imagine two open-source projects, one using UTF-8 and the other Big5. How 
would people participate on these projects (using the same machine) if we 
expected applications to always stick to one system-wide encoding setting?

> IMO, e.g.,  networking related applications, have to deal with encoding,
> this is by nature, since network is used to connect
> people from different places.

Let's remember that Maven is just sitting next to a "networking related 
application", i.e. source control management.


Benjamin 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Roger Ye <ro...@gmail.com>.
definitely option [a]

respecting platform default encoding is the convention with the highest
weight,
and option [b] simply breaks this convention by not respecting platform
default encoding.

e.g., in Linux, if LC_ALL=en_US.UTF-16 has been set,
one will be very confused in case of option [b], when maven uses another
encoding such as utf-8
this is just an example and may not be the actual case, surely I know utf-8
is a good thing

furthermore, if a lot of applications behave like option [b], but
unfortunately they use inconsistent default encoding,
then you know what a hell is.

and always respecting platform default encoding is the correct way to make
an application encoding-transparent
so the application developer don't need to worry about converting
back-n-forth between several encodings/charsets,
given the context of a standalone system as a sandbox.

IMO, e.g.,  networking related applications, have to deal with encoding,
this is by nature, since network is used to connect
people from different places.

If the developer of a multi-encoding application don't understand what
encoding is, he/she should learn it, you just
can not assume the multi-encoding application as a single-encoding
application

an encoding, to a text parser application, is like the language spoked to
the audience,  you just can not assume
the speaker of any lecture always uses English.

It's easy for people to know that there are so many languages spoked in the
world, it's just one more step further
to understand that "different" text parsers also read in different encodings


On 4/29/08, Benjamin Bentmann <be...@udo.edu> wrote:
>
> Dear community,
>
> the Maven team is currently discussing a proposal about the future
> handling
> of source file encoding by the various plugins, please see our wiki
> article
> [0] for all details.
>
> A controversial aspect of this proposal is which file encoding should be
> assumed in case the user did not specify this in the POM. This poll should
> help us to come to a well-founded decision.
>
> These are the two possible directions to go:
>
> a) Use the current platform encoding, aka the system property
>  "file.encoding".
>
> b) Use a static/fixed value that is defined by convention, i.e. is not
>  platform-dependent.
>
> Approach a) matches the current behavior of most plugins and is as such
> backwards-compatible. Approach b) on the other hand can potentially break
> builds when users update to a newer version of an affected plugin if:
> - the build relies on an encoding other than ASCII/Latin-1 and
> - this encoding is not explicitly stated in the plugin configuration
>
> The reason why b) was suggested is its positive effect on build
> reproducibility: Unlike approach a), a build will out-of-the-box deliver
> the
> same output for all team members regardless of their OS or locale. It is
> now
> to balance if this improvement is worth the potential breaks as
> illustrated
> above.
>
> So, please let us know:
>
> [a] Use platform default encoding, keep backward-compat
> [b] Use fixed default encoding, be platform-independent
>
> Regards,
>
>
> Benjamin Bentmann
>
>
> [0]
>
> http://docs.codehaus.org/display/MAVENUSER/POM+Element+for+Source+File+Encoding
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
> For additional commands, e-mail: users-help@maven.apache.org
>
>

Re: [POLL] Default Value for File Encoding

Posted by Jesse McConnell <jm...@apache.org>.
for maven 2.0.x i would go +1 for option a

for maven 2.1 I would go +1 for option b with my caveat being a proper
element of the pom and not shoved into the properties.

jesse

-- 
jesse mcconnell
jesse.mcconnell@gmail.com

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Christian Kölle <ch...@switch.ch>.
Benjamin Bentmann wrote:
> Marat Radchenko wrote:
>> And let it be UTF-8.
> 
> Until a flood of users pushes into this direction of UTF-8,
> which is surely the more international/nicer choice, I believe we're better
> off with staying to Latin-1 and keep consistency among the plugins.
> 

OK, start the food. I would also recommend UTF-8. Latin-1 would only be
slightly better than ASCII.

Christian

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Benjamin Bentmann <be...@udo.edu>.
Marat Radchenko wrote:
> And let it be UTF-8.

Well, that's another story ;-)

The problem is we have already two plugins out (Site and Javadoc) that
employ Latin-1 as the default value. Either we have them break to use UTF-8,
too, or leave those two as exceptions to the rest of the plugins. Both ways
are not golden. Until a flood of users pushes into this direction of UTF-8,
which is surely the more international/nicer choice, I believe we're better
off with staying to Latin-1 and keep consistency among the plugins.


Benjamin


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Marat Radchenko <sl...@gmail.com>.
+1 for b. And let it be UTF-8.

On 4/29/08, Benjamin Bentmann <be...@udo.edu> wrote:
> Dear community,
>
>  the Maven team is currently discussing a proposal about the future handling
>  of source file encoding by the various plugins, please see our wiki article
>  [0] for all details.
>
>  A controversial aspect of this proposal is which file encoding should be
>  assumed in case the user did not specify this in the POM. This poll should
>  help us to come to a well-founded decision.
>
>  These are the two possible directions to go:
>
>  a) Use the current platform encoding, aka the system property
>   "file.encoding".
>
>  b) Use a static/fixed value that is defined by convention, i.e. is not
>   platform-dependent.
>
>  Approach a) matches the current behavior of most plugins and is as such
>  backwards-compatible. Approach b) on the other hand can potentially break
>  builds when users update to a newer version of an affected plugin if:
>  - the build relies on an encoding other than ASCII/Latin-1 and
>  - this encoding is not explicitly stated in the plugin configuration
>
>  The reason why b) was suggested is its positive effect on build
>  reproducibility: Unlike approach a), a build will out-of-the-box deliver
> the
>  same output for all team members regardless of their OS or locale. It is
> now
>  to balance if this improvement is worth the potential breaks as illustrated
>  above.
>
>  So, please let us know:
>
>  [a] Use platform default encoding, keep backward-compat
>  [b] Use fixed default encoding, be platform-independent
>
>  Regards,
>
>
>  Benjamin Bentmann
>
>
>  [0]
> http://docs.codehaus.org/display/MAVENUSER/POM+Element+for+Source+File+Encoding
>
>
> ---------------------------------------------------------------------
>  To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
>  For additional commands, e-mail: users-help@maven.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by nicolas de loof <ni...@apache.org>.
+1 for [b]

Many novide developper don't even know what character encoding is. I had to
explain many time why the same application, compiled under a Unix server did
not generate the same result for some txt files with french characters.

Backward compatibility is nice but this doesn't mean user don't have to read
the release note to see deprecations, warning and upgrade notice !

Nico


2008/4/29 Benjamin Bentmann <be...@udo.edu>:

> Dear community,
>
> the Maven team is currently discussing a proposal about the future
> handling
> of source file encoding by the various plugins, please see our wiki
> article
> [0] for all details.
>
> A controversial aspect of this proposal is which file encoding should be
> assumed in case the user did not specify this in the POM. This poll should
> help us to come to a well-founded decision.
>
> These are the two possible directions to go:
>
> a) Use the current platform encoding, aka the system property
>  "file.encoding".
>
> b) Use a static/fixed value that is defined by convention, i.e. is not
>  platform-dependent.
>
> Approach a) matches the current behavior of most plugins and is as such
> backwards-compatible. Approach b) on the other hand can potentially break
> builds when users update to a newer version of an affected plugin if:
> - the build relies on an encoding other than ASCII/Latin-1 and
> - this encoding is not explicitly stated in the plugin configuration
>
> The reason why b) was suggested is its positive effect on build
> reproducibility: Unlike approach a), a build will out-of-the-box deliver
> the
> same output for all team members regardless of their OS or locale. It is
> now
> to balance if this improvement is worth the potential breaks as
> illustrated
> above.
>
> So, please let us know:
>
> [a] Use platform default encoding, keep backward-compat
> [b] Use fixed default encoding, be platform-independent
>
> Regards,
>
>
> Benjamin Bentmann
>
>
> [0]
>
> http://docs.codehaus.org/display/MAVENUSER/POM+Element+for+Source+File+Encoding
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
> For additional commands, e-mail: users-help@maven.apache.org
>
>

Re: [POLL] Default Value for File Encoding

Posted by Christian Kölle <ch...@switch.ch>.
Benjamin Bentmann wrote:
> 
> a) Use the current platform encoding, aka the system property
>   "file.encoding".
> 
> b) Use a static/fixed value that is defined by convention, i.e. is not
>   platform-dependent.
> 

Hi

I vote for b). The different file encodings on different environments
are a mess.

Christian

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Rainer Pruy <Ra...@Acrys.COM>.

Wayne Fay schrieb:
> Correct me if I'm wrong, but old projects using old Maven builds will
> not be affected by this. So we eliminate those from the discussion.
> 
> Old projects moving to new Maven builds will need to add a single
> <property> in their pom, and then everything compiles fine etc. I
> consider this "maintenance" and based on my experience with moving
> across versions, I'd be very surprised if this was the only thing they
> needed to change in their pom (very few people lock down plugin
> versions, and new plugins sometimes require changes to the pom). If
> they simply kept using the old Maven build they originally built their
> project with, they wouldn't need to do this.
> 
> New projects using the new Maven builds will either use the default
> that we are discussing (I voted for b) or declare their own default
> with a single <property> in their pom.
> 
> Which of the above cases are you most concerned about??
> 

None - or both if you like.

I just feel there is no real argument for changing current default encoding assumption.

Case one (old projects using old maven versions) are not affected, yes.
Case two (old projects with new maven versions) might get a hint for something needs to be fixed if build actually happen to break.
But otherwise nothing will get improved. Thus we will need a different mechanism anyway.

Same for case three (new projects with new maven): If one happens to get into trouble all is fine.
Otherwise no incentive for "fixing" any problem related to encoding.

Thus, we need something more effective, but then why change encoding default in the first place.

For case two
> Wayne
> 
> On 4/29/08, Rainer Pruy <Ra...@acrys.com> wrote:
>>
>> Wayne Fay schrieb:
>>> My vote is [b]. Consistent builds are the very foundation upon which we operate.
>>>
>> (Sorry Wayne it is not personal, I just came across that thought while reading your post.....)
>>
>> Putting up a default behaviour that deviates from current default, will not bring consistent builds for those projects.
>> Most likely the files are not compatible with the new implied default.
>>
>> So the only intention can be ensuring consistent builds for any *future* project (version).
>> Thus flagging encoding problems will improve awareness and will surely contribute more to consistent builds that "changing the rules"
>> on the game...
>> Rainer
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
>> For additional commands, e-mail: users-help@maven.apache.org
>>
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
> For additional commands, e-mail: users-help@maven.apache.org
> 

-- 
Rainer Pruy
Geschäftsführer

Acrys Consult GmbH & Co. KG
Untermainkai 29-30, D-60329 Frankfurt
Tel: +49-69-244506-0 - Fax: +49-69-244506-50
Web: http://www.acrys.com -  Email: office@acrys.com
Handelsregister: Frankfurt am Main, HRA 31151

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Benjamin Bentmann <be...@udo.edu>.
Wayne Fay wrote:

> Correct me if I'm wrong, but old projects using old Maven builds will
> not be affected by this. So we eliminate those from the discussion.

It's right that's old projects are not affected as long as we assume they 
have locked down their plugin versions. The change we discuss is bound to a 
specific plugin version, so updating a source processing plugin (say from 
maven-compiler 2.0.1 to maven-compiler 2.1) would require to watchout for 
the encoding change. Projects that didn't lock down their plugins versions 
are naturally affected by this change just like with any other change to the 
used plugins.

> New projects using the new Maven builds will either use the default
> that we are discussing (I voted for b) or declare their own default
> with a single <property> in their pom.

That's it. And if people need to have different encodings for different 
plugins, they still have the freedom to configure the plugins individually, 
too.


Benjamin 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Wayne Fay <wa...@gmail.com>.
Correct me if I'm wrong, but old projects using old Maven builds will
not be affected by this. So we eliminate those from the discussion.

Old projects moving to new Maven builds will need to add a single
<property> in their pom, and then everything compiles fine etc. I
consider this "maintenance" and based on my experience with moving
across versions, I'd be very surprised if this was the only thing they
needed to change in their pom (very few people lock down plugin
versions, and new plugins sometimes require changes to the pom). If
they simply kept using the old Maven build they originally built their
project with, they wouldn't need to do this.

New projects using the new Maven builds will either use the default
that we are discussing (I voted for b) or declare their own default
with a single <property> in their pom.

Which of the above cases are you most concerned about??

Wayne

On 4/29/08, Rainer Pruy <Ra...@acrys.com> wrote:
>
>
> Wayne Fay schrieb:
> > My vote is [b]. Consistent builds are the very foundation upon which we operate.
> >
>
> (Sorry Wayne it is not personal, I just came across that thought while reading your post.....)
>
> Putting up a default behaviour that deviates from current default, will not bring consistent builds for those projects.
> Most likely the files are not compatible with the new implied default.
>
> So the only intention can be ensuring consistent builds for any *future* project (version).
> Thus flagging encoding problems will improve awareness and will surely contribute more to consistent builds that "changing the rules"
> on the game...
> Rainer
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
> For additional commands, e-mail: users-help@maven.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Benjamin Bentmann <be...@udo.edu>.
Rainer Pruy wrote:

> If already being unpolite why not in a way that will cause major
> improvement on the situation by forcing users to stating encoding in any
> case

Yes, as we talk about it, this becomes my personal favorite. I guess a
default value as originally proposed is only of value if it works for a
majority, i.e. "convention over configuration" only works if one has a
reasonable convention. However, Latin-1 is admittedly not international
enough to serve this, UTF-8 might have been (but again, if third of the
world uses Big5, GBK etc. that's questionable, too).

Requiring an explicit encoding in all cases would have been the most
consequent approach because it would have broken for everybody and as such
would have taught everybody to specify the encoding. It would have hurt once
but then never again.

Maybe in the next century.


Benjamin


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Rainer Pruy <Ra...@Acrys.COM>.

Benjamin Bentmann schrieb:
> Rainer Pruy wrote:
> 
>> Putting up a default behaviour that deviates from current default,
>> will not bring consistent builds for those projects.
> 
> I would like to argue the opposite: If we consider a project whose POM
> does not explicitly specify file encodings for the plugins in use, each
> developer will implicitly use his platform default encoding during the
> build. Further assume that the platform default encoding among the
> project team differs (for whatever reason). This potentially causes the
> build output for developer A and developer B to differ although they are
> - building from the same POM
> - using the same Maven version
> - using the same plugin versions

Yes I see your argument,
nevertheless there are large areas where a "breaking" build does not imply receiving some kind of error message.
I'd assume there are numerous cases where "breaking" just implies strange results somewhere in an application.
This will not get improved on changing default encoding, it will just happen to "break" in a different way.
So why not leave the bad situation as is and avoiding making it worse by adding the chance that some build will exhibit breaks while
still in "uncritical" environments. (Causing some improvents for the price of being "unpolite" as you did put it below)

If already being unpolite why not in a way that will cause major improvement on the situation by forcing users to stating encoding in
any case and keeping current problems on current project settings.
Fixing some "by default" while breaking others (causing them to get fixed).

Same effect if *any* build is flagging bad usage of encoding (aka missing encoding declarations) and building up some pressure on
people providing projects publicly.

No changes for "old" ones -- consistent improvement for anything else....

> 
> In contrast, if the unspecified file encoding defaulted to a
> platform-independent value defined by a Maven convention, the build will
> a) either work for both developers or
> b) work for none of them
> in both cases, they observe the same build output.
> 
> I mean, the major aspect of the Maven default encoding being Latin-1
> instead of UTF-8 or whatever people's platfrom encoding is, is that this
> value is platform-independent and as such applies to the entire team
> (unless their override it).
> 
>> Most likely the files are not compatible with the new implied default.
> 
> Yes, but you would simply need to fix your POM and are back on the road.
> 
>> Thus flagging encoding problems will improve awareness and will surely
>> contribute more to consistent builds that "changing the rules" on the
>> game...
> 
> If we change the rules such that the build of those people, that are
> currently unaware of the encoding issue and simply assume their platform
> encoding, can break, that's some kind (though not fully reliable) of
> flagging encoding problems, IMHO. Yes, yes, that might not be the most
> polite way of promoting things, but sometimes I feel a little emphasis
> is OK.
> 
> 
> Benjamin
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
> For additional commands, e-mail: users-help@maven.apache.org
> 

-- 
Rainer Pruy
Geschäftsführer

Acrys Consult GmbH & Co. KG
Untermainkai 29-30, D-60329 Frankfurt
Tel: +49-69-244506-0 - Fax: +49-69-244506-50
Web: http://www.acrys.com -  Email: office@acrys.com
Handelsregister: Frankfurt am Main, HRA 31151

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Benjamin Bentmann <be...@udo.edu>.
Rainer Pruy wrote:

> Putting up a default behaviour that deviates from current default, will 
> not bring consistent builds for those projects.

I would like to argue the opposite: If we consider a project whose POM does 
not explicitly specify file encodings for the plugins in use, each developer 
will implicitly use his platform default encoding during the build. Further 
assume that the platform default encoding among the project team differs 
(for whatever reason). This potentially causes the build output for 
developer A and developer B to differ although they are
- building from the same POM
- using the same Maven version
- using the same plugin versions

In contrast, if the unspecified file encoding defaulted to a 
platform-independent value defined by a Maven convention, the build will
a) either work for both developers or
b) work for none of them
in both cases, they observe the same build output.

I mean, the major aspect of the Maven default encoding being Latin-1 instead 
of UTF-8 or whatever people's platfrom encoding is, is that this value is 
platform-independent and as such applies to the entire team (unless their 
override it).

> Most likely the files are not compatible with the new implied default.

Yes, but you would simply need to fix your POM and are back on the road.

> Thus flagging encoding problems will improve awareness and will surely 
> contribute more to consistent builds that "changing the rules" on the 
> game...

If we change the rules such that the build of those people, that are 
currently unaware of the encoding issue and simply assume their platform 
encoding, can break, that's some kind (though not fully reliable) of 
flagging encoding problems, IMHO. Yes, yes, that might not be the most 
polite way of promoting things, but sometimes I feel a little emphasis is 
OK.


Benjamin


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Rainer Pruy <Ra...@Acrys.COM>.

Wayne Fay schrieb:
> My vote is [b]. Consistent builds are the very foundation upon which we operate.
> 

(Sorry Wayne it is not personal, I just came across that thought while reading your post.....)

Putting up a default behaviour that deviates from current default, will not bring consistent builds for those projects.
Most likely the files are not compatible with the new implied default.

So the only intention can be ensuring consistent builds for any *future* project (version).
Thus flagging encoding problems will improve awareness and will surely contribute more to consistent builds that "changing the rules"
on the game...
Rainer

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Wayne Fay <wa...@gmail.com>.
My vote is [b]. Consistent builds are the very foundation upon which we operate.

Wayne

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Benjamin Bentmann <be...@udo.edu>.
Brian E. Fox wrote:

> Can you outline in what cases and in what ways this change could break
> existing builds

Surely. About the cases that might suffer from the change: We propose to use
Latin-1 as the default encoding in case the user did not specify it. So
first up, everybody who already explicitly declares an encoding will not
notice the change, i.e. if your POM looks like

  <plugin>
    <artifactId>maven-compiler-plugin</artifactId>
    <configuration>
      <encoding>big5</encoding>
      ...
    </confinguration>
  </plugin>

the build will work just as before (using big5) when you switch to the newer
plugin version that incorporates our proposal.

In contrast, the build will likely break if you effectively use an encoding
other than Latin-1 or ASCII (ASCII is just a subset of Latin-1) but did not
declare this in the configuration for the various plugins. The prime example
for potentially affected builds seem to be Asian projects that naturally use
the Non-Western encoding of the platforms (compare the comments on our wiki
article).

As for the kind of break: The best case is a plugin that entirely refuses
its work via an exception because the file contents it is trying to process
violates the assumed encoding (e.g. Latin-1 byte sequences are in general
not valid UTF-8 byte sequences). Why do I call this build failure a best
case? Because it tells you straight out that the desired encoding needs to
be declared in the POM. The other way is a plugin that works but silently
outputs garbage. This is more subtle but it requires human review to detect.
That's easy if you know where to look (Non-ASCII characters) but again
requires a user being aware of the issue.

> and what it would take for the user to fix?

In one line: State the encoding you want to use in the POM.

The POM is our means to configure a build. If its default values don't fit
your need, you can always go ahead and explicitly add the configuration
element.

When we consider the state as is, i.e. the release versions of the plugins
and Maven, that means to configure each and every plugin separately. Once we
have the plugin versions released that follow our proposal and adhere to the
convention of evaluating the POM property "${project.build.sourceEncoding}",
this configuration can in most cases reduced to adding

  <properties>
    <project.build.sourceEncoding>...</project.build.sourceEncoding>
  </properties>

>  Could a tool be created to correct it automatically?

I believe the answer is "no". This is basically related to the discussion we
had over on dev@ with Jason regarding the usage of JChardet [0]. A machine
tool cannot reliable tell what file encoding your sources use (because it
would need to semantically understand text). So this a human task but that
should be easily done.


Benjamin


[0]
http://www.nabble.com/-VOTE--POM-Element-for-Source-File-Encoding-to16515820s177.html


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


RE: [POLL] Default Value for File Encoding

Posted by "Brian E. Fox" <br...@reply.infinity.nu>.
Benjamin,
Can you outline in what cases and in what ways this change could break
existing builds, and what it would take for the user to fix? Could a
tool be created to correct it automatically?

--Brian

-----Original Message-----
From: Benjamin Bentmann [mailto:benjamin.bentmann@udo.edu] 
Sent: Tuesday, April 29, 2008 7:24 AM
To: users@maven.apache.org
Subject: [POLL] Default Value for File Encoding

Dear community,

the Maven team is currently discussing a proposal about the future
handling
of source file encoding by the various plugins, please see our wiki
article
[0] for all details.

A controversial aspect of this proposal is which file encoding should be
assumed in case the user did not specify this in the POM. This poll
should
help us to come to a well-founded decision.

These are the two possible directions to go:

a) Use the current platform encoding, aka the system property
   "file.encoding".

b) Use a static/fixed value that is defined by convention, i.e. is not
   platform-dependent.

Approach a) matches the current behavior of most plugins and is as such
backwards-compatible. Approach b) on the other hand can potentially
break
builds when users update to a newer version of an affected plugin if:
- the build relies on an encoding other than ASCII/Latin-1 and
- this encoding is not explicitly stated in the plugin configuration

The reason why b) was suggested is its positive effect on build
reproducibility: Unlike approach a), a build will out-of-the-box deliver
the
same output for all team members regardless of their OS or locale. It is
now
to balance if this improvement is worth the potential breaks as
illustrated
above.

So, please let us know:

[a] Use platform default encoding, keep backward-compat
[b] Use fixed default encoding, be platform-independent

Regards,


Benjamin Bentmann


[0]
http://docs.codehaus.org/display/MAVENUSER/POM+Element+for+Source+File+E
ncoding


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Roger Ye <ro...@gmail.com>.
Hi Sherali,

On 4/29/08, Sherali Karimov <sh...@karimov.org> wrote:
>
> +1 for the option b.
> We had our share of builds behaving differently from OS to OS and from
> region to region. :(


Excuse me, but I think this is your fault.

This is exactly the case where you should use explicit encoding

Like in a multi-national meeting you should rule for a common language such
as English
or you'll have a mess instead of a meeting :P

Re: [POLL] Default Value for File Encoding

Posted by Sherali Karimov <sh...@karimov.org>.
+1 for the option b.
We had our share of builds behaving differently from OS to OS and  
from region to region. :(

cheers,
sherali

29/04/2008, в 21:23, Benjamin Bentmann писал(а):

> Dear community,
>
> the Maven team is currently discussing a proposal about the future  
> handling
> of source file encoding by the various plugins, please see our wiki  
> article
> [0] for all details.
>
> A controversial aspect of this proposal is which file encoding  
> should be
> assumed in case the user did not specify this in the POM. This poll  
> should
> help us to come to a well-founded decision.
>
> These are the two possible directions to go:
>
> a) Use the current platform encoding, aka the system property
>   "file.encoding".
>
> b) Use a static/fixed value that is defined by convention, i.e. is not
>   platform-dependent.
>
> Approach a) matches the current behavior of most plugins and is as  
> such
> backwards-compatible. Approach b) on the other hand can potentially  
> break
> builds when users update to a newer version of an affected plugin if:
> - the build relies on an encoding other than ASCII/Latin-1 and
> - this encoding is not explicitly stated in the plugin configuration
>
> The reason why b) was suggested is its positive effect on build
> reproducibility: Unlike approach a), a build will out-of-the-box  
> deliver the
> same output for all team members regardless of their OS or locale.  
> It is now
> to balance if this improvement is worth the potential breaks as  
> illustrated
> above.
>
> So, please let us know:
>
> [a] Use platform default encoding, keep backward-compat
> [b] Use fixed default encoding, be platform-independent
>
> Regards,
>
>
> Benjamin Bentmann
>
>
> [0]
> http://docs.codehaus.org/display/MAVENUSER/POM+Element+for+Source 
> +File+Encoding
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
> For additional commands, e-mail: users-help@maven.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Paul Benedict <pb...@apache.org>.
I definitely vote for A but I see those who vote for B as valid as well. A
is basically "today's choice" since the default today is the platform's
encoding. If people want to override the default and forget about it, it
tells me it should belong in a corporate POM, which implies A again.

Paul

On Tue, Apr 29, 2008 at 6:55 AM, Felix Knecht <fe...@apache.org> wrote:

>
>  b) Use a static/fixed value that is defined by convention, i.e. is not
> >  platform-dependent.
> >
> >  +1
> Starting a new maven project and not being aware of this thread / encoding
> problem I (speaking as maven user) for sure will not set an encoding and
> rely on the 'default' encoding. Doing so may result in troubles when sharing
> the project among different OS / Countries / ...
>
> Regards
> Felix
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
> For additional commands, e-mail: users-help@maven.apache.org
>
>

Re: [POLL] Default Value for File Encoding

Posted by Felix Knecht <fe...@apache.org>.
> b) Use a static/fixed value that is defined by convention, i.e. is not
>   platform-dependent.
>
+1
Starting a new maven project and not being aware of this thread / 
encoding problem I (speaking as maven user) for sure will not set an 
encoding and rely on the 'default' encoding. Doing so may result in 
troubles when sharing the project among different OS / Countries / ...

Regards
Felix


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Hervé BOUTEMY <he...@free.fr>.
+1 for a)
with a warning like "[WARN] using detected local platform encoding 'xxx'. To 
ensure build reproducibility, consider adding project.build.sourceEncoding 
property to your pom"

This won't break existing builds from users that don't even know their 
encoding, but will help them do the right choice: explicitely declare 
encoding in their pom.

Hervé

Le mardi 29 avril 2008, Benjamin Bentmann a écrit :
> Dear community,
>
> the Maven team is currently discussing a proposal about the future handling
> of source file encoding by the various plugins, please see our wiki article
> [0] for all details.
>
> A controversial aspect of this proposal is which file encoding should be
> assumed in case the user did not specify this in the POM. This poll should
> help us to come to a well-founded decision.
>
> These are the two possible directions to go:
>
> a) Use the current platform encoding, aka the system property
>    "file.encoding".
>
> b) Use a static/fixed value that is defined by convention, i.e. is not
>    platform-dependent.
>
> Approach a) matches the current behavior of most plugins and is as such
> backwards-compatible. Approach b) on the other hand can potentially break
> builds when users update to a newer version of an affected plugin if:
> - the build relies on an encoding other than ASCII/Latin-1 and
> - this encoding is not explicitly stated in the plugin configuration
>
> The reason why b) was suggested is its positive effect on build
> reproducibility: Unlike approach a), a build will out-of-the-box deliver
> the same output for all team members regardless of their OS or locale. It
> is now to balance if this improvement is worth the potential breaks as
> illustrated above.
>
> So, please let us know:
>
> [a] Use platform default encoding, keep backward-compat
> [b] Use fixed default encoding, be platform-independent
>
> Regards,
>
>
> Benjamin Bentmann
>
>
> [0]
> http://docs.codehaus.org/display/MAVENUSER/POM+Element+for+Source+File+Enco
>ding
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
> For additional commands, e-mail: users-help@maven.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Thierry Lach <th...@gmail.com>.
+1 to c.

On Tue, Apr 29, 2008 at 8:51 AM, Jochen Wiedmann <jo...@gmail.com>
wrote:

> On Tue, Apr 29, 2008 at 1:23 PM, Benjamin Bentmann
> <be...@udo.edu> wrote:
>
> ---clip---



> I'd opt for
>
>    c) Use a configurable value, by default the current platform encoding.
>
> Should be
>
>  * Upwards compatible
>  * Simplify the use of Maven for people who don't need to care for that
> value.
>    (Most development teams have uniform development platforms, or at least
>    uniform default encodings.)
>  * Make reproducable builds possible for the rest.
>
>
>
>
> --
> Look, that's why there's rules, understand? So that you think before
> you break 'em.
>
>  -- (Terry Pratchett, Thief of Time)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
> For additional commands, e-mail: users-help@maven.apache.org
>
>

Re: [POLL] Default Value for File Encoding

Posted by Benjamin Bentmann <be...@udo.edu>.
Jochen Wiedmann wrote:

> I'd opt for
>
>    c) Use a configurable value, by default the current platform encoding.

To my understanding, that's nothing more than variant a). Of course, we are 
talking about a configurable value. Locking down plugins to any kind of 
encoding without having a chance of customization would be a design flaw par 
excellence.

Some day in the future, each and every plugin should offer a configuration 
parameter to control the encoding for its input/output files. So that is the 
finest grained control with regard to configuration. Next up, we are 
planning on a central POM property/element where users specify the file 
encoding for all their plugins. The already mentioned wiki article outlines 
this in more detail.

This thread is only about the situation in which a user did *not* configure 
the encoding but expects the build to use some default value.  Taking this 
default from the platform or from an established convention is the remaining 
question.


Benjamin


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Jochen Wiedmann <jo...@gmail.com>.
On Tue, Apr 29, 2008 at 1:23 PM, Benjamin Bentmann
<be...@udo.edu> wrote:

>  These are the two possible directions to go:
>
>  a) Use the current platform encoding, aka the system property
>   "file.encoding".
>
>  b) Use a static/fixed value that is defined by convention, i.e. is not
>   platform-dependent.

I'd opt for

    c) Use a configurable value, by default the current platform encoding.

Should be

  * Upwards compatible
  * Simplify the use of Maven for people who don't need to care for that value.
    (Most development teams have uniform development platforms, or at least
    uniform default encodings.)
  * Make reproducable builds possible for the rest.




-- 
Look, that's why there's rules, understand? So that you think before
you break 'em.

 -- (Terry Pratchett, Thief of Time)

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by ruimo <sh...@ruimo.com>.
Hi,

+1 to [a]

There seems no meaning to break compatibility.


Benjamin Bentmann wrote:
> 
> Dear community,
> 
> the Maven team is currently discussing a proposal about the future
> handling
> of source file encoding by the various plugins, please see our wiki
> article
> [0] for all details.
> 
> A controversial aspect of this proposal is which file encoding should be
> assumed in case the user did not specify this in the POM. This poll should
> help us to come to a well-founded decision.
> 
> These are the two possible directions to go:
> 
> a) Use the current platform encoding, aka the system property
>    "file.encoding".
> 
> b) Use a static/fixed value that is defined by convention, i.e. is not
>    platform-dependent.
> 
> Approach a) matches the current behavior of most plugins and is as such
> backwards-compatible. Approach b) on the other hand can potentially break
> builds when users update to a newer version of an affected plugin if:
> - the build relies on an encoding other than ASCII/Latin-1 and
> - this encoding is not explicitly stated in the plugin configuration
> 
> The reason why b) was suggested is its positive effect on build
> reproducibility: Unlike approach a), a build will out-of-the-box deliver
> the
> same output for all team members regardless of their OS or locale. It is
> now
> to balance if this improvement is worth the potential breaks as
> illustrated
> above.
> 
> So, please let us know:
> 
> [a] Use platform default encoding, keep backward-compat
> [b] Use fixed default encoding, be platform-independent
> 
> Regards,
> 
> 
> Benjamin Bentmann
> 
> 
> [0]
> http://docs.codehaus.org/display/MAVENUSER/POM+Element+for+Source+File+Encoding
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
> For additional commands, e-mail: users-help@maven.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/-POLL--Default-Value-for-File-Encoding-tp16958386s177p16960887.html
Sent from the Maven - Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Roger Ye <ro...@gmail.com>.
On 4/30/08, Benjamin Bentmann <be...@udo.edu> wrote:
>
> Paolo Compieta wrote:
>
>  - most companies have uniform OS platforms
> >
>
> I am used to scenarios where people work on Unix/Win terminals or their
> Unix/Mac/Win notebooks on their own discretion, creating quite some
> heterogenous development culture. Might be one reason why I quickly had
> locked down all encoding settings in our corporate POM...
>

I love Linux, in which I can globally set UTF-8 as my default encoding
I hate Windows, in which I cannot do that, then for me only GB2312 is the
"most appropriate".

maybe I'm not a Windows savvy, and I hope someone can tell how to do that :P

Roger

Re: [POLL] Default Value for File Encoding

Posted by Benjamin Bentmann <be...@udo.edu>.
Paolo Compieta wrote:

> - most companies have uniform OS platforms

I am used to scenarios where people work on Unix/Win terminals or their
Unix/Mac/Win notebooks on their own discretion, creating quite some
heterogenous development culture. Might be one reason why I quickly had
locked down all encoding settings in our corporate POM...

> - most editors allow you select a proper charset, but they (usually)
> automatically detect the default ("file.encoding"); it'd be not
> comfortable
> changing every time the charset to a different one only because maven said
> "this is the standard"; i.e., i wouldn't exchange platform-dependence with
> implicit charset-dependence (potential drawbacks on all other kinds of
> editor - java/sql/xml/properties/..)

If the proposed default value matches your platform encoding, you're just
fine. If it doesn't, you would simply configure your POM accordingly (i.e. 
configure Maven for your needs and not vice-versa) and both you and in 
particular all your co-workers are fine for the rest of their life, too. You 
don't promote to edit the same file with different encodings selected for 
your editor, don't you?

> - big trans-national companies (should!) have centralized and
> well-configured building-machine to be asked for deliverables;

Wouldn't you want to be able to create the same build output on your own dev
machine than the output from these "centralized and well-configured
building-machine"? For that reason, the encoding should be bound to the POM
(which is shared among all participants) in contrast to OS or locale.


Benjamin


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Paolo Compieta <pa...@gmail.com>.
Hi all,

+1 [a]

with a few considerations (please, correct me if i'm wrong):

- backward compatibility is not a must, but a cost if not assured; runtime
charset errors (other than build-breaking ones) may be hard to detect
- most companies have uniform OS platforms; teams with non-uniform
developing environments have already faced this problem: editing a file with
a different encoding requires some thought.. far before building
- most editors allow you select a proper charset, but they (usually)
automatically detect the default ("file.encoding"); it'd be not comfortable
changing every time the charset to a different one only because maven said
"this is the standard"; i.e., i wouldn't exchange platform-dependence with
implicit charset-dependence (potential drawbacks on all other kinds of
editor - java/sql/xml/properties/..)
- big trans-national companies (should!) have centralized and
well-configured building-machine to be asked for deliverables; those
deliverables are surely reproducible and should be deployed to
official/uniform testing and production environments

regards,
Paolo


Benjamin Bentmann wrote:
> 
> Dear community,
> 
> the Maven team is currently discussing a proposal about the future
> handling
> of source file encoding by the various plugins, please see our wiki
> article
> [0] for all details.
> 
> A controversial aspect of this proposal is which file encoding should be
> assumed in case the user did not specify this in the POM. This poll should
> help us to come to a well-founded decision.
> 
> These are the two possible directions to go:
> 
> a) Use the current platform encoding, aka the system property
>    "file.encoding".
> 
> b) Use a static/fixed value that is defined by convention, i.e. is not
>    platform-dependent.
> 
> Approach a) matches the current behavior of most plugins and is as such
> backwards-compatible. Approach b) on the other hand can potentially break
> builds when users update to a newer version of an affected plugin if:
> - the build relies on an encoding other than ASCII/Latin-1 and
> - this encoding is not explicitly stated in the plugin configuration
> 
> The reason why b) was suggested is its positive effect on build
> reproducibility: Unlike approach a), a build will out-of-the-box deliver
> the
> same output for all team members regardless of their OS or locale. It is
> now
> to balance if this improvement is worth the potential breaks as
> illustrated
> above.
> 
> So, please let us know:
> 
> [a] Use platform default encoding, keep backward-compat
> [b] Use fixed default encoding, be platform-independent
> 
> Regards,
> 
> 
> Benjamin Bentmann
> 
> 
> [0]
> http://docs.codehaus.org/display/MAVENUSER/POM+Element+for+Source+File+Encoding
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
> For additional commands, e-mail: users-help@maven.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/-POLL--Default-Value-for-File-Encoding-tp16958386s177p16963039.html
Sent from the Maven - Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Benjamin Bentmann <be...@udo.edu>.
Milos Klient wrote:
> Just a note, both solution allow one to have a reproducible builds if
> one cares.

Absolutely. Just to further clarify: This poll is not about reproducibility
or not. Setting the encoding explicitly in the POM will always give you a
reproducible build, no matter where this discussion ends. This poll is
"merely" about the question, whether this reproducibility comes
out-of-the-box or requires explicit user configuration. Also, out-of-the-box
reproducibility here does not mean that all builds will work with our
proposed default value of Latin-1, users will likely want to override this
value for the projects. But the major point is it will work for everybody on
the project or for nobody, no more "works (just) for me".


Benjamin


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Milos Kleint <mk...@gmail.com>.
definitely a)
 I can't help myself I'm a backward compatibility guy.

Just a note, both solution allow one to have a reproducible builds if
one cares. Benjamin and Herve (and others) have done a great job on
making sure that when you set the encoding for the project it gets
applied consistently across plugins.
However option b. can potentially break existing builds that relied on
existing behaviour.

Milos

On Tue, Apr 29, 2008 at 1:37 PM, Jörg Schaible
<Jo...@elsag-solutions.com> wrote:
>
>  Definitely b)
>
>  Reproducable builds are an absolute requirement for a build tool.
>
>
>
>  Benjamin Bentmann wrote:
>  > Dear community,
>  >
>  > the Maven team is currently discussing a proposal about the
>  > future handling
>  > of source file encoding by the various plugins, please see
>  > our wiki article
>  > [0] for all details.
>  >
>  > A controversial aspect of this proposal is which file
>  > encoding should be
>  > assumed in case the user did not specify this in the POM.
>  > This poll should
>  > help us to come to a well-founded decision.
>  >
>  > These are the two possible directions to go:
>  >
>  > a) Use the current platform encoding, aka the system property
>  > "file.encoding".
>  >
>  > b) Use a static/fixed value that is defined by convention, i.e. is
>  > not    platform-dependent.
>  >
>  > Approach a) matches the current behavior of most plugins and
>  > is as such
>  > backwards-compatible. Approach b) on the other hand can
>  > potentially break
>  > builds when users update to a newer version of an affected plugin if:
>  > - the build relies on an encoding other than ASCII/Latin-1 and
>  > - this encoding is not explicitly stated in the plugin configuration
>  >
>  > The reason why b) was suggested is its positive effect on build
>  > reproducibility: Unlike approach a), a build will
>  > out-of-the-box deliver the
>  > same output for all team members regardless of their OS or
>  > locale. It is now
>  > to balance if this improvement is worth the potential breaks
>  > as illustrated
>  > above.
>  >
>  > So, please let us know:
>  >
>  > [a] Use platform default encoding, keep backward-compat
>  > [b] Use fixed default encoding, be platform-independent
>  >
>  > Regards,
>  >
>  >
>  > Benjamin Bentmann
>  >
>  >
>  > [0]
>  > http://docs.codehaus.org/display/MAVENUSER/POM+Element+for+Sou
>  > rce+File+Encoding
>  >
>  >
>  > ---------------------------------------------------------------------
>  > To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
>  > For additional commands, e-mail: users-help@maven.apache.org
>
>
>  ---------------------------------------------------------------------
>  To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
>  For additional commands, e-mail: users-help@maven.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


RE: [POLL] Default Value for File Encoding

Posted by Jörg Schaible <Jo...@Elsag-Solutions.com>.
Definitely b)

Reproducable builds are an absolute requirement for a build tool.

Benjamin Bentmann wrote:
> Dear community,
> 
> the Maven team is currently discussing a proposal about the
> future handling
> of source file encoding by the various plugins, please see
> our wiki article
> [0] for all details.
> 
> A controversial aspect of this proposal is which file
> encoding should be
> assumed in case the user did not specify this in the POM.
> This poll should
> help us to come to a well-founded decision.
> 
> These are the two possible directions to go:
> 
> a) Use the current platform encoding, aka the system property   
> "file.encoding". 
> 
> b) Use a static/fixed value that is defined by convention, i.e. is
> not    platform-dependent. 
> 
> Approach a) matches the current behavior of most plugins and
> is as such
> backwards-compatible. Approach b) on the other hand can
> potentially break
> builds when users update to a newer version of an affected plugin if:
> - the build relies on an encoding other than ASCII/Latin-1 and
> - this encoding is not explicitly stated in the plugin configuration
> 
> The reason why b) was suggested is its positive effect on build
> reproducibility: Unlike approach a), a build will
> out-of-the-box deliver the
> same output for all team members regardless of their OS or
> locale. It is now
> to balance if this improvement is worth the potential breaks
> as illustrated
> above.
> 
> So, please let us know:
> 
> [a] Use platform default encoding, keep backward-compat
> [b] Use fixed default encoding, be platform-independent
> 
> Regards,
> 
> 
> Benjamin Bentmann
> 
> 
> [0]
> http://docs.codehaus.org/display/MAVENUSER/POM+Element+for+Sou
> rce+File+Encoding 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
> For additional commands, e-mail: users-help@maven.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Plugin warning (was: Re: [POLL] Default Value for File Encoding)

Posted by Manos Batsis <ma...@geekologue.com>.
Benjamin Bentmann wrote:
e the two possible directions to go:
> 
> a) Use the current platform encoding, aka the system property
>   "file.encoding".
> 
> b) Use a static/fixed value that is defined by convention, i.e. is not
>   platform-dependent.


My vote is certainly b. However and IMHO, plugins that get the change 
should implement some API for Maven to be aware of their conformance and 
output a warning during the build in the case of a non-conforming plugin.

As an example to make up for my havent_had_coffee_yet english, this 
could be easily done by adding something like this in 2.1's AbstractMojo:

public boolean isSourceEncodingAware(){
	return false;
}


Maven could check all plugins at an early build stage and output a 
warning for the non-conforming ones. Plugin builders should be 
responsible of both overriding this and ensuring their code reads files 
properly.

Cheers,

Manos



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Heinrich Nirschl <he...@gmail.com>.
>  b) Use a static/fixed value that is defined by convention, i.e. is not
>   platform-dependent.

I vote for b). We recently had an encoding problem when we built a
project that was developed on Windows on a Unix server. Fortunately,
it caused a syntax error so that it was detected early. I can imagine
cases where the encoding problem is just in a string. Chances are
high, that such a bug will go undetected for a long time.

Henry

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Benjamin Bentmann <be...@udo.edu>.
Benjamin Bentmann wrote:

> Unfortunately, the change for MJAVADOC-165 was already released, i.e. the
> Javadoc Plugin 2.4 uses Latin-1 as default encoding if its configuration
> does not specify otherwise. As a matter of consistency, this should be
> reverted to use platform encoding, however this would imply another
> breaking change. Given the apparent dilemma, I created MJAVADOC-185 where
> users can vote for this change if it should happen.

A similar case is the Maven JXR Plugin which uses Latin-1 as its default
encoding (since version 2.0). Votes to bring this in line with the outcome 
of this poll can be placed at
  http://jira.codehaus.org/browse/JXR-62


Benjamin


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Benjamin Bentmann <be...@udo.edu>.
Benjamin Bentmann wrote:

> A controversial aspect of this proposal is which file encoding should be
> assumed in case the user did not specify this in the POM. This poll should
> help us to come to a well-founded decision.
>
> These are the two possible directions to go:
>
> a) Use the current platform encoding, aka the system property
>   "file.encoding".
>
> b) Use a static/fixed value that is defined by convention, i.e. is not
>   platform-dependent.

More than 24 hours after the last response on this thread, I believe it's
fair to say the die is cast. The votes here on this thread on over on the
related wiki article sum up to almost 90% for approach a) which is a rather
clear outcome. I will update our proposal accordingly.

Unfortunately, the change for MJAVADOC-165 was already released, i.e. the
Javadoc Plugin 2.4 uses Latin-1 as default encoding if its configuration
does not specify otherwise. As a matter of consistency, this should be
reverted to use platform encoding, however this would imply another breaking
change. Given the apparent dilemma, I created MJAVADOC-185 where users can
vote for this change if it should happen.

I would like to thank everybody who participated in this poll and shared his
thoughts.


Benjamin Bentmann


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Rainer Pruy <Ra...@Acrys.COM>.

Benjamin Bentmann schrieb:
> Rainer Pruy wrote:
> 
>> This might be true for an all java world,
>> nevertheless, in case the maven default deviates from your platform one,
>> how does an editor know where to get the proper encoding for a given
>> file?
>> (It would be quite difficult to enrich *any* editor around with some
>> logic to default to "maven" encoding in case there is a pom along
>> the path. so, it might work for IDEs where all aspects are tightly
>> integrated..)
>>
>> (Personally, I would not like to be forced to dump good ol' vi (;-))
> 
> Surely, text editors shouldn't be aware of a Maven POM somewhere hanging
> around with an encoding setting burried in it, nor should people drop
> their favorite editors. I simply expect the user to tell both Maven and
> its text editor what the desired encoding is. I mean, when you work on a
> Maven project and its sources, you would like to edit a file with the
> same encoding as your colleagues do, don't you? So it's not about
> syncing your editor to Maven but syncing Maven to the convention of your
> team.

To be honest,
I use to run recode on check-out / check-in to ensure checked-in versions are consistent with "standard" and checked-out ones are
conforming to my local environment.
This way my editor is doing the right thing.

(And I save some brain work for figuring out what project is using what encoding as of now. Just an aside: my world is not maven-only,
and some projects won't change encoding just to get in sync with some tool)

Thus I'm still with "force people stating encoding explicitly and don't twiddle around with default settings that won't solve the
problem in the first place".


> 
> 
> Benjamin
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
> For additional commands, e-mail: users-help@maven.apache.org
> 

-- 
Rainer Pruy
Geschäftsführer

Acrys Consult GmbH & Co. KG
Untermainkai 29-30, D-60329 Frankfurt
Tel: +49-69-244506-0 - Fax: +49-69-244506-50
Web: http://www.acrys.com -  Email: office@acrys.com
Handelsregister: Frankfurt am Main, HRA 31151

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Benjamin Bentmann <be...@udo.edu>.
Rainer Pruy wrote:

> This might be true for an all java world,
> nevertheless, in case the maven default deviates from your platform one,
> how does an editor know where to get the proper encoding for a given file?
> (It would be quite difficult to enrich *any* editor around with some logic 
> to default to "maven" encoding in case there is a pom along
> the path. so, it might work for IDEs where all aspects are tightly 
> integrated..)
>
> (Personally, I would not like to be forced to dump good ol' vi (;-))

Surely, text editors shouldn't be aware of a Maven POM somewhere hanging 
around with an encoding setting burried in it, nor should people drop their 
favorite editors. I simply expect the user to tell both Maven and its text 
editor what the desired encoding is. I mean, when you work on a Maven 
project and its sources, you would like to edit a file with the same 
encoding as your colleagues do, don't you? So it's not about syncing your 
editor to Maven but syncing Maven to the convention of your team.


Benjamin 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Rainer Pruy <Ra...@Acrys.COM>.
Hi Benjamin,

Benjamin Bentmann schrieb:
> Manos Batsis wrote:
> 
>> Having all files stick to a given (default) encoding will mean a
>> nightmare to all
>> platforms where such encoding is not the system one when it comes to
>> modifying or > editing files.
> 
> I can't follow your arguments here. Proper text editors allow you to
> select the file encoding you save your files in, so the system default
> encoding should not matter.
> 

This might be true for an all java world,
nevertheless, in case the maven default deviates from your platform one,
how does an editor know where to get the proper encoding for a given file?
(It would be quite difficult to enrich *any* editor around with some logic to default to "maven" encoding in case there is a pom along
the path. so, it might work for IDEs where all aspects are tightly integrated..)

(Personally, I would not like to be forced to dump good ol' vi (;-))

>> we should deprecate any file operation that fails stating an explicit
>> encoding and this way encourage users to explicitly state the encoding
>> in use.
> 
> I'm not sure what you mean with "file operation".

here: reading from and writing to files

> 
> We have feature requests out for PMD and Checkstyle to detect usage of
> problematic IO APIs like java.io.FileReader and I know that already some
> work on these has been started.
> 
> As for the Maven plugins themselves and their file handling: We don't
> need to deprecate things here. Every plugin that reads/writes plain text
> files should offer an encoding parameter for the user to configure the
> correct file encoding. Work on extending unconfigurable plugins with
> such a parameter is in progress/scheduled.
> 

Sorry for not being precise enough.
I did not mean "deprecation" in the specific meaning of interface elements.

It was more towards arranging for any "file operation" (see above) without explicit stated encoding to fail (ok, this might be to
tough, but a warning would be minimum here)

>> c)  a) + discourage any use of files that do not state encoding
>> explicitly
> 
> I take this as a vote for a) with the intention to output a warning in
> case the encoding was not specified. Please correct me if I
> misunderstood you.
> 

You are right: As I stated at the very top of my message: +1 for a)


This poll is about "default" without any explicit setting.
a) is the least disturbing one for users and teams within a homogeneous environment.
As inhomogeneous teams already face problems and sure will come to using encoding config quickly,
a) is least disturbing.

Rainer
> 
> Benjamin
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
> For additional commands, e-mail: users-help@maven.apache.org
> 

-- 
Rainer Pruy
Geschäftsführer

Acrys Consult GmbH & Co. KG
Untermainkai 29-30, D-60329 Frankfurt
Tel: +49-69-244506-0 - Fax: +49-69-244506-50
Web: http://www.acrys.com -  Email: office@acrys.com
Handelsregister: Frankfurt am Main, HRA 31151

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Manos Batsis <ma...@geekologue.com>.
Benjamin Bentmann wrote:
> Manos Batsis wrote:
> 
>> I hate this! Someone finally agrees with me but in a misquoted email; I
>> never wrote that :-)
> 
> As I said, that was my fault of getting the reply header wrong, I apologize
> for this confusion. I didn't want to upset you Manos.

No prob, sorry if i sounded upset - i wasn't. Thanks for the thread!

Cheers,

Manos


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Benjamin Bentmann <be...@udo.edu>.
Manos Batsis wrote:

> I hate this! Someone finally agrees with me but in a misquoted email; I
> never wrote that :-)

As I said, that was my fault of getting the reply header wrong, I apologize
for this confusion. I didn't want to upset you Manos.


Benjamin


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Manos Batsis <ma...@geekologue.com>.
Roger Ye wrote:
> On 4/29/08, Benjamin Bentmann <be...@udo.edu> wrote:
>> Manos Batsis wrote:
>>
>>  Having all files stick to a given (default) encoding will mean a
>>> nightmare to all
>>> platforms where such encoding is not the system one when it comes to
>>> modifying or > editing files.
>>>
>> I can't follow your arguments here. Proper text editors allow you to
>> select the file encoding you save your files in, so the system default
>> encoding should not matter.
> 
> 
> no offense, but this is your problem  for not being able to follow Manos's
> arguments here,


I hate this! Someone finally agrees with me but in a misquoted email; I 
never wrote that :-)


Sorry, i voted for b + warnings for plugins that have no clue, also made 
a rough proposal on how this could work.

Cheers,

Manos

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Benjamin Bentmann <be...@udo.edu>.
Roger Ye wrote:

> No offense, I bet you're an American and never read the joke which 
> involves trilingual, bilingual and American

I am from Germany, not sure how close that counts to being American ;-) 
Anyway, you're right, I can't remember the joke you referred to.

> please consider what if in Linux you've set LC_ALL=en_US.UTF-8, but in 
> your system no application respects this system wide setting.

I just mean "system wide" is quite coarse granular. Had you never the need 
to change this setting on a per-application basis?


Benjamin 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Roger Ye <ro...@gmail.com>.
Hey,

No offense, I bet you're an American and never read the joke which involves
trilingual, bilingual and American

On 4/29/08, Benjamin Bentmann <be...@udo.edu> wrote:
>
> Manos Batsis wrote:
>
>  Having all files stick to a given (default) encoding will mean a
> > nightmare to all
> > platforms where such encoding is not the system one when it comes to
> > modifying or > editing files.
> >
>
> I can't follow your arguments here. Proper text editors allow you to
> select the file encoding you save your files in, so the system default
> encoding should not matter.


no offense, but this is your problem  for not being able to follow Manos's
arguments here,
please consider what if in Linux you've set LC_ALL=en_US.UTF-8, but in your
system no application respects this system wide setting.

we should deprecate any file operation that fails stating an explicit
> > encoding and this way encourage users to explicitly state the encoding in
> > use.
> >
>
> I'm not sure what you mean with "file operation".


easy, file reading and file writing, or file I/O, consider such APIs,
please.

We have feature requests out for PMD and Checkstyle to detect usage of
> problematic IO APIs like java.io.FileReader and I know that already some
> work on these has been started.
>
> As for the Maven plugins themselves and their file handling: We don't need
> to deprecate things here. Every plugin that reads/writes plain text files
> should offer an encoding parameter for the user to configure the correct
> file encoding. Work on extending unconfigurable plugins with such a
> parameter is in progress/scheduled.
>
>  c)  a) + discourage any use of files that do not state encoding
> > explicitly
> >
>
> I take this as a vote for a) with the intention to output a warning in
> case the encoding was not specified. Please correct me if I misunderstood
> you.


Maybe an INFO is better for you, but if your maven powered project has
developers from all over the world, you'll understand a warning is rather
important.

Benjamin
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
> For additional commands, e-mail: users-help@maven.apache.org
>
>

Re: [POLL] Default Value for File Encoding

Posted by Benjamin Bentmann <be...@udo.edu>.
> Manos Batsis wrote:

This should have been "Rainer Pruy", I'm sorry.


Benjamin

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Benjamin Bentmann <be...@udo.edu>.
Manos Batsis wrote:

> Having all files stick to a given (default) encoding will mean a nightmare 
> to all
> platforms where such encoding is not the system one when it comes to 
> modifying or > editing files.

I can't follow your arguments here. Proper text editors allow you to select 
the file encoding you save your files in, so the system default encoding 
should not matter.

> we should deprecate any file operation that fails stating an explicit 
> encoding and this way encourage users to explicitly state the encoding in 
> use.

I'm not sure what you mean with "file operation".

We have feature requests out for PMD and Checkstyle to detect usage of 
problematic IO APIs like java.io.FileReader and I know that already some 
work on these has been started.

As for the Maven plugins themselves and their file handling: We don't need 
to deprecate things here. Every plugin that reads/writes plain text files 
should offer an encoding parameter for the user to configure the correct 
file encoding. Work on extending unconfigurable plugins with such a 
parameter is in progress/scheduled.

> c)  a) + discourage any use of files that do not state encoding explicitly

I take this as a vote for a) with the intention to output a warning in case 
the encoding was not specified. Please correct me if I misunderstood you.


Benjamin 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Rainer Pruy <Ra...@Acrys.COM>.
+1 for a)

even if b) does promise reproducible builds. Having all files stick to a given (default) encoding will mean a nightmare to all
platforms where such encoding is not the system one when it comes to modifying or editing files.

Thus, in addition to a) (allowing files to stick to whatever encoding the local system lives in)
we should deprecate any file operation that fails stating an explicit encoding and this way encourage users to explicitly state the
encoding in use.

Actually, I'm in favour of

c)  a) + discourage any use of files that do not state encoding explicitly (probably pom 4.0.1 could require stating an explicit
encoding?)


This way backward compatibility is achieved for "old" projects.
Any new ones may use any encoding appropriate for local use (and this may change form version to version).
But on the other hand correct interpretation is ensured as there will be no doubt what encoding to use for interpreting files.

(Probably a <encoding> element at pom level will suffice for the start)

Rainer


Benjamin Bentmann schrieb:
> Dear community,
> 
> the Maven team is currently discussing a proposal about the future handling
> of source file encoding by the various plugins, please see our wiki article
> [0] for all details.
> 
> A controversial aspect of this proposal is which file encoding should be
> assumed in case the user did not specify this in the POM. This poll should
> help us to come to a well-founded decision.
> 
> These are the two possible directions to go:
> 
> a) Use the current platform encoding, aka the system property
>   "file.encoding".
> 
> b) Use a static/fixed value that is defined by convention, i.e. is not
>   platform-dependent.
> 
> Approach a) matches the current behavior of most plugins and is as such
> backwards-compatible. Approach b) on the other hand can potentially break
> builds when users update to a newer version of an affected plugin if:
> - the build relies on an encoding other than ASCII/Latin-1 and
> - this encoding is not explicitly stated in the plugin configuration
> 
> The reason why b) was suggested is its positive effect on build
> reproducibility: Unlike approach a), a build will out-of-the-box deliver
> the
> same output for all team members regardless of their OS or locale. It is
> now
> to balance if this improvement is worth the potential breaks as illustrated
> above.
> 
> So, please let us know:
> 
> [a] Use platform default encoding, keep backward-compat
> [b] Use fixed default encoding, be platform-independent
> 
> Regards,
> 
> 
> Benjamin Bentmann
> 
> 
> [0]
> http://docs.codehaus.org/display/MAVENUSER/POM+Element+for+Source+File+Encoding
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
> For additional commands, e-mail: users-help@maven.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org


Re: [POLL] Default Value for File Encoding

Posted by Paul MERLIN <pa...@nosphere.org>.
+1 for b)

Reproductible builds is _the_ shit.

About backward compatibility, I second Nicolas about reading releases notes, 
upgrade guides etc...


Le Tuesday 29 April 2008 13:23:44 Benjamin Bentmann, vous avez écrit :
> Dear community,
>
> the Maven team is currently discussing a proposal about the future handling
> of source file encoding by the various plugins, please see our wiki article
> [0] for all details.
>
> A controversial aspect of this proposal is which file encoding should be
> assumed in case the user did not specify this in the POM. This poll should
> help us to come to a well-founded decision.
>
> These are the two possible directions to go:
>
> a) Use the current platform encoding, aka the system property
>    "file.encoding".
>
> b) Use a static/fixed value that is defined by convention, i.e. is not
>    platform-dependent.
>
> Approach a) matches the current behavior of most plugins and is as such
> backwards-compatible. Approach b) on the other hand can potentially break
> builds when users update to a newer version of an affected plugin if:
> - the build relies on an encoding other than ASCII/Latin-1 and
> - this encoding is not explicitly stated in the plugin configuration
>
> The reason why b) was suggested is its positive effect on build
> reproducibility: Unlike approach a), a build will out-of-the-box deliver
> the same output for all team members regardless of their OS or locale. It
> is now to balance if this improvement is worth the potential breaks as
> illustrated above.
>
> So, please let us know:
>
> [a] Use platform default encoding, keep backward-compat
> [b] Use fixed default encoding, be platform-independent
>
> Regards,
>
>
> Benjamin Bentmann
>
>
> [0]
> http://docs.codehaus.org/display/MAVENUSER/POM+Element+for+Source+File+Enco
>ding
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
> For additional commands, e-mail: users-help@maven.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@maven.apache.org
For additional commands, e-mail: users-help@maven.apache.org