You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by Marshall Schor <ms...@schor.com> on 2006/11/11 17:03:22 UTC

line ends, Windows, SVN and Eclipse

Sorry for the long note, but this is important I think  :-)

There's a thread on the incubator site mentioning that changing one file 
in a web site is causing lots of other changes to other files.

People think it's caused by not having proper SVN set up for
controlling how line end delimiters are treated between Windows and
Unix systems, when new files are created or existing ones edited.
Windows line ends are x'0D0A', unix are just x'0A'.

Eclipse experiments shows that for existing files, editing them preserves
the line endings. There's also an Eclipse command under
File -> convert line end delimiters to..."
For new files, you can control what kinds of line ends are used.
    See: windows->prefs->general->workspace new text file line delimeter
    Using "default" for this makes .xml and .java files have 0D0A on my 
system.

The way to make SVN work, given that different users might be using
windows/unix conventions and different editors,
is to have SVN always store in one format - unix -
and translate upon checkout/ checkin. SVN supports this: see:
http://svnbook.red-bean.com/nightly/en/svn-book.html
(search for svn:eol-style)

To do this, we need to always set
svn:eol-style native on all the files that might be checked out and edited.

Apache gives this advice here:
http://www.apache.org/dev/version-control.html#https-svn, specifically
the last two paragraphs:

     Committers will need to properly configure their svn client.
     One particular issue is OS-specific line-endings for text files.
     When you add a new text file, especially when applying patches
     from Bugzilla, first ensure that the line-endings are appropriate
     for *your* system, then do ...

     svn add test.txt
     svn propset svn:eol-style native test.txt

     Your svn client can be configured to do that automatically
     for some common file types.
     Add the list to your ~/.subversion/config 
<http://www.apache.org/dev/svn-eol-style.txt> file.
     However, you should still pay attention to the messages
     from your svn client when you do 'svn commit'.

I can see my "Tortise SVN" settings are not configured for setting
this on "new" files; I'm guessing the Eclipse SVN client is not
configured for this because this property is not set on our files
in the SVN repository.  (Eclipse has a default
configuration - but I don't know where that file is.  It does seem
possible to mke the Eclipse config use the same Tortise SVN config file)

I searched  uimaj-core/src/main/java for  /u000d and found 2 files
that have windows native line endings; the others have unix style:

/uimaj-core/src/main/java/org/apache/uima/analysis_engine/impl/AnalysisEngineDescription_impl.java
/uimaj-core/src/main/java/org/apache/uima/resource/metadata/impl/MetaDataObject_impl.java

The fix that is suggested here

(snipped from a note from Daniel Kulp:)
For tuscany/cxf/qpid, I kind of wrote a script that will go through 
everything and add a "proper" set of props:

http://svn.apache.org/repos/asf/incubator/tuscany/java/etc/set_svn_properties.sh

sets svn:eol-style, svn:mime-type and svn:keywords "Rev Date" for a 
whole set of files.
The keywords Rev and Date substitute the revision number and date of 
last revision.

I suggest we run this on our SVN tree from time to time - does this seem 
like a good idea?

Should we update our source files to include %rev% and %date% tags? That 
would allow
easy determination from extracted files what rev / date they were. If we 
put this in as
a java static final string, we could also tell the rev/date for 
components in jar files.

-Marshall


Noel J. Bergman wrote:
>> There is an issue with building the site from a windows platform that
>> causes the eols to get all mixed up.
>>     
>
> See http://www.apache.org/dev/version-control.html#https-svn, specifically
> the last two paragraphs, starting with "Committers will need to properly
> ..."  This is why Roy made his comment to Garrett expressing displeasure
> that SVN doesn't permit us to enforce it from the server side.
>
> 	--- Noel
(snipped from a note from Eric Johnson:)

I made a change this morning and also noticed that a lot of files I had
not touched were listed as modified. In looking at them the content was
unchanged.
I did have to run build -fix to get my checkins to work. There is an
issue with building the site from a windows platform that causes the
eols to get all mixed up... 




Re: line ends, Windows, SVN and Eclipse

Posted by Thilo Goetz <tw...@gmx.de>.
Marshall Schor wrote:
...
> How about making .txt files that should be treated as test-case-input 
> have some distinguishing type extension?
> That way, we could make use of subversion configuration settings, as 
> well as have shell scripts that we could
> run occasionally to fix things, based on the extension.

Well, to my mind, those files are called .txt because they *are* text 
files.  I don't think it's any easier to name files in a special way 
than to set the svn properties manually.  For aesthetic reasons, if 
nothing else, I'd vote for not renaming them.

--Thilo



Re: line ends, Windows, SVN and Eclipse

Posted by Marshall Schor <ms...@schor.com>.
Thilo Goetz wrote:
> We need to be careful and selective when setting this property.  It's 
> ok for source code, but anything else, we need to check.  For example, 
> we've had endless trouble with text files that we use as test case 
> input.  If eol-style is set to native on those, test cases behave 
> differently on Windows and Unix/Linux.
> So +1 to setting this property for all _text_ files, but -1 to setting 
> it to "native" indiscriminately.  For test case input (and possibly 
> others), the value should be set to "LF", as that's the most portable.
+1
>>
>>     Your svn client can be configured to do that automatically
>>     for some common file types.
>>     Add the list to your ~/.subversion/config 
>> <http://www.apache.org/dev/svn-eol-style.txt> file.
>>     However, you should still pay attention to the messages
>>     from your svn client when you do 'svn commit'.
>>
> See above.  This is overly general in our case, as we treat some of 
> our text files as binary.
How about making .txt files that should be treated as test-case-input 
have some distinguishing type extension?
That way, we could make use of subversion configuration settings, as 
well as have shell scripts that we could
run occasionally to fix things, based on the extension.

What would be a good extension to use?  .txt_unix_line_ends ?
> It's too bad that SVN has no "text encoding" property to set the code 
> page of text files.  That would be a very useful thing to check on 
> commit.
We're allowed to add our own properties to files in SVN.  If you think 
this would be useful, how about proposing property names and values for 
this?

-Marshall


Re: line ends, Windows, SVN and Eclipse

Posted by Thilo Goetz <tw...@gmx.de>.
Marshall Schor wrote:
...
> To do this, we need to always set
> svn:eol-style native on all the files that might be checked out and edited.

We need to be careful and selective when setting this property.  It's ok 
for source code, but anything else, we need to check.  For example, 
we've had endless trouble with text files that we use as test case 
input.  If eol-style is set to native on those, test cases behave 
differently on Windows and Unix/Linux.

So +1 to setting this property for all _text_ files, but -1 to setting 
it to "native" indiscriminately.  For test case input (and possibly 
others), the value should be set to "LF", as that's the most portable.

> Apache gives this advice here:
> http://www.apache.org/dev/version-control.html#https-svn, specifically
> the last two paragraphs:
> 
>     Committers will need to properly configure their svn client.
>     One particular issue is OS-specific line-endings for text files.
>     When you add a new text file, especially when applying patches
>     from Bugzilla, first ensure that the line-endings are appropriate
>     for *your* system, then do ...
> 
>     svn add test.txt
>     svn propset svn:eol-style native test.txt
> 
>     Your svn client can be configured to do that automatically
>     for some common file types.
>     Add the list to your ~/.subversion/config 
> <http://www.apache.org/dev/svn-eol-style.txt> file.
>     However, you should still pay attention to the messages
>     from your svn client when you do 'svn commit'.
> 

See above.  This is overly general in our case, as we treat some of our 
text files as binary.

It's too bad that SVN has no "text encoding" property to set the code 
page of text files.  That would be a very useful thing to check on commit.

--Thilo