You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Marcin Kasperski <Ma...@softax.com.pl> on 2004/02/19 14:25:28 UTC
Client properties, checkout/update hook, encoding...
Seems my remarks about subversion book went to this list, so
let's continue with more general thoughts. As short intro: I am
currently using and co-administering CVS repository (being a
person who made my organization to use CVS that 4 or 5 years
ago), currently I watch subversion and slowly consider whether
it would be good move for us...
Let me first describe the problem we currently have. I am Polish
and that means in particular that in the texts I write I use
some national characters. Those characters are defined within
iso-8859-2 encoding and used so on all Unix, Linux, VMS, Mac and
VMS platforms. But our friends from Microsoft decided to create
and use so called win-1250 encoding where those national
characters are placed somewhere else. Now, we are developing
some cross-platform libraries and software, co-developed by
people working on Windows and Linux/Unix/VMS. An this results in
the mess - whoever writes a comment, readme, whatever, uses his/
her natural character encoding, people working on the other
platform see this as a strange chars. We are routinely
converting all that to iso-8859-2 but this means Windows people
always see this wrong.
BTW: this problem has some similarity to the famous CR vs CR/LF..
I think I see the fairly natural solution which subversion could
implement to help solving such a problem. What I need is
an ability for pre-commit hook to change the modification text,
additional checkout/update hook and an ability to bind some kind
of property to the client (sandbox) and make it available to the
hook. I imagine it so:
a) For every textual file I define some kind of property (say
'natural-encoding') which tells what should be the natural
(repository) file encoding. Maybe some commit hook verifies
whether this is set but this is not so important. Maybe natural
encoding is always UTF-8 and need not the property.
b) In pre-commit hook I convert the file between the encodings in
case the client encoding differs from the natural encoding of
this file (of course only for the files which have the property
activating the whole mechanism, this is not good idea to do it
for Word docs). Here are the two needed subversion features: the
hook needs the info which encoding the sandbox is using (some
kind of sandbox property forwarded to the server while
commiting) and an ability to modify the changes being commited
by the pre-commit hook.
c) Similarly, some update/checkout hook would convert opposite
way. Here one need to have such a hook at all, to give to it the
client property and to influence the file body.
This way it seems possible that each sandbox will use its natural
characters encoding in the way similar to using its own end of
line mark.
What do you thing about such a idea? Or is there something else?
By the way: I think that sandbox properties, checkout/update hook
and data modification in hooks could have more usage than the
character conversion. As a quick example for the first two,
sandbox marked with 'official build' property could allow only
checkouts from tags directory...
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Client properties, checkout/update hook, encoding...
Posted by Ben Collins-Sussman <su...@collab.net>.
On Thu, 2004-02-19 at 09:05, Tobias Ringström wrote:
> No it doesn't. The commit log messages are handled that way, but not the
> file contents.
Correct. The repository stores all paths and commit logs in UTF8, but
doesn't *ever* change file contents. The repository treats file
contents as a pure bytestream. The only thing that ever changes file
contents is the working copy, which might do EOL translation or keyword
substitution.
It sounds like Marcin is asking for a 3rd type of working-copy
translation, one which does charset translation.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Client properties, checkout/update hook, encoding...
Posted by Tobias Ringström <to...@ringstrom.mine.nu>.
Francois Beausoleil wrote:
>Hi !
>
>Subversion already does everything you wrote about. Without needing pre
>commit hooks or anything.
>
>The files on the server are always encoded as UTF-8, and are transported
>this way on the wire. When the WC is updated, Subversion decodes the
>UTF-8 and translates it to the currently selected platform encoding. The
>reverse is done when the file is committed.
>
>
No it doesn't. The commit log messages are handled that way, but not the
file contents.
/Tobias
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Re: Client properties, checkout/update hook, encoding...
Posted by Francois Beausoleil <fb...@users.sourceforge.net>.
Hi !
Subversion already does everything you wrote about. Without needing pre
commit hooks or anything.
The files on the server are always encoded as UTF-8, and are transported
this way on the wire. When the WC is updated, Subversion decodes the
UTF-8 and translates it to the currently selected platform encoding. The
reverse is done when the file is committed.
Hope that helps !
François
On Thu, 19 Feb 2004 15:25:28 +0100, "Marcin Kasperski"
<Ma...@softax.com.pl> said:
> Seems my remarks about subversion book went to this list, so
> let's continue with more general thoughts. As short intro: I am
> currently using and co-administering CVS repository (being a
> person who made my organization to use CVS that 4 or 5 years
> ago), currently I watch subversion and slowly consider whether
> it would be good move for us...
>
> Let me first describe the problem we currently have. I am Polish
> and that means in particular that in the texts I write I use
> some national characters. Those characters are defined within
> iso-8859-2 encoding and used so on all Unix, Linux, VMS, Mac and
> VMS platforms. But our friends from Microsoft decided to create
> and use so called win-1250 encoding where those national
> characters are placed somewhere else. Now, we are developing
> some cross-platform libraries and software, co-developed by
> people working on Windows and Linux/Unix/VMS. An this results in
> the mess - whoever writes a comment, readme, whatever, uses his/
> her natural character encoding, people working on the other
> platform see this as a strange chars. We are routinely
> converting all that to iso-8859-2 but this means Windows people
> always see this wrong.
> BTW: this problem has some similarity to the famous CR vs CR/LF..
>
> I think I see the fairly natural solution which subversion could
> implement to help solving such a problem. What I need is
> an ability for pre-commit hook to change the modification text,
> additional checkout/update hook and an ability to bind some kind
> of property to the client (sandbox) and make it available to the
> hook. I imagine it so:
> a) For every textual file I define some kind of property (say
> 'natural-encoding') which tells what should be the natural
> (repository) file encoding. Maybe some commit hook verifies
> whether this is set but this is not so important. Maybe natural
> encoding is always UTF-8 and need not the property.
> b) In pre-commit hook I convert the file between the encodings in
> case the client encoding differs from the natural encoding of
> this file (of course only for the files which have the property
> activating the whole mechanism, this is not good idea to do it
> for Word docs). Here are the two needed subversion features: the
> hook needs the info which encoding the sandbox is using (some
> kind of sandbox property forwarded to the server while
> commiting) and an ability to modify the changes being commited
> by the pre-commit hook.
> c) Similarly, some update/checkout hook would convert opposite
> way. Here one need to have such a hook at all, to give to it the
> client property and to influence the file body.
>
> This way it seems possible that each sandbox will use its natural
> characters encoding in the way similar to using its own end of
> line mark.
>
> What do you thing about such a idea? Or is there something else?
>
> By the way: I think that sandbox properties, checkout/update hook
> and data modification in hooks could have more usage than the
> character conversion. As a quick example for the first two,
> sandbox marked with 'official build' property could allow only
> checkouts from tags directory...
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
>
Developer of Java Gui Builder
http://jgb.sourceforge.net/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org