You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Dongsheng Song <do...@gmail.com> on 2010/11/13 16:18:08 UTC

[Proposed] Split very long messages by paragraph for easy translate

Hi folks,

subversion.pot have some very long translated message, for example:

Apply a patch to a working copy.\n
usage: patch PATCHFILE [WCPATH]\n
\n
 Apply a unidiff patch in PATCHFILE to the working copy WCPATH.\n
 If WCPATH is omitted, '.' is assumed.\n
\n
 A unidiff patch suitable for application to a working copy can be\n
 produced with the 'svn diff' command or third-party diffing tools.\n
 Any non-unidiff content of PATCHFILE is ignored.\n
\n
 Changes listed in the patch will either be applied or rejected.\n
 If a change does not match at its exact line offset, it may be applied\n
 earlier or later in the file if a match is found elsewhere for the\n
 surrounding lines of context provided by the patch.\n
 A change may also be applied with fuzz, which means that one\n
 or more lines of context are ignored when matching the change.\n
 If no matching context can be found for a change, the change conflicts\n
 and will be written to a reject file with the extension .svnpatch.rej.\n
\n
 For each patched file a line will be printed with characters reporting\n
 the action taken. These characters have the following meaning:\n
\n
   A  Added\n
   D  Deleted\n
   U  Updated\n
   C  Conflict\n
   G  Merged (with local uncommitted changes)\n
\n
 Changes applied with an offset or fuzz are reported on lines starting\n
 with the '>' symbol. You should review such changes carefully.\n
\n
 If the patch removes all content from a file, that file is scheduled\n
 for deletion. If the patch creates a new file, that file is scheduled\n
 for addition. Use 'svn revert' to undo deletions and additions you\n
 do not agree with.\n

>From the translator's point of view, this very hard for translate and maintain.
So I proposed we should split these long message like mercurial.

--
Dongsheng

Re: [Proposed] Split very long messages by paragraph for easy translate

Posted by Julian Foad <ju...@wandisco.com>.
On Sat, 2011-02-05, Stefan Sperling wrote:
> It might be more complext to generate and store multiple strings.
> But if this helps translators I think we should do it.
> We must find a nice way of splitting up help texts large help
> texts like this put too much burden on translators.

Looking at this concern from the outside, I have a question.  Do any of
the commonly available translation tools do a good job of assisting with
translating long strings?  I would expect that merely showing a diff of
the old and new English strings would be enough assistance in most
cases.  I can see how using a simple tool that doesn't even show a diff
would make the job very difficult.

- Julian



Re: [Proposed] Split very long messages by paragraph for easy translate

Posted by Stefan Sperling <st...@elego.de>.
On Fri, Feb 04, 2011 at 04:10:30PM +0800, Dongsheng Song wrote:
> On Fri, Feb 4, 2011 at 15:59, Dongsheng Song <do...@gmail.com> wrote:
> > On Sun, Nov 14, 2010 at 01:19, Greg Hudson <gh...@mit.edu> wrote:
> >> On Sat, 2010-11-13 at 10:31 -0500, Daniel Shahaf wrote:
> >>> Sounds reasonable.
> >>>
> >>> What changes to the source code would be required?
> >>>
> >>> Do we just change
> >>>       N_("three\n\nparagraphs\n\nhere\n")
> >>> to
> >>>       N_("three\n") N_("paragraphs\n") N_("here\n")
> >>
> >> No, that would just result in evaluating gettext on the combined string,
> >> same as before.  I can see two options for help strings in particular:
> >>
> >> 1. Rev svn_opt_subcommand_desc2_t to include an array of help strings
> >> which are translated and displayed in sequence.
> >>
> >> 2. Change print_command_info2 to look at the help string and break it up
> >> at certain boundaries (such as blank lines or numbered list entries)
> >> before translating it.
> >>
> >> (Mercurial is written in Python, so it has different constraints.)
> >>
> >
> > Change svn_opt_subcommand_desc2_t.help from 'const char *' to 'const
> > char **' maybe break the ABI,
> > I don't think we can do it in the near future.

We won't change svn_opt_subcommand_desc2_t.
Instead, we would introduce a new svn_opt_subcommand_desc3_t.

> >
> > Change print_command_info2 have the similarly issues, and it's more
> > complex to generate and store strings for translating.

There would be a new print_command_info3.
It might be more complext to generate and store multiple strings.
But if this helps translators I think we should do it.
We must find a nice way of splitting up help texts large help
texts like this put too much burden on translators.

> > I have another approach, introduce a new function to concatenate many
> > strings, e.g.
> > const char * svn_opt__string_concat(apr_pool_t *pool, const char *s1, ...)
> >
> > The last parameter should be NULL to indicate this is the last
> > parameter, then the very long messages:
> > N_("para1" "para2" "para3" "..."  "paraN")
> >
> > Can be changed to:
> > svn_opt__string_concat(N_("para1"), (N_("para2"), (N_("para3"), ...,
> > (N_("paraN"), NULL)
> >
> > Why I recall this thread ? Because when I translating today, I found a
> > message which have 255 lines (svn/main.c:676 ),
> > It's supper terrible ! I can not image the maintenance work when there
> > have one line changed only!
> >
> > Any comments?
> >
> > --
> > Dongsheng Song
> >
> 
> OOPS, apr_pstrcat can concatenate multiple strings, not need for
> svn_opt__string_concat.
> http://apr.apache.org/docs/apr/1.4/group__apr__strings.html#g7bd80c95ffb7b3f96bc78e7b5b5b0045

Unfortunately, the help text is part of the definition of a static array.
So at the moment we cannot use apr_pstrcat() to split up the help text.

Stefan

Re: [Proposed] Split very long messages by paragraph for easy translate

Posted by Dongsheng Song <do...@gmail.com>.
On Fri, Feb 4, 2011 at 15:59, Dongsheng Song <do...@gmail.com> wrote:
> On Sun, Nov 14, 2010 at 01:19, Greg Hudson <gh...@mit.edu> wrote:
>> On Sat, 2010-11-13 at 10:31 -0500, Daniel Shahaf wrote:
>>> Sounds reasonable.
>>>
>>> What changes to the source code would be required?
>>>
>>> Do we just change
>>>       N_("three\n\nparagraphs\n\nhere\n")
>>> to
>>>       N_("three\n") N_("paragraphs\n") N_("here\n")
>>
>> No, that would just result in evaluating gettext on the combined string,
>> same as before.  I can see two options for help strings in particular:
>>
>> 1. Rev svn_opt_subcommand_desc2_t to include an array of help strings
>> which are translated and displayed in sequence.
>>
>> 2. Change print_command_info2 to look at the help string and break it up
>> at certain boundaries (such as blank lines or numbered list entries)
>> before translating it.
>>
>> (Mercurial is written in Python, so it has different constraints.)
>>
>
> Change svn_opt_subcommand_desc2_t.help from 'const char *' to 'const
> char **' maybe break the ABI,
> I don't think we can do it in the near future.
>
> Change print_command_info2 have the similarly issues, and it's more
> complex to generate and store strings for translating.
>
> I have another approach, introduce a new function to concatenate many
> strings, e.g.
> const char * svn_opt__string_concat(apr_pool_t *pool, const char *s1, ...)
>
> The last parameter should be NULL to indicate this is the last
> parameter, then the very long messages:
> N_("para1" "para2" "para3" "..."  "paraN")
>
> Can be changed to:
> svn_opt__string_concat(N_("para1"), (N_("para2"), (N_("para3"), ...,
> (N_("paraN"), NULL)
>
> Why I recall this thread ? Because when I translating today, I found a
> message which have 255 lines (svn/main.c:676 ),
> It's supper terrible ! I can not image the maintenance work when there
> have one line changed only!
>
> Any comments?
>
> --
> Dongsheng Song
>

OOPS, apr_pstrcat can concatenate multiple strings, not need for
svn_opt__string_concat.
http://apr.apache.org/docs/apr/1.4/group__apr__strings.html#g7bd80c95ffb7b3f96bc78e7b5b5b0045

Re: [Proposed] Split very long messages by paragraph for easy translate

Posted by Dongsheng Song <do...@gmail.com>.
On Sun, Nov 14, 2010 at 01:19, Greg Hudson <gh...@mit.edu> wrote:
> On Sat, 2010-11-13 at 10:31 -0500, Daniel Shahaf wrote:
>> Sounds reasonable.
>>
>> What changes to the source code would be required?
>>
>> Do we just change
>>       N_("three\n\nparagraphs\n\nhere\n")
>> to
>>       N_("three\n") N_("paragraphs\n") N_("here\n")
>
> No, that would just result in evaluating gettext on the combined string,
> same as before.  I can see two options for help strings in particular:
>
> 1. Rev svn_opt_subcommand_desc2_t to include an array of help strings
> which are translated and displayed in sequence.
>
> 2. Change print_command_info2 to look at the help string and break it up
> at certain boundaries (such as blank lines or numbered list entries)
> before translating it.
>
> (Mercurial is written in Python, so it has different constraints.)
>

Change svn_opt_subcommand_desc2_t.help from 'const char *' to 'const
char **' maybe break the ABI,
I don't think we can do it in the near future.

Change print_command_info2 have the similarly issues, and it's more
complex to generate and store strings for translating.

I have another approach, introduce a new function to concatenate many
strings, e.g.
const char * svn_opt__string_concat(apr_pool_t *pool, const char *s1, ...)

The last parameter should be NULL to indicate this is the last
parameter, then the very long messages:
N_("para1" "para2" "para3" "..."  "paraN")

Can be changed to:
svn_opt__string_concat(N_("para1"), (N_("para2"), (N_("para3"), ...,
(N_("paraN"), NULL)

Why I recall this thread ? Because when I translating today, I found a
message which have 255 lines (svn/main.c:676 ),
It's supper terrible ! I can not image the maintenance work when there
have one line changed only!

Any comments?

--
Dongsheng Song

Re: [Proposed] Split very long messages by paragraph for easy translate

Posted by Greg Hudson <gh...@MIT.EDU>.
On Sat, 2010-11-13 at 10:31 -0500, Daniel Shahaf wrote:
> Sounds reasonable.
> 
> What changes to the source code would be required?
> 
> Do we just change
> 	N_("three\n\nparagraphs\n\nhere\n")
> to
> 	N_("three\n") N_("paragraphs\n") N_("here\n")

No, that would just result in evaluating gettext on the combined string,
same as before.  I can see two options for help strings in particular:

1. Rev svn_opt_subcommand_desc2_t to include an array of help strings
which are translated and displayed in sequence.

2. Change print_command_info2 to look at the help string and break it up
at certain boundaries (such as blank lines or numbered list entries)
before translating it.

(Mercurial is written in Python, so it has different constraints.)


Re: [Proposed] Split very long messages by paragraph for easy translate

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Sounds reasonable.

What changes to the source code would be required?

Do we just change
	N_("three\n\nparagraphs\n\nhere\n")
to
	N_("three\n") N_("paragraphs\n") N_("here\n")
?

Dongsheng Song wrote on Sat, Nov 13, 2010 at 23:18:08 +0800:
> Hi folks,
> 
> subversion.pot have some very long translated message, for example:
> 
> Apply a patch to a working copy.\n
> usage: patch PATCHFILE [WCPATH]\n
> \n
>  Apply a unidiff patch in PATCHFILE to the working copy WCPATH.\n
>  If WCPATH is omitted, '.' is assumed.\n
> \n
>  A unidiff patch suitable for application to a working copy can be\n
>  produced with the 'svn diff' command or third-party diffing tools.\n
>  Any non-unidiff content of PATCHFILE is ignored.\n
> \n
>  Changes listed in the patch will either be applied or rejected.\n
>  If a change does not match at its exact line offset, it may be applied\n
>  earlier or later in the file if a match is found elsewhere for the\n
>  surrounding lines of context provided by the patch.\n
>  A change may also be applied with fuzz, which means that one\n
>  or more lines of context are ignored when matching the change.\n
>  If no matching context can be found for a change, the change conflicts\n
>  and will be written to a reject file with the extension .svnpatch.rej.\n
> \n
>  For each patched file a line will be printed with characters reporting\n
>  the action taken. These characters have the following meaning:\n
> \n
>    A  Added\n
>    D  Deleted\n
>    U  Updated\n
>    C  Conflict\n
>    G  Merged (with local uncommitted changes)\n
> \n
>  Changes applied with an offset or fuzz are reported on lines starting\n
>  with the '>' symbol. You should review such changes carefully.\n
> \n
>  If the patch removes all content from a file, that file is scheduled\n
>  for deletion. If the patch creates a new file, that file is scheduled\n
>  for addition. Use 'svn revert' to undo deletions and additions you\n
>  do not agree with.\n
> 
> From the translator's point of view, this very hard for translate and maintain.
> So I proposed we should split these long message like mercurial.
> 
> --
> Dongsheng

Re: [Proposed] Split very long messages by paragraph for easy translate

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
Sounds reasonable.

What changes to the source code would be required?

Do we just change
	N_("three\n\nparagraphs\n\nhere\n")
to
	N_("three\n") N_("paragraphs\n") N_("here\n")
?

Dongsheng Song wrote on Sat, Nov 13, 2010 at 23:18:08 +0800:
> Hi folks,
> 
> subversion.pot have some very long translated message, for example:
> 
> Apply a patch to a working copy.\n
> usage: patch PATCHFILE [WCPATH]\n
> \n
>  Apply a unidiff patch in PATCHFILE to the working copy WCPATH.\n
>  If WCPATH is omitted, '.' is assumed.\n
> \n
>  A unidiff patch suitable for application to a working copy can be\n
>  produced with the 'svn diff' command or third-party diffing tools.\n
>  Any non-unidiff content of PATCHFILE is ignored.\n
> \n
>  Changes listed in the patch will either be applied or rejected.\n
>  If a change does not match at its exact line offset, it may be applied\n
>  earlier or later in the file if a match is found elsewhere for the\n
>  surrounding lines of context provided by the patch.\n
>  A change may also be applied with fuzz, which means that one\n
>  or more lines of context are ignored when matching the change.\n
>  If no matching context can be found for a change, the change conflicts\n
>  and will be written to a reject file with the extension .svnpatch.rej.\n
> \n
>  For each patched file a line will be printed with characters reporting\n
>  the action taken. These characters have the following meaning:\n
> \n
>    A  Added\n
>    D  Deleted\n
>    U  Updated\n
>    C  Conflict\n
>    G  Merged (with local uncommitted changes)\n
> \n
>  Changes applied with an offset or fuzz are reported on lines starting\n
>  with the '>' symbol. You should review such changes carefully.\n
> \n
>  If the patch removes all content from a file, that file is scheduled\n
>  for deletion. If the patch creates a new file, that file is scheduled\n
>  for addition. Use 'svn revert' to undo deletions and additions you\n
>  do not agree with.\n
> 
> From the translator's point of view, this very hard for translate and maintain.
> So I proposed we should split these long message like mercurial.
> 
> --
> Dongsheng